# GEMS PedTools Example Usage for R
Below is a simple example that illustrates how to access data in the PedTools database.
### Set up an HTTP client using Python's request library
We use a `Session` object to store our API key and automatically include it in the header for each request.

Note that we have a `api_key.R` file in the Exchange-Notebooks directory. The file contains only the below line.
```
api_key <- 'SECRET'
```

In [1]:
# Load necessary libraries
library(httr)
library(jsonlite)
library(dplyr)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union




### Set up an HTTP client

In [2]:
# Grab your API Key and assign it to api_key
source("api_key.R")

# Initialize session headers
headers <- add_headers(apikey = api_key)

# Define base URL
# Uncomment the base URL you wish to use
base <- "https://exchange-1.gems.msi.umn.edu/pedtools/v1"

### Grab all information for a specific variety

In [3]:
# Query information for a specific variety
params <- list(pedigree_depth = 5)
variety <- URLencode("TURKEY", reserved = TRUE)
url <- paste0(base, "/", variety)

# Perform GET request
response <- GET(url, headers, query = params)

# Check if the request was successful
if (status_code(response) == 200) {
  # Parse JSON response and convert to a data frame
  content <- content(response, as = "text", encoding = "UTF-8")
  data <- fromJSON(content, flatten = TRUE) %>% as.data.frame()
  print(data)
} else {
  cat("Error:", status_code(response), "-", content(response, as = "text"), "\n")
}

    preferred_name crop_name backcross_str selfing_count market_class
1 TURKEY GID:10509     wheat                           0           NA
2   TURKEY GID:135     wheat                           2           NA
  release_date developer parentage
1           NA        NA        NA
2           NA        NA        NA
                                                                                                                                   aliases
1                                                                          10509, TURKEY, TURKEY 13.R, germplasm_bank_id, bcid, cross_name
2 135, CWI51783, P.1066-?-?, TK, TURKEY, germplasm_bank_id, germplasm_bank_accession_id, selection_history, cross_abbreviation, cross_name
  father                  pedigree mother.preferred_name mother.crop_name
1     NA          TURKEY GID:10509                  <NA>             <NA>
2     NA P.1066 GID:25 (P.1066) F2     P.1066-? GID:3700            wheat
  mother.backcross_str mother.selfing_count mo

### What are the aliases for the returned varieties?

In [4]:
if ("aliases" %in% colnames(data)) {
    print("Aliases for the varieties:")
    print(data$aliases)
} else {
    print("No aliases found.")
}

[1] "Aliases for the varieties:"
[[1]]
         name              type
1       10509 germplasm_bank_id
2      TURKEY              bcid
3 TURKEY 13.R        cross_name

[[2]]
        name                        type
1        135           germplasm_bank_id
2   CWI51783 germplasm_bank_accession_id
3 P.1066-?-?           selection_history
4         TK          cross_abbreviation
5     TURKEY                  cross_name



### Let's check another entry, and retrieve its pedigree to a depth of 5 (great-great-great grandparents).

In [5]:
# Query pedigree information for a specific variety
params <- list(pedigree_depth = 5)
variety <- URLencode("SANDPIPER", reserved = TRUE)
url <- paste0(base, "/", variety)

# Perform GET request
response <- GET(url, headers, query = params)

# Check if the request was successful
if (status_code(response) == 200) {
  # Parse JSON response and convert to a data frame
  content <- content(response, as = "text", encoding = "UTF-8")
  data <- fromJSON(content, flatten = TRUE) %>% as.data.frame()
  print(data$pedigree)
} else {
  cat("Error:", status_code(response), "-", content(response, as = "text"), "\n")
}

[1] "###123561 / STEWART 63 /2/ 980 GID:980 / WELLS /3/ ###123633 / TEHUACAN 60 /2/ 980 GID:980 / WELLS /4/ POLONICUM PI185309 /2/ ###250171 / TEHUACAN 60 /3/ ###123633 / TEHUACAN 60 /2/ 980 GID:980 / WELLS /5/ II23584 GID:417 (II23584) F6 /2/ 3852 (###6385) F2 /3/ 3855 (###3926) F1 (###7220) F6"


### How do I find a matrix of Coefficient of Parentage (COP) values among any arbitrary list of varieties?
We use the post request `cop/matrix` endpoint to obtain the COP matrix as follows. Note that names of varieties do *not* need to be quote escaped since they will be in the body of the post not in the URL.

In [6]:
params <- list(max_depth = 20)
var_list <- '["FLAMINGO", "SANDPIPER", "GAVIOTA", "TURKEY"]'
url <- paste0(base, "/cop/matrix")

# Perform POST request
response <- POST(url, headers, query = params, body = var_list, encode = "json")

# Check if the request was successful
if (status_code(response) == 200) {
  # Parse JSON response and convert to a data frame
  content <- content(response, as = "text", encoding = "UTF-8")
  data <- fromJSON(content, flatten = TRUE) %>% as.data.frame()
  print(data)
} else {
  cat("Error:", status_code(response), "-", content(response, as = "text"), "\n")
}

  reverse_mapping.FLAMINGO reverse_mapping.GAVIOTA reverse_mapping.SANDPIPER
1                 FLAMINGO                 GAVIOTA                 SANDPIPER
2                 FLAMINGO                 GAVIOTA                 SANDPIPER
  reverse_mapping.TURKEY.GID.135 reverse_mapping.TURKEY.GID.10509
1                         TURKEY                           TURKEY
2                         TURKEY                           TURKEY
  forward_mapping.FLAMINGO forward_mapping.GAVIOTA forward_mapping.SANDPIPER
1                 FLAMINGO                 GAVIOTA                 SANDPIPER
2                 FLAMINGO                 GAVIOTA                 SANDPIPER
  forward_mapping.TURKEY cop.FLAMINGO.FLAMINGO cop.FLAMINGO.GAVIOTA
1         TURKEY GID:135             0.9921875                    0
2       TURKEY GID:10509             0.9921875                    0
  cop.FLAMINGO.SANDPIPER cop.FLAMINGO.TURKEY.GID.135
1              0.4960938                           0
2              0.4960938      