# GEMS PedTools Example Usage
Below is a simple example that illustrates how to access data in the PedTools database.
### Set up an HTTP client using Python's request library
We use a `Session` object to store our API key and automatically include it in the header for each request.

Note that we have a `api_key.py` file in the Exchange-Notebooks directory. The file contains only the below line.
```
api_key = 'SECRET'
```

In [None]:
import json
import pandas as pd
from requests import Session
from html import escape
import sys
sys.path.append('..')
from api_key import api_key

s = Session()
s.headers.update({'apikey': api_key})

#base = 'https://exchange-1.gems.msi.umn.edu/pedtools/v1'
base = 'https://exchange-dev.gems.msi.umn.edu/pedtools/v1'

### Find info on a variety of interest
We use the `/{variety}` endpoint to obtain all recorded info on a variety, in this case 'TURKEY'.

In [2]:
params = {'pedigree_depth': 5}
variety = escape('TURKEY')
res = s.get(f'{base}/{variety}', params=params)
df = pd.json_normalize(res.json())
df

Unnamed: 0,preferred_name,crop_name,is_landrace,backcross_depth,selfing_count,market_class,release_date,developer,parentage,aliases,...,mother.crop_name,mother.is_landrace,mother.backcross_depth,mother.selfing_count,mother.market_class,mother.release_date,mother.developer,mother.parentage,mother.aliases,mother
0,135 (GID:135),wheat,False,0,2,,,,,"[{'name': '135', 'type': 'germplasm_bank_id'},...",...,wheat,False,0.0,1.0,,,,,"[{'name': '3700', 'type': 'germplasm_bank_id'}...",
1,TURKEY (GID:10509),wheat,False,0,0,,,,,"[{'name': '10509', 'type': 'germplasm_bank_id'...",...,,,,,,,,,,


### Cool two items were returned. Buty whay was the variety '135' returned? Check the aliases.

In [3]:
df['aliases'][0]

[{'name': '135', 'type': 'germplasm_bank_id'},
 {'name': 'CWI51783', 'type': 'germplasm_bank_accession_id'},
 {'name': 'P.1066-?-?', 'type': 'selection_history'},
 {'name': 'TK', 'type': 'cross_abbreviation'},
 {'name': 'TURKEY', 'type': 'cross_name'}]

### What are all the columns I can check out from the above data frame?

In [4]:
df.columns

Index(['preferred_name', 'crop_name', 'is_landrace', 'backcross_depth',
       'selfing_count', 'market_class', 'release_date', 'developer',
       'parentage', 'aliases', 'father', 'pedigree', 'mother.preferred_name',
       'mother.crop_name', 'mother.is_landrace', 'mother.backcross_depth',
       'mother.selfing_count', 'mother.market_class', 'mother.release_date',
       'mother.developer', 'mother.parentage', 'mother.aliases', 'mother'],
      dtype='object')

Notice that there are lots of columns describing attributes of the mother but not of the father. As you'll see below, this is because the Father of both returned varieties is Null.

In [5]:
df['father']

0    None
1    None
Name: father, dtype: object

### Let's check another entry, and retrieve its pedigree to a depth of 5 (great-great-great grandparents).

In [16]:
params = {'pedigree_depth': 5}
variety = escape('SANDPIPER')
res = s.get(f'{base}/{variety}', params=params)
df = pd.json_normalize(res.json())
df

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

### One match. Let's retrieve its pedigree.

In [15]:
df['pedigree'][0]

'D23055 (GID:1010) / ###252110 /2/ ###7220 (###7220) F6'

### How do I find the Coefficient of Parentage (COP) between any arbitrary pair of varieties?
We use the `/{variety1}/{variety2}/cop` endpoint to obtain the COP between two varieties, variety1 and variety2.

In [None]:
var1 = escape('GAVIOTA')
var2 = escape('SANDPIPER')
res = s.get(f'{base}/{var1}/{var2}/cop')
pd.json_normalize(res.json())