# Download EDD Study From Jupyter Notebook
This notebook illustrates how to use python to export an EDD study into a pandas dataframe for downstream analytics and processing for any bioinformatics workflow. It also includes 

First the edd_utils module is imported with the required functions to login and export

In [242]:
#Install a pip package edd-utils in the current Jupyter kernel if running on your personal computer. 
print("If you run this on jupyter.jbei.org select JBEI-py3.6 in the top right corner and ignore this cell and its error.")
import sys
!{sys.executable} -m pip  install edd-utils --user

If you run this on jupyter.jbei.org select JBEI-py3.6 in the top right corner and ignore this cell and its error.


In [241]:
from edd_utils import login, export_study

Each EDD study has a unique identifier called a *slug*. A slug is a string from the end of the URL, between the last two slash signs (``/``). We provide this string to tell our exporter what study to download.
Below is an example.

In [4]:
# Study to Download
study_slug = 'f2ftest1'
slug=study_slug

If the desired EDD server is not `edd.jbei.org`, it should be specified (e.g. `public-edd.jbei.org`, `public-edd.agilebiofoundry.org`).

In [5]:
# EDD server
edd_server = 'edd.jbei.org'

Now we use the login function in edd_utils to **Login to EDD** using the default instance (edd.jbei.org)

In [8]:
session = login(edd_server=edd_server)

Password for rgentz:  ········


Finally we **Download the Study** using the export_study function.  It returns a pandas dataframe that can be manipulated for downstream data analysis.

In [9]:
try:
    df = export_study(session, study_slug, edd_server=edd_server)
except:
    print("Slugname and/or EDD password are wrong. Please correct before proceding")




In [10]:
df.head() #Gets the data as by the sample notebook

Unnamed: 0,Study ID,Study Name,Line ID,Line Name,Line Description,Protocol,Assay ID,Assay Name,Formal Type,Measurement Type,Compartment,Units,Value,Hours
0,119511,F2Ftest1,119512,1,F2F 1,Transcriptomics,119657,1,,O2 Consumption,0,,21.51014,0.0
1,119511,F2Ftest1,119512,1,F2F 1,Transcriptomics,119657,1,cid:5793,D-Glucose,0,,33.65349,0.0
2,119511,F2Ftest1,119513,2,F2F 2,Transcriptomics,119658,2,,O2 Consumption,0,,28.50065,0.0
3,119511,F2Ftest1,119513,2,F2F 2,Transcriptomics,119658,2,cid:5793,D-Glucose,0,,42.70434,0.0
4,119511,F2Ftest1,119514,3,F2F 3,Transcriptomics,119659,3,,O2 Consumption,0,,45.79955,0.0


Code below gets all the metadata and returns it to the user.

In [246]:
#Get metadata
import pandas as pd
def export_metadata(session, slug, edd_server='edd.jbei.org', verbose=False):
    '''Export Metadata from EDD as a pandas dataframe'''

    try:
        lookup_response = session.get(f'https://{edd_server}/rest/studies/?slug={slug}')

    except KeyError:
        if lookup_response.status_code == requests.codes.forbidden:
            print('Access to EDD not granted\n.')
            sys.exit()
        elif lookup_response.status_code == requests.codes.not_found:
            print('EDD study was not found\n.')
            sys.exit()
        elif lookup_response.status_code == requests.codes.server_error:
            print('Server error\n.')
            sys.exit()
        else:
            print('An error with EDD export has occurred\n.')
            sys.exit()

    json_response = lookup_response.json()
    # Catch the error if study slug is not found in edd_server
    try: 
        study_id = json_response["results"][0]["pk"]
    except IndexError:
        if json_response["results"] == []:
            print(f'Slug \'{slug}\' not found in {edd_server}.\n')
            sys.exit()
    # TODO: catch the error if the study is found but cannot be accessed by this user
    
    if verbose:
        print("Study id is ",study_id)
    # Get the metadata names
    export_response = session.get(f'https://{edd_server}/rest/metadata_types/')
    rainer_get=export_response.json()
    results=rainer_get['results']
    names=[] #all names of EDD
    pknumbers=[] #all pknumbers of EDD
    for i in results:
        names.append(i["type_name"])
        pknumbers.append(str(i['pk']))
    while rainer_get["next"]!=None: #Get next page of names untill all done
        export_response = session.get(rainer_get["next"])
        rainer_get=export_response.json()
        results=rainer_get['results']
        for i in results:
            names.append(i["type_name"])
            pknumbers.append(str(i['pk']))
           
    # Get the metadata value's
    export_response = session.get(f'https://{edd_server}/rest/lines/?study={study_id}')
    metadata=export_response.json()
    usednames=["Line Name","Description"] #others will be added dynamically
    pkn=[] #numbers present in the data
    for j in metadata['results'][0]["metadata"]:
        if j in pknumbers:
            usednames.append(names[pknumbers.index(j)])
        pkn.append(j)

    df=pd.DataFrame(columns=usednames)
    
    for i in metadata['results']:
        data=[i["name"],i["description"]] #linename and desciption
        for k in pkn:
            data.append(i["metadata"][k])
        df.loc[len(df)]=data
    while metadata["next"]!=None:
        export_response = session.get(metadata["next"])
        metadata=export_response.json()
        for i in metadata['results']:
            data=[i["name"],i["description"]]
            for k in pkn:
                data.append(i["metadata"][k])
            df.loc[len(df)]=data
    df=df.set_index('Line Name')
    if verbose:
        print(df)
    return df
    
export_metadata(session, slug, edd_server='edd.jbei.org', verbose=False)

Unnamed: 0_level_0,Description,Flask Volume,Growth temperature,Date Grown,Date of harvest,Growth Site Type,Growth Site Location,Growth Site Plot ID,Tissue type,IL Name,...,Fermentation Media,Fermentation Starting OD,Fermentation Time,Fermentation Temperature,Fermentation pH set point,Fermentation working volume,Separation method,Overlay Ratio,Overlay Compound,Saccharification biomass loading %
Line Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,F2F 1,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
2,F2F 2,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
3,F2F 3,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
4,F2F 4,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
5,F2F 5,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
6,F2F 6,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
7,F2F 7,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
8,F2F 8,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
9,F2F 9,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
10,F2F 10,4,,6/1/19,10/1/19,Greenhouse,Davis,1,Stem,Cholinium Phosphate,...,hydrolysate&amonium sulfate,0.2,72,30,NU,150ul,Overlay,25%,Dodecane spiked with pentadecane,100%
