# How to download sequence files with UDN Gateway and FileService APIs
This example uses Python 3 but the concepts can be converted to any scripting language.  More detailed documentation for the UDN Gateway API can be found [here](https://documenter.getpostman.com/view/1367615/RVu1JBWH)

### Setup
Import python packages and setup header information.  Two different headers are used in this example: one for the UDN Gateway API and the other for the FileService API. 

In [None]:
import requests
import json

A authorization token is needed to access the UDN Gateway API.  This token is shown in the dictionary below as the `Authorization` token. Login to the web UDN Gateway and navigate to the API to obtain an authorization token. 

A second token is needed to access the details about files stored in FileService which is the applicaiton that manages metadata for all UDN Gateway sequencing files.  This token is shown in the dictionary below as `FSAuthorization`.  The `FSAuthorization` key is specific to the UDN Gateway API.    

Login to FileService to obtain an authorization token. 

Development URLs: 
- FileService: https://fileservicedev.aws.dbmi.hms.harvard.edu/
- UDN Gateway: https://dev.undiagnosed.hms.harvard.edu

Send email to the UDN Coordinating Center to obtain the URLs for the production servers

In [None]:
gateway_token = 'xxxx'
fileservice_token = 'xxxx'

In [None]:
headers = {
    'Content-Type': 'application/json', 
    'Authorization': 'Token {gw_token}'.format(gw_token=gateway_token), 
    'FSAuthorization': 'FSToken {fs_token}'.format(fs_token=fileservice_token)
}

### Get patient indexed file metadata from the UDN Gateway
To get a list of files associated with a specific patient make a GET request to  

`/api/sequence/files/<patient_udnid>/`

where `<patient_udnid>` is the UDN ID of the patient and can be obtained either through the web UDN Gateway interface or using the UDN Gateway API.  

This request returns a list of JSON objects that provide details for each file associated with a patient.  

In [None]:
gateway_host = 'gateway.undiagnosed.hms.harvard.edu'

In [None]:
udn_id = 'UDN510878'

-----
Data is returned as a list of JSON objects. The FileService UUID is required to obtain a secure download link. The filename is optional but recommended.  

----

In [None]:
url = 'https://{}/api/sequences/{}/'.format(gateway_host, udn_id)
r = requests.get(url, headers=headers)
r.json()[0]['sequencingfiles'][0]

In [None]:
fileservice_uuid = r.json()[0]['sequencingfiles'][0]['fileserviceuuid']
fileservice_uuid

In [None]:
filename = r.json()[0]['sequencingfiles'][0]['filename']
filename

### Get signed download URL from FileService
First setup a new set of header information.  The `Token` value here is the same as the `FSToken` in the header used in the header for the UDN Gateway API.  This is now accessing the FileService API.  

In [None]:
fs_headers = {
    'Content-Type': 'application/json; charset=UTF-8', 
    'Authorization': 'Token {fs_token}'.format(fs_token=fileservice_token)
}

Use the `fileserviceuuid` from the file metadata returned in the previous section.  Then make a GET request to the following url endpoint to obtain a signed download url from FileService. 

`/filemaster/api/file/<file_uuid>/download/`

In [None]:
fs_host = 'fileservice.dbmi.hms.harvard.edu'

In [None]:
url = 'https://{}/filemaster/api/file/{}/download/'.format(fs_host, fileservice_uuid)
r = requests.get(url, headers=fs_headers)
r.json()

In [None]:
download_url = r.json()['url']

### File Metadata

A GET request to the following url endpoint returns the full set of metadata associated with the file

```/filemaster/api/file/<file_uuid>/```

In [None]:
url = 'https://{}/filemaster/api/file/{}/'.format(fs_host, fileservice_uuid)
r = requests.get(url, headers=fs_headers)
r.json()

### Download file
Then use a download tool like wget to download the file from the `url` field that is returned. Be sure to include the quotation marks.   

```
wget -O "<filename>" "<url>"
```

The example below shows how to call `wget` programmatically within a python script

In [None]:
from subprocess import call

In [None]:
call('wget -O "'+filename+'" "'+download_url+'"', shell=True)

# Entire process scripted

The following section shows an aggregation of the previous examples into a single scripted solution for downloading sequencing files.

In [None]:
import requests
import json
from subprocess import call

# setup tokens to easily switch between systems (eg dev and prod)
gateway_token = 'xxxx'
fileservice_token = 'xxxx'

headers = {
    'Content-Type': 'application/json', 
    'Authorization': 'Token {gw_token}'.format(gw_token=gateway_token), 
    'FSAuthorization': 'FSToken {fs_token}'.format(fs_token=fileservice_token)
}

# FileSerivce API needs a separate set of headers
fs_headers = {
    'Content-Type': 'application/json; charset=UTF-8', 
    'Authorization': 'Token {fs_token}'.format(fs_token=fileservice_token)
}

# setup the host to easily switch between systems
gateway_host = 'udndev.dbmi.hms.harvard.edu'
fileservice_host = 'fileservicedev.aws.dbmi.hms.harvard.edu'

# for a single patient
udnid = 'some UDN ID'

# the sequence/files/ endpoint returns a list of json objects with file info
url = 'https://{}/api/sequences/{}/'.format(host, udn_id)
r = requests.get(url, headers=headers, verify=False)

# we can loop through that list to download each file
for file in r.json()[0]['sequencingfiles']
    uuid = file['fileserviceuuid']
    filename = file['filename']
    
    try:
        url = 'https://{}/filemaster/api/file/{}/download/'.format(fs_host, uuid)
        r = requests.get(url, headers=fs_headers, veryify=False)
    except:
        continue
    else:
        download_url = r.json()['url']
        call('wget -O "{}" "{}"'.format(filename, download_url), shell=True)