# Download _austraits_ data in python
by: J.R. Ferrer-Paris [@jrfep](https://github.com/jrfep)

We will download data from [AusTraits](https://austraits.org/) ([pre-print](https://www.biorxiv.org/content/10.1101/2021.01.04.425314v1)) in a local folder.

## Libraries
Let's start loading the libraries

In [1]:
from pathlib import Path
import os
import json
import urllib
from zipfile import ZipFile

## Read _austraits_ data 
We will download the file from the [Zenodo repository](https://zenodo.org/record/5112001) using the API url and saving this under the data folder.

In [2]:
repodir = Path("../") 
outputdir = repodir / "data/austraits/"

if not os.path.isdir(outputdir):
    os.makedirs(outputdir)

We use urllib to open the url and read the data (if successfully connected!), we write this as a function:

In [3]:
def getResponse(url):
    operUrl = urllib.request.urlopen(url)
    if(operUrl.getcode()==200):
       data = operUrl.read()
    else:
       print("Error receiving data", operUrl.getcode())
    return data


Now we need to select the right `url` for the master version of the repositoy. This should always point to the latest version of the data. 

In [4]:
dataset = "https://zenodo.org/api/records/3568417"

Now we run the function `getResponse` and save the results as an object `zrecord`:

In [5]:
zrecord = getResponse(dataset)

Response data is in json format, need to parse it:

In [6]:
jsonData = json.loads(zrecord)

In [7]:
jsonData 

{'conceptdoi': '10.5281/zenodo.3568417',
 'conceptrecid': '3568417',
 'created': '2021-07-18T06:32:30.575319+00:00',
 'doi': '10.5281/zenodo.5112001',
 'files': [{'bucket': '9c997956-8254-4fcc-a17b-5fe1fd079022',
   'checksum': 'md5:cd7ba1c395b976a02fd4c3c772d88d78',
   'key': 'austraits-3.0.2.rds',
   'links': {'self': 'https://zenodo.org/api/files/9c997956-8254-4fcc-a17b-5fe1fd079022/austraits-3.0.2.rds'},
   'size': 12325324,
   'type': 'rds'},
  {'bucket': '9c997956-8254-4fcc-a17b-5fe1fd079022',
   'checksum': 'md5:ed44176eb71466fe9a4ca1773d6b5961',
   'key': 'austraits-3.0.2.zip',
   'links': {'self': 'https://zenodo.org/api/files/9c997956-8254-4fcc-a17b-5fe1fd079022/austraits-3.0.2.zip'},
   'size': 14738862,
   'type': 'zip'},
  {'bucket': '9c997956-8254-4fcc-a17b-5fe1fd079022',
   'checksum': 'md5:7047ae5b30b1727140000a4daa484722',
   'key': 'dictionary.html',
   'links': {'self': 'https://zenodo.org/api/files/9c997956-8254-4fcc-a17b-5fe1fd079022/dictionary.html'},
   'size': 1

The json data includes a list of files:

In [8]:
for files in jsonData['files']:
    print(files['key'])

austraits-3.0.2.rds
austraits-3.0.2.zip
dictionary.html
NEWS.md
readme.txt


We want to download the `zip` file with the `csv` files

In [9]:
outputfile = outputdir / jsonData['files'][1]['key']

if os.path.isfile(outputfile):
    print('File exists')
else:
    resp = getResponse(jsonData['files'][1]['links']['self'])
    output = open(outputfile,'wb')
    output.write(resp)
    output.close()

We can list the contents of compressed file using `ZipFile`

In [10]:
zfobj = ZipFile(outputfile)
zfobj.namelist()

['austraits-3.0.2/',
 'austraits-3.0.2/taxa.csv',
 'austraits-3.0.2/methods.csv',
 'austraits-3.0.2/definitions.yml',
 'austraits-3.0.2/build_info.md',
 'austraits-3.0.2/contributors.csv',
 'austraits-3.0.2/contexts.csv',
 'austraits-3.0.2/excluded_data.csv',
 'austraits-3.0.2/traits.csv',
 'austraits-3.0.2/taxonomic_updates.csv',
 'austraits-3.0.2/sites.csv',
 'austraits-3.0.2/sources.bib']

## Next steps

This is it for this step, we can start exploring the austrait data.