# The Paleobiology Database Tutorial
This tutorial will teach you how to access data from the [Paleobiology Database (PBDB)](https://paleobiodb.org/navigator/).

## What is the Paleobiology Database?
PBDB is a public database of paleontological data that anyone can use, maintained by an international non-governmental group of paleontologists. One of its main features is its navigator, which allows a user to sort data by geological time, taxa, authorizer, stratigraphy, and more. PBDB is run by the Department of Geoscience at the University of Wisconsin-Madison. The project team consists of Shanan Peters, Michael McClennan, and John Czaplewski. 

## How do you access the data?
PBDB is free to use and has no requirements for access. After sorting through the PBDB navigator and finding the dataset you want to download, click on the button to the left called "save map data". A window will appear, giving you two choices. You can either download the data as a CSV, JSON, TSV, or RIS file, or you can obtain a URL that can be used for external scripts such as R or Python. If you choose to download the data as a file, it can be used automatically for analysis. However, accessing the data by making HTTP requests is a little more intensive. This tutorial will teach you how to obtain the data desired by using the URL, and will require installation of Python. Instructions for download can be found [at this link](https://realpython.com/installing-python/).


## Obtaining data through HTTP requests

1) Obtain your URL by clicking on the "Get URL" tab in the window discussed above. <br>
2) Click "Get URL" for the label "Data URL". Copy this url for later use. <br>
3) Open the command prompt by typing "cmd" into your search bar, usually found at the top or bottom of you screen. A black box should appear. <br>
4) Open up a Python interpreter by typing "python". <br>
5) Enter this next block of code line by line:

import requests <br>
URL = "your URL here" <br>
r = requests.get(url=URL) <br>
data = r.json() <br>

Try printing data by calling print(data) and you should see your data outputted into the command line! 
As an example, I will use the URL "https://paleobiodb.org/data1.2/occs/list.json?lngmin=-143.2617&lngmax=-75.7617&latmin=31.3536&latmax=48.5748&interval_id=15&base_id=10707,52775&show=coords,attr,loc,prot,time,strat,stratext,lith,lithext,geo,rem,ent,entname,crmod&datainfo", which is a datset on Jurassic Dinosaurs in the Western US.

In [35]:
import requests
URL = "https://paleobiodb.org/data1.2/occs/list.json?lngmin=-143.2617&lngmax=-75.7617&latmin=31.3536&latmax=48.5748&interval_id=15&base_id=10707,52775&show=coords,attr,loc,prot,time,strat,stratext,lith,lithext,geo,rem,ent,entname,crmod&datainfo"
r = requests.get(url=URL) 
dino_data = r.json() #converts to a json file
dino_data

{'access_time': 'Fri 2018-10-05 21:20:41 GMT',
 'data_license': 'Creative Commons CC-BY',
 'data_provider': 'The Paleobiology Database',
 'data_source': 'The Paleobiology Database',
 'data_url': 'http://paleobiodb.org/data1.2/occs/list.json?lngmin=-143.2617&lngmax=-75.7617&latmin=31.3536&latmax=48.5748&interval_id=15&base_id=10707,52775&show=coords,attr,loc,prot,time,strat,stratext,lith,lithext,geo,rem,ent,entname,crmod&datainfo',
 'documentation_url': 'http://paleobiodb.org/data1.2/occs/list_doc.html',
 'elapsed_time': 0.482,
 'license_url': 'http://creativecommons.org/licenses/by/4.0/',
 'parameters': {'base_id': '10707,52775',
  'interval_id': '15',
  'latmax': '48.5748',
  'latmin': '31.3536',
  'lngmax': '-75.7617',
  'lngmin': '-143.2617',
  'show': 'coords,attr,loc,prot,time,strat,stratext,lith,lithext,geo,rem,ent,entname,crmod',
  'taxon_status': 'all',
  'timerule': 'major'},
 'records': [{'ath': 'M. Carrano',
   'ati': 'prs:14',
   'cc2': 'US',
   'cid': 'col:11924',
   'cny'

Your data has now been read and stored as a json file and awaits further exploration!