## Using NY Times APIs

What are APIs?

Structured ways people can give you their data.

Why?

Usually because they want to help web/mobile developers attract more users to their service.

Twitter doesn't have an API to help you out.

They want developers to build apps to drive more eyeballs to their service.

![](https://raw.github.com/nealcaren/workshop_2014/master/notebooks/images/times_inequality.png)

No love with the scrape!!!

![](https://raw.github.com/nealcaren/workshop_2014/master/notebooks/images/no_luck.png)

In [1]:
import requests

Do me a favor and sign up to be a [developer](http://developer.nytimes.com) with the New York Times and get your own API key.

In [2]:
my_times_api_key = 'd745e332f909406f9d3854a483af1702'

APIs can be accessed like a normal URL, but they are often very long, complicated, and involve variables you want to change. For example, you can get the information about the first 10 articles published in the New York Times that used the word "food" with:

[http://api.nytimes.com/svc/search/v2/articlesearch.json?sort=newest&begin_date=20170101&end_date=20171015&api-key=d20bc9ac37156ecc4cb3d78eb956201d%3A0%3A54059647&q=food&page=0](http://api.nytimes.com/svc/search/v2/articlesearch.json?sort=newest&begin_date=20170101&end_date=20171015&api-key=d20bc9ac37156ecc4cb3d78eb956201d%3A0%3A54059647&q=food&page=0)

Requests allows you to do this in a more civilized way.

In [3]:
payload = {'q'         : 'food', 
           'begin_date': '20180101' ,
           'end_date'  : '20181001',
           'api-key'   :  my_times_api_key,
           'sort'      : 'oldest' ,
           'offset'    :  20}

base_url = 'http://api.nytimes.com/svc/search/v2/articlesearch.json?'

In [4]:
r = requests.get(base_url, params = payload)

#r.url

In [5]:
r.text

'{"status":"OK","copyright":"Copyright (c) 2018 The New York Times Company. All Rights Reserved.","response":{"docs":[{"web_url":"https://cooking.nytimes.com/recipes/1018901-cumin-roasted-salmon-with-cilantro-sauce","snippet":"Roasting a whole fillet of fish might seem like a weekend-only treat, but cooking salmon this way is a luxury you should allow yourself on any old Tuesday, as it requires no additional preparation or skill. Be sure to slather the vinegary herb sau...","blog":{},"source":"du_recipe","multimedia":[{"rank":1,"subtype":"thumbnail","caption":"Alison Roman\'s cumin-roasted salmon with cilantro sauce.","credit":"Julia Gartland for The New York Times","type":"image","url":"images/2018/08/28/dining/roman-salmon-horizontal/roman-salmon-horizontal-thumbStandard.jpg","height":75,"width":75,"legacy":{},"subType":"thumbnail","crop_name":"thumbStandard"},{"rank":1,"subtype":"large","caption":"Alison Roman\'s cumin-roasted salmon with cilantro sauce.","credit":"Julia Gartland fo

In [6]:
r.json()

{'copyright': 'Copyright (c) 2018 The New York Times Company. All Rights Reserved.',
 'response': {'docs': [{'_id': '5995a1de7c459f246b61ecc0',
    'blog': {},
    'byline': {'organization': None,
     'original': 'Alison Roman',
     'person': [{'firstname': 'Alison',
       'lastname': 'Roman',
       'middlename': None,
       'organization': '',
       'qualifier': None,
       'rank': 1,
       'role': 'reported',
       'title': None}]},
    'document_type': 'recipe',
    'headline': {'content_kicker': None,
     'kicker': None,
     'main': '',
     'name': 'Cumin-Roasted Salmon With Cilantro Sauce',
     'print_headline': None,
     'seo': None,
     'sub': None},
    'keywords': [],
    'multimedia': [{'caption': "Alison Roman's cumin-roasted salmon with cilantro sauce.",
      'credit': 'Julia Gartland for The New York Times',
      'crop_name': 'thumbStandard',
      'height': 75,
      'legacy': {},
      'rank': 1,
      'subType': 'thumbnail',
      'subtype': 'thumbnail'

In [7]:
json = r.json()


In [8]:
json.keys()

dict_keys(['status', 'copyright', 'response'])

In [9]:
json['status']

'OK'

Output from `json['response']` ommitted because it was really long.

In [10]:
json['response']['docs']

[{'_id': '5995a1de7c459f246b61ecc0',
  'blog': {},
  'byline': {'organization': None,
   'original': 'Alison Roman',
   'person': [{'firstname': 'Alison',
     'lastname': 'Roman',
     'middlename': None,
     'organization': '',
     'qualifier': None,
     'rank': 1,
     'role': 'reported',
     'title': None}]},
  'document_type': 'recipe',
  'headline': {'content_kicker': None,
   'kicker': None,
   'main': '',
   'name': 'Cumin-Roasted Salmon With Cilantro Sauce',
   'print_headline': None,
   'seo': None,
   'sub': None},
  'keywords': [],
  'multimedia': [{'caption': "Alison Roman's cumin-roasted salmon with cilantro sauce.",
    'credit': 'Julia Gartland for The New York Times',
    'crop_name': 'thumbStandard',
    'height': 75,
    'legacy': {},
    'rank': 1,
    'subType': 'thumbnail',
    'subtype': 'thumbnail',
    'type': 'image',
    'url': 'images/2018/08/28/dining/roman-salmon-horizontal/roman-salmon-horizontal-thumbStandard.jpg',
    'width': 75},
   {'caption': "A

In [11]:
json['response']['meta']['hits']

5708

In [13]:
from time import sleep

base_url = 'http://api.nytimes.com/svc/search/v2/articlesearch.json?'

payload = { 'q'         : 'food', 
            'api-key'   :  my_times_api_key,
            'sort'      : 'newest' ,
            'page'      :  10}

years = [2010, 2011, 2012, 2013, 2014, 2015, 2016]
counts = []
for year in years:
    year_string = str(year)
    payload['begin_date'] = year_string + '0101'
    payload['end_date']   = year_string + '1001'
    r = requests.get(base_url, params = payload)
    json = r.json()
    count  = json['response']['meta']['hits']
    counts.append(count)
    sleep(.1)   

KeyError: 'response'

This can be plotted in Python.

In [None]:
%pylab inline
import matplotlib.pyplot as plt

plt.scatter(years,counts)
plt.ticklabel_format(useOffset=False)

Your turn. Modify the script below to output a csv with the monthly total of "food" articles. For an extra challenge, add an additional column with the count of the number of "food security" articles.