# W11 Working with Web APIs

Since many websites provide publicly available API's that allow users to wither access data feed or download web page content via JSON, XML, HTML or some other format, we use the API provided by New York Times to show how to read JSON by using some Python libraries and transfer the JSON to Pandas data frame.

## First Dataset

I use the Books API as my first demo example. This section contains the best sellers lists and book reviews. The JSON file that I want to extract is the "best-sellers/history.josn". The step is:

  * Create an API key on the The New York Times Developers website provided by The New York Times.
  * Select the dataset.
  * Authorize your API-key.
  * Get the URL link through "Try this Key".
  * Use requests to read the JSON data。
  * Transfer the JSON file to Pandas data frame.
    

In [2]:
import requests
url = 'https://api.nytimes.com/svc/books/v3/lists/best-sellers/history.json?api-key=VOjk1yB2zhDZ06DG9NFvhxmvVNoMAKYe'
resp = requests.get(url)
resp

<Response [200]>

In [3]:
# now convert the 'resp' object's JSON content into a list of 
# native Python objects
data = resp.json()
type(data)


dict

In [4]:
# now, see the length of the data.
len(data)

4

In [5]:
# check the content of the list
data

{'status': 'OK',
 'copyright': 'Copyright (c) 2019 The New York Times Company.  All Rights Reserved.',
 'num_results': 31991,
 'results': [{'title': '"I GIVE YOU MY BODY ..."',
   'description': 'The author of the Outlander novels gives tips on writing sex scenes, drawing on examples from the books.',
   'contributor': 'by Diana Gabaldon',
   'author': 'Diana Gabaldon',
   'contributor_note': '',
   'price': 0,
   'age_group': '',
   'publisher': 'Dell',
   'isbns': [{'isbn10': '0399178570', 'isbn13': '9780399178573'}],
   'ranks_history': [{'primary_isbn10': '0399178570',
     'primary_isbn13': '9780399178573',
     'rank': 8,
     'list_name': 'Advice How-To and Miscellaneous',
     'display_name': 'Advice, How-To & Miscellaneous',
     'published_date': '2016-09-04',
     'bestsellers_date': '2016-08-20',
     'weeks_on_list': 1,
     'ranks_last_week': None,
     'asterisk': 0,
     'dagger': 0}],
   'reviews': [{'book_review_link': '',
     'first_chapter_link': '',
     'sunday_r

In [6]:
# since the data type is dict not list, we need to use another way to 
# tranfer data to pandas data frame.
# I suggest to use json_normalize function.
from pandas.io.json import json_normalize
df=json_normalize(data['results'])
df.head(5)

Unnamed: 0,age_group,author,contributor,contributor_note,description,isbns,price,publisher,ranks_history,reviews,title
0,,Diana Gabaldon,by Diana Gabaldon,,The author of the Outlander novels gives tips ...,"[{'isbn10': '0399178570', 'isbn13': '978039917...",0.0,Dell,"[{'primary_isbn10': '0399178570', 'primary_isb...","[{'book_review_link': '', 'first_chapter_link'...","""I GIVE YOU MY BODY ..."""
1,,Annette Gordon-Reed and Peter S Onuf,by Annette Gordon-Reed and Peter S. Onuf,,A character study that attempts to make sense ...,"[{'isbn10': '0871404427', 'isbn13': '978087140...",0.0,Liveright,"[{'primary_isbn10': '0871404427', 'primary_isb...","[{'book_review_link': '', 'first_chapter_link'...","""MOST BLESSED OF THE PATRIARCHS"""
2,,Gary Vaynerchuk,by Gary Vaynerchuk,,The entrepreneur expands on subjects addressed...,"[{'isbn10': '0062273124', 'isbn13': '978006227...",0.0,HarperCollins,"[{'primary_isbn10': '0062273124', 'primary_isb...","[{'book_review_link': '', 'first_chapter_link'...",#ASKGARYVEE
3,,Sophia Amoruso,by Sophia Amoruso,,An online fashion retailer traces her path to ...,"[{'isbn10': '039916927X', 'isbn13': '978039916...",0.0,Portfolio/Penguin/Putnam,"[{'primary_isbn10': '1591847931', 'primary_isb...","[{'book_review_link': '', 'first_chapter_link'...",#GIRLBOSS
4,,David Hogg and Lauren Hogg,by David Hogg and Lauren Hogg,,Students from Marjory Stoneman Douglas High Sc...,"[{'isbn10': '198480183X', 'isbn13': '978198480...",0.0,Random House,"[{'primary_isbn10': '198480183X', 'primary_isb...","[{'book_review_link': '', 'first_chapter_link'...",#NEVERAGAIN


In [7]:
# select the column needed.
df1=df[['author','contributor','description','price','publisher','title']]
df1.head(5)

Unnamed: 0,author,contributor,description,price,publisher,title
0,Diana Gabaldon,by Diana Gabaldon,The author of the Outlander novels gives tips ...,0.0,Dell,"""I GIVE YOU MY BODY ..."""
1,Annette Gordon-Reed and Peter S Onuf,by Annette Gordon-Reed and Peter S. Onuf,A character study that attempts to make sense ...,0.0,Liveright,"""MOST BLESSED OF THE PATRIARCHS"""
2,Gary Vaynerchuk,by Gary Vaynerchuk,The entrepreneur expands on subjects addressed...,0.0,HarperCollins,#ASKGARYVEE
3,Sophia Amoruso,by Sophia Amoruso,An online fashion retailer traces her path to ...,0.0,Portfolio/Penguin/Putnam,#GIRLBOSS
4,David Hogg and Lauren Hogg,by David Hogg and Lauren Hogg,Students from Marjory Stoneman Douglas High Sc...,0.0,Random House,#NEVERAGAIN


## The second dataset
The second dataset is pulled from the Most Popular API section. This section records the data about articles been shared most time by email, facebook, or Twitter. The dataset I pulled is focused on Twitter. The steps are the same as the one mentioned above.

In [8]:
url1 = 'https://api.nytimes.com/svc/mostpopular/v2/shared/30/twitter.json?api-key=VOjk1yB2zhDZ06DG9NFvhxmvVNoMAKYe'
resp1 = requests.get(url1)
resp1

<Response [200]>

In [9]:
# now convert the 'resp' object's JSON content into a list of 
# native Python objects
data1 = resp1.json()
type(data1)

dict

In [10]:
# now, see the length of the data.
len(data1)

4

In [11]:
# check the content of the list
data1

{'status': 'OK',
 'copyright': 'Copyright (c) 2019 The New York Times Company.  All Rights Reserved.',
 'num_results': 1743,
 'results': [{'url': 'https://www.nytimes.com/2019/03/28/health/woman-pain-anxiety.html',
   'adx_keywords': 'Pain;Anxiety and Stress;Cameron, Jo;Srivastava, Devjit;British Journal of Anaesthesia;Genetics and Heredity;Opioids and Opiates;Pain-Relieving Drugs;University College London;Scotland',
   'subsection': '',
   'share_count': 1,
   'count_type': 'SHARED-TWITTER',
   'column': None,
   'eta_id': 0,
   'section': 'Health',
   'id': 100000006432324,
   'asset_id': 100000006432324,
   'nytdsection': 'health',
   'byline': 'By HEATHER MURPHY',
   'type': 'Article',
   'title': 'At 71, She’s Never Felt Pain or Anxiety. Now Scientists Know Why.',
   'abstract': 'Scientists discovered a previously unidentified genetic mutation in a Scottish woman. They hope it could lead to the development of new pain treatment.',
   'published_date': '2019-03-28',
   'source': 'T

In [12]:
# since the data type is dict not list, we need to use another way to 
# tranfer data to pandas data frame.
# I suggest to use json_normalize function.
df2 =json_normalize(data1['results'])
df2.head(5)

Unnamed: 0,abstract,adx_keywords,asset_id,byline,column,count_type,des_facet,eta_id,geo_facet,id,...,published_date,section,share_count,source,subsection,title,type,updated,uri,url
0,Scientists discovered a previously unidentifie...,"Pain;Anxiety and Stress;Cameron, Jo;Srivastava...",100000006432324,By HEATHER MURPHY,,SHARED-TWITTER,"[ANXIETY AND STRESS, GENETICS AND HEREDITY, OP...",0,[SCOTLAND],100000006432324,...,2019-03-28,Health,1,The New York Times,,"At 71, She’s Never Felt Pain or Anxiety. Now S...",Article,2019-04-01 19:08:07,nyt://article/02e0395b-1496-540a-8e5f-6687a52a...,https://www.nytimes.com/2019/03/28/health/woma...
1,Members of the special counsel’s team have tol...,"Trump, Donald J;Mueller, Robert S III;Barr, Wi...",100000006441465,"By NICHOLAS FANDOS, MICHAEL S. SCHMIDT and MAR...",,SHARED-TWITTER,[RUSSIAN INTERFERENCE IN 2016 US ELECTIONS AND...,0,,100000006441465,...,2019-04-03,U.S.,2,The New York Times,politics,Some on Mueller’s Team Say Report Was More Dam...,Article,2019-04-04 18:27:18,nyt://article/6d5ac849-af3f-5a1b-90f7-88f56c37...,https://www.nytimes.com/2019/04/03/us/politics...
2,The number of female solo travelers has skyroc...,Women and Girls;Travel and Vacations;Sex Crime...,100000006346102,By MEGAN SPECIA and TARIRO MZEZEWA,,SHARED-TWITTER,"[WOMEN AND GIRLS, ASSAULTS]",0,"[COSTA RICA, BOLIVIA, THAILAND]",100000006346102,...,2019-03-25,Travel,3,The New York Times,,Adventurous. Alone. Attacked.,Article,2019-04-02 13:16:42,nyt://article/f723a721-2e1d-59bf-b747-17225966...,https://www.nytimes.com/2019/03/25/travel/solo...
3,A reconstruction of the moment when a truck be...,"Humanitarian Aid;Guaido, Juan;Maduro, Nicolas;...",100000006400090,"By NICHOLAS CASEY, CHRISTOPH KOETTL and DEBORA...",,SHARED-TWITTER,"[POLITICS AND GOVERNMENT, RUMORS AND MISINFORM...",0,[VENEZUELA],100000006400090,...,2019-03-10,World,4,The New York Times,americas,Footage Contradicts U.S. Claim That Nicolás Ma...,Article,2019-03-11 17:40:56,nyt://article/b8b9ab4e-90c7-57d1-b65f-8fe3b832...,https://www.nytimes.com/2019/03/10/world/ameri...
4,The hummingbirds were dying. Cockroaches were ...,Greenhouse Gas Emissions;Sustainable Living;Un...,100000006401893,By CARL ZIMMER,news analysis,SHARED-TWITTER,,0,,100000006401893,...,2019-03-29,Sunday Review,5,The New York Times,,The Lost History of One of the World’s Strange...,Article,2019-04-01 00:31:43,nyt://article/5ce9cefb-fadf-5e01-bb8a-0267160a...,https://www.nytimes.com/2019/03/29/sunday-revi...


In [13]:
# select the column needed.
df3=df2[['abstract','adx_keywords','byline','des_facet','geo_facet']]
df3.head(5)

Unnamed: 0,abstract,adx_keywords,byline,des_facet,geo_facet
0,Scientists discovered a previously unidentifie...,"Pain;Anxiety and Stress;Cameron, Jo;Srivastava...",By HEATHER MURPHY,"[ANXIETY AND STRESS, GENETICS AND HEREDITY, OP...",[SCOTLAND]
1,Members of the special counsel’s team have tol...,"Trump, Donald J;Mueller, Robert S III;Barr, Wi...","By NICHOLAS FANDOS, MICHAEL S. SCHMIDT and MAR...",[RUSSIAN INTERFERENCE IN 2016 US ELECTIONS AND...,
2,The number of female solo travelers has skyroc...,Women and Girls;Travel and Vacations;Sex Crime...,By MEGAN SPECIA and TARIRO MZEZEWA,"[WOMEN AND GIRLS, ASSAULTS]","[COSTA RICA, BOLIVIA, THAILAND]"
3,A reconstruction of the moment when a truck be...,"Humanitarian Aid;Guaido, Juan;Maduro, Nicolas;...","By NICHOLAS CASEY, CHRISTOPH KOETTL and DEBORA...","[POLITICS AND GOVERNMENT, RUMORS AND MISINFORM...",[VENEZUELA]
4,The hummingbirds were dying. Cockroaches were ...,Greenhouse Gas Emissions;Sustainable Living;Un...,By CARL ZIMMER,,


## Third dataset

The third dataset contains the data from the Movie review section. The dataset is set up as all type, 1000 offset, and ordered by opening date. The step is the same as above.

In [14]:
url2 = 'https://api.nytimes.com/svc/movies/v2/reviews/all.json?offset=1000&order=by-opening-date&api-key=VOjk1yB2zhDZ06DG9NFvhxmvVNoMAKYe'
resp2 = requests.get(url2)
resp2

<Response [200]>

In [15]:
# now convert the 'resp' object's JSON content into a list of 
# native Python objects
data2 = resp2.json()
type(data2)

dict

In [16]:
len(data2)

5

In [17]:
# check the content of the list
data2

{'status': 'OK',
 'copyright': 'Copyright (c) 2019 The New York Times Company. All Rights Reserved.',
 'has_more': True,
 'num_results': 20,
 'results': [{'display_title': 'Eden',
   'mpaa_rating': 'R',
   'critics_pick': 1,
   'byline': 'STEPHEN HOLDEN',
   'headline': 'True Story Inspires Tale of Sex Trade; in a Twist, a U.S. Marshal Is the Bad Guy',
   'summary_short': '“Eden,” directed by Megan Griffiths, is an examination of sex trafficking in the United States.',
   'publication_date': '2013-03-19',
   'opening_date': None,
   'date_updated': '2017-11-02 04:16:35',
   'link': {'type': 'article',
    'url': 'http://www.nytimes.com/2013/03/20/movies/eden-depicts-sex-trafficking-in-the-united-states.html',
    'suggested_link_text': 'Read the New York Times Review of Eden'},
   'multimedia': None},
  {'display_title': '108',
   'mpaa_rating': '',
   'critics_pick': 1,
   'byline': 'JEANNETTE CATSOULIS',
   'headline': 'A Death in Paraguay Leads a Niece to Brutal Truths',
   'summary

In [20]:
# since the data type is dict not list, we need to use another way to 
# tranfer data to pandas data frame.
# I suggest to use json_normalize function.
df4 =json_normalize(data2['results'])
df4.head(5)

Unnamed: 0,byline,critics_pick,date_updated,display_title,headline,link.suggested_link_text,link.type,link.url,mpaa_rating,multimedia,opening_date,publication_date,summary_short
0,STEPHEN HOLDEN,1,2017-11-02 04:16:35,Eden,True Story Inspires Tale of Sex Trade; in a Tw...,Read the New York Times Review of Eden,article,http://www.nytimes.com/2013/03/20/movies/eden-...,R,,,2013-03-19,"“Eden,” directed by Megan Griffiths, is an exa..."
1,JEANNETTE CATSOULIS,1,2017-11-02 04:16:34,108,A Death in Paraguay Leads a Niece to Brutal Tr...,Read the New York Times Review of 108,article,http://www.nytimes.com/2013/03/18/movies/108-c...,,,,2013-03-17,In “108 (Cuchillo de Palo)” Renate Costa inves...
2,MANOHLA DARGIS,1,2017-11-02 04:16:36,Reality,"Onstage All of His Life, but Aching to Be Seen",Read the New York Times Review of Reality,article,http://www.nytimes.com/2013/03/15/movies/reali...,R,,,2013-03-14,"“Reality,” a film by Matteo Garrone, follows a..."
3,A. O. SCOTT,1,2017-11-02 04:16:40,Philip Roth Unmasked,Looking Past the Alter Egos to the Novelist,Read the New York Times Review of Philip Roth ...,article,http://www.nytimes.com/2013/03/13/movies/phili...,,,,2013-03-12,It’s fitting that Mr. Roth dominates the scree...
4,RACHEL SALTZ,1,2017-11-02 04:16:32,I Killed My Mother,A Mother-Son Dance With Many Awkward Steps,Read the New York Times Review of I Killed My ...,article,http://www.nytimes.com/2013/03/13/movies/i-kil...,Not Rated,,,2013-03-12,The plot of “I Killed My Mother” centers on th...


In [21]:
# select the column needed.
df5=df4[['byline','date_updated','display_title','headline','link.url','mpaa_rating','publication_date']]
df5.head(5)

Unnamed: 0,byline,date_updated,display_title,headline,link.url,mpaa_rating,publication_date
0,STEPHEN HOLDEN,2017-11-02 04:16:35,Eden,True Story Inspires Tale of Sex Trade; in a Tw...,http://www.nytimes.com/2013/03/20/movies/eden-...,R,2013-03-19
1,JEANNETTE CATSOULIS,2017-11-02 04:16:34,108,A Death in Paraguay Leads a Niece to Brutal Tr...,http://www.nytimes.com/2013/03/18/movies/108-c...,,2013-03-17
2,MANOHLA DARGIS,2017-11-02 04:16:36,Reality,"Onstage All of His Life, but Aching to Be Seen",http://www.nytimes.com/2013/03/15/movies/reali...,R,2013-03-14
3,A. O. SCOTT,2017-11-02 04:16:40,Philip Roth Unmasked,Looking Past the Alter Egos to the Novelist,http://www.nytimes.com/2013/03/13/movies/phili...,,2013-03-12
4,RACHEL SALTZ,2017-11-02 04:16:32,I Killed My Mother,A Mother-Son Dance With Many Awkward Steps,http://www.nytimes.com/2013/03/13/movies/i-kil...,Not Rated,2013-03-12
