# Live Coding - Module 3
## Rachel Holman

In this live coding session we will access three different datasets which we can access on the internet without having to supply any API keys or other credentials (we'll cover APIs with credentials next week).

The goal is to examine the structure of the data, decide what is metadata and what constitutes the content of the dataframe we are trying to build, and to use the various tools available to us to convert the data to a pandas dataframe in our Python environment.

We'll work on the following three examples together:

1. [Cocktail recipes](https://www.thecocktaildb.com/api/json/v1/1/filter.php?c=Cocktail) from [The Cocktail DB](https://www.thecocktaildb.com/api.php)

2. Data on the pages on [Wikipedia](https://en.wikipedia.org/w/index.php?search=Virginia&title=Special%3ASearch&fulltext=1&ns0=1) that pop up when searching for the term ["Virginia"](https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=Virginia&format=json&srlimit=500). (This search gets the first 500 hits, but there are 210391 results. If time, we will use the sroffset parameter described in the [API documentation](https://en.wikipedia.org/w/api.php?action=help&modules=query%2Bsearch) to get the full list)

3. Articles from [NewsAPI.org](https://newsapi.org/docs/endpoints/everything). We will need to register for a [free API key](https://newsapi.org/docs/authentication) to be able to use this API. We'll talk about the best practices for storing API keys and keeping them secret during Module 4.


In [1]:
import numpy as np
import pandas as pd
import json
import requests

In [2]:
url = "https://www.thecocktaildb.com/api/json/v1/1/filter.php?c=Cocktail"
r = requests.get(url)
r  #response 200 means it worked without error!

<Response [200]>

In [3]:
# output what was returned by the requests.get(): r.text
# output in json format: json.loads(r.text)
myjson = json.loads(r.text)

In [4]:
# get drink name in first index of drinks
myjson['drinks'][0]['strDrink']

'155 Belmont'

In [5]:
numbers= [1,2,3,4,5]
numbers

[1, 2, 3, 4, 5]

In [6]:
# list comprehension
# to square every number, subtract 1, and take sqrt:
[np.sqrt(x**2 - 1)  for x in numbers]

[0.0,
 1.7320508075688772,
 2.8284271247461903,
 3.872983346207417,
 4.898979485566356]

In [7]:
# print out name of drink for every drink in the json
[d['strDrink'] for d in myjson['drinks']]

['155 Belmont',
 '57 Chevy with a White License Plate',
 '747 Drink',
 '9 1/2 Weeks',
 "A Gilligan's Island",
 'A True Amaretto Sour',
 'A.D.M. (After Dinner Mint)',
 'A1',
 'Abbey Martini',
 'Absolut Summertime',
 'Absolutely Fabulous',
 'Absolutly Screwed Up',
 'Ace',
 'Adam & Eve',
 'Addington',
 'Addison',
 'Addison Special',
 'Adios Amigos Cocktail',
 'Afterglow',
 'Alice Cocktail',
 'Amaretto fizz',
 'Aperol Spritz',
 'Apple Highball',
 'Apple Karate',
 'Applejack',
 'Aquamarine',
 'Arizona Stingers',
 'Arizona Twister',
 'Army special',
 'Autumn Garibaldi',
 'Aviation',
 'Bahama Mama',
 'Banana Cream Pi',
 "Bee's Knees",
 'Bijou',
 'Blue Hurricane',
 'Blueberry Mojito',
 'Bombay Cassis',
 'Bora Bora',
 'Boulevardier',
 'Bounty Hunter',
 'Brigadier',
 'Broadside',
 'Brooklyn',
 'Butterfly Effect',
 "Captain Kidd's Punch",
 'Cherry Electric Lemonade',
 'Cocktail Horse’s Neck',
 'Corn n Oil',
 'Corpse Reviver',
 'Cosmopolitan',
 'Cosmopolitan Martini',
 'Cream Soda',
 'Dark Caipiri

In [8]:
# for making a json dataset 
pd.json_normalize(myjson, record_path=['drinks'])

Unnamed: 0,strDrink,strDrinkThumb,idDrink
0,155 Belmont,https://www.thecocktaildb.com/images/media/dri...,15346
1,57 Chevy with a White License Plate,https://www.thecocktaildb.com/images/media/dri...,14029
2,747 Drink,https://www.thecocktaildb.com/images/media/dri...,178318
3,9 1/2 Weeks,https://www.thecocktaildb.com/images/media/dri...,16108
4,A Gilligan's Island,https://www.thecocktaildb.com/images/media/dri...,16943
...,...,...,...
95,Miami Vice,https://www.thecocktaildb.com/images/media/dri...,13936
96,Michelada,https://www.thecocktaildb.com/images/media/dri...,178343
97,Midnight Mint,https://www.thecocktaildb.com/images/media/dri...,14842
98,Mojito,https://www.thecocktaildb.com/images/media/dri...,11000


In [9]:
url = 'https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=Virginia&format=json&srlimit=500'
r = requests.get(url)
r

<Response [200]>

In [10]:
myjson = json.loads(r.text)
#myjson

In [11]:
wikidf = pd.json_normalize(myjson, record_path = ['query', 'search'])
wikidf

Unnamed: 0,ns,title,pageid,size,wordcount,snippet,timestamp
0,0,Virginia,32432,299870,26084,"<span class=""searchmatch"">Virginia</span>, off...",2023-06-27T14:42:03Z
1,0,West Virginia,32905,183464,17649,"<span class=""searchmatch"">Virginia</span> is a...",2023-06-25T23:50:52Z
2,0,"Virginia Beach, Virginia",91239,141238,12558,"<span class=""searchmatch"">Virginia</span> Beac...",2023-06-27T02:07:26Z
3,0,Virginia Woolf,32742,331505,31948,"Adeline <span class=""searchmatch"">Virginia</sp...",2023-06-26T16:46:32Z
4,0,Virgínia,1392216,2035,26,"<span class=""searchmatch"">Virgínia</span> is a...",2022-01-27T06:45:35Z
...,...,...,...,...,...,...,...
495,0,"Ballston, Arlington, Virginia",603147,22831,1764,Ballston is a neighborhood in Arlington County...,2023-06-24T15:59:04Z
496,0,"Christiansburg, Virginia",137687,28407,3282,Christiansburg (formerly Hans Meadows) is a to...,2023-06-12T19:22:16Z
497,0,"Petersburg, Virginia",91268,78607,8248,Petersburg is an independent city in the Commo...,2023-06-16T16:07:00Z
498,0,1964 United States presidential election in Vi...,43161140,77986,1368,The 1964 United States presidential election i...,2023-06-27T05:19:41Z


In [12]:
# harder example
# don't do this! in the future, don't paste your API key in your code like this:
apikey = "38d9de88dc9d4dc09c5764977a193672"
url = 'https://newsapi.org/v2/everything'
parameters = {'apiKey': apikey, 
             'q': 'hasbulla'} #search parameter
r= requests.get(url, params=parameters)
r

<Response [200]>

In [13]:
myjson = json.loads(r.text)

In [14]:
myjson['articles']

[{'source': {'id': None, 'name': 'Huffingtonpost.es'},
  'author': 'Sergio Coto',
  'title': 'Le manda esta imagen a su madre, le pregunta si sabe quién es y la continuación es ORO',
  'description': '<![CDATA[<p>Las conversaciones de WhatsApp han aparecido para cambiarlo la forma de comunicarse, pero, por si fuera poco, también se han convertido en un arma de doble filo por las bromas y los memes que se acaban compartiendo y que terminan en redes sociales…',
  'url': 'https://www.huffingtonpost.es/virales/le-manda-imagen-madre-le-pregunta-continuacion-oro.html',
  'urlToImage': 'https://img.huffingtonpost.es/files/og_thumbnail/uploads/2023/06/24/imagen-de-la-conversacion.jpeg',
  'publishedAt': '2023-06-24T10:53:11Z',
  'content': 'Las conversaciones de WhatsApp han aparecido para cambiarlo la forma de comunicarse, pero, por si fuera poco, también se han convertido en un arma de doble filo por las bromas y los memes que se acab… [+1240 chars]'},
 {'source': {'id': None, 'name': 'Clari

In [15]:
pd.json_normalize(myjson, record_path=['articles'])

Unnamed: 0,author,title,description,url,urlToImage,publishedAt,content,source.id,source.name
0,Sergio Coto,"Le manda esta imagen a su madre, le pregunta s...",<![CDATA[<p>Las conversaciones de WhatsApp han...,https://www.huffingtonpost.es/virales/le-manda...,https://img.huffingtonpost.es/files/og_thumbna...,2023-06-24T10:53:11Z,Las conversaciones de WhatsApp han aparecido p...,,Huffingtonpost.es
1,"Mississippi Clarion Ledger, David Eckert, Miss...",Ole Miss says goodbye to rivalry with Vanderbi...,See Ole Miss football's tribute to its rivalry...,https://www.clarionledger.com/story/sports/col...,https://www.gannett-cdn.com/presto/2021/11/21/...,2023-06-15T20:05:13Z,OXFORD The social media teams around the SEC g...,,Clarion Ledger
2,marca.com,Prime Tyson Fury or prime Mike Tyson? Evander ...,Tyson Fury and Mike Tyson are two names that w...,https://www.marca.com/en/boxing/2023/06/19/649...,https://phantom-marca.unidadeditorial.es/10648...,2023-06-19T12:25:04Z,Tyson Fury and Mike Tyson are two names that w...,marca,Marca
3,(abc),El Barça se redime de sus pecados,PESTAÑA unicaja-barcelona-semis-tercero-acb22/...,https://www.abc.es/deportes/baloncesto/barca-r...,https://s2.abcstatics.com/abc/www/multimedia/d...,2023-06-11T18:45:58Z,El Barcelona al fin tuvo fe ante la dificultad...,,Www.abc.es
