<a href="https://colab.research.google.com/github/tjsoftworks/Agile_Data_Code_2/blob/master/Intro_to_APIs_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Welcome to Google Colab

Colab is essentially Google's way of hosting a [jupyter notebook](https://jupyter.org/). A very popular tool to use as a data scientist!

It allows us to write code, documentation, and output visuals all in one place.

To be able to and edit the code in this workshop. Please make a copy for yourself

`file > save a copy in drive`

This should open a new tab with your own copy of this notebook. It can take a minute to load.

Colab also gives you some options for running complicated computations such as training deep learning model. To see access those options:

`Runtime > change runtime type` Select `GPU`, `TPU`, or `None`

We don't need to change anything for this workshop, but its a great resource if you start learning deep learning and don't have a powerful GPU at home. 

This is a text cell. 

You can add a new text cell by clicking `+ Text` above. 

It does not highlight wrong spelling. I apologize for any typos!


# REST

## Let's cover a simple REST GET request

The TacoFancy API was built on top of TacoFancy. 

See https://github.com/evz/tacofancy-api and https://github.com/sinker/tacofancy for more details.

One of the most difficult parts of APIs can be discovering valid ways to interact with them. 

In a best-case scenario, the API is thoroughly documented AND kept current...

> Just like all programmers documenting their code...

So with that in mind, lets check out the source code for the TacoFancy API to discover the Flask Endpoints: https://github.com/evz/tacofancy-api/blob/master/app.py

In [None]:
import requests

# main endpoint url
taco_url='http://taco-randomizer.herokuapp.com/'

# specific endpoint for a random full-taco
taco_options='/random/?full-taco=true'

#### Now we'll use the URL with the endpoint `/random/?full-taco=true` to GET a random taco.

In [None]:
response = requests.get(taco_url+taco_options)

# We can access the JSON payload response by calling the json method
response.json()

{'base_layer': {'name': 'Garlic Black Beans',
  'slug': 'garlic_black_beans',
  'url': 'https://raw.github.com/sinker/tacofancy/master/base_layers/garlic_black_beans.md'},
 'base_layer_url': 'https://raw.github.com/sinker/tacofancy/master/base_layers/garlic_black_beans.md',
 'condiment_url': None,
 'mixin_url': None,
 'name': 'Black Bean, Potato, and Onion Tacos',
 'seasoning_url': None,
 'shell_url': None,
 'slug': 'black_bean_potato_and_onion_tacos',
 'url': 'https://raw.github.com/sinker/tacofancy/master/full_tacos/black_bean_potato_onion_tacos.md'}

In [None]:
import time

random_tacos=[]

for _ in range(10):
  # Retreive the page
  response = requests.get(taco_url+taco_options)
  random_tacos.append(response.json())
  # Wait between requests to avoid overloading the app
  time.sleep(5)

#### We can store these random tacos in a list and turn them into a Pandas DataFrame

In [None]:
import pandas as pd

df = pd.DataFrame(random_tacos)

In [None]:
df.head()

Unnamed: 0,base_layer,condiment,slug,mixin_url,seasoning_url,url,shell_url,recipe,base_layer_url,name,condiment_url,mixin
0,{'url': 'https://raw.github.com/sinker/tacofan...,{'url': 'https://raw.github.com/sinker/tacofan...,asian_style_tacos,,,https://raw.github.com/sinker/tacofancy/master...,,Asian Style Tacos\n=================\n\nIf you...,https://raw.github.com/sinker/tacofancy/master...,Asian Style Tacos,https://raw.github.com/sinker/tacofancy/master...,
1,{'url': 'https://raw.github.com/sinker/tacofan...,{'url': 'https://raw.github.com/sinker/tacofan...,fish_tacos,https://raw.github.com/sinker/tacofancy/master...,,https://raw.github.com/sinker/tacofancy/master...,,Fish Tacos\n==========\n\nFish tacos tend to o...,https://raw.github.com/sinker/tacofancy/master...,Fish Tacos,https://raw.github.com/sinker/tacofancy/master...,{'url': 'https://raw.github.com/sinker/tacofan...
2,{'url': 'https://raw.github.com/sinker/tacofan...,{'url': 'https://raw.github.com/sinker/tacofan...,chicken_verde_corn_and_black_bean_tacos_with_p...,,,https://raw.github.com/sinker/tacofancy/master...,,"Chicken Verde, Corn and Black Bean Tacos with ...",https://raw.github.com/sinker/tacofancy/master...,"Chicken Verde, Corn and Black Bean Tacos with ...",https://raw.github.com/sinker/tacofancy/master...,
3,{'url': 'https://raw.github.com/sinker/tacofan...,{'url': 'https://raw.github.com/sinker/tacofan...,chorizo_sweet_potato_and_apple_tacos_with_chip...,https://raw.github.com/sinker/tacofancy/master...,,https://raw.github.com/sinker/tacofancy/master...,,"Chorizo, Sweet Potato and Apple Tacos with Chi...",https://raw.github.com/sinker/tacofancy/master...,"Chorizo, Sweet Potato and Apple Tacos with Chi...",https://raw.github.com/sinker/tacofancy/master...,{'url': 'https://raw.github.com/sinker/tacofan...
4,{'url': 'https://raw.github.com/sinker/tacofan...,,swiss_chard_tacos,,,https://raw.github.com/sinker/tacofancy/master...,,# Swiss Chard Tacos\n\nGot chard or another le...,https://raw.github.com/sinker/tacofancy/master...,Swiss Chard Tacos,,


In [None]:
response=requests.get(taco_url+'/base_layers/')

In [None]:
base_layers_df = pd.DataFrame(response.json())

In [None]:
base_layers_df.head()

Unnamed: 0,url,recipe,name,slug
0,https://raw.github.com/sinker/tacofancy/master...,Soyrizo\n=======\n\n* Soyrizo (The El Burrito ...,Soyrizo,soyrizo
1,https://raw.github.com/sinker/tacofancy/master...,Roasted Butternut Squash\n====================...,Roasted Butternut Squash,roasted_butternut_squash
2,https://raw.github.com/sinker/tacofancy/master...,Baja Beer Battered Fish\n=====================...,Baja Beer Battered Fish,baja_beer_battered_fish
3,https://raw.github.com/sinker/tacofancy/master...,Carnitas\n========\n\nThis recipe calls for bo...,Carnitas,carnitas
4,https://raw.github.com/sinker/tacofancy/master...,Chopped Steak\n=============\n\nI like all kin...,Chopped Steak,chopped_steak


In [None]:
response=requests.get(taco_url+'/base_layers/chorizo')
response.json()


{'name': 'Chorizo',
 'slug': 'chorizo',
 'url': 'https://raw.github.com/sinker/tacofancy/master/base_layers/chorizo.md'}

#SOAP

## What about SOAP APIs? 

Here we will work with a SOAP API dedicated to validating ISBNs.

ISBN - International Standard Book Number

ISBNs are 10 or 13 Digits long. Here we will work through an example of making a SOAP request for a 10 digit ISBN.

---


We can check out the service at: http://webservices.daehosting.com/services/isbnservice.wso



In [None]:
# URL of SOAP Endpoint
soap_url='http://webservices.daehosting.com/services/isbnservice.wso'

# Header to indicate content type
headers={'content-type':'text/xml'}

# The bulk of the request which must match the SOAP specification of the server
body10="""<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
 <soap:Body>
  <IsValidISBN10 xmlns="http://webservices.daehosting.com/ISBN">
   <sISBN>0-19-852663-6</sISBN>
  </IsValidISBN10>
 </soap:Body>
</soap:Envelope>"""

In [None]:
response = requests.post(soap_url,data=body10,headers=headers)

In [None]:
from xml.dom import minidom

xml = minidom.parseString(response.text)
print(xml.toprettyxml())

<?xml version="1.0" ?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
	
  
	<soap:Body>
		
    
		<m:IsValidISBN10Response xmlns:m="http://webservices.daehosting.com/ISBN">
			
      
			<m:IsValidISBN10Result>true</m:IsValidISBN10Result>
			
    
		</m:IsValidISBN10Response>
		
  
	</soap:Body>
	

</soap:Envelope>



## Challenge: 

Create a new request given an ISBN of your choosing

In [None]:
# Your code here


## Challenge:

Try validating a 13 digit ISBN

## Extra Challenge:

Turn those steps into a custom function

In [None]:
# 13 digit ISBN validation code


In [None]:
# Create a function that can work with either


In [None]:
# Can you integrate the above with this new function?

def parse_isbn_xml(xml, length=10):
  """
  Return Boolean result of an XML response from ISBN validator 
  """
  tag_name=f'm:IsValidISBN{length}Result'
  element = xml.getElementsByTagName(tag_name)[0]
  node = element.lastChild
  if node.nodeValue == 'true':
    return True
  else:
    return False

In [None]:
parse_isbn_xml(xml)

True

#GraphQL

#### Now onto GraphQL!!!

GraphQL is one of the closest API structures to direct access to a database SQL-like experience.

Check out the API explorer at https://api.graphql.jobs/

In [None]:
query="""{
  jobs{
    id
    title
    cities{
      name
    }
    company{
      name
    }
    description
  }
}"""

In [None]:
graphql_url="https://api.graphql.jobs/"

In [None]:
response=requests.post(graphql_url,json={'query':query})

In [None]:
response.json()

{'data': {'jobs': [{'cities': [{'name': 'San Francisco'}],
    'company': {'name': 'Segment'},
    'description': "# **Overview\xa0**\n\nAt Segment, we believe companies should be able to send their customer data wherever they want, whenever they want, with no fuss.\xa0We make this easy with a single pipeline that collects, stores, filters, transforms, and sends data to hundreds of business tools with the flip of a switch.\xa0Historically, we’ve built integrations with more than 250 different customer data tools ourselves(think Mixpanel, Google Analytics, Stripe).This March, we opened up our [++**Developer Center**++](https://segment.com/partners/developer-center/). For the first time, new companies could build integrations upon Segment data, using our self-service workflow. In that time, we’ve onboarded **60 separate companies**, each of whom built endpoints to work with our spec.\xa0We're now looking for a Senior Fullstack Engineer to help us expand our platform…we want to offer ever

We can even build a Pandas DataFrame from our results

In [None]:
# filter down the JSON hierarchy
jobs_df=pd.DataFrame(response.json()['data']['jobs'])

In [None]:
# function to clean up the company name
get_details=lambda x:x['name']

In [None]:
jobs_df['company']=jobs_df.company.apply(get_details)

In [None]:
jobs_df.head()

Unnamed: 0,id,title,cities,company,description
0,cjz1ipl9x009a0758hg68h7vy,Senior Fullstack Engineer - Platform,[{'name': 'San Francisco'}],Segment,"# **Overview **\n\nAt Segment, we believe comp..."
1,cjwt2a8j700by0793lnvon5c9,Full Stack JavaScript Developer,[{'name': 'Berlin'}],Unrealists,We are looking for a strong (Midlevel to Senio...
2,cjw1yogu0007j079339al4zyu,"Senior Software Engineer, API Development",[{'name': 'London'}],DeepCrawl,**Overview of DeepCrawl**\n\nDeepCrawl is the ...
3,cjwqsa7wj007d0778obv0dfs9,Experienced Backend Engineer - Go,[],Theorem,Do you enjoy collaborating in a consultative e...
4,cjwnlolmk03hm0756mmvkw5d2,Senior Software Engineer - Frontend,[],Close,**About Us**\n\nAt [Close](https://close.com/)...


In [None]:
from gensim.summarization import keywords

keywords_list=jobs_df.description.apply(lambda x: keywords(x).split('\n'))

In [None]:
from collections import Counter

In [None]:
c=Counter()
for job_kws in keywords_list:
  c.update(job_kws)

In [None]:
c.most_common()[0:20]

[('experience', 42),
 ('work', 40),
 ('team', 39),
 ('working', 38),
 ('development', 34),
 ('products', 31),
 ('new', 30),
 ('build', 29),
 ('building', 27),
 ('product', 22),
 ('engineering', 21),
 ('status', 20),
 ('developers', 20),
 ('engineer', 20),
 ('engineers', 20),
 ('developer', 19),
 ('develop', 19),
 ('graphql', 19),
 ('production', 19),
 ('developing', 17)]

What about limiting the results of our query?

In [None]:
query="""{
  jobs {
    title
    tags(first:1) {
      name
    }
  }
}
"""

In [None]:
response=requests.post(graphql_url,json={'query':query})

In [None]:
response.json()

{'data': {'jobs': [{'tags': [{'name': 'TypeScript'}],
    'title': 'Senior Fullstack Engineer - Platform'},
   {'tags': [{'name': 'JavaScript'}],
    'title': 'Full Stack JavaScript Developer'},
   {'tags': [{'name': 'TypeScript'}],
    'title': 'Senior Software Engineer, API Development'},
   {'tags': [{'name': 'Backend'}],
    'title': 'Experienced Backend Engineer - Go'},
   {'tags': [{'name': 'JavaScript'}],
    'title': 'Senior Software Engineer - Frontend'},
   {'tags': [{'name': 'Backend'}],
    'title': 'Senior Software Engineer - Backend'},
   {'tags': [{'name': 'TypeScript'}],
    'title': 'Full Stack Javascript Developer'},
   {'tags': [{'name': 'Python'}], 'title': 'Deep Learning Engineer'},
   {'tags': [{'name': 'JavaScript'}], 'title': 'Full Stack Engineer'},
   {'tags': [{'name': 'JavaScript'}], 'title': 'React Developer'},
   {'tags': [{'name': 'Swift'}], 'title': 'iOS Engineer'},
   {'tags': [{'name': 'Android'}], 'title': 'Android Engineer'},
   {'tags': [{'name': 'Fr

## Resources for other APIs

#### GraphQL

https://github.com/APIs-guru/graphql-apis

#### General (Mostly REST)
https://github.com/public-apis/public-apis#environment


# Whats next?

### Community
- [Spotify: The Tech Pivot](https://open.spotify.com/show/5hsVXoBEshedfp4P9HSmTC)
- [Apple Podcasts: The Tech Pivot](https://podcasts.apple.com/us/podcast/the-tech-pivot/id1547353713?uo=4&itscg=30200&itsct=podcast_box)

### Keep Learning:
The most important thing to do it keep learning!

- [Data Science Prep Course](https://bit.ly/DSIPREP-32q7lQj) 

See all upcoming Galvanize online events [here](https://www.hackreactor.com/webinars)

### Challenge ideas:
- Find a new API to query
- Try creating an API of your own (start with hello world!)
- Integrate an API into your machine learning pipeline


### Stay Connected:
- Linkedin: [https://www.linkedin.com/in/andrewmeans/](https://www.linkedin.com/in/andrewmeans/)

- email: andrew.means@galvanize.com