# Working with APIs using the requests library

A very common task you will take on over and over again is pulling needed data off of an api. If you are very lucky there will be an SDK (software development kit) that you can import like any other library that gives you a clean interface to the service. But that is usually not the case and you will be left with trying to figure out how to munge http. Fortunately, python's requests library makes the process much easier.

You'll notice below that I'm creating a token variable based on an environment variable I passed into the docker container while spinning it up. This is to prevent you criminals from accessing my shit :) If you want to run the later examples on your own github repo you'll need to generate a Personal Access Token with the correct permissions. You can find instructions [here.](https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/)

All jokes aside, be very very careful with your credentials. Careers have been ruined by careless commiting of secrets onto public repos. Once you've generated a PAT, the wrong way to pass it into your version of this notebook is pasting the string literal in... people could see, and, worse, it is super easy to commit that somewhere public.

In [1]:
import os
import requests
import json

In [2]:
# Change these to the appropriate values to follow along using your own repo
USER = "JCPistell"
TOKEN = os.environ["GITHUB_TOKEN"]

## REST... yep... I totally know what that is...

The vast majority of the APIs you will interact with are RESTful. REST stands for "Representational State Transfer". The basic idea here is two sides communicating over a network should use a common protocol and neither side should have to track the state of the other. In very simple terms, what this *actually* means is computers talking to each other using http. The vast majority of our interactions with the internet involve `GET` requests, but there are a few more words and RESTful APIs use them.

I sincerely hope that clears up all potential questions because my knowledge of networking has now been exhausted. Let's just try it, shall we? :)

We'll use the github api for the rest of these examples... as always it's important to [RTFM](https://developer.github.com/v3/)

In [26]:
# The basics: let's do a standard get request
r = requests.get("https://api.github.com/zen")

In [27]:
r.text

'Half measures are as bad as nothing at all.'

In [5]:
# Let's get some public info on me
r = requests.get(f"https://api.github.com/users/{USER}")

In [6]:
# Status codes are a great way to prevent 404 exceptions in your code
r.status_code

200

In [7]:
# We have full access to the response's header... useful for debugging 
r.headers["content-type"]

'application/json; charset=utf-8'

In [8]:
# If the response body is json we have a convenience function to parse it into a dict
user_info = r.json()

In [9]:
# Which we can then pretty-print
print(json.dumps(user_info, indent=4))

{
    "login": "JCPistell",
    "id": 8353467,
    "node_id": "MDQ6VXNlcjgzNTM0Njc=",
    "avatar_url": "https://avatars3.githubusercontent.com/u/8353467?v=4",
    "gravatar_id": "",
    "url": "https://api.github.com/users/JCPistell",
    "html_url": "https://github.com/JCPistell",
    "followers_url": "https://api.github.com/users/JCPistell/followers",
    "following_url": "https://api.github.com/users/JCPistell/following{/other_user}",
    "gists_url": "https://api.github.com/users/JCPistell/gists{/gist_id}",
    "starred_url": "https://api.github.com/users/JCPistell/starred{/owner}{/repo}",
    "subscriptions_url": "https://api.github.com/users/JCPistell/subscriptions",
    "organizations_url": "https://api.github.com/users/JCPistell/orgs",
    "repos_url": "https://api.github.com/users/JCPistell/repos",
    "events_url": "https://api.github.com/users/JCPistell/events{/privacy}",
    "received_events_url": "https://api.github.com/users/JCPistell/received_events",
    "type": "User"

In [10]:
# Or access specific info
user_info["id"]

8353467

## Authorization

So far we've been using calls that don't require authorization. Let's now dig into some functionality that requires us to authorize the call with some credentials. The standard way to go about this is to provide a token in the header of your request. The requests library makes generating these headers very easy:

In [11]:
# Now we get to use the token!
header = {"Authorization": f"token {TOKEN}"}
url = "https://api.github.com/user/repos"

# We'll use a similar technique to generate some parameters. What do you think this does?
params = {"per_page": 100}

In [12]:
r = requests.get(url, headers=header, params=params)

In [13]:
# Let's get fancy and only pull the names of each public repo using a comprehension

[i["name"] for i in r.json() if not i["private"]]

['auth0-golang-web-app',
 'dotfiles',
 'gocsv',
 'iamauth',
 'maxwell',
 'pubsub4s',
 'ReactiveBaseSync',
 'shopify_theme',
 'windows-development-environment',
 'API-created-example',
 'Baltimore',
 'branchingandloops',
 'cdnow',
 'CLT-Simulation',
 'CodingOrientation',
 'colinreverser',
 'colinslooker',
 'dataAnalyticsFinal',
 'datasciencecoursera',
 'datasharing',
 'dotfiles',
 'dplyr_deck',
 'Earthquake-slides',
 'Earthquakes',
 'eduService1',
 'ExData_Plotting1',
 'fireProject',
 'GetDataProject',
 'interview-repo',
 'JCPistell.github.io',
 'jenkins-python-test',
 'lookerVagrant',
 'machine_learning_assignment',
 'movieCluster',
 'msbaNotebooks',
 'msbaPresentations',
 'msba_local_mysql',
 'msba_vagrantfile',
 'MSBC5070',
 'pilgrimBank',
 'ProgrammingAssignment2',
 'RegModels_1',
 'RepData_PeerAssessment1',
 'R_functions',
 'R_intro',
 'test-repo',
 'testslidify',
 'colintest']

## Using POST requests to make things

Let's now create a repo. This will require a `post` request and a content payload, which we can use a dict to create... requests will then do the heavy lifting of structuring the body of the request.

In [14]:
REPO_NAME = "API-created-example"

payload = {
  "name": REPO_NAME,
  "description": "repo created with api",
  "private": False
}

url = "https://api.github.com/user/repos"
header = {"Authorization": f"token {TOKEN}", "content-type": "application/json"}

In [15]:
r = requests.post(url, headers=header, data=json.dumps(payload))

In [16]:
# Did it work?
r.status_code

422

In [17]:
# Did it actually work??

r = requests.get(f"https://api.github.com/repos/{USER}/{REPO_NAME}")
{k:v for k,v in r.json().items() if k in payload.keys()}

{'name': 'API-created-example',
 'private': False,
 'description': 'repo created with api'}

In [21]:
# That's a shit description. Let's make it much better with a PATCH request
payload = {
    "name": REPO_NAME,
    "description": "A much better description"
}

url = f"https://api.github.com/repos/{USER}/{REPO_NAME}"

r = requests.patch(url, headers=header, data=json.dumps(payload))

In [22]:
r.status_code

200

In [24]:
r.json()["description"]

'A much better description'