# Intermediate APIs

We looked at a basic API in the last mission. That API didn't require authentication, but most do. Imagine that you're using the reddit API to pull a list of your private messages. **It would be a huge privacy breach for reddit to give that information to anyone, so requiring authentication makes sense.**
<br>
**APIs also use authentication to perform rate limiting**. Developers typically use APIs to build interesting applications or services. In order to ensure that it remains available and responsive for all users, an API will prevent you from making too many requests in too short a time. We call this restriction rate limiting. **It ensures that one user can't overload the API server by making too many requests too fast.**

In this mission, we'll explore the GitHub API and use it to pull some interesting data on repositories and users. GitHub has 

* user accounts ([example](https://github.com/torvalds))
* repositories that contain code ([example](https://github.com/torvalds/linux))
* organizations that companies can create ([example](https://github.com/dataquestio)).

Take a look at the [documentation for the GitHub API](https://developer.github.com/v3/), and specifically the [authentication section](https://developer.github.com/v3/#authentication).

To authenticate with the GitHub API, we'll need to use an access token. An access token is a credential we can [generate on GitHub's website](https://github.com/settings/tokens). The token is a string that the API can read and associate with your account.

Using a token is preferable to a username and password for a few reasons:

* Typically, you'll be accessing an API from a script. **If you put your username and password in the script and someone manages to get their hands on it, they can take over your account.** In contrast, you can revoke an access token to cancel an unauthorized person's access if there's a security breach.
* Access tokens can have scopes and specific permissions. For instance, you can make a token that has permission to write to your GitHub repositories and make new ones. Or, you can make a token that can only read from your repositories. **Using read-access-only tokens in potentially insecure or shared scripts gives you more control over security.**

You'll need to pass your token to the GitHub API through an Authorization header. Just like the server sends headers in response to our request, we can send the server headers when we make a request. Headers contain metadata about the request. We can use Python's requests library to make a dictionary of headers, and then pass it into our request.

* We need to include the word token in the Authorization header, followed by our access token. Here's an example of an Authorization header:

```
{"Authorization": "token 1f36137fbbe1602f779300dad26e4c1b7fbab631"}
```

In this case, our access token is 1f36137fbbe1602f779300dad26e4c1b7fbab631. GitHub generated this token and associated it with the account of Vik Paruchuri.

* You should never share your token with anyone you don't want to have access to your account. We've revoked the token you'll be using throughout this mission, so it isn't valid anymore. Consider a token somewhat equivalent to a password, and store it securely.

We've imported requests for you already, so please avoid doing it again in this mission. Importing requests will overwrite some of the custom API logic we've developed for answer checking.

In [1]:
import requests

In [20]:
import praw

ModuleNotFoundError: No module named 'praw'

In [3]:
# Create a dictionary of headers containing our Authorization header.
headers = {"Authorization": "token 7d9e3e26294 ... b89f0576"}

# Make a GET request to the GitHub API with our headers.
# This API endpoint will give us details about Vik Paruchuri.
response = requests.get("https://api.github.com/users/VikParuchuri/orgs", headers=headers)

# Print the content of the response.  As you can see, this token corresponds to the account of Vik Paruchuri.
print(response.json())

orgs = response.json()

[{'login': 'dataquestio', 'id': 11148054, 'url': 'https://api.github.com/orgs/dataquestio', 'repos_url': 'https://api.github.com/orgs/dataquestio/repos', 'events_url': 'https://api.github.com/orgs/dataquestio/events', 'hooks_url': 'https://api.github.com/orgs/dataquestio/hooks', 'issues_url': 'https://api.github.com/orgs/dataquestio/issues', 'members_url': 'https://api.github.com/orgs/dataquestio/members{/member}', 'public_members_url': 'https://api.github.com/orgs/dataquestio/public_members{/member}', 'avatar_url': 'https://avatars3.githubusercontent.com/u/11148054?v=4', 'description': 'Learn data science online'}]


APIs usually let us retrieve information about specific objects in a database. On the previous screen, for example, we retrieved information about a specific user object, `VikParuchuri`. We could also retrieve information about other GitHub users through the same endpoint. For example, https://api.github.com/users/torvalds would get us information about [Linus Torvalds](https://en.wikipedia.org/wiki/Linus_Torvalds).


In [5]:
response = requests.get("https://api.github.com/users/torvalds", headers=headers)
torvalds = response.json()
torvalds

{'avatar_url': 'https://avatars0.githubusercontent.com/u/1024025?v=4',
 'bio': None,
 'blog': '',
 'company': 'Linux Foundation',
 'created_at': '2011-09-03T15:26:22Z',
 'email': None,
 'events_url': 'https://api.github.com/users/torvalds/events{/privacy}',
 'followers': 63347,
 'followers_url': 'https://api.github.com/users/torvalds/followers',
 'following': 0,
 'following_url': 'https://api.github.com/users/torvalds/following{/other_user}',
 'gists_url': 'https://api.github.com/users/torvalds/gists{/gist_id}',
 'gravatar_id': '',
 'hireable': None,
 'html_url': 'https://github.com/torvalds',
 'id': 1024025,
 'location': 'Portland, OR',
 'login': 'torvalds',
 'name': 'Linus Torvalds',
 'organizations_url': 'https://api.github.com/users/torvalds/orgs',
 'public_gists': 0,
 'public_repos': 4,
 'received_events_url': 'https://api.github.com/users/torvalds/received_events',
 'repos_url': 'https://api.github.com/users/torvalds/repos',
 'site_admin': False,
 'starred_url': 'https://api.gith

In addition to users, the GitHub API has a few other types of objects. For example, https://api.github.com/orgs/dataquestio will retrieve information about the Dataquest organization on GitHub. https://api.github.com/repos/octocat/Hello-World will give us information about the Hello-World repository that the user octocat owns.

* GitHub offers [full documentation](https://developer.github.com/v3/) for all of the API's endpoints.

In [8]:
response = requests.get("https://api.github.com/repos/octocat/Hello-World", headers=headers)
hello_world = response.json()
hello_world

{'archive_url': 'https://api.github.com/repos/octocat/Hello-World/{archive_format}{/ref}',
 'archived': False,
 'assignees_url': 'https://api.github.com/repos/octocat/Hello-World/assignees{/user}',
 'blobs_url': 'https://api.github.com/repos/octocat/Hello-World/git/blobs{/sha}',
 'branches_url': 'https://api.github.com/repos/octocat/Hello-World/branches{/branch}',
 'clone_url': 'https://github.com/octocat/Hello-World.git',
 'collaborators_url': 'https://api.github.com/repos/octocat/Hello-World/collaborators{/collaborator}',
 'comments_url': 'https://api.github.com/repos/octocat/Hello-World/comments{/number}',
 'commits_url': 'https://api.github.com/repos/octocat/Hello-World/commits{/sha}',
 'compare_url': 'https://api.github.com/repos/octocat/Hello-World/compare/{base}...{head}',
 'contents_url': 'https://api.github.com/repos/octocat/Hello-World/contents/{+path}',
 'contributors_url': 'https://api.github.com/repos/octocat/Hello-World/contributors',
 'created_at': '2011-01-26T19:01:12Z'

### Sometimes, a request can return a lot of objects. 
This might happen when you're doing something like listing out all of a user's repositories, for example. Returning too much data will take a long time and slow the server down. For example, if a user has 1,000+ repositories, requesting all of them might take 10+ seconds. This isn't a great user experience, so it's typical for API providers to implement pagination. This means that the API provider will only return a certain number of records per page. 
* You can specify the page number that you want to access. To access all of the pages, you'll need to write a loop.

To get the repositories a user has starred (marked as interesting), we can use the following API endpoint:
* https://api.github.com/users/VikParuchuri/starred

We can add two pagination query parameters to it - `page`, and `per_page`. 
* `page` is the page we want to access
* `per_page` is the number of records we want to see on each page. 

Typically, **API providers enforce a cap on how high per_page can be**, because setting it to an extremely high value defeats the purpose of pagination.

* Check out the [Github API documentation on pagination](https://developer.github.com/v3/#pagination).

In [12]:
params = {"per_page": 1, "page": 2}
response = requests.get("https://api.github.com/users/VikParuchuri/starred", headers=headers, params=params)
page2_repos = response.json()
page2_repos

[{'archive_url': 'https://api.github.com/repos/ooyala/barkeep/{archive_format}{/ref}',
  'archived': False,
  'assignees_url': 'https://api.github.com/repos/ooyala/barkeep/assignees{/user}',
  'blobs_url': 'https://api.github.com/repos/ooyala/barkeep/git/blobs{/sha}',
  'branches_url': 'https://api.github.com/repos/ooyala/barkeep/branches{/branch}',
  'clone_url': 'https://github.com/ooyala/barkeep.git',
  'collaborators_url': 'https://api.github.com/repos/ooyala/barkeep/collaborators{/collaborator}',
  'comments_url': 'https://api.github.com/repos/ooyala/barkeep/comments{/number}',
  'commits_url': 'https://api.github.com/repos/ooyala/barkeep/commits{/sha}',
  'compare_url': 'https://api.github.com/repos/ooyala/barkeep/compare/{base}...{head}',
  'contents_url': 'https://api.github.com/repos/ooyala/barkeep/contents/{+path}',
  'contributors_url': 'https://api.github.com/repos/ooyala/barkeep/contributors',
  'created_at': '2011-09-01T18:30:15Z',
  'default_branch': 'master',
  'deploym

Since we've authenticated with our token, the system knows who we are, and can show us some relevant information without us having to specify our username.

* Making a GET request to https://api.github.com/user will give us information about the user the authentication token is for.

In [13]:
user = requests.get("https://api.github.com/user", params=params, headers=headers).json()

In [14]:
user

{'avatar_url': 'https://avatars1.githubusercontent.com/u/10291339?v=4',
 'bio': '[Korea University, Seoul]\r\n - B.A in Media and Communication, 2016\r\n[Com2us, Seoul] \r\n - Staff in Game Business Dept. 2015-2017',
 'blog': 'choigww.github.io',
 'company': None,
 'created_at': '2014-12-24T17:58:55Z',
 'email': 'choigww@gmail.com',
 'events_url': 'https://api.github.com/users/choigww/events{/privacy}',
 'followers': 4,
 'followers_url': 'https://api.github.com/users/choigww/followers',
 'following': 4,
 'following_url': 'https://api.github.com/users/choigww/following{/other_user}',
 'gists_url': 'https://api.github.com/users/choigww/gists{/gist_id}',
 'gravatar_id': '',
 'hireable': True,
 'html_url': 'https://github.com/choigww',
 'id': 10291339,
 'location': None,
 'login': 'choigww',
 'name': 'Kyu Hyung Choi',
 'organizations_url': 'https://api.github.com/users/choigww/orgs',
 'public_gists': 0,
 'public_repos': 5,
 'received_events_url': 'https://api.github.com/users/choigww/recei

## POST requests

So far, we've been making GET requests. We use GET requests to retrieve information from a server (hence the name GET). There are a few other types of API requests.

### For example, we use POST requests to send information (instead of retrieve it), and to create objects on the API's server. 
  * With the GitHub API, **we can use POST requests to create new repositories.**

Different API endpoints choose what types of requests they will accept. 
* Not all endpoints will accept a POST request, and not all will accept a GET request. 
* You'll have to consult the [API's documentation](https://developer.github.com/v3/) to figure out which endpoints accept which types of requests.

We can make POST requests using `requests.post`. **POST requests almost always include data, because we need to send the data** the server will use to create the new object.

```
payload = {"name": "test"}
requests.post("https://api.github.com/user/repos", json=payload)
```

The code above will create a new repository named test under the account of the currently authenticated user. It will **convert the payload dictionary to JSON, and pass it along with the POST request**.

Check out [GitHub's API documentation for repositories](https://developer.github.com/v3/repos/) to see a full list of what data we can pass in with this POST request. Here are just a couple data points:

* `name` -- Required, the name of the repository
* `description` -- Optional, the description of the repository

A successful POST request will usually return a `201` **status code** indicating that it was able to create the object on the server. Sometimes, the API will return the JSON representation of the new object as the content of the response.

In [15]:
# Create the data we'll pass into the API endpoint.  While this endpoint only requires the "name" key, there are other optional keys.
payload = {"name": "test-repo-learning-about-apis"}

# We need to pass in our authentication headers!
response = requests.post("https://api.github.com/user/repos", json=payload, headers=headers)

status = response.status_code
status

201

![](img/making-repo-using-api-.png)

## PUT/PATCH Requests

### Sometimes we want to update an existing object, rather than create a new one. 
This is where `PATCH` and `PUT` requests come into play.

* We use PATCH requests when we want to change a few attributes of an object, but don't want to resend the entire object to the server. Maybe we just want to change the name of our repository, for example.

* We use PUT requests to send the complete object we're revising as a replacement for the server's existing version.

In practice, API developers don't always respect this convention. Sometimes API endpoints that accept PUT requests will treat them like PATCH requests, and not require us to send the whole object back.

* We send a payload of data with PATCH requests, the same way we do with POST requests:

```
payload = {"description": "The best repository ever!", "name": "test"}
response = requests.patch("https://api.github.com/repos/VikParuchuri/test", json=payload)
```
The code above will change the description of the test repository to The best repository ever! (we didn't specify a description when we created it).

* A successful PATCH request will usually return a `200` status code.

In [17]:
payload = {"description": "Learning about requests!", "name": "test-repo-learning-about-apis"}
response = requests.patch("https://api.github.com/repos/choigww/test-repo-learning-about-apis", json=payload, headers=headers)
print(response.status_code)

status = response.status_code

200


![](img/patch-description.png)

The final major request type is the `DELETE request`. The DELETE request removes objects from the server. We can use the DELETE request to remove repositories.

````
response = requests.delete("https://api.github.com/repos/VikParuchuri/test")
```

The above code will delete the test repository from GitHub.

A successful DELETE request will usually return a `204` status code indicating that it successfully deleted the object.

Use DELETE requests carefully - it's very easy to remove something important by accident.

In [19]:
response = requests.delete("https://api.github.com/repos/choigww/test-repo-learning-about-apis", headers=headers)
print(response.status_code)
status = response.status_code

204


![](img/del-repo.png)