# Working on a Project

### Introduction

While we do not formally require a side project in this course, many students find working on a project a very effective way to reinforce what they learned in class.

### Getting Started

To get started on a project, it is often useful to consider a topic or dataset that you are interested in, and then pull some data.

> We do not recommend spending too much time considering and reconsidering a project topic.  The whole point of this is to reinforce learning, and your project will evolve over time.

So how do you pull some data?  Well a good way is to find an API -- a mechanism where a company or organization makes their data available, and then use the `requests` library to access data from the API. 

### Let's see it

For example, cities provide lots information through their "open data" website, and from there you can access their api. 

For example, we can find NYC's open data website [here](https://data.cityofnewyork.us/).  Click on it, and then on the left hand side you can see a list of topics.  If we click on `education`, then from there we can see 2012 SAT results.

> You can also just click the link [here](https://data.cityofnewyork.us/Education/2012-SAT-Results/f9bf-2cp4).

From there, you can find the api by clicking on the API tab towards the left, and then copying the API endpoint that we see towards the bottom of the popup.

<img src="./api-work.png" width="100%">

Ok, now it's time to call the API.

### Getting data from the API

The API endpoint that we copied above is just the url where we can access the api.  And then we can get data from that url with the following.

In [1]:
import requests
url = "https://data.cityofnewyork.us/resource/f9bf-2cp4.json"
response = requests.get(url)
schools = response.json()

And then if we look at our data, we see that we are back to our list of dictionaries.

In [3]:
schools[:1]

[{'dbn': '01M292',
  'school_name': 'HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES',
  'num_of_sat_test_takers': '29',
  'sat_critical_reading_avg_score': '355',
  'sat_math_avg_score': '404',
  'sat_writing_avg_score': '363'}]

> **Note:** Above, we got back a list of dictionaries but some APIs will return a dictionary instead of a list.  It's the developer's responsibility to look at the structure of the data that the API returns.

Breaking down the code above.  Ok, so let's see how we called the API one more time.

In [4]:
import requests # import the requests library
url = "https://data.cityofnewyork.us/resource/f9bf-2cp4.json"
response = requests.get(url) # request from the url

So the three lines above import the requests library, so that we can call the api.  And then we use the requests.get method, passing in the url as an argument.  We assign the result to the variable `response`.

In [5]:
response

<Response [200]>

Response just returns a response object (whatever that is), and inside that object is our data.  But to get it, we need to call `response.json()`.  So we do that, assigning the data to a variable.

In [6]:
schools = response.json()
schools[:2]

[{'dbn': '01M292',
  'school_name': 'HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES',
  'num_of_sat_test_takers': '29',
  'sat_critical_reading_avg_score': '355',
  'sat_math_avg_score': '404',
  'sat_writing_avg_score': '363'},
 {'dbn': '01M448',
  'school_name': 'UNIVERSITY NEIGHBORHOOD HIGH SCHOOL',
  'num_of_sat_test_takers': '91',
  'sat_critical_reading_avg_score': '383',
  'sat_math_avg_score': '423',
  'sat_writing_avg_score': '366'}]

### Other APIs

So how do you find these APIs?  Well, we can Google of course.  Or you can also browse [this list](https://github.com/public-apis/public-apis).  One thing to note is that some of the APIs require authentication.  We'll learn how to authenticate with an API in this course but it can be tricky if you've never done it.  So perhaps start with a url that does not require authentication.

> For example, under the list of APIs we can see a column that says Auth no.

> <img src="./auth-no.png" width="70%">

And then, after becoming more familiar with authentication with APIs, you can always add an API that requires authentication.

### Need a quick idea?

Finally, if you don't have any project idea, a really interesting API is the Texas Mixed Drink Receipts API.  This is information from the reporting of all Texas establishments about the amount of alcohol they sell each month.  You can find documentation on the API [here](https://dev.socrata.com/foundry/data.texas.gov/naix-2893).  And you can call the API with the following.

In [7]:
import requests
url = "https://data.texas.gov/api/views/naix-2893/rows.json"
response = requests.get(url)
receipt_data = response.json()