# Finding Web APIs & Storing Data with Python (Rutgers Example)

In this tutorial, we’re going to learn a practical workflow for fetching data from web APIs and storing it into a database.

1. **Observe** what API request a website makes (using Chrome DevTools)
2. **Recreate** that same request in Python (with `requests`)
3. **Store** the data in a local **SQLite** database

We’ll use the [Rutgers Schedule of Classes website](https://classes.rutgers.edu/soc/#home) as a real example, but the process applies to almost any website.


### What you’ll end with
By the end, you’ll have:
- A working Python script (inside this notebook) that fetches real data from an API endpoint
- A SQLite database file saved locally in the `data/` folder
- A repeatable pattern you can use on other websites


## Before we start (quick notes)

- This tutorial is for **educational purposes**. Always respect a website’s Terms of Service and avoid sending excessive requests.
- When you recreate API requests outside the browser (such as with Python), some things may not work exactly the same way. This is normal. The goal is to learn the method, not just memorize one endpoint.
- This is a very basic introduction. Real-world APIs can be more complex, requiring authentication, rate limiting, and other considerations.

**Now let's get into the tutorial!**

## Background: What is an API?

An **API** (Application Programming Interface) is a structured way for one program to communicate with another.

When you use a modern website, your browser often does this behind the scenes:
- You click a button (like “Search”)
- The site sends a request to an API endpoint which is handled by the backend server
- The server responds with data (often in JSON format)
- The browser uses that JSON to update the page

That API request is what we’re going to “copy” — not by scraping the page HTML, but by calling the same endpoint directly.

If you want to learn more about APIs in detail, check out this article posted by AWS: [What is an API?](https://aws.amazon.com/what-is/api/)


## Background: What makes up a network request?

When you inspect a network request, you'll see it has several key components:

### Overview of components

##### 1) Method
This tells the server the type of request being made. Most common are **GET** and **POST**.
- GET requests usally retrieved data with the paramaters in the URL
- POST requests usually send data in the request body

##### 2) URL (Endpoint)
This is the web address where the request is sent.
* It often looks like `https://example.com/api/resource`


##### 3) Parameters
These control what data you get back in the case of a GET request, or what data you send in the case of a POST request.
- For GET requests, parameters are often appended to a URL as query strings (e.g., `?key=value&key2=value2` or the full url: `https://example.com/api/resource?key=value&key2=value2`)
- For POST requests, parameters are often in the request body (e.g., JSON or form data)

##### 4) Headers
These provide additional information about the request which can affect how the server processes it. Common headers include:
- `Content-Type`: Indicates the media type of the resource being sent for POST requests (e.g., `application/json`)
- `Accept`: Specifies the type of response the client can handle for GET requests (e.g., `application/json`) 
- `Authorization`: Contains credentials for authentication (if required)
- `User-Agent`: Identifies the client software making the request


##### 5) Response
This is the data the server sends back after processing the request. For APIs, it’s often in **JSON** format, but it can also be XML, HTML, or plain text.



## Inspecting Network Requests on the Schedule of Classes website with Chrome DevTools

Modern websites rarely load all their data at once. Instead, they make **network requests** to fetch data dynamically when you interact with the page.

To find the API request we want to recreate, we’ll use **Chrome DevTools**.

### Step 1: Open Chrome DevTools
1. Open Google Chrome and navigate to the [Rutgers Schedule of Classes website](https://classes.rutgers.edu/soc/#home) (https://classes.rutgers.edu/soc/#home).
2. Right-click anywhere on the page and select **Inspect** to open DevTools.
3. Click on the **Network** tab in DevTools to monitor network activity.

### Step 2: Filter for XHR/Fetch Requests

1. In the Network tab, look for the filter options and click on **XHR** or **Fetch**. This will only show the API requests made.
2. At the top of the inspect panel, also click on the Disable cache checkbox to ensure all requests are captured. (Cache prevents requests from being sent again if the data is already stored locally)

### Step 3: Perform a Search on the Website

1. On the Schedule of Classes website, fill out the search form (select a term, location, and level) and click the **Continue** button to submit the form.
2. Watch the Network tab for new requests that appear when you click Search.

### Step 4: Identify the Relevant API Request
1. Look for a request that seems to correspond to the search you just performed.
2. Click on that request and confirm that it returns the data you expect (by checking the **Response** tab).
- If you can't see any data in the Response tab, right click on the request and select "Open in new tab" to see the full JSON response.

### Step 5: Note down the request details
1. Copy the **URL** (Endpoint) by right-clicking on it and selecting **Copy > Copy URL**.
2. Navigate to the **Headers** tab of the selected request.
3. Note down the **Method** (GET or POST).
4. Scroll down to the **Request Headers** section and copy important headers like `Content-Type`, `Accept`, and any others that seem relevant. (For this example, we will only need `Content-Type` and `Accept` headers)

Tip: There is an app called Postman that can help you test and debug API requests interactively. It's worth checking out if you plan on working with APIs frequently or even just to try it out I would definitely recommend it! You can download it at https://www.postman.com/downloads/ and here's a short 4 minute [introduction video](https://www.youtube.com/watch?v=aHvsOiHK-tI) on the basics of using Postman. 








## Recreating the API Request in Python

Now that we have all the details of the API request, we can call it directly ourselves without the need of the Schedule of Classes website. This gives us the ability to fetch data programatically and use the data however we want.

The goal of this section is to recreate the same request we observed in DevTools using Python's `requests` library and print out the JSON response.

We'll first start by importing the neccessary libraries and setting up the request details we noted down earlier.

### Step 0: Import Libraries and Define Request Details

Firstly we want to define the request details we noted down earlier in DevTools such as the URL and Headers. 

TODO: 
Paste the API request URL you copied from DevTools into the URL variable below.

In [None]:
import requests
import json

TODO = 0 #to prevent empty cell error in Jupyter

##Defining the request details we noted down earlier

URL= '' #TODO : Paste the API request URL you copied from DevTools here

HEADERS = {
    "Accept": "application/json", #Tells the server we expect JSON data in response
}

### Step 1: Make the request to the API endpoint and check if it was successful

An important part about working with new libraries is reading the documentation. The [requests documentation](https://docs.python-requests.org/en/latest/) is very well written and easy to follow. If you ever get stuck or confused about how to use a certain function, the documentation is a great place to start!

TODO: 
1. Fill in the required paramters to the requests.get() function to make the GET request to the API endpoint (HINT: you will need to use the URL and HEADERs variables we defined earlier)
2. Print the status code of the response to ensure the request was successful (status code 200) (If the status code is not 200, double check the URL and Headers you are using. 200 means OK or successful)
3. Parse the JSON response using response.json(), store it in the `data` variable and print it out.


In [None]:

#TODO: 
# 1. Fill in the required paramaters
# 2. Print the status code of the response
# 3. Parse the JSON response, store it in the 'data' variable and print it out.

response = requests.get() #TODO: Fill in the required paramaters
# response = requests.get('https://classes.rutgers.edu/soc/api/courses.json?year=2026&term=1&campus=NB', headers=HEADERS)

data = TODO
print("Status Code:", TODO) #TODO: Replace TODO with the appropriate variable to print the status code 
print("JSON Response:", TODO) #TODO: Replace TODO with the appropriate variable to print the JSON response




Status Code: 200


### Step 2: Parse and explore the JSON data

Now that we have the JSON data, we can start exploring its structure to understand how to extract the information we need.

In this tutorial we are going to extract all the 