 # Assignment 1-2: Data Collection Using Web APIs

 ## Objective

 Many websites (such as Twitter, Yelp, Spotify) provide free APIs to allow users to access their data. In this assignment, you will learn the following:



 * How to ask insightful questions about data.

 * How to collect data from Web APIs using standard Python libraries.



 **Requirements:**



 1. Use [pandas.DataFrame](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) to manipulate data.



 2. Follow the Python code style guide (https://www.python.org/dev/peps/pep-0008/). If your code is hard to read, you may lose points. This requirement will stay for the whole semester.

 ## Preliminary

 To complete this assignment, you will use Python libraries such as:



 - `requests` to make HTTP requests to the API.

 - `json` to parse JSON responses.

 - `pandas` to process and manipulate data.



 Before starting, you can refer to these tutorials:



 * [Python `requests` library](https://realpython.com/python-requests/)

 * [Working with JSON in Python](https://realpython.com/python-json/)

 * [Pandas basics](https://pandas.pydata.org/docs/user_guide/10min.html)

## Finding APIs



 If you're unsure which API to explore, here are some useful resources:

 - [Public APIs GitHub Repository](https://github.com/public-apis/public-apis): A curated list of free APIs for development.



 Make sure the API you choose aligns with the assignment requirements and provides sufficient data to answer your questions.

 ## Overview

 This is a **group** assignment.

 Please check your group in this [Spreadsheet](https://1sfu-my.sharepoint.com/:x:/g/personal/sbergner_sfu_ca/EfiqKEqv4_pGgGjG0CvYBN4BvNM4FnJ-SvBAkIqVKN-iJA?e=LhVSqk).



 To complete this assignment, your group needs to go through the following steps:



 1. Select a new Web API

 2. Explore the API documentation to understand its capabilities and endpoints.

 3. Formulate four questions that can be answered using the API.

 4. Write Python code to query the API and answer these questions.



 ### Step 3: Formulating Questions

 Good questions should:

 - Be **useful**, answering common or novel data use cases.

 - Be **diverse**, covering various endpoints or use cases.

 - Have a range of **difficulty**, from simple (few parameters) to complex (multiple parameters or computations).

 ### Example Questions (Yelp API)



 * Q1. What's the phone number of Capilano Suspension Bridge Park?

 * Q2. Which yoga store has the highest review count in Vancouver?

 * Q3. How many Starbucks stores are in Seattle and where are they located?

 * Q4. What are the ratings for a list of restaurants?



 These questions vary in usefulness, diversity, and complexity.

 ## Now, it's your turn! :)

 ### Instructions:

 1. Choose an API and obtain access credentials (e.g., API key).

 2. Write Python functions to query the API.

 3. Answer each question using the API data.

 4. Use `pandas` to format and display your results.

In [None]:
## Provide your API key here for TAs to reproduce your results
API_KEY = "96ea7050c7574dca93f0aae38effe795"


 ### Q0: Write a function to fetch data from the API

In [1]:
import requests
import pandas as pd

def fetch_data(endpoint, params, headers=None):
    """
    Fetch data from the given API endpoint with specified parameters.
    
    Args:
        endpoint (str): The API endpoint URL.
        params (dict): Dictionary of query parameters.
        headers (dict): Optional headers for the request.

    Returns:
        dict: JSON response from the API.

    Raises:
        requests.exceptions.RequestException: If the request fails.
    """
    try:
        response = requests.get(endpoint, params=params, headers=headers, timeout=10)
        response.raise_for_status()  # Raise an error for bad responses (4XX, 5XX)
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")
        return None

# Example: Replace `example_api_endpoint` with your actual endpoint.
response = fetch_data("https://api.spoonacular.com/recipes/complexSearch", {"param1": "value1"}, headers={"Authorization": f"Bearer {API_KEY}"})
# print(response)


 ### Q1: Retrieve business details for a specific place (TODO: change question)

In [None]:
def answer_question_1():
    endpoint = "https://api.example.com/endpoint"
    params = {"query": "example"}
    headers = {"Authorization": f"Bearer {API_KEY}"}
    data = fetch_data(endpoint, params, headers)

    if data:
        df = pd.DataFrame(data["results"])
        print("Top 5 Results:")
        print(df.head())  # Display top 5 results

answer_question_1()


 ### Q2: Analyze and sort data by a specific attribute
TODO: change question in title

In [None]:
def answer_question_2():
    endpoint = "https://api.example.com/another_endpoint"
    params = {"search": "example_search"}
    headers = {"Authorization": f"Bearer {API_KEY}"}
    data = fetch_data(endpoint, params, headers)

    if data:
        df = pd.DataFrame(data["results"])
        sorted_df = df.sort_values("rating", ascending=False)  # Sort by rating
        print("Sorted Results:")
        print(sorted_df.head())

answer_question_2()


 ### Q3: Filter results based on a condition
 TODO: change question in title

In [None]:
def answer_question_3():
    endpoint = "https://api.example.com/third_endpoint"
    params = {"location": "example_location"}
    headers = {"Authorization": f"Bearer {API_KEY}"}
    data = fetch_data(endpoint, params, headers)

    if data:
        df = pd.DataFrame(data["results"])
        filtered_df = df[df["rating"] > 4.5]  # Filter by rating > 4.5
        print("Filtered Results:")
        print(filtered_df)

answer_question_3()


 ### Q4: Optional - Visualizing data

In [None]:
import matplotlib.pyplot as plt

def visualize_question_4():
    endpoint = "https://api.example.com/fourth_endpoint"
    params = {"term": "example_term"}
    headers = {"Authorization": f"Bearer {API_KEY}"}
    data = fetch_data(endpoint, params, headers)

    if data:
        df = pd.DataFrame(data["results"])
        df["rating"].value_counts().plot(kind="bar")  # Plot a bar chart of ratings
        plt.title("Distribution of Ratings")
        plt.xlabel("Rating")
        plt.ylabel("Frequency")
        plt.show()

visualize_question_4()


 ## Submission

 Complete this notebook, rename it to `A1-2.ipynb`, and submit it along with any necessary credentials or configuration files to the CourSys activity [`Assignment 1 - Part 2`](https://coursys.sfu.ca/2025sp-cmpt-733-g1/+a1-2/).

 ## Submission Checklist



 - [ ] Completed notebook file (`A1-2.ipynb`).

 - [ ] Included API keys or other necessary credentials (if applicable).

 - [ ] Verified that questions and answers are documented with meaningful titles.

 - [ ] Optional visualizations are added to enhance insights.