<a href="https://colab.research.google.com/github/PalakPoddar/my-notebooks/blob/main/API_Challenge_Unemployment_(Fall_2022)_SOLUTION.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

> NOTE: this notebook contains solutions to a previous exercise, authored by [Prof. Rossetti](https://github.com/prof-rossetti/intro-to-python)

# Prerequisites

Obtain an [AlphaVantage API Key](https://www.alphavantage.co/support/#api-key). A normal key should be fine, but alternatively you can use one of the prof's "premium" keys.  

# Instructions


  1. Read the docs for the [AlphaVantage API](https://www.alphavantage.co/documentation/).
  2. Find the endpoint and corresponding URL for requesting **unemployment data**.
  3. Make an example request in the browser, to verify the data is there and start to visually inspect its structure.
  4. Make the request programmatically via Python.
  5. Parse the response data, to answer the questions below.



> NOTE: Remember to run the setup cell before you try to make a request. This will help you keep your secret credential safe!!!! 


**Further Exploration:**

After answering the questions using JSON formatted data, also answer the same questsions, but this time using CSV formatted data.

> HINT: modify your request URL to specify the `datatype` parameter (see the unemployment endpoint docs)

> HINT: [filtering dataframe by substring](https://stackoverflow.com/questions/11350770/filter-pandas-dataframe-by-substring-criteria)


# Challenges

**Challenge A**

What is the most recent unemployment rate? And the corresponding date? Display the unemployment rate using a percent sign.

> NOTE: assume the most recent data is first (i.e. no need to sort, although it may be helpful / safe to do so)

> HINT: use a format string or string concat to help with the percent sign display 


**Challenge B**

What is the average unemployment rate for all months during this calendar year? How many months are included?

> HINT: use a filtering operation to get only the datapoints we care about. You could maybe use substring inclusion in your filter condition.

> HINT: you might need to convert the string values to float datatype

**Challenge C**

Plot a line chart of unemployment rates over time.

> HINT: use the [plotly package](https://github.com/prof-rossetti/intro-to-python/blob/main/notes/python/packages/plotly.md)!

> HINT: you might need to convert the string values to float datatype


# Setup

In [None]:
from getpass import getpass

API_KEY = getpass("Please input your AlphaVantage API Key: ") 


Please input your AlphaVantage API Key: ··········


# Solution (JSON)

In [None]:
import requests
import json
from pprint import pprint

request_url = f"https://www.alphavantage.co/query?function=UNEMPLOYMENT&apikey={API_KEY}"

response = requests.get(request_url)

parsed_response = json.loads(response.text)
print(type(parsed_response))
pprint(parsed_response)

In [None]:
data = parsed_response["data"]

In [None]:
# Challenge A
#
# What is the most recent unemployment rate? And the corresponding date? 
# Display the unemployment rate using a percent sign.

print("-------------------------")
print("LATEST UNEMPLOYMENT RATE:")
#print(data[0])
print(f"{data[0]['value']}%", "as of", data[0]["date"])

-------------------------
LATEST UNEMPLOYMENT RATE:
3.5% as of 2022-09-01


In [None]:

# Challenge B
# 
# What is the average unemployment rate for all months during this calendar year?
# ... How many months does this cover?

from statistics import mean

this_year = [d for d in data if "2022-" in d["date"]]

rates_this_year = [float(d["value"]) for d in this_year]
#print(rates_this_year)

print("-------------------------")
print("AVG UNEMPLOYMENT THIS YEAR:", f"{mean(rates_this_year)}%")
print("NO MONTHS:", len(this_year))

-------------------------
AVG UNEMPLOYMENT THIS YEAR: 3.6555555555555554%
NO MONTHS: 9


In [None]:
# Challenge C
# 
# Plot a line chart of unemployment rates over time.

from plotly.express import line

dates = [d["date"] for d in data]
rates = [float(d["value"]) for d in data]

fig = line(x=dates, y=rates, title="United States Unemployment Rate over time", labels= {"x": "Month", "y": "Unemployment Rate"})
fig.show()

# Solution (CSV)

In [None]:
from pandas import read_csv

request_url = f"https://www.alphavantage.co/query?function=UNEMPLOYMENT&apikey={API_KEY}&datatype=csv"

df = read_csv(request_url)

print(df.head())
print(df.columns)
print(len(df))

    timestamp  value
0  2022-09-01    3.5
1  2022-08-01    3.7
2  2022-07-01    3.5
3  2022-06-01    3.6
4  2022-05-01    3.6
Index(['timestamp', 'value'], dtype='object')
897


In [None]:
# Challenge A
#
# What is the most recent unemployment rate? And the corresponding date? 
# Display the unemployment rate using a percent sign.

print("-------------------------")
print("LATEST UNEMPLOYMENT RATE:")
first_row = df.iloc[0]
#print(first_row)
print(f"{first_row['value']}%", "as of", first_row["timestamp"])

-------------------------
LATEST UNEMPLOYMENT RATE:
3.5% as of 2022-09-01


In [None]:

# Challenge B
# 
# What is the average unemployment rate for all months during this calendar year?
# ... How many months does this cover?

# https://stackoverflow.com/questions/11350770/filter-pandas-dataframe-by-substring-criteria
this_year_df = df[df["timestamp"].str.contains("2022-")]
print(this_year_df)

print("-------------------------")
print("AVG UNEMPLOYMENT THIS YEAR:", f"{this_year_df['value'].mean()}%")
print("NO MONTHS:", len(this_year_df))

    timestamp  value
0  2022-09-01    3.5
1  2022-08-01    3.7
2  2022-07-01    3.5
3  2022-06-01    3.6
4  2022-05-01    3.6
5  2022-04-01    3.6
6  2022-03-01    3.6
7  2022-02-01    3.8
8  2022-01-01    4.0
-------------------------
AVG UNEMPLOYMENT THIS YEAR: 3.6555555555555563%
NO MONTHS: 9


In [None]:
# Challenge C
# 
# Plot a line chart of unemployment rates over time.

from plotly.express import line

fig = line(x=df["timestamp"], y=df["value"], title="United States Unemployment Rate over time", labels= {"x": "Month", "y": "Unemployment Rate"})
fig.show()

# Cleanup Prep

In [None]:
#def format_pct(my_number:float) -> str:

def format_pct(my_number):
    """
    Formats a percentage number like 3.6555554 as percent, rounded to two decimal places.

    Param my_number (float) like 3.6555554

    Returns (str) like '3.66%'
    """
    return f"{my_number:.2f}%"


In [None]:
print(format_pct(3.65554)) #> 3.66%

print(format_pct(25.4)) #> 25.40%

result = format_pct(25.4)
print(result) #> 25.40%

3.66%
25.40%
25.40%


In [None]:
assert format_pct(3.65554) == '3.66%'

assert format_pct(25.4) == '25.40%'

result = format_pct(25.4)
assert result == '25.40%'
#assert result == 'oops%'