### 1. Choose an API

- a) Choose an API and briefly describe the type of data you can obtain from it. Note: Please do not use any of the APIs we covered in lecture (e.g. NYTimes, Github etc.).

    the API I choose is https://openweathermap.org/api, where I can obtain current, forecast and historical weather data in JSON, XML, and HTML formats
    
    
- b) Provide a link to the API documentation and

    a link to the API documentation: https://openweathermap.org/api
    
  
- c) the base URL of the API you intend to use.

    the base URL of the API: https://api.openweathermap.org/data/2.5/onecall?

### 2. Authentication

- a) Briefly explain how the API authenticates the user. 

    After signing up the website, the user automatically gets an API key, and the user need to combine the API key and the url to obtain the data. 
    
    
- b) Apply for an API key if necessary and provide the information (with relevant URL) how that can be done. Do not include the API key in the assignment submission.
    
    https://openweathermap.org/api. This website introduce how I can get an API key: the user needs to sign up the website, the user automatically gets an API key according to their subscription information. 

### 3. Send a Simple GET request

##### a) Execute a simple GET request to obtain a small amount of data from the API. Describe a few query parameters and add them to the query. If you have a choice of the output the API returns (e.g. XML or JSON), I suggest to choose JSON because it easier to work with. Your output here should include the code for the GET request, including the query parameters, as well as a snippet of the output.

In [225]:
import os
import requests
from dotenv import load_dotenv

load_dotenv(dotenv_path = '/Users/humphreyhan/Documents/GitHub/Xiangyu_Han/hw08/openweather_env') 
token = os.getenv('OPEN_WEATHER_TOKEN')

In [265]:
params = {'appid':token, 'lat':'50','lon':'50'}
r = requests.get('https://api.openweathermap.org/data/2.5/onecall?',params = params)

In [297]:
import json
print(json.dumps(r.json(),indent=2,sort_keys=True))

{
  "current": {
    "clouds": 13,
    "dew_point": 265.79,
    "dt": 1637118248,
    "feels_like": 260.81,
    "humidity": 84,
    "pressure": 1025,
    "sunrise": 1637121361,
    "sunset": 1637153650,
    "temp": 267.81,
    "uvi": 0,
    "visibility": 10000,
    "weather": [
      {
        "description": "few clouds",
        "icon": "02n",
        "id": 801,
        "main": "Clouds"
      }
    ],
    "wind_deg": 258,
    "wind_gust": 13.61,
    "wind_speed": 6.87
  },
  "daily": [
    {
      "clouds": 8,
      "dew_point": 266.94,
      "dt": 1637136000,
      "feels_like": {
        "day": 265.46,
        "eve": 266.17,
        "morn": 260.82,
        "night": 264.47
      },
      "humidity": 69,
      "moon_phase": 0.44,
      "moonrise": 1637151300,
      "moonset": 1637112600,
      "pop": 0,
      "pressure": 1025,
      "sunrise": 1637121361,
      "sunset": 1637153650,
      "temp": {
        "day": 272,
        "eve": 271.34,
        "max": 273.18,
        "min": 267.47

##### b) Check (and show) the status of the request.

In [258]:
r.status_code

200

##### c) Check (and show) the type of the response (e.g. XML, JSON, csv).

In [259]:
r.headers.get('content-type') 

'application/json; charset=utf-8'

### 4. Parse the response and Create a dataset

##### a) Take the response returned by the API and turn it into a useful Python object (e.g. a list, vector, or pandas data frame). Show the code how this is done.

In [266]:
weather_json = r.json()
print(weather_json.keys())

dict_keys(['lat', 'lon', 'timezone', 'timezone_offset', 'current', 'minutely', 'hourly', 'daily'])


In [283]:
weather_json_df = pd.json_normalize(weather_json['hourly'])
weather_json_df.head()

Unnamed: 0,dt,temp,feels_like,pressure,humidity,dew_point,uvi,clouds,visibility,wind_speed,wind_deg,wind_gust,weather,pop
0,1637118000,267.81,260.81,1025,84,265.79,0.0,13,10000,6.87,258,13.61,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
1,1637121600,267.79,260.79,1025,85,265.91,0.0,13,10000,6.85,259,13.6,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
2,1637125200,267.99,260.99,1025,84,265.97,0.0,12,10000,6.79,260,13.41,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
3,1637128800,268.87,261.87,1025,81,266.41,0.42,11,10000,7.19,262,13.11,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
4,1637132400,270.25,263.25,1024,75,266.87,0.68,7,10000,7.98,270,12.21,"[{'id': 800, 'main': 'Clear', 'description': '...",0


##### b) Using the API, create a dataset (in data frame format) for multiple records. I'd say a sample size greater than 100 is sufficient for the example but feel free to get more data if you feel ambitious and the API allows you to do that fairly easily. The dataset can include only a small subset of the returned data. Just choose some interesting features. There is no need to be inclusive here.

In [277]:
params = {'appid':token, 'lat':'40','lon':'116'}
r1 = requests.get('https://api.openweathermap.org/data/2.5/onecall?',params = params)

In [286]:
params = {'appid':token, 'lat':'40','lon':'116'}
r1 = requests.get('https://api.openweathermap.org/data/2.5/onecall?',params = params)
weather_json1 = r1.json()
weather_json_df1 = pd.json_normalize(weather_json1['hourly'])

params = {'appid':token, 'lat':'40','lon':'116'}
r2 = requests.get('https://api.openweathermap.org/data/2.5/onecall?',params = params)
weather_json2 = r2.json()
weather_json_df2 = pd.json_normalize(weather_json2['hourly'])

weather_df = pd.concat([weather_json_df,weather_json_df1,weather_json_df2])
print(weather_df.shape)
weather_df.head()

(144, 14)


Unnamed: 0,dt,temp,feels_like,pressure,humidity,dew_point,uvi,clouds,visibility,wind_speed,wind_deg,wind_gust,weather,pop
0,1637118000,267.81,260.81,1025,84,265.79,0.0,13,10000,6.87,258,13.61,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
1,1637121600,267.79,260.79,1025,85,265.91,0.0,13,10000,6.85,259,13.6,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
2,1637125200,267.99,260.99,1025,84,265.97,0.0,12,10000,6.79,260,13.41,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
3,1637128800,268.87,261.87,1025,81,266.41,0.42,11,10000,7.19,262,13.11,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
4,1637132400,270.25,263.25,1024,75,266.87,0.68,7,10000,7.98,270,12.21,"[{'id': 800, 'main': 'Clear', 'description': '...",0


##### c) Provide some summary statistics of the data. Include the data frame in a .csv file called data.csv with your submission for the grader.



In [287]:
weather_df.describe()

Unnamed: 0,dt,temp,feels_like,pressure,humidity,dew_point,uvi,clouds,visibility,wind_speed,wind_deg,wind_gust,pop
count,144.0,144.0,144.0,144.0,144.0,144.0,144.0,144.0,144.0,144.0,144.0,144.0,144.0
mean,1637203000.0,277.414722,275.353056,1021.409722,56.465278,268.032153,0.299306,50.611111,10000.0,2.267847,219.743056,3.494653,0.0
std,50046.31,5.247493,6.658489,5.496869,22.97792,1.570589,0.549461,42.234297,0.0,1.679581,105.518815,3.232412,0.0
min,1637118000.0,267.79,260.79,1012.0,26.0,262.64,0.0,0.0,10000.0,0.16,0.0,0.3,0.0
25%,1637160000.0,272.53,269.08,1017.0,41.0,266.785,0.0,6.75,10000.0,1.08,158.75,1.375,0.0
50%,1637203000.0,278.275,277.875,1020.5,45.5,268.005,0.0,53.0,10000.0,1.79,261.5,2.31,0.0
75%,1637245000.0,281.0475,280.3325,1025.0,83.0,269.12,0.3825,98.25,10000.0,2.89,297.75,4.4025,0.0
max,1637287000.0,287.15,285.28,1031.0,97.0,271.12,1.91,100.0,10000.0,7.98,358.0,13.61,0.0


In [290]:
weather_df.to_csv('data.csv')

### 5. API Client

##### a) API client function
- allows the user to specify some smallish set of query parameters (from Q.3a)
- run a GET request with these parameters
- check the status of the request the server returns and inform the user of any errors (from Q.3b)
- parse the response and return a Python object to the user of the function. You can choose whether returning a list (from Q.4a) or a data frame (from Q.4b) is best.
- Add docstrings to the API client function that explain the paramters, the output, and ideally include a quick example.

In [295]:
def get_weather_data(lat = 50, lon = 50):
    
    '''
    parameters:
        lat: float, the latitude of the requested location
        lon: float, the longitude of the requested location
    output:
        a dataframe of the hourly forcast of weather
    '''
    
    params = {'appid':token, 'lat':lat,'lon':lon}
    r = requests.get('https://api.openweathermap.org/data/2.5/onecall?',params = params)
    if r.status_code == 200:
        weather_json = r.json()
        weather_json_df = pd.json_normalize(weather_json['hourly'])
        return weather_json_df
    else:
        print('Invalid API key. Please see http://openweathermap.org/faq#error401 for more info.')
        return r.status_code

In [296]:
get_weather_data()

Unnamed: 0,dt,temp,feels_like,pressure,humidity,dew_point,uvi,clouds,visibility,wind_speed,wind_deg,wind_gust,weather,pop
0,1637118000,267.63,260.63,1025,86,265.88,0.0,25,10000,6.71,260,13.3,"[{'id': 802, 'main': 'Clouds', 'description': ...",0
1,1637121600,267.61,260.61,1025,87,266.0,0.0,25,10000,6.63,260,13.11,"[{'id': 802, 'main': 'Clouds', 'description': ...",0
2,1637125200,267.72,260.72,1025,86,265.97,0.0,22,10000,6.48,262,13.0,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
3,1637128800,268.37,261.37,1025,84,266.34,0.42,20,10000,7.09,266,12.7,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
4,1637132400,269.55,262.55,1025,79,266.79,0.68,17,10000,7.41,271,11.41,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
5,1637136000,271.06,264.46,1025,73,267.35,0.92,20,10000,7.19,279,10.41,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
6,1637139600,272.65,266.61,1024,68,267.44,0.95,17,10000,6.97,278,9.6,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
7,1637143200,273.05,267.14,1024,67,267.71,0.9,17,10000,6.92,279,9.4,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
8,1637146800,273.16,267.69,1024,70,268.31,0.53,17,10000,6.04,277,9.3,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
9,1637150400,272.48,267.04,1024,74,268.35,0.21,17,10000,5.63,272,9.6,"[{'id': 801, 'main': 'Clouds', 'description': ...",0
