<a href="https://colab.research.google.com/github/Waleed18574/Waleed_Data_Analyst_Portfolio_Projects/blob/main/Python_Data_Analytics_Projects/Non_Business_Data_Analysis_Projects/Collecting_Analyzing_and_Visualizing_Weather_Temperature_Data_at_LaGuardia_Airport/Collecting%2C_Analyzing_and_Visualizing_Weather_Temperature_Data_at_LaGuardia_Airport.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Collecting_Analyzing_and_Visualizing_Weather_Temperature_Data_at_LaGuardia_Airport
____
## Introduction:
### In this project, I made a request to collect daily temperature data from the National Centers for Environmental Information (NCEI) API using request library. Next, I explored, manipulated and transformed the data to the required form before analyzing it using pandas library. Finally, I visualized the results of the analysis using plotly library.The [NCEI](https://www.ncei.noaa.gov/access) has a helpful getting started page that shows how to form requests.

This project is excuted by <a href="https://www.linkedin.com/in/waleed-abdulla-b00155a1/" target="_blank">Waleed</a>

In [1]:
#import required libraries
import pandas as pd
import requests
import plotly.express as px
import plotly.graph_objects as go

In [2]:
#write a function to request the json file
def make_request(endpoint, payload=None):
    """
    Make a request to a specific endpoint on the weather API by passing headers and optional payload.
    Parameters:
        - endpoint: The endpoint of the API you want to make a GET request to.
        - payload: A dictionary of data to pass along with the request.
    Returns: Response object.    
    """
    
    return requests.get(f'https://www.ncdc.noaa.gov/cdo-web/api/v2/{endpoint}',
                       headers = {'token':'RpXamnPSTpdoIWZUnlsZzuQPsDFNdYuP'},
                       params = payload)

In [3]:
# request the json file of the 'dataset' endpoint
response = make_request('datasets')

In [4]:
# check out if the request returned ok
response.status_code

200

In [5]:
import pprint
pprint.pprint(response.json())

{'metadata': {'resultset': {'count': 11, 'limit': 25, 'offset': 1}},
 'results': [{'datacoverage': 1,
              'id': 'GHCND',
              'maxdate': '2022-01-16',
              'mindate': '1750-02-01',
              'name': 'Daily Summaries',
              'uid': 'gov.noaa.ncdc:C00861'},
             {'datacoverage': 1,
              'id': 'GSOM',
              'maxdate': '2021-12-01',
              'mindate': '1763-01-01',
              'name': 'Global Summary of the Month',
              'uid': 'gov.noaa.ncdc:C00946'},
             {'datacoverage': 1,
              'id': 'GSOY',
              'maxdate': '2021-01-01',
              'mindate': '1763-01-01',
              'name': 'Global Summary of the Year',
              'uid': 'gov.noaa.ncdc:C00947'},
             {'datacoverage': 0.95,
              'id': 'NEXRAD2',
              'maxdate': '2022-01-15',
              'mindate': '1991-06-05',
              'name': 'Weather Radar (Level II)',
              'uid': 'gov.noaa.ncd

In [6]:
# check the keys of the returned json file
response.json().keys()

dict_keys(['metadata', 'results'])

In [7]:
# see the information of the 'results' key in the 'metadata' values
response.json()['metadata']

{'resultset': {'count': 11, 'limit': 25, 'offset': 1}}

In [8]:
# check out the sturcture of the 'results' key
response.json()['results']

[{'datacoverage': 1,
  'id': 'GHCND',
  'maxdate': '2022-01-16',
  'mindate': '1750-02-01',
  'name': 'Daily Summaries',
  'uid': 'gov.noaa.ncdc:C00861'},
 {'datacoverage': 1,
  'id': 'GSOM',
  'maxdate': '2021-12-01',
  'mindate': '1763-01-01',
  'name': 'Global Summary of the Month',
  'uid': 'gov.noaa.ncdc:C00946'},
 {'datacoverage': 1,
  'id': 'GSOY',
  'maxdate': '2021-01-01',
  'mindate': '1763-01-01',
  'name': 'Global Summary of the Year',
  'uid': 'gov.noaa.ncdc:C00947'},
 {'datacoverage': 0.95,
  'id': 'NEXRAD2',
  'maxdate': '2022-01-15',
  'mindate': '1991-06-05',
  'name': 'Weather Radar (Level II)',
  'uid': 'gov.noaa.ncdc:C00345'},
 {'datacoverage': 0.95,
  'id': 'NEXRAD3',
  'maxdate': '2022-01-16',
  'mindate': '1994-05-20',
  'name': 'Weather Radar (Level III)',
  'uid': 'gov.noaa.ncdc:C00708'},
 {'datacoverage': 1,
  'id': 'NORMAL_ANN',
  'maxdate': '2010-01-01',
  'mindate': '2010-01-01',
  'name': 'Normals Annual/Seasonal',
  'uid': 'gov.noaa.ncdc:C00821'},
 {'data

In [9]:
# the'results' key contains list of dictionaries
# select the first dictionary to look at its keys to see what fields the data contains
response.json()['results'][0].keys()

dict_keys(['uid', 'mindate', 'maxdate', 'name', 'datacoverage', 'id'])

In [10]:
# cehck the 'id' and 'name' fields
[(data['id'],data['name']) for data in response.json()['results']]

[('GHCND', 'Daily Summaries'),
 ('GSOM', 'Global Summary of the Month'),
 ('GSOY', 'Global Summary of the Year'),
 ('NEXRAD2', 'Weather Radar (Level II)'),
 ('NEXRAD3', 'Weather Radar (Level III)'),
 ('NORMAL_ANN', 'Normals Annual/Seasonal'),
 ('NORMAL_DLY', 'Normals Daily'),
 ('NORMAL_HLY', 'Normals Hourly'),
 ('NORMAL_MLY', 'Normals Monthly'),
 ('PRECIP_15', 'Precipitation 15 Minute'),
 ('PRECIP_HLY', 'Precipitation Hourly')]

for the purpose of this notebook, dataset id 'GHCND' is the required id

In [11]:
# make a request with the endpoint = 'datacategories' and 'datasetid' = 'GHCND' to idnetify the 
#'datacategoryid'
response = make_request('datacategories',payload = {'datasetid':'GHCND'})

In [12]:
# check to see if the response status is ok
response.status_code

200

In [13]:
# check the json of the response

pprint.pprint(response.json())

{'metadata': {'resultset': {'count': 9, 'limit': 25, 'offset': 1}},
 'results': [{'id': 'EVAP', 'name': 'Evaporation'},
             {'id': 'LAND', 'name': 'Land'},
             {'id': 'PRCP', 'name': 'Precipitation'},
             {'id': 'SKY', 'name': 'Sky cover & clouds'},
             {'id': 'SUN', 'name': 'Sunshine'},
             {'id': 'TEMP', 'name': 'Air Temperature'},
             {'id': 'WATER', 'name': 'Water'},
             {'id': 'WIND', 'name': 'Wind'},
             {'id': 'WXTYPE', 'name': 'Weather Type'}]}


{'name': 'Air Temperature', 'id': 'TEMP'}\
based on the resutls, the required 'datacategoryid' is 'TEMP'

In [14]:
# make a request with the endpoint 'datatypes' and the 'datacategoryid' of 'TEMP' to identify the required
# datatypes
response = make_request('datatypes', payload = {'datacategoryid':'TEMP'})

In [15]:
# check the status of the request
response.status_code

200

In [16]:
pprint.pprint(response.json())

{'metadata': {'resultset': {'count': 59, 'limit': 25, 'offset': 1}},
 'results': [{'datacoverage': 1,
              'id': 'CDSD',
              'maxdate': '2021-12-01',
              'mindate': '1763-01-01',
              'name': 'Cooling Degree Days Season to Date'},
             {'datacoverage': 1,
              'id': 'DATN',
              'maxdate': '2022-01-15',
              'mindate': '1750-02-01',
              'name': 'Number of days included in the multiday minimum '
                      'temperature (MDTN)'},
             {'datacoverage': 1,
              'id': 'DATX',
              'maxdate': '2022-01-14',
              'mindate': '1750-02-01',
              'name': 'Number of days included in the multiday maximum '
                      'temperature (MDTX)'},
             {'datacoverage': 1,
              'id': 'DLY-DUTR-NORMAL',
              'maxdate': '2010-12-31',
              'mindate': '2010-01-01',
              'name': 'Long-term averages of daily diurnal temperat

In [17]:
# check the 'id' and 'name' of the results
[(data['id'],data['name']) for data in response.json()['results']]

[('CDSD', 'Cooling Degree Days Season to Date'),
 ('DATN',
  'Number of days included in the multiday minimum temperature (MDTN)'),
 ('DATX',
  'Number of days included in the multiday maximum temperature (MDTX)'),
 ('DLY-DUTR-NORMAL', 'Long-term averages of daily diurnal temperature range'),
 ('DLY-DUTR-STDDEV',
  'Long-term standard deviations of daily diurnal temperature range'),
 ('DLY-TAVG-NORMAL', 'Long-term averages of daily average temperature'),
 ('DLY-TAVG-STDDEV',
  'Long-term standard deviations of daily average temperature'),
 ('DLY-TMAX-NORMAL', 'Long-term averages of daily maximum temperature'),
 ('DLY-TMAX-STDDEV',
  'Long-term standard deviations of daily maximum temperature'),
 ('DLY-TMIN-NORMAL', 'Long-term averages of daily minimum temperature'),
 ('DLY-TMIN-STDDEV',
  'Long-term standard deviations of daily minimum temperature'),
 ('EMNT', 'Extreme minimum temperature for the period.'),
 ('EMXT', 'Extreme maximum temperature for the period.'),
 ('HDSD', 'Heating De

the last 3 data types are the required info

('HTMN', 'Highest minimum temperature')\
('HTMX', 'Highest maximum temperature')\
('LTMN', 'Lowest minimum temperature')

In [18]:
# make a request with endpoint of 'locationcategories' and 'datasetid' of 'GHCND' to indetify the
# required location
response = make_request('locationcategories',{'datasetid':'GHCND'})

In [19]:
# check the status of the request
response.status_code

200

In [20]:
# check the json of the response
import pprint
pprint.pprint(response.json())

{'metadata': {'resultset': {'count': 12, 'limit': 25, 'offset': 1}},
 'results': [{'id': 'CITY', 'name': 'City'},
             {'id': 'CLIM_DIV', 'name': 'Climate Division'},
             {'id': 'CLIM_REG', 'name': 'Climate Region'},
             {'id': 'CNTRY', 'name': 'Country'},
             {'id': 'CNTY', 'name': 'County'},
             {'id': 'HYD_ACC', 'name': 'Hydrologic Accounting Unit'},
             {'id': 'HYD_CAT', 'name': 'Hydrologic Cataloging Unit'},
             {'id': 'HYD_REG', 'name': 'Hydrologic Region'},
             {'id': 'HYD_SUB', 'name': 'Hydrologic Subregion'},
             {'id': 'ST', 'name': 'State'},
             {'id': 'US_TERR', 'name': 'US Territory'},
             {'id': 'ZIP', 'name': 'Zip Code'}]}


city is the porper 'locationcategory'

{'id': 'CITY', 'name': 'City'}

In [21]:
# write a binary search function to search for the city of new york
def get_item(name, what, endpoint, start=1, end=None):
    """
    Grab the JSON payload for a given item using binary search.
    Parameters:
        - name: The item to look for.
        - what: Dictionary specifying what the item in `name` is.
        - endpoint: Where to look for the item.
        - start: The position to start at. We don't need to touch this, but the function will manipulate 
          this with recursion.
        - end: The last position of the cities.
    Returns: Dictionary of the information for the item if found Returns: Dictionary of the information 
    for the item if found 
    """
    # find the midpoint which we use to cut the data in half each time
    mid = (start + (end if end else 1)) // 2
    
   # lowercase the name so this is not case-sensitive
    name = name.lower()
    
    # define the payload we will send with each request
    payload = {'datasetid' : 'GHCND',
               'sortfield' : 'name',
               'offset' : mid, 
               'limit' : 1}
    
    # make request adding additional filter parameters from `what`
    response = make_request(endpoint, {**payload, **what})
    
    if response.ok:
        # grab the end index from the response metadata the first time through
        end = end if end else response.json()['metadata']['resultset']['count']
        
        # grab the lowercase version of the current name
        current_name = response.json()['results'][0]['name'].lower()
        
        # if name is in the current name, the item is found
        if name in current_name:
            # return the found item
            return response.json()['results'][0]
        else:
            # if the start index is greater than or equal to end index, return empty dictionary
            if start >= end:
                return {}
            # otherwirse, search further
            elif name < current_name:
                return get_item(name, what, endpoint, start, mid-1)
            elif name > current_name:
                return get_item(name, what, endpoint, mid + 1, end)
    else:
        # response wasn't ok, use code to determine why
        print(f'Response not OK, status: {response.status_code}')

In [22]:
# writ ea function to search for get the location using the get_item( function)
def get_location(name):
    """
    Grab the JSON payload for a given location using binary search
    Parameters:
        - name: The city to look for
    Returns: Dictionary of the information for the city if found Returns: Dictionary of the information 
    for the city if found
    """
    return get_item(name, {'locationcategoryid' : 'CITY'}, 'locations')

In [23]:
# get the location info of NYC
nyc = get_location('New York')

In [24]:
nyc

{'datacoverage': 1,
 'id': 'CITY:US360019',
 'maxdate': '2022-01-16',
 'mindate': '1869-01-01',
 'name': 'New York, NY US'}

In [25]:
# drill down to get the data of Laguardia Airport using the locationid of nyc
laguardiaAirport = get_item('Laguardia Airport',{'location':nyc['id']},'stations')

In [26]:
laguardiaAirport

{'datacoverage': 1,
 'elevation': 3,
 'elevationUnit': 'METERS',
 'id': 'GHCND:USW00014732',
 'latitude': 40.77945,
 'longitude': -73.88027,
 'maxdate': '2022-01-16',
 'mindate': '1939-10-07',
 'name': 'LAGUARDIA AIRPORT, NY US'}

In [27]:
# make a request to get the temperature data October 2018 recodred at Laguardia Airport using its info
response = make_request('data',{'datasetid':'GHCND',
                                'stationid':laguardiaAirport['id'],
                                'locationid':nyc['id'],
                                'startdate':'2021-01-16',
                                'enddate':'2022-01-16',
                                'datatypeid':['TMIN','TMAX'],
                                'units':'metric',
                                'limit':1000
                               }
                       )
response.status_code

200

In [28]:
import pandas as pd
# bring the 'results of response into dataframe
df = pd.DataFrame(response.json()['results'])

In [29]:
# check the first few rows
df.count()

date          728
datatype      728
station       728
attributes    728
value         728
dtype: int64

In [30]:
# check the nul
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 728 entries, 0 to 727
Data columns (total 5 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   date        728 non-null    object 
 1   datatype    728 non-null    object 
 2   station     728 non-null    object 
 3   attributes  728 non-null    object 
 4   value       728 non-null    float64
dtypes: float64(1), object(4)
memory usage: 28.6+ KB


In [31]:
# optimize the space taken by df
cols = ['datatype','station','attributes']
for col in cols:
    print(col,' distinct values are :\n',df[col].unique())

datatype  distinct values are :
 ['TMAX' 'TMIN']
station  distinct values are :
 ['GHCND:USW00014732']
attributes  distinct values are :
 [',,W,2400' ',,S,2400' ',,D,2400']


In [32]:
# convert the columsn of 'datatype','station','attributes' to category data-type to optimize the space
for col in cols:
    df[col] = df[col].astype('category')
    
# convert the date column's data-type into datetime
df['date'] = pd.to_datetime(df['date'])

In [33]:
# rename the columns
df.rename(columns = {'attributes':'flags','value':'temp_c'}, inplace = True)
df.columns

Index(['date', 'datatype', 'station', 'flags', 'temp_c'], dtype='object')

In [34]:
df.dtypes

date        datetime64[ns]
datatype          category
station           category
flags             category
temp_c             float64
dtype: object

nice!

In [35]:
# check the genral statistics of df
df.describe()

Unnamed: 0,temp_c
count,728.0
mean,14.584066
std,10.276967
min,-10.5
25%,6.1
50%,15.6
75%,22.8
max,37.8


In [36]:
df

Unnamed: 0,date,datatype,station,flags,temp_c
0,2021-01-16,TMAX,GHCND:USW00014732,",,W,2400",9.4
1,2021-01-16,TMIN,GHCND:USW00014732,",,W,2400",4.4
2,2021-01-17,TMAX,GHCND:USW00014732,",,W,2400",7.8
3,2021-01-17,TMIN,GHCND:USW00014732,",,W,2400",3.3
4,2021-01-18,TMAX,GHCND:USW00014732,",,W,2400",8.3
...,...,...,...,...,...
723,2022-01-13,TMIN,GHCND:USW00014732,",,D,2400",0.6
724,2022-01-14,TMAX,GHCND:USW00014732,",,D,2400",6.1
725,2022-01-14,TMIN,GHCND:USW00014732,",,D,2400",-5.6
726,2022-01-15,TMAX,GHCND:USW00014732,",,W,2400",-5.5


In [37]:
# ploting the tempature
import plotly.express as px
fig = px.line(df, x = 'date', 
              y = 'temp_c', 
              color = 'datatype', 
              title = 'Daily Minimum & Maximum Tempature at Laguardia Airport in °C',
              template = 'plotly_dark')
fig.show()

Within the 1 years time-frame (2021-01-16 to 2022-01-16), it can be seen that the highest temperature at the Laguardia Airport was recorded on June 2021 and the lowest temperature was recorded on January 2022

In [38]:
# save in the data folder
#df.to_csv('../data/nyc_tempature.csv', index = False)

In [39]:
df.head()

Unnamed: 0,date,datatype,station,flags,temp_c
0,2021-01-16,TMAX,GHCND:USW00014732,",,W,2400",9.4
1,2021-01-16,TMIN,GHCND:USW00014732,",,W,2400",4.4
2,2021-01-17,TMAX,GHCND:USW00014732,",,W,2400",7.8
3,2021-01-17,TMIN,GHCND:USW00014732,",,W,2400",3.3
4,2021-01-18,TMAX,GHCND:USW00014732,",,W,2400",8.3


In [40]:
df_pivot = df.pivot_table(values = 'temp_c', 
                             index = ['date'], 
                             columns ='datatype')

df_maximum_col = df_pivot['TMAX']
df_minimum_col = df_pivot['TMIN']

In [41]:
df_pivot

datatype,TMAX,TMIN
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2021-01-16,9.4,4.4
2021-01-17,7.8,3.3
2021-01-18,8.3,4.4
2021-01-19,6.1,3.3
2021-01-20,5.6,-1.6
...,...,...
2022-01-11,-3.3,-8.3
2022-01-12,5.6,-6.7
2022-01-13,7.8,0.6
2022-01-14,6.1,-5.6


In [42]:
df['TMAX'] = df['date'].map(df_maximum_col)
df['TMIN'] = df['date'].map(df_minimum_col)

In [43]:
df.head()

Unnamed: 0,date,datatype,station,flags,temp_c,TMAX,TMIN
0,2021-01-16,TMAX,GHCND:USW00014732,",,W,2400",9.4,9.4,4.4
1,2021-01-16,TMIN,GHCND:USW00014732,",,W,2400",4.4,9.4,4.4
2,2021-01-17,TMAX,GHCND:USW00014732,",,W,2400",7.8,7.8,3.3
3,2021-01-17,TMIN,GHCND:USW00014732,",,W,2400",3.3,7.8,3.3
4,2021-01-18,TMAX,GHCND:USW00014732,",,W,2400",8.3,8.3,4.4


In [44]:
df = df[df['datatype'] == 'TMAX']

In [45]:
df.drop('datatype', axis =1, inplace = True)
df.drop('temp_c', axis =1, inplace = True)



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [46]:
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month        



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [47]:
df.head()

Unnamed: 0,date,station,flags,TMAX,TMIN,year,month
0,2021-01-16,GHCND:USW00014732,",,W,2400",9.4,4.4,2021,1
2,2021-01-17,GHCND:USW00014732,",,W,2400",7.8,3.3,2021,1
4,2021-01-18,GHCND:USW00014732,",,W,2400",8.3,4.4,2021,1
6,2021-01-19,GHCND:USW00014732,",,W,2400",6.1,3.3,2021,1
8,2021-01-20,GHCND:USW00014732,",,W,2400",5.6,-1.6,2021,1


In [48]:
df_grouped = df.groupby(['year','month'])['TMAX','TMIN'].mean().reset_index()
df_grouped


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,year,month,TMAX,TMIN
0,2021,1,4.0375,-1.3
1,2021,2,4.432143,-0.964286
2,2021,3,11.790323,3.064516
3,2021,4,17.29,7.89
4,2021,5,21.919355,13.003226
5,2021,6,28.93,19.55
6,2021,7,29.287097,21.519355
7,2021,8,29.732258,22.809677
8,2021,9,26.193333,19.053333
9,2021,10,21.4,14.893548


In [49]:
df_grouped.rename(columns = {'TMAX':'AVG_TMAX', 'TMIN':'AVG_TMIN'},inplace =True)

In [50]:
df_grouped.dtypes

year          int64
month         int64
AVG_TMAX    float64
AVG_TMIN    float64
dtype: object

In [51]:
month_names ={1:'January',2:'February',3:'March',4:'April',5:'May',6:'June',7:'July',8:'August',
              9:'September',10:'October',11:'November',12:'December'}

In [52]:
df_grouped['month_name'] = df_grouped['month'].map(month_names)
df_grouped['month_name']

0       January
1      February
2         March
3         April
4           May
5          June
6          July
7        August
8     September
9       October
10     November
11     December
12      January
Name: month_name, dtype: object

In [53]:
df_grouped['month_year'] = df_grouped['month_name']+'-'+df_grouped['year'].astype(str)
df_grouped['month_year']

0       January-2021
1      February-2021
2         March-2021
3         April-2021
4           May-2021
5          June-2021
6          July-2021
7        August-2021
8     September-2021
9       October-2021
10     November-2021
11     December-2021
12      January-2022
Name: month_year, dtype: object

In [54]:
df_grouped

Unnamed: 0,year,month,AVG_TMAX,AVG_TMIN,month_name,month_year
0,2021,1,4.0375,-1.3,January,January-2021
1,2021,2,4.432143,-0.964286,February,February-2021
2,2021,3,11.790323,3.064516,March,March-2021
3,2021,4,17.29,7.89,April,April-2021
4,2021,5,21.919355,13.003226,May,May-2021
5,2021,6,28.93,19.55,June,June-2021
6,2021,7,29.287097,21.519355,July,July-2021
7,2021,8,29.732258,22.809677,August,August-2021
8,2021,9,26.193333,19.053333,September,September-2021
9,2021,10,21.4,14.893548,October,October-2021


In [55]:
df_grouped.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13 entries, 0 to 12
Data columns (total 6 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   year        13 non-null     int64  
 1   month       13 non-null     int64  
 2   AVG_TMAX    13 non-null     float64
 3   AVG_TMIN    13 non-null     float64
 4   month_name  13 non-null     object 
 5   month_year  13 non-null     object 
dtypes: float64(2), int64(2), object(2)
memory usage: 752.0+ bytes


In [58]:
pip install plotly==5.5.0

Collecting plotly==5.5.0
  Downloading plotly-5.5.0-py2.py3-none-any.whl (26.5 MB)
[K     |████████████████████████████████| 26.5 MB 1.5 MB/s 
[?25hCollecting tenacity>=6.2.0
  Downloading tenacity-8.0.1-py3-none-any.whl (24 kB)
Installing collected packages: tenacity, plotly
  Attempting uninstall: plotly
    Found existing installation: plotly 4.4.1
    Uninstalling plotly-4.4.1:
      Successfully uninstalled plotly-4.4.1
Successfully installed plotly-5.5.0 tenacity-8.0.1


In [73]:

fig = go.Figure()
fig.add_trace(go.Scatter( x=df_grouped['month_year'].values.tolist(),
                          y=df_grouped['AVG_TMAX'].values.tolist(),
                          name="AVG_TMAX"       # this sets its legend entry
                        )
             )

fig.add_trace(go.Scatter( x = df_grouped['month_year'].values.tolist(),
                          y = df_grouped['AVG_TMIN'].values.tolist()
                        )
             )

fig.update_layout(title="Monthly Average Minimum & Maximum Tempature at Laguardia Airport in °C",
                  xaxis_title="Month",
                  yaxis_title="Tempature"
                  
                 )
                  
fig.layout.template = ('plotly_dark')
fig.update_xaxes(tickangle=-45)
