#### Paphos Weather

This page shows various metrics for the weather in Paphos, Cyprus.

Weather data is purchased from Openweather Marketplace [here](https://home.openweathermap.org/marketplace/my_orders)



In [25]:

from datetime import date
from datetime import datetime
import time
import pandas as pd


def toDate(epoch_time):
    return datetime.fromtimestamp(epoch_time)
  


In [27]:
import pandas as pd

df=pd.read_json('/Users/walkerrowe/Documents/weather/pathosWeather.json')

In [28]:
df.columns

Index(['city_name', 'lat', 'lon', 'main', 'wind', 'clouds', 'weather', 'dt',
       'dt_iso', 'timezone', 'rain'],
      dtype='object')

In [3]:
toDate(946684800)

datetime.datetime(1999, 12, 31, 19, 0)

In [4]:
df['dt'].map(toDate)

0        1999-12-31 19:00:00
1        1999-12-31 20:00:00
2        1999-12-31 21:00:00
3        1999-12-31 22:00:00
4        1999-12-31 23:00:00
                 ...        
181459   2020-09-12 15:00:00
181460   2020-09-12 16:00:00
181461   2020-09-12 17:00:00
181462   2020-09-12 18:00:00
181463   2020-09-12 19:00:00
Name: dt, Length: 181464, dtype: datetime64[ns]

In [5]:
df.columns

Index(['city_name', 'lat', 'lon', 'main', 'wind', 'clouds', 'weather', 'dt',
       'dt_iso', 'timezone', 'rain'],
      dtype='object')

In [6]:
df['main'][0]

{'temp': 55.35,
 'temp_min': 51.8,
 'temp_max': 65.53,
 'feels_like': 49.44,
 'pressure': 1016,
 'humidity': 73}

We have to convert the epoch time to ISO Date, which is something we can read.

The map function runs a function over a series.  A series means a dataframe with one column.  When we write df['new column'] = df['some column'].map(somefunction) it adds the new columns to the dataframe.


In [17]:
df['isoDate']=df['dt'].map(toDate)

In [19]:
df.columns

Index(['city_name', 'lat', 'lon', 'main', 'wind', 'clouds', 'weather', 'dt',
       'dt_iso', 'timezone', 'rain', 'isoDate'],
      dtype='object')

In [18]:
df['isoDate']

0        1999-12-31 19:00:00
1        1999-12-31 20:00:00
2        1999-12-31 21:00:00
3        1999-12-31 22:00:00
4        1999-12-31 23:00:00
                 ...        
181459   2020-09-12 15:00:00
181460   2020-09-12 16:00:00
181461   2020-09-12 17:00:00
181462   2020-09-12 18:00:00
181463   2020-09-12 19:00:00
Name: isoDate, Length: 181464, dtype: datetime64[ns]

We want to caculate the average temperature by date usign the mean() function.  But the problem is that the date is an object inside the **main** object.  See below how to fix that.

In [9]:
df.index = pd.to_datetime(df['isoDate'])
df.resample('d').mean()

Unnamed: 0_level_0,lat,lon,dt,timezone
isoDate,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1999-12-31,34.753637,32.406951,9.466920e+08,7200.0
2000-01-01,34.753637,32.406951,9.467442e+08,7200.0
2000-01-02,34.753637,32.406951,9.468306e+08,7200.0
2000-01-03,34.753637,32.406951,9.469170e+08,7200.0
2000-01-04,34.753637,32.406951,9.470034e+08,7200.0
...,...,...,...,...
2020-09-08,34.753637,32.406951,1.599579e+09,10800.0
2020-09-09,34.753637,32.406951,1.599665e+09,10800.0
2020-09-10,34.753637,32.406951,1.599752e+09,10800.0
2020-09-11,34.753637,32.406951,1.599838e+09,10800.0


The series df['main'] is a JSON object.  Write a function to split it into it's seperate fields.  Use **series.map(function)** method like above.

{'temp': 55.35,
 'temp_min': 51.8,
 'temp_max': 65.53,
 'feels_like': 49.44,
 'pressure': 1016,
 'humidity': 73}
 

In [16]:
df.columns

Index(['city_name', 'lat', 'lon', 'main', 'wind', 'clouds', 'weather', 'dt',
       'dt_iso', 'timezone', 'rain'],
      dtype='object')

In [11]:
df.head(1)

Unnamed: 0_level_0,city_name,lat,lon,main,wind,clouds,weather,dt,dt_iso,timezone,rain,isoDate
isoDate,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1999-12-31 19:00:00,Paphos Castle,34.753637,32.406951,"{'temp': 55.35, 'temp_min': 51.8, 'temp_max': ...","{'speed': 9.17, 'deg': 20}",{'all': 1},"[{'id': 800, 'main': 'Clear', 'description': '...",946684800,2000-01-01 00:00:00 +0000 UTC,7200,,1999-12-31 19:00:00


In [29]:
df['temp']=df["main"].map(lambda x: x["temp"])

In [31]:
import numpy as np

year=[]
month=[]
row=len(df.index)
for j in range(0,row):
    date=df["dt_iso"][j]
    onlyDate=date[:len(date)-19]
    year.append(onlyDate[0:4])
    month.append(int(onlyDate[5:7]))
    
    
month=np.asarray(month)
year=np.asarray(year)

In [55]:
monthYear=pd.DataFrame(year,columns=['year'])
monthYear['month']=pd.DataFrame(month)
monthYear['temp']=df["temp"]

In [56]:
monthYear.astype('float').info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 181464 entries, 0 to 181463
Data columns (total 3 columns):
 #   Column  Non-Null Count   Dtype  
---  ------  --------------   -----  
 0   year    181464 non-null  float64
 1   month   181464 non-null  float64
 2   temp    181464 non-null  float64
dtypes: float64(3)
memory usage: 4.2 MB


In [70]:
monthly=monthYear.groupby(['month']).mean()
monthly

Unnamed: 0_level_0,temp
month,Unnamed: 1_level_1
1,55.434047
2,56.078107
3,59.114076
4,63.465818
5,69.825123
6,76.094807
7,80.706629
8,81.433214
9,77.987837
10,72.329584


In [69]:
winter=monthYear.loc[monthYear['month'] < 4 ].groupby(['month']).mean()
winter


Unnamed: 0_level_0,temp
month,Unnamed: 1_level_1
1,55.434047
2,56.078107
3,59.114076


In [68]:
december=monthYear.loc[monthYear['month'] == 12 ].groupby(['year','month']).mean()
december

Unnamed: 0_level_0,Unnamed: 1_level_0,temp
year,month,Unnamed: 2_level_1
2000,12,58.634516
2001,12,57.609099
2002,12,57.30922
2003,12,58.927715
2004,12,56.998159
2005,12,58.957675
2006,12,56.970282
2007,12,58.443884
2008,12,59.40711
2009,12,60.644704
