#### Paphos Weather

This page shows various metrics for the weather in Paphos, Cyprus.

Weather data is purchased from Openweather Marketplace [here](https://home.openweathermap.org/marketplace/my_orders)



In [4]:

from datetime import date
from datetime import datetime
import time
import pandas as pd


def toDate(epoch_time):
    return datetime.fromtimestamp(epoch_time)
  


In [2]:
import pandas as pd

df=pd.read_json('/Users/walkerrowe/Documents/weather/pathosWeather.json')

In [3]:
df.columns

Index(['city_name', 'lat', 'lon', 'main', 'wind', 'clouds', 'weather', 'dt',
       'dt_iso', 'timezone', 'rain'],
      dtype='object')

In [5]:
df['dt'].map(toDate)

0        2000-01-01 02:00:00
1        2000-01-01 03:00:00
2        2000-01-01 04:00:00
3        2000-01-01 05:00:00
4        2000-01-01 06:00:00
                 ...        
181459   2020-09-12 22:00:00
181460   2020-09-12 23:00:00
181461   2020-09-13 00:00:00
181462   2020-09-13 01:00:00
181463   2020-09-13 02:00:00
Name: dt, Length: 181464, dtype: datetime64[ns]

We have to convert the epoch time to ISO Date, which is something we can read.

The map function runs a function over a series.  A series means a dataframe with one column.  When we write df['new column'] = df['some column'].map(somefunction) it adds the new columns to the dataframe.


This determines it is rained.  Need to know when it was cloudy.

In [81]:
df['dateTime']=df['dt'].map(lambda l: toDate(l).strftime("%Y-%m-%d %H:%M:%S"))

In [17]:
rainy = df[~df['rain'].isnull()]

In [20]:
rainy['rain']

isoDate
2000-01-03 00:00:00    {'1h': 0.30000000000000004}
2000-01-03 01:00:00    {'1h': 0.30000000000000004}
2000-01-03 02:00:00                    {'1h': 0.5}
2000-01-03 03:00:00                    {'1h': 0.5}
2000-01-03 04:00:00                    {'1h': 0.5}
                                  ...             
2020-04-25 09:00:00    {'1h': 0.30000000000000004}
2020-04-25 14:00:00                   {'1h': 0.33}
2020-05-03 04:00:00                   {'1h': 0.33}
2020-05-05 09:00:00    {'1h': 0.30000000000000004}
2020-05-24 04:00:00                   {'1h': 0.33}
Name: rain, Length: 5395, dtype: object

In [7]:
df['clouds']

0          {'all': 1}
1         {'all': 15}
2         {'all': 30}
3          {'all': 1}
4         {'all': 14}
             ...     
181459    {'all': 40}
181460    {'all': 40}
181461    {'all': 40}
181462    {'all': 40}
181463    {'all': 20}
Name: clouds, Length: 181464, dtype: object

In [8]:
df['weather']

0         [{'id': 800, 'main': 'Clear', 'description': '...
1         [{'id': 801, 'main': 'Clouds', 'description': ...
2         [{'id': 802, 'main': 'Clouds', 'description': ...
3         [{'id': 800, 'main': 'Clear', 'description': '...
4         [{'id': 801, 'main': 'Clouds', 'description': ...
                                ...                        
181459    [{'id': 802, 'main': 'Clouds', 'description': ...
181460    [{'id': 802, 'main': 'Clouds', 'description': ...
181461    [{'id': 802, 'main': 'Clouds', 'description': ...
181462    [{'id': 802, 'main': 'Clouds', 'description': ...
181463    [{'id': 801, 'main': 'Clouds', 'description': ...
Name: weather, Length: 181464, dtype: object

In [8]:
rainy['isoDate']=df['dt'].map(toDate)

In [9]:
df.columns

Index(['city_name', 'lat', 'lon', 'main', 'wind', 'clouds', 'weather', 'dt',
       'dt_iso', 'timezone', 'rain', 'isoDate'],
      dtype='object')

We want to caculate the average temperature by date usign the mean() function.  But the problem is that the date is an object inside the **main** object.  See below how to fix that.

In [14]:
df['weatherMain']=df['weather'].map(lambda x: x[0]['main'])

In [15]:
df['weatherDesc']=df['weather'].map(lambda x: x[0]['description'])

In [17]:
df['weatherMain']

0          Clear
1         Clouds
2         Clouds
3          Clear
4         Clouds
           ...  
181459    Clouds
181460    Clouds
181461    Clouds
181462    Clouds
181463    Clouds
Name: weatherMain, Length: 181464, dtype: object

In [42]:
df['weatherDesc']

0             sky is clear
1               few clouds
2         scattered clouds
3             sky is clear
4               few clouds
                ...       
181459    scattered clouds
181460    scattered clouds
181461    scattered clouds
181462    scattered clouds
181463          few clouds
Name: weatherDesc, Length: 181464, dtype: object

In [16]:
df['temp']=df["main"].map(lambda x: x["temp"])
df['temp_min']=df["main"].map(lambda x: x["temp_min"])
df['temp_max']=df["main"].map(lambda x: x["temp_max"])

In [20]:
import numpy as np

year=[]
month=[]
row=len(df.index)
for j in range(0,row):
    date=df["dt_iso"][j]
    onlyDate=date[:len(date)-19]
    year.append(onlyDate[0:4])
    month.append(int(onlyDate[5:7]))
    
    
month=np.asarray(month)
year=np.asarray(year)

In [65]:
df['month']=pd.DataFrame(month)
df['year']=pd.DataFrame(year).astype('int32')

In [43]:
winter=df.loc[df['month'] < 4 ].groupby(['month']) 

In [82]:
december=df.loc[df['month'] == 12 ].groupby(['year','month','weatherMain'])

In [51]:
december.groups.keys()

dict_keys([('2000', 12, 'Clear'), ('2000', 12, 'Clouds'), ('2000', 12, 'Rain'), ('2000', 12, 'Thunderstorm'), ('2001', 12, 'Clear'), ('2001', 12, 'Clouds'), ('2001', 12, 'Rain'), ('2001', 12, 'Thunderstorm'), ('2002', 12, 'Clear'), ('2002', 12, 'Clouds'), ('2002', 12, 'Rain'), ('2002', 12, 'Thunderstorm'), ('2003', 12, 'Clear'), ('2003', 12, 'Clouds'), ('2003', 12, 'Rain'), ('2003', 12, 'Thunderstorm'), ('2004', 12, 'Clear'), ('2004', 12, 'Clouds'), ('2004', 12, 'Rain'), ('2004', 12, 'Thunderstorm'), ('2005', 12, 'Clear'), ('2005', 12, 'Clouds'), ('2005', 12, 'Haze'), ('2005', 12, 'Mist'), ('2005', 12, 'Rain'), ('2006', 12, 'Clear'), ('2006', 12, 'Clouds'), ('2006', 12, 'Rain'), ('2006', 12, 'Thunderstorm'), ('2007', 12, 'Clear'), ('2007', 12, 'Clouds'), ('2007', 12, 'Rain'), ('2007', 12, 'Thunderstorm'), ('2008', 12, 'Clear'), ('2008', 12, 'Clouds'), ('2008', 12, 'Rain'), ('2008', 12, 'Thunderstorm'), ('2009', 12, 'Clear'), ('2009', 12, 'Clouds'), ('2009', 12, 'Haze'), ('2009', 12, 'R

In [64]:
df['year']

0         2000
1         2000
2         2000
3         2000
4         2000
          ... 
181459    2020
181460    2020
181461    2020
181462    2020
181463    2020
Name: year, Length: 181464, dtype: object

In [None]:
this=(df.loc[(df['month'] == 12) & (df['year']==2019)])
pd.set_option('display.max_rows', this.shape[0]+1)
print(this)

In [104]:
this.groupby(['year','month','weatherMain'])['dt'].count()

year  month  weatherMain 
2019  12     Clear            42
             Clouds          602
             Mist              1
             Rain             48
             Thunderstorm     51
Name: dt, dtype: int64