# European Weather Assessment
### Oiginal Taken From:


EUROPEAN CLIMATE ASSESSMENT & DATASET (ECA&D), file created on 22-04-2021
THESE DATA CAN BE USED FREELY PROVIDED THAT THE FOLLOWING SOURCE IS ACKNOWLEDGED:

Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface
air temperature and precipitation series for the European Climate Assessment.
Int. J. of Climatol., 22, 1441-1453.
Data and metadata available at http://www.ecad.eu

-----------
Data collection selection and processing
----------

The initial meteorological data was retrieved from ECA&D [1] a project that makes available daily 
observations at meteorological stations throughout Europe and the Mediterranean. 18 different European 
cities or places were selected of which multiple daily observations were available through the 
years 2000 to 2010. Those where Basel (Switzerland), Budapest (Hungary), Dresden, Düsseldorf, 
Kassel, München (all  Germany), De Bilt and Maastricht (the Netherlands), Heathrow (UK), Ljubljana (Slovenia), 
Malmo and Stockholm (Sweden), Montélimar, Perpignan and Tours (France), Oslo (Norway), Roma (Italy), and 
Sonnblick (Austria). 

Recordings of daily meteorological observations for these 18 locations span different times, some contain 
collections that go back to the 19th century. Here, however, we only selected the time span from 
2000 to 2010 resulting in 3654 daily observations. The dataset is then constructed from all data of 
those 18 locations.

The data in addition consists of different observations. While all selected locations provide data 
for the variables ‘mean temperature’, ‘max temperature’, and ‘min temperature’, we also included 
data for the variables 'cloud_cover', 'wind_speed', 'wind_gust', 'humidity', 'pressure', 'global_radiation', 
'precipitation', 'sunshine' wherever those were available. 

After collecting the data, very basic cleaning of the data was performed. Columns with > 5% invalid 
entries (“-9999”) were removed, columns with <= 5% invalid entries where kept but invalid entries 
were replaced by mean values. This resulted in 165 variables (or features) over the course of 3654 days.
Finally, we transformed several data units to achieve more similar ranges of the present values. 
This makes the data more suitable to be used for machine learning or deep learning even without 
additional processing. We deliberately did not chose to fully standardize the data because we 
wanted to keep the presented units and values as intuitively accessible as possible. Temperature are 
now given in degree Celsius, wind speed and gust in m/s, humidity in fraction of 100%, sea level 
pressure in 1000 hPa, global radiation in 100 W/m2, precipitation amounts in centimeter, sunshine in hours.


-----------
Physical units of the variables:
----------

ORIGINAL DATA UNITS:

- CC   : cloud cover in oktas
- DD   : wind direction in degrees
- FG   : wind speed in 0.1 m/s
- FX   : wind gust in 0.1 m/s
- HU   : humidity in 1 %
- PP   : sea level pressure in 0.1 hPa
- QQ   : global radiation in W/m2
- RR   : precipitation amount in 0.1 mm
- SS   : sunshine in 0.1 Hours
- TG   : mean temperature in 0.1 &#176;C
- TN   : minimum temperature in 0.1 &#176;C
- TX   : maximum temperature in 0.1 &#176;C



CONVERTED to:
- CC   : cloud cover in oktas
- DD   : wind direction in degrees
- FG   : wind speed in 1 m/s
- FX   : wind gust in 1 m/s
- HU   : humidity in fraction of 100 %
- PP   : sea level pressure in 1000 hPa
- QQ   : global radiation in 100 W/m2
- RR   : precipitation amount in 10 mm
- SS   : sunshine in 1 Hours
- TG   : mean temperature in 1 &#176;C
- TN   : minimum temperature in 1 &#176;C
- TX   : maximum temperature in 1 &#176;C



In [100]:
import pandas as pd
import numpy as np
import seaborn as sb
import matplotlib.pyplot as plt
import autoreload
import missingno as msno
import warnings
import csv
import os
import sys

from collections import defaultdict, Counter

%matplotlib inline

sb.set()
sb.set_style('ticks')
sb.set_palette('Accent')

pd.set_option('display.max_columns', 200)
pd.set_option('display.max_rows', 1000)

warnings.filterwarnings('ignore')

In [88]:
df = pd.read_csv("D:\Open Classroom\Datasets\Weather Prediction II\weather_prediction_dataset.csv")

In [89]:
df.shape

(3654, 165)

In [90]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3654 entries, 0 to 3653
Columns: 165 entries, DATE to TOURS_temp_max
dtypes: float64(150), int64(15)
memory usage: 4.6 MB


In [91]:
df.head()

Unnamed: 0,DATE,MONTH,BASEL_cloud_cover,BASEL_humidity,BASEL_pressure,BASEL_global_radiation,BASEL_precipitation,BASEL_sunshine,BASEL_temp_mean,BASEL_temp_min,BASEL_temp_max,BUDAPEST_cloud_cover,BUDAPEST_humidity,BUDAPEST_pressure,BUDAPEST_global_radiation,BUDAPEST_precipitation,BUDAPEST_sunshine,BUDAPEST_temp_mean,BUDAPEST_temp_max,DE_BILT_cloud_cover,DE_BILT_wind_speed,DE_BILT_wind_gust,DE_BILT_humidity,DE_BILT_pressure,DE_BILT_global_radiation,DE_BILT_precipitation,DE_BILT_sunshine,DE_BILT_temp_mean,DE_BILT_temp_min,DE_BILT_temp_max,DRESDEN_cloud_cover,DRESDEN_wind_speed,DRESDEN_wind_gust,DRESDEN_humidity,DRESDEN_global_radiation,DRESDEN_precipitation,DRESDEN_sunshine,DRESDEN_temp_mean,DRESDEN_temp_min,DRESDEN_temp_max,DUSSELDORF_cloud_cover,DUSSELDORF_wind_speed,DUSSELDORF_wind_gust,DUSSELDORF_humidity,DUSSELDORF_pressure,DUSSELDORF_global_radiation,DUSSELDORF_precipitation,DUSSELDORF_sunshine,DUSSELDORF_temp_mean,DUSSELDORF_temp_min,...,OSLO_cloud_cover,OSLO_wind_speed,OSLO_wind_gust,OSLO_humidity,OSLO_pressure,OSLO_global_radiation,OSLO_precipitation,OSLO_sunshine,OSLO_temp_mean,OSLO_temp_min,OSLO_temp_max,PERPIGNAN_wind_speed,PERPIGNAN_humidity,PERPIGNAN_pressure,PERPIGNAN_global_radiation,PERPIGNAN_precipitation,PERPIGNAN_temp_mean,PERPIGNAN_temp_min,PERPIGNAN_temp_max,ROMA_cloud_cover,ROMA_humidity,ROMA_pressure,ROMA_global_radiation,ROMA_sunshine,ROMA_temp_mean,ROMA_temp_min,ROMA_temp_max,SONNBLICK_cloud_cover,SONNBLICK_humidity,SONNBLICK_global_radiation,SONNBLICK_precipitation,SONNBLICK_sunshine,SONNBLICK_temp_mean,SONNBLICK_temp_min,SONNBLICK_temp_max,STOCKHOLM_cloud_cover,STOCKHOLM_pressure,STOCKHOLM_precipitation,STOCKHOLM_sunshine,STOCKHOLM_temp_mean,STOCKHOLM_temp_min,STOCKHOLM_temp_max,TOURS_wind_speed,TOURS_humidity,TOURS_pressure,TOURS_global_radiation,TOURS_precipitation,TOURS_temp_mean,TOURS_temp_min,TOURS_temp_max
0,20000101,1,8,0.89,1.0286,0.2,0.03,0.0,2.9,1.6,3.9,3,0.92,1.0268,0.52,0.0,3.7,-4.9,-0.7,7,2.5,8.0,0.97,1.024,0.11,0.1,0.0,6.1,3.5,8.1,8,3.2,7.2,0.89,0.09,0.32,0.0,1.0,-1.8,2.0,8,2.5,5.9,0.92,1.024,0.12,0.22,0.0,4.2,2.5,...,7,0.9,5.1,0.94,1.013,0.04,0.6,0.0,-5.0,-8.6,-3.2,4.4,0.71,1.0267,0.6,0.0,12.2,10.3,14.0,0,0.72,1.0244,0.92,8.4,1.6,3.0,8.0,7,0.89,0.82,1.34,0.0,-15.2,-17.0,-13.4,8,1.0163,0.17,0.0,-2.3,-9.3,0.7,1.6,0.97,1.0275,0.25,0.04,8.5,7.2,9.8
1,20000102,1,8,0.87,1.0318,0.25,0.0,0.0,3.6,2.7,4.8,8,0.94,1.0297,0.14,0.0,0.4,-3.6,-1.9,8,3.7,9.0,0.97,1.0267,0.11,0.0,0.0,7.3,5.4,8.7,7,4.0,8.8,0.89,0.23,0.0,0.4,2.5,1.4,4.0,6,3.0,7.4,0.87,1.0283,0.19,0.0,0.7,6.5,2.7,...,6,1.9,5.7,0.94,1.0076,0.11,0.0,1.6,-0.8,-6.7,2.4,2.9,0.67,1.0278,0.96,0.0,9.8,5.1,14.6,2,0.74,1.0263,0.81,6.5,4.2,0.0,8.4,5,0.86,0.6,0.39,2.8,-13.7,-15.0,-12.3,8,1.0108,0.2,0.0,1.3,0.5,2.0,2.0,0.99,1.0293,0.17,0.16,7.9,6.6,9.2
2,20000103,1,5,0.81,1.0314,0.5,0.0,3.7,2.2,0.1,4.8,6,0.95,1.0295,0.19,0.0,0.0,-0.8,1.1,8,6.1,13.0,0.94,1.0203,0.11,0.45,0.0,8.4,6.4,9.6,7,5.4,12.1,0.79,0.18,0.0,0.0,4.2,1.3,5.1,7,5.5,14.3,0.78,1.0235,0.12,0.28,0.0,7.7,6.9,...,6,1.7,8.7,0.88,1.0016,0.04,0.0,0.0,1.2,-1.1,3.8,2.5,0.85,1.0288,0.93,0.0,8.6,4.1,13.2,0,0.77,1.0288,0.89,0.0,3.8,11.1,21.1,3,0.41,0.81,0.0,5.1,-9.2,-12.5,-5.8,7,1.0071,0.08,1.8,0.8,-1.0,2.8,3.4,0.91,1.0267,0.27,0.0,8.1,6.6,9.6
3,20000104,1,7,0.79,1.0262,0.63,0.35,6.9,3.9,0.5,7.5,8,0.94,1.0252,0.21,0.0,0.0,-1.0,0.1,7,3.8,15.0,0.94,1.0142,0.11,1.09,0.0,6.4,4.3,9.4,8,6.0,14.4,0.88,0.11,0.22,0.0,4.4,3.4,5.2,7,6.0,16.8,0.87,1.0162,0.12,0.97,0.0,7.8,6.6,...,1,3.4,11.8,0.58,0.9982,0.13,0.0,5.3,2.1,-0.5,5.1,1.5,0.85,1.0269,0.56,0.02,8.6,4.3,12.8,1,0.85,1.0273,0.89,8.2,6.0,2.0,10.0,1,0.25,1.05,0.11,8.7,-5.6,-7.0,-4.2,2,0.9947,0.0,5.0,3.5,2.5,4.6,4.9,0.95,1.0222,0.11,0.44,8.6,6.4,10.8
4,20000105,1,5,0.9,1.0246,0.51,0.07,3.7,6.0,3.8,8.6,5,0.88,1.0235,0.43,0.0,0.8,0.2,3.9,3,4.0,12.0,0.9,1.0183,0.48,0.0,6.5,4.4,1.4,7.4,2,5.6,15.8,0.76,0.49,0.0,5.7,1.8,-0.5,6.9,4,4.5,11.2,0.8,1.0203,0.51,0.0,6.5,5.2,0.4,...,8,1.2,5.7,0.94,1.0055,0.05,0.06,0.0,-0.7,-4.0,0.5,2.6,0.74,1.0219,0.83,0.02,9.2,3.6,14.9,2,0.92,1.0238,0.74,7.5,5.0,-1.2,11.2,4,0.77,0.69,0.17,3.4,-7.6,-9.4,-5.8,5,1.0072,0.0,2.2,-0.6,-1.8,2.9,3.6,0.95,1.0209,0.39,0.04,8.0,6.4,9.5


In [92]:
df['DATE'].dtypes

dtype('int64')

In [93]:
df['DATE'] = df['DATE'].astype(str)

In [94]:
import datetime

df['DATE'] = list(map(lambda x: datetime.datetime.strptime(x, '%Y%m%d').strftime('%m/%d/%Y'), df['DATE']))

In [95]:
df.head()

Unnamed: 0,DATE,MONTH,BASEL_cloud_cover,BASEL_humidity,BASEL_pressure,BASEL_global_radiation,BASEL_precipitation,BASEL_sunshine,BASEL_temp_mean,BASEL_temp_min,BASEL_temp_max,BUDAPEST_cloud_cover,BUDAPEST_humidity,BUDAPEST_pressure,BUDAPEST_global_radiation,BUDAPEST_precipitation,BUDAPEST_sunshine,BUDAPEST_temp_mean,BUDAPEST_temp_max,DE_BILT_cloud_cover,DE_BILT_wind_speed,DE_BILT_wind_gust,DE_BILT_humidity,DE_BILT_pressure,DE_BILT_global_radiation,DE_BILT_precipitation,DE_BILT_sunshine,DE_BILT_temp_mean,DE_BILT_temp_min,DE_BILT_temp_max,DRESDEN_cloud_cover,DRESDEN_wind_speed,DRESDEN_wind_gust,DRESDEN_humidity,DRESDEN_global_radiation,DRESDEN_precipitation,DRESDEN_sunshine,DRESDEN_temp_mean,DRESDEN_temp_min,DRESDEN_temp_max,DUSSELDORF_cloud_cover,DUSSELDORF_wind_speed,DUSSELDORF_wind_gust,DUSSELDORF_humidity,DUSSELDORF_pressure,DUSSELDORF_global_radiation,DUSSELDORF_precipitation,DUSSELDORF_sunshine,DUSSELDORF_temp_mean,DUSSELDORF_temp_min,...,OSLO_cloud_cover,OSLO_wind_speed,OSLO_wind_gust,OSLO_humidity,OSLO_pressure,OSLO_global_radiation,OSLO_precipitation,OSLO_sunshine,OSLO_temp_mean,OSLO_temp_min,OSLO_temp_max,PERPIGNAN_wind_speed,PERPIGNAN_humidity,PERPIGNAN_pressure,PERPIGNAN_global_radiation,PERPIGNAN_precipitation,PERPIGNAN_temp_mean,PERPIGNAN_temp_min,PERPIGNAN_temp_max,ROMA_cloud_cover,ROMA_humidity,ROMA_pressure,ROMA_global_radiation,ROMA_sunshine,ROMA_temp_mean,ROMA_temp_min,ROMA_temp_max,SONNBLICK_cloud_cover,SONNBLICK_humidity,SONNBLICK_global_radiation,SONNBLICK_precipitation,SONNBLICK_sunshine,SONNBLICK_temp_mean,SONNBLICK_temp_min,SONNBLICK_temp_max,STOCKHOLM_cloud_cover,STOCKHOLM_pressure,STOCKHOLM_precipitation,STOCKHOLM_sunshine,STOCKHOLM_temp_mean,STOCKHOLM_temp_min,STOCKHOLM_temp_max,TOURS_wind_speed,TOURS_humidity,TOURS_pressure,TOURS_global_radiation,TOURS_precipitation,TOURS_temp_mean,TOURS_temp_min,TOURS_temp_max
0,01/01/2000,1,8,0.89,1.0286,0.2,0.03,0.0,2.9,1.6,3.9,3,0.92,1.0268,0.52,0.0,3.7,-4.9,-0.7,7,2.5,8.0,0.97,1.024,0.11,0.1,0.0,6.1,3.5,8.1,8,3.2,7.2,0.89,0.09,0.32,0.0,1.0,-1.8,2.0,8,2.5,5.9,0.92,1.024,0.12,0.22,0.0,4.2,2.5,...,7,0.9,5.1,0.94,1.013,0.04,0.6,0.0,-5.0,-8.6,-3.2,4.4,0.71,1.0267,0.6,0.0,12.2,10.3,14.0,0,0.72,1.0244,0.92,8.4,1.6,3.0,8.0,7,0.89,0.82,1.34,0.0,-15.2,-17.0,-13.4,8,1.0163,0.17,0.0,-2.3,-9.3,0.7,1.6,0.97,1.0275,0.25,0.04,8.5,7.2,9.8
1,01/02/2000,1,8,0.87,1.0318,0.25,0.0,0.0,3.6,2.7,4.8,8,0.94,1.0297,0.14,0.0,0.4,-3.6,-1.9,8,3.7,9.0,0.97,1.0267,0.11,0.0,0.0,7.3,5.4,8.7,7,4.0,8.8,0.89,0.23,0.0,0.4,2.5,1.4,4.0,6,3.0,7.4,0.87,1.0283,0.19,0.0,0.7,6.5,2.7,...,6,1.9,5.7,0.94,1.0076,0.11,0.0,1.6,-0.8,-6.7,2.4,2.9,0.67,1.0278,0.96,0.0,9.8,5.1,14.6,2,0.74,1.0263,0.81,6.5,4.2,0.0,8.4,5,0.86,0.6,0.39,2.8,-13.7,-15.0,-12.3,8,1.0108,0.2,0.0,1.3,0.5,2.0,2.0,0.99,1.0293,0.17,0.16,7.9,6.6,9.2
2,01/03/2000,1,5,0.81,1.0314,0.5,0.0,3.7,2.2,0.1,4.8,6,0.95,1.0295,0.19,0.0,0.0,-0.8,1.1,8,6.1,13.0,0.94,1.0203,0.11,0.45,0.0,8.4,6.4,9.6,7,5.4,12.1,0.79,0.18,0.0,0.0,4.2,1.3,5.1,7,5.5,14.3,0.78,1.0235,0.12,0.28,0.0,7.7,6.9,...,6,1.7,8.7,0.88,1.0016,0.04,0.0,0.0,1.2,-1.1,3.8,2.5,0.85,1.0288,0.93,0.0,8.6,4.1,13.2,0,0.77,1.0288,0.89,0.0,3.8,11.1,21.1,3,0.41,0.81,0.0,5.1,-9.2,-12.5,-5.8,7,1.0071,0.08,1.8,0.8,-1.0,2.8,3.4,0.91,1.0267,0.27,0.0,8.1,6.6,9.6
3,01/04/2000,1,7,0.79,1.0262,0.63,0.35,6.9,3.9,0.5,7.5,8,0.94,1.0252,0.21,0.0,0.0,-1.0,0.1,7,3.8,15.0,0.94,1.0142,0.11,1.09,0.0,6.4,4.3,9.4,8,6.0,14.4,0.88,0.11,0.22,0.0,4.4,3.4,5.2,7,6.0,16.8,0.87,1.0162,0.12,0.97,0.0,7.8,6.6,...,1,3.4,11.8,0.58,0.9982,0.13,0.0,5.3,2.1,-0.5,5.1,1.5,0.85,1.0269,0.56,0.02,8.6,4.3,12.8,1,0.85,1.0273,0.89,8.2,6.0,2.0,10.0,1,0.25,1.05,0.11,8.7,-5.6,-7.0,-4.2,2,0.9947,0.0,5.0,3.5,2.5,4.6,4.9,0.95,1.0222,0.11,0.44,8.6,6.4,10.8
4,01/05/2000,1,5,0.9,1.0246,0.51,0.07,3.7,6.0,3.8,8.6,5,0.88,1.0235,0.43,0.0,0.8,0.2,3.9,3,4.0,12.0,0.9,1.0183,0.48,0.0,6.5,4.4,1.4,7.4,2,5.6,15.8,0.76,0.49,0.0,5.7,1.8,-0.5,6.9,4,4.5,11.2,0.8,1.0203,0.51,0.0,6.5,5.2,0.4,...,8,1.2,5.7,0.94,1.0055,0.05,0.06,0.0,-0.7,-4.0,0.5,2.6,0.74,1.0219,0.83,0.02,9.2,3.6,14.9,2,0.92,1.0238,0.74,7.5,5.0,-1.2,11.2,4,0.77,0.69,0.17,3.4,-7.6,-9.4,-5.8,5,1.0072,0.0,2.2,-0.6,-1.8,2.9,3.6,0.95,1.0209,0.39,0.04,8.0,6.4,9.5


In [96]:
df['DATE'] = pd.to_datetime(df['DATE'])
df.head()

Unnamed: 0,DATE,MONTH,BASEL_cloud_cover,BASEL_humidity,BASEL_pressure,BASEL_global_radiation,BASEL_precipitation,BASEL_sunshine,BASEL_temp_mean,BASEL_temp_min,BASEL_temp_max,BUDAPEST_cloud_cover,BUDAPEST_humidity,BUDAPEST_pressure,BUDAPEST_global_radiation,BUDAPEST_precipitation,BUDAPEST_sunshine,BUDAPEST_temp_mean,BUDAPEST_temp_max,DE_BILT_cloud_cover,DE_BILT_wind_speed,DE_BILT_wind_gust,DE_BILT_humidity,DE_BILT_pressure,DE_BILT_global_radiation,DE_BILT_precipitation,DE_BILT_sunshine,DE_BILT_temp_mean,DE_BILT_temp_min,DE_BILT_temp_max,DRESDEN_cloud_cover,DRESDEN_wind_speed,DRESDEN_wind_gust,DRESDEN_humidity,DRESDEN_global_radiation,DRESDEN_precipitation,DRESDEN_sunshine,DRESDEN_temp_mean,DRESDEN_temp_min,DRESDEN_temp_max,DUSSELDORF_cloud_cover,DUSSELDORF_wind_speed,DUSSELDORF_wind_gust,DUSSELDORF_humidity,DUSSELDORF_pressure,DUSSELDORF_global_radiation,DUSSELDORF_precipitation,DUSSELDORF_sunshine,DUSSELDORF_temp_mean,DUSSELDORF_temp_min,...,OSLO_cloud_cover,OSLO_wind_speed,OSLO_wind_gust,OSLO_humidity,OSLO_pressure,OSLO_global_radiation,OSLO_precipitation,OSLO_sunshine,OSLO_temp_mean,OSLO_temp_min,OSLO_temp_max,PERPIGNAN_wind_speed,PERPIGNAN_humidity,PERPIGNAN_pressure,PERPIGNAN_global_radiation,PERPIGNAN_precipitation,PERPIGNAN_temp_mean,PERPIGNAN_temp_min,PERPIGNAN_temp_max,ROMA_cloud_cover,ROMA_humidity,ROMA_pressure,ROMA_global_radiation,ROMA_sunshine,ROMA_temp_mean,ROMA_temp_min,ROMA_temp_max,SONNBLICK_cloud_cover,SONNBLICK_humidity,SONNBLICK_global_radiation,SONNBLICK_precipitation,SONNBLICK_sunshine,SONNBLICK_temp_mean,SONNBLICK_temp_min,SONNBLICK_temp_max,STOCKHOLM_cloud_cover,STOCKHOLM_pressure,STOCKHOLM_precipitation,STOCKHOLM_sunshine,STOCKHOLM_temp_mean,STOCKHOLM_temp_min,STOCKHOLM_temp_max,TOURS_wind_speed,TOURS_humidity,TOURS_pressure,TOURS_global_radiation,TOURS_precipitation,TOURS_temp_mean,TOURS_temp_min,TOURS_temp_max
0,2000-01-01,1,8,0.89,1.0286,0.2,0.03,0.0,2.9,1.6,3.9,3,0.92,1.0268,0.52,0.0,3.7,-4.9,-0.7,7,2.5,8.0,0.97,1.024,0.11,0.1,0.0,6.1,3.5,8.1,8,3.2,7.2,0.89,0.09,0.32,0.0,1.0,-1.8,2.0,8,2.5,5.9,0.92,1.024,0.12,0.22,0.0,4.2,2.5,...,7,0.9,5.1,0.94,1.013,0.04,0.6,0.0,-5.0,-8.6,-3.2,4.4,0.71,1.0267,0.6,0.0,12.2,10.3,14.0,0,0.72,1.0244,0.92,8.4,1.6,3.0,8.0,7,0.89,0.82,1.34,0.0,-15.2,-17.0,-13.4,8,1.0163,0.17,0.0,-2.3,-9.3,0.7,1.6,0.97,1.0275,0.25,0.04,8.5,7.2,9.8
1,2000-01-02,1,8,0.87,1.0318,0.25,0.0,0.0,3.6,2.7,4.8,8,0.94,1.0297,0.14,0.0,0.4,-3.6,-1.9,8,3.7,9.0,0.97,1.0267,0.11,0.0,0.0,7.3,5.4,8.7,7,4.0,8.8,0.89,0.23,0.0,0.4,2.5,1.4,4.0,6,3.0,7.4,0.87,1.0283,0.19,0.0,0.7,6.5,2.7,...,6,1.9,5.7,0.94,1.0076,0.11,0.0,1.6,-0.8,-6.7,2.4,2.9,0.67,1.0278,0.96,0.0,9.8,5.1,14.6,2,0.74,1.0263,0.81,6.5,4.2,0.0,8.4,5,0.86,0.6,0.39,2.8,-13.7,-15.0,-12.3,8,1.0108,0.2,0.0,1.3,0.5,2.0,2.0,0.99,1.0293,0.17,0.16,7.9,6.6,9.2
2,2000-01-03,1,5,0.81,1.0314,0.5,0.0,3.7,2.2,0.1,4.8,6,0.95,1.0295,0.19,0.0,0.0,-0.8,1.1,8,6.1,13.0,0.94,1.0203,0.11,0.45,0.0,8.4,6.4,9.6,7,5.4,12.1,0.79,0.18,0.0,0.0,4.2,1.3,5.1,7,5.5,14.3,0.78,1.0235,0.12,0.28,0.0,7.7,6.9,...,6,1.7,8.7,0.88,1.0016,0.04,0.0,0.0,1.2,-1.1,3.8,2.5,0.85,1.0288,0.93,0.0,8.6,4.1,13.2,0,0.77,1.0288,0.89,0.0,3.8,11.1,21.1,3,0.41,0.81,0.0,5.1,-9.2,-12.5,-5.8,7,1.0071,0.08,1.8,0.8,-1.0,2.8,3.4,0.91,1.0267,0.27,0.0,8.1,6.6,9.6
3,2000-01-04,1,7,0.79,1.0262,0.63,0.35,6.9,3.9,0.5,7.5,8,0.94,1.0252,0.21,0.0,0.0,-1.0,0.1,7,3.8,15.0,0.94,1.0142,0.11,1.09,0.0,6.4,4.3,9.4,8,6.0,14.4,0.88,0.11,0.22,0.0,4.4,3.4,5.2,7,6.0,16.8,0.87,1.0162,0.12,0.97,0.0,7.8,6.6,...,1,3.4,11.8,0.58,0.9982,0.13,0.0,5.3,2.1,-0.5,5.1,1.5,0.85,1.0269,0.56,0.02,8.6,4.3,12.8,1,0.85,1.0273,0.89,8.2,6.0,2.0,10.0,1,0.25,1.05,0.11,8.7,-5.6,-7.0,-4.2,2,0.9947,0.0,5.0,3.5,2.5,4.6,4.9,0.95,1.0222,0.11,0.44,8.6,6.4,10.8
4,2000-01-05,1,5,0.9,1.0246,0.51,0.07,3.7,6.0,3.8,8.6,5,0.88,1.0235,0.43,0.0,0.8,0.2,3.9,3,4.0,12.0,0.9,1.0183,0.48,0.0,6.5,4.4,1.4,7.4,2,5.6,15.8,0.76,0.49,0.0,5.7,1.8,-0.5,6.9,4,4.5,11.2,0.8,1.0203,0.51,0.0,6.5,5.2,0.4,...,8,1.2,5.7,0.94,1.0055,0.05,0.06,0.0,-0.7,-4.0,0.5,2.6,0.74,1.0219,0.83,0.02,9.2,3.6,14.9,2,0.92,1.0238,0.74,7.5,5.0,-1.2,11.2,4,0.77,0.69,0.17,3.4,-7.6,-9.4,-5.8,5,1.0072,0.0,2.2,-0.6,-1.8,2.9,3.6,0.95,1.0209,0.39,0.04,8.0,6.4,9.5


In [101]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3654 entries, 0 to 3653
Columns: 165 entries, DATE to TOURS_temp_max
dtypes: datetime64[ns](1), float64(150), int64(14)
memory usage: 4.6 MB


In [104]:
df2 = df.copy()

In [107]:
df2.head()

Unnamed: 0,DATE,MONTH,BASEL_cloud_cover,BASEL_humidity,BASEL_pressure,BASEL_global_radiation,BASEL_precipitation,BASEL_sunshine,BASEL_temp_mean,BASEL_temp_min,BASEL_temp_max,BUDAPEST_cloud_cover,BUDAPEST_humidity,BUDAPEST_pressure,BUDAPEST_global_radiation,BUDAPEST_precipitation,BUDAPEST_sunshine,BUDAPEST_temp_mean,BUDAPEST_temp_max,DE_BILT_cloud_cover,DE_BILT_wind_speed,DE_BILT_wind_gust,DE_BILT_humidity,DE_BILT_pressure,DE_BILT_global_radiation,DE_BILT_precipitation,DE_BILT_sunshine,DE_BILT_temp_mean,DE_BILT_temp_min,DE_BILT_temp_max,DRESDEN_cloud_cover,DRESDEN_wind_speed,DRESDEN_wind_gust,DRESDEN_humidity,DRESDEN_global_radiation,DRESDEN_precipitation,DRESDEN_sunshine,DRESDEN_temp_mean,DRESDEN_temp_min,DRESDEN_temp_max,DUSSELDORF_cloud_cover,DUSSELDORF_wind_speed,DUSSELDORF_wind_gust,DUSSELDORF_humidity,DUSSELDORF_pressure,DUSSELDORF_global_radiation,DUSSELDORF_precipitation,DUSSELDORF_sunshine,DUSSELDORF_temp_mean,DUSSELDORF_temp_min,DUSSELDORF_temp_max,HEATHROW_cloud_cover,HEATHROW_humidity,HEATHROW_pressure,HEATHROW_global_radiation,HEATHROW_precipitation,HEATHROW_sunshine,HEATHROW_temp_mean,HEATHROW_temp_min,HEATHROW_temp_max,KASSEL_wind_speed,KASSEL_wind_gust,KASSEL_humidity,KASSEL_pressure,KASSEL_global_radiation,KASSEL_precipitation,KASSEL_sunshine,KASSEL_temp_mean,KASSEL_temp_min,KASSEL_temp_max,LJUBLJANA_cloud_cover,LJUBLJANA_wind_speed,LJUBLJANA_humidity,LJUBLJANA_pressure,LJUBLJANA_global_radiation,LJUBLJANA_precipitation,LJUBLJANA_sunshine,LJUBLJANA_temp_mean,LJUBLJANA_temp_min,LJUBLJANA_temp_max,MAASTRICHT_cloud_cover,MAASTRICHT_wind_speed,MAASTRICHT_wind_gust,MAASTRICHT_humidity,MAASTRICHT_pressure,MAASTRICHT_global_radiation,MAASTRICHT_precipitation,MAASTRICHT_sunshine,MAASTRICHT_temp_mean,MAASTRICHT_temp_min,MAASTRICHT_temp_max,MALMO_wind_speed,MALMO_precipitation,MALMO_temp_mean,MALMO_temp_min,MALMO_temp_max,MONTELIMAR_wind_speed,MONTELIMAR_humidity,MONTELIMAR_pressure,MONTELIMAR_global_radiation,MONTELIMAR_precipitation,MONTELIMAR_temp_mean,MONTELIMAR_temp_min,MONTELIMAR_temp_max,MUENCHEN_cloud_cover,MUENCHEN_wind_speed,MUENCHEN_wind_gust,MUENCHEN_humidity,MUENCHEN_pressure,MUENCHEN_global_radiation,MUENCHEN_precipitation,MUENCHEN_sunshine,MUENCHEN_temp_mean,MUENCHEN_temp_min,MUENCHEN_temp_max,OSLO_cloud_cover,OSLO_wind_speed,OSLO_wind_gust,OSLO_humidity,OSLO_pressure,OSLO_global_radiation,OSLO_precipitation,OSLO_sunshine,OSLO_temp_mean,OSLO_temp_min,OSLO_temp_max,PERPIGNAN_wind_speed,PERPIGNAN_humidity,PERPIGNAN_pressure,PERPIGNAN_global_radiation,PERPIGNAN_precipitation,PERPIGNAN_temp_mean,PERPIGNAN_temp_min,PERPIGNAN_temp_max,ROMA_cloud_cover,ROMA_humidity,ROMA_pressure,ROMA_global_radiation,ROMA_sunshine,ROMA_temp_mean,ROMA_temp_min,ROMA_temp_max,SONNBLICK_cloud_cover,SONNBLICK_humidity,SONNBLICK_global_radiation,SONNBLICK_precipitation,SONNBLICK_sunshine,SONNBLICK_temp_mean,SONNBLICK_temp_min,SONNBLICK_temp_max,STOCKHOLM_cloud_cover,STOCKHOLM_pressure,STOCKHOLM_precipitation,STOCKHOLM_sunshine,STOCKHOLM_temp_mean,STOCKHOLM_temp_min,STOCKHOLM_temp_max,TOURS_wind_speed,TOURS_humidity,TOURS_pressure,TOURS_global_radiation,TOURS_precipitation,TOURS_temp_mean,TOURS_temp_min,TOURS_temp_max
0,2000-01-01,1,8,0.89,1.0286,0.2,0.03,0.0,2.9,1.6,3.9,3,0.92,1.0268,0.52,0.0,3.7,-4.9,-0.7,7,2.5,8.0,0.97,1.024,0.11,0.1,0.0,6.1,3.5,8.1,8,3.2,7.2,0.89,0.09,0.32,0.0,1.0,-1.8,2.0,8,2.5,5.9,0.92,1.024,0.12,0.22,0.0,4.2,2.5,6.9,7,0.94,1.0245,0.18,0.0,0.4,7.0,4.9,10.8,2.5,8.2,0.93,1.0237,0.06,0.13,0.0,3.5,1.5,5.0,6,0.4,0.83,1.0294,0.57,0.0,5.2,-4.8,-9.1,-1.3,8,3.1,7.0,0.98,1.0251,0.06,0.17,0.0,5.6,4.1,6.9,2.5,0.27,2.9,0.9,3.6,3.8,0.85,1.0269,0.3,0.0,5.5,2.5,8.5,8,2.6,9.4,0.91,1.0273,0.2,0.2,0.0,1.7,-0.5,2.6,7,0.9,5.1,0.94,1.013,0.04,0.6,0.0,-5.0,-8.6,-3.2,4.4,0.71,1.0267,0.6,0.0,12.2,10.3,14.0,0,0.72,1.0244,0.92,8.4,1.6,3.0,8.0,7,0.89,0.82,1.34,0.0,-15.2,-17.0,-13.4,8,1.0163,0.17,0.0,-2.3,-9.3,0.7,1.6,0.97,1.0275,0.25,0.04,8.5,7.2,9.8
1,2000-01-02,1,8,0.87,1.0318,0.25,0.0,0.0,3.6,2.7,4.8,8,0.94,1.0297,0.14,0.0,0.4,-3.6,-1.9,8,3.7,9.0,0.97,1.0267,0.11,0.0,0.0,7.3,5.4,8.7,7,4.0,8.8,0.89,0.23,0.0,0.4,2.5,1.4,4.0,6,3.0,7.4,0.87,1.0283,0.19,0.0,0.7,6.5,2.7,7.9,7,0.89,1.0253,0.2,0.02,0.7,7.9,5.0,11.5,2.9,9.6,0.92,1.029,0.33,0.0,2.9,2.3,0.3,4.7,6,0.4,0.76,1.031,0.59,0.0,5.0,-0.9,-4.9,2.0,7,3.8,9.0,0.95,1.0286,0.14,0.0,0.0,6.2,4.2,7.5,3.8,0.0,3.7,1.0,5.4,5.8,0.82,1.0287,0.54,0.0,8.3,6.8,9.8,6,2.1,8.2,0.9,1.0321,0.66,0.0,6.1,1.9,-0.2,5.8,6,1.9,5.7,0.94,1.0076,0.11,0.0,1.6,-0.8,-6.7,2.4,2.9,0.67,1.0278,0.96,0.0,9.8,5.1,14.6,2,0.74,1.0263,0.81,6.5,4.2,0.0,8.4,5,0.86,0.6,0.39,2.8,-13.7,-15.0,-12.3,8,1.0108,0.2,0.0,1.3,0.5,2.0,2.0,0.99,1.0293,0.17,0.16,7.9,6.6,9.2
2,2000-01-03,1,5,0.81,1.0314,0.5,0.0,3.7,2.2,0.1,4.8,6,0.95,1.0295,0.19,0.0,0.0,-0.8,1.1,8,6.1,13.0,0.94,1.0203,0.11,0.45,0.0,8.4,6.4,9.6,7,5.4,12.1,0.79,0.18,0.0,0.0,4.2,1.3,5.1,7,5.5,14.3,0.78,1.0235,0.12,0.28,0.0,7.7,6.9,9.1,8,0.91,1.0186,0.13,0.6,0.0,9.4,7.2,9.5,4.8,11.9,0.9,1.0251,0.2,0.01,0.0,3.5,2.2,4.6,6,0.3,0.83,1.0309,0.51,0.0,2.4,-0.3,-1.8,3.3,7,7.4,14.0,0.87,1.0236,0.15,0.02,0.9,6.8,6.1,7.9,4.3,0.06,5.6,4.0,6.9,0.4,0.92,1.0316,0.53,0.0,3.2,-2.1,8.5,7,2.1,6.9,0.92,1.0317,0.28,0.0,0.4,-0.4,-3.3,0.9,6,1.7,8.7,0.88,1.0016,0.04,0.0,0.0,1.2,-1.1,3.8,2.5,0.85,1.0288,0.93,0.0,8.6,4.1,13.2,0,0.77,1.0288,0.89,0.0,3.8,11.1,21.1,3,0.41,0.81,0.0,5.1,-9.2,-12.5,-5.8,7,1.0071,0.08,1.8,0.8,-1.0,2.8,3.4,0.91,1.0267,0.27,0.0,8.1,6.6,9.6
3,2000-01-04,1,7,0.79,1.0262,0.63,0.35,6.9,3.9,0.5,7.5,8,0.94,1.0252,0.21,0.0,0.0,-1.0,0.1,7,3.8,15.0,0.94,1.0142,0.11,1.09,0.0,6.4,4.3,9.4,8,6.0,14.4,0.88,0.11,0.22,0.0,4.4,3.4,5.2,7,6.0,16.8,0.87,1.0162,0.12,0.97,0.0,7.8,6.6,9.2,5,0.89,1.0148,0.34,0.02,2.9,7.0,4.4,11.0,4.5,12.7,0.94,1.0174,0.06,0.44,0.0,4.8,3.5,5.6,2,0.4,0.88,1.0262,0.7,0.0,3.5,-3.6,-6.1,0.4,8,7.2,15.0,0.92,1.0165,0.07,1.33,0.0,7.3,6.1,9.0,3.9,0.75,4.5,3.0,6.4,1.1,0.85,1.0274,0.64,0.0,7.2,2.3,12.1,6,2.7,11.7,0.75,1.026,0.58,0.04,4.5,3.8,-2.8,6.6,1,3.4,11.8,0.58,0.9982,0.13,0.0,5.3,2.1,-0.5,5.1,1.5,0.85,1.0269,0.56,0.02,8.6,4.3,12.8,1,0.85,1.0273,0.89,8.2,6.0,2.0,10.0,1,0.25,1.05,0.11,8.7,-5.6,-7.0,-4.2,2,0.9947,0.0,5.0,3.5,2.5,4.6,4.9,0.95,1.0222,0.11,0.44,8.6,6.4,10.8
4,2000-01-05,1,5,0.9,1.0246,0.51,0.07,3.7,6.0,3.8,8.6,5,0.88,1.0235,0.43,0.0,0.8,0.2,3.9,3,4.0,12.0,0.9,1.0183,0.48,0.0,6.5,4.4,1.4,7.4,2,5.6,15.8,0.76,0.49,0.0,5.7,1.8,-0.5,6.9,4,4.5,11.2,0.8,1.0203,0.51,0.0,6.5,5.2,0.4,8.6,5,0.85,1.0142,0.25,0.08,1.3,6.4,1.9,10.8,2.4,8.8,0.84,1.021,0.48,0.0,6.7,2.3,0.2,6.3,4,0.6,0.85,1.0271,0.57,0.0,4.6,-3.0,-6.1,1.1,4,4.1,10.0,0.87,1.0205,0.44,0.0,6.2,5.2,0.6,8.4,3.2,0.03,3.8,2.5,5.5,3.4,0.82,1.0234,0.7,0.0,8.2,1.5,14.8,5,3.3,13.2,0.87,1.0248,0.26,0.0,0.2,5.3,4.3,7.3,8,1.2,5.7,0.94,1.0055,0.05,0.06,0.0,-0.7,-4.0,0.5,2.6,0.74,1.0219,0.83,0.02,9.2,3.6,14.9,2,0.92,1.0238,0.74,7.5,5.0,-1.2,11.2,4,0.77,0.69,0.17,3.4,-7.6,-9.4,-5.8,5,1.0072,0.0,2.2,-0.6,-1.8,2.9,3.6,0.95,1.0209,0.39,0.04,8.0,6.4,9.5


In [117]:
cols = ['BASEL_cloud_cover', 'BASEL_humidity','BASEL_pressure','BASEL_global_radiation','BASEL_precipitation','BASEL_sunshine','BASEL_temp_mean','BASEL_temp_min',	
        'BASEL_temp_max', 'BUDAPEST_cloud_cover', 'BUDAPEST_humidity', 'BUDAPEST_pressure', 'BUDAPEST_global_radiation', 'BUDAPEST_precipitation', 'BUDAPEST_sunshine',	
        'BUDAPEST_temp_mean', 'BUDAPEST_temp_max', 'DE_BILT_cloud_cover', 'DE_BILT_wind_speed', 'DE_BILT_wind_gust', 'DE_BILT_humidity', 'DE_BILT_pressure','DE_BILT_global_radiation',	'DE_BILT_precipitation',	
        'DE_BILT_sunshine', 'DE_BILT_temp_mean', 'DE_BILT_temp_min', 'DE_BILT_temp_max', 'DRESDEN_cloud_cover', 'DRESDEN_wind_speed', 'DRESDEN_wind_gust', 'DRESDEN_humidity',	
        'DRESDEN_global_radiation', 'DRESDEN_precipitation', 'DRESDEN_sunshine', 'DRESDEN_temp_mean', 'DRESDEN_temp_min', 'DRESDEN_temp_max', 'DUSSELDORF_cloud_cover', 'DUSSELDORF_wind_speed',	
        'DUSSELDORF_wind_gust','DUSSELDORF_humidity', 'DUSSELDORF_pressure', 'DUSSELDORF_global_radiation', 'DUSSELDORF_precipitation', 'DUSSELDORF_sunshine', 'DUSSELDORF_temp_mean',	
        'DUSSELDORF_temp_min', 'DUSSELDORF_temp_max', 'HEATHROW_cloud_cover', 'HEATHROW_humidity', 'HEATHROW_pressure', 'HEATHROW_global_radiation', 'HEATHROW_precipitation',	
        'HEATHROW_sunshine' ,'HEATHROW_temp_mean' ,'HEATHROW_temp_min' ,'HEATHROW_temp_max' ,'KASSEL_wind_speed' ,'KASSEL_wind_gust' ,'KASSEL_humidity',	
        'KASSEL_pressure', 'KASSEL_global_radiation', 'KASSEL_precipitation', 'KASSEL_sunshine', 'KASSEL_temp_mean', 'KASSEL_temp_min', 'KASSEL_temp_max',	
        'LJUBLJANA_cloud_cover', 'LJUBLJANA_wind_speed', 'LJUBLJANA_humidity', 'LJUBLJANA_pressure', 'LJUBLJANA_global_radiation', 'LJUBLJANA_precipitation', 
        'LJUBLJANA_sunshine', 'LJUBLJANA_temp_mean', 'LJUBLJANA_temp_min', 'LJUBLJANA_temp_max', 'MAASTRICHT_cloud_cover', 'MAASTRICHT_wind_speed', 'MAASTRICHT_wind_gust',	
        'MAASTRICHT_humidity', 'MAASTRICHT_pressure', 'MAASTRICHT_global_radiation', 'MAASTRICHT_precipitation', 
        'MAASTRICHT_sunshine', 'MAASTRICHT_temp_mean', 'MAASTRICHT_temp_min', 'MAASTRICHT_temp_max', 'MALMO_wind_speed', 'MALMO_precipitation', 
        'MALMO_temp_mean', 'MALMO_temp_min', 'MALMO_temp_max', 'MONTELIMAR_wind_speed', 'MONTELIMAR_humidity', 'MONTELIMAR_pressure', 'MONTELIMAR_global_radiation',	
        'MONTELIMAR_precipitation', 'MONTELIMAR_temp_mean', 'MONTELIMAR_temp_min', 'MONTELIMAR_temp_max', 'MUENCHEN_cloud_cover', 
        'MUENCHEN_wind_speed', 'MUENCHEN_wind_gust', 'MUENCHEN_humidity', 'MUENCHEN_pressure', 'MUENCHEN_global_radiation', 'MUENCHEN_precipitation', 'MUENCHEN_sunshine',	
        'MUENCHEN_temp_mean', 'MUENCHEN_temp_min', 'MUENCHEN_temp_max', 'OSLO_cloud_cover', 'OSLO_wind_speed', 'OSLO_wind_gust', 'OSLO_humidity', 'OSLO_pressure', 
        'OSLO_global_radiation', 'OSLO_precipitation', 'OSLO_sunshine', 'OSLO_temp_mean', 'OSLO_temp_min', 'OSLO_temp_max', 'PERPIGNAN_wind_speed', 
        'PERPIGNAN_humidity', 'PERPIGNAN_pressure', 'PERPIGNAN_global_radiation', 'PERPIGNAN_precipitation', 'PERPIGNAN_temp_mean', 'PERPIGNAN_temp_min', 
        'PERPIGNAN_temp_max', 'ROMA_cloud_cover', 'ROMA_humidity', 'ROMA_pressure', 'ROMA_global_radiation', 'ROMA_sunshine', 'ROMA_temp_mean', 'ROMA_temp_min', 
        'ROMA_temp_max', 'SONNBLICK_cloud_cover', 'SONNBLICK_humidity', 'SONNBLICK_global_radiation', 'SONNBLICK_precipitation', 'SONNBLICK_sunshine', 'SONNBLICK_temp_mean',	
        'SONNBLICK_temp_min', 'SONNBLICK_temp_max', 'STOCKHOLM_cloud_cover', 'STOCKHOLM_pressure', 'STOCKHOLM_precipitation', 'STOCKHOLM_sunshine', 'STOCKHOLM_temp_mean',	
        'STOCKHOLM_temp_min', 'STOCKHOLM_temp_max', 'TOURS_wind_speed', 'TOURS_humidity', 'TOURS_pressure', 'TOURS_global_radiation', 'TOURS_precipitation', 'TOURS_temp_mean',	
        'TOURS_temp_min','TOURS_temp_max']

vals_name = ['cloud_cover', 'humidity', 'pressure', 'global_rad', 'ppt', 'sunshine', 'temp_mean', 'temp_min', 'temp_max']

In [121]:
df2 = pd.melt(df2, id_vars=['DATE', 'MONTH'],
             value_vars=[x for x in cols], 
             var_name=['location'], value_name=[y for y in vals_name])

TypeError: unhashable type: 'list'

In [129]:
df2 = pd.melt(df2, id_vars=['DATE', 'MONTH'],
             value_vars=['BASEL_cloud_cover', 'BASEL_humidity'], 
             var_name=['location'], value_name=['cloud_cover', 'humidity'])

KeyError: "The following 'value_vars' are not present in the DataFrame: ['BASEL_cloud_cover', 'BASEL_humidity']"

In [126]:
df2.head(750)

Unnamed: 0,DATE,MONTH,location,cloud_cover
0,2000-01-01,1,BASEL_cloud_cover,8
1,2000-01-02,1,BASEL_cloud_cover,8
2,2000-01-03,1,BASEL_cloud_cover,5
3,2000-01-04,1,BASEL_cloud_cover,7
4,2000-01-05,1,BASEL_cloud_cover,5
5,2000-01-06,1,BASEL_cloud_cover,3
6,2000-01-07,1,BASEL_cloud_cover,8
7,2000-01-08,1,BASEL_cloud_cover,4
8,2000-01-09,1,BASEL_cloud_cover,8
9,2000-01-10,1,BASEL_cloud_cover,8


In [124]:
df2.shape

(3654, 4)