# Exercici - Regressió polinòmica - Temperatura de confort

## Introducció

La temperatura de sensació o temperatura percebuda és un valor que ens serveix per avaluar la sensació tèrmica que experimenta el cos sota els efectes combinats de la temperatura i altres elements materològics (vent, humitat....)
Des dels serveis meterològics de la Corporació Catalana de Mitjana Audivisuals (CCMA) ens han demant que els hi preparem un model per predir aquesta temperatura de sensació.
Ells volen utilitzar el nostre model per mostrar-lo als mapes de la seva web: https://www.ccma.cat/el-temps/previsio/

Per fer això ens passen un dataset [weatherHistory.csv](weatherHistory.csv) a on hi ha dades meteorològiques les quals ens podem basar per fer el model

* **Formatted Date:** Data en format americà.
* **Summary:** Descripció del dia: Partly Cloudy,Overcast,Dry, Light Rain,....
* **Precip Type:** Tipus de precipitació (rain,snow,null)
* **Temperature (C):** Temperatura en graus celsius/centígrads
* **Apparent Temperature (C):** Temperatura de sensació en graus celsius/centígrads
* **Humidity:** Humitat en tant per 1
* **Wind Speed (km/h):** Velocitat del vent
* **Wind Bearing (degrees):** Direcció del vent.
* **Visibility (km):** 
* **Cloud Cover:** 0(serè), 1-2(pocs), 3-4(dispersos), 5-7(rancats), 8(enuvolat), 9(fosc) 
* **Pressure (millibars):** Pressió atmosfèrica

## Objectiu

Utilitza les tècniques vistes de regressió per crear un model capaç de predir la temperatura de sensació.

Documenta el procés i justifica els valors obtinguts.


In [24]:
# Importem numpy i pandas
import pandas as pd
import numpy as np 

# Visualització de les dades
import matplotlib.pyplot as plot
import statsmodels.api as sm
import seaborn as sns

## Exploració de dades

Realitza una exploració de les dades

In [9]:
data = pd.read_csv('weatherHistory.csv')
data.head()

Unnamed: 0,Formatted Date,Summary,Precip Type,Temperature (C),Apparent Temperature (C),Humidity,Wind Speed (km/h),Wind Bearing (degrees),Visibility (km),Cloud Cover,Pressure (millibars),Daily Summary
0,2006-04-01 00:00:00.000 +0200,Partly Cloudy,rain,9.472222,7.388889,0.89,14.1197,251.0,15.8263,0.0,1015.13,Partly cloudy throughout the day.
1,2006-04-01 01:00:00.000 +0200,Partly Cloudy,rain,9.355556,7.227778,0.86,14.2646,259.0,15.8263,0.0,1015.63,Partly cloudy throughout the day.
2,2006-04-01 02:00:00.000 +0200,Mostly Cloudy,rain,9.377778,9.377778,0.89,3.9284,204.0,14.9569,0.0,1015.94,Partly cloudy throughout the day.
3,2006-04-01 03:00:00.000 +0200,Partly Cloudy,rain,8.288889,5.944444,0.83,14.1036,269.0,15.8263,0.0,1016.41,Partly cloudy throughout the day.
4,2006-04-01 04:00:00.000 +0200,Mostly Cloudy,rain,8.755556,6.977778,0.83,11.0446,259.0,15.8263,0.0,1016.51,Partly cloudy throughout the day.


In [11]:
#Canvio el nom de les columnes per poder treballar-hi millor.

data = data.set_axis(['date', 'summary', 'precip_type', 'temp','atemp','humidity','wind_speed','wind_bearing','visibility','cloud_cover','pressure','daily_summary'], axis=1)
data.head()

Unnamed: 0,date,summary,precip_type,temp,atemp,humidity,wind_speed,wind_bearing,visibility,cloud_cover,pressure,daily_summary
0,2006-04-01 00:00:00.000 +0200,Partly Cloudy,rain,9.472222,7.388889,0.89,14.1197,251.0,15.8263,0.0,1015.13,Partly cloudy throughout the day.
1,2006-04-01 01:00:00.000 +0200,Partly Cloudy,rain,9.355556,7.227778,0.86,14.2646,259.0,15.8263,0.0,1015.63,Partly cloudy throughout the day.
2,2006-04-01 02:00:00.000 +0200,Mostly Cloudy,rain,9.377778,9.377778,0.89,3.9284,204.0,14.9569,0.0,1015.94,Partly cloudy throughout the day.
3,2006-04-01 03:00:00.000 +0200,Partly Cloudy,rain,8.288889,5.944444,0.83,14.1036,269.0,15.8263,0.0,1016.41,Partly cloudy throughout the day.
4,2006-04-01 04:00:00.000 +0200,Mostly Cloudy,rain,8.755556,6.977778,0.83,11.0446,259.0,15.8263,0.0,1016.51,Partly cloudy throughout the day.


### Analizem les dades

In [14]:
# Mirem la cuantitat de registre que tenim, mitjana, etc...
data.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
temp,96453.0,11.932678,9.551546,-21.822222,4.688889,12.0,18.838889,39.905556
atemp,96453.0,10.855029,10.696847,-27.716667,2.311111,12.0,18.838889,39.344444
humidity,96453.0,0.734899,0.195473,0.0,0.6,0.78,0.89,1.0
wind_speed,96453.0,10.81064,6.913571,0.0,5.8282,9.9659,14.1358,63.8526
wind_bearing,96453.0,187.509232,107.383428,0.0,116.0,180.0,290.0,359.0
visibility,96453.0,10.347325,4.192123,0.0,8.3398,10.0464,14.812,16.1
cloud_cover,96453.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
pressure,96453.0,1003.235956,116.969906,0.0,1011.9,1016.45,1021.09,1046.38


In [1]:
# Comprovem si tenim valors NULLs