<a href="https://colab.research.google.com/github/IzaquielCordeiro/Jupyter-Notebook/blob/master/AUS_weather_plotly.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Dataset


* Rain in Australia

https://www.kaggle.com/jsphyg/weather-dataset-rattle-package?select=weatherAUS.csv


## Imports

In [None]:
!pip install -q plotly==4.2.1

[K     |████████████████████████████████| 7.2 MB 22.9 MB/s 
[?25h  Building wheel for retrying (setup.py) ... [?25l[?25hdone


In [None]:
import pandas as pd
import plotly
import plotly.graph_objects as go
import plotly.express as px
import numpy as np
from plotly.offline import plot, iplot
import cufflinks as cf
cf.go_offline()
plotly.offline.init_notebook_mode(connected = True)
import plotly.io as pio
pio.renderers.default = 'colab'

## Loading dataset

In [None]:
df = pd.read_csv("./weatherAUS.csv")

## Pre-processing, data cleaning, transformations, etc.


Reducing features

In [None]:
df.drop(columns=["WindGustDir",	"WindGustSpeed",	"WindDir9am",	"WindDir3pm",	"WindSpeed9am",	"WindSpeed3pm",	"Humidity9am",	"Humidity3pm",	"Pressure9am",	"Pressure3pm",	"Cloud9am",	"Cloud3pm",	"Temp9am",	"Temp3pm",	"RainToday",	"RainTomorrow"], inplace=True)
df = df[df["Location"] == 'Canberra']
df.dropna(inplace=True)

Adjusting date values (DateTime)

In [None]:
df["Date"] = pd.to_datetime(df["Date"])

## Interactive Charts


The ***Rainfall*** data had an extensive horizontal range of low values, and nevertheless, it had a single vertically extensive value, which ended up omitting the disparity of the smaller intervals. It was by far a good chart choice to use the dynamic/interactive chart strategy on those data.

In [None]:
nbins=len(df['Rainfall'].unique())
fig = px.histogram(x=df["Rainfall"].values, nbins=nbins)
fig.update_layout(bargap=0.2)
fig.show()

A side-by-side demonstration of the difference between minimum and maximum temperature records respectively.

In [None]:
import plotly.offline as py
minTemp = go.Histogram(
               x = df["MinTemp"],
               name = 'MinTemp')

maxTemp = go.Histogram(
                x = df["MaxTemp"],
                name = 'MaxTemp')

py.iplot([minTemp,maxTemp])

We can see from the following two graphs, over a period of 3 years, how the similarity of temperature records behaves. It is also clear that both are positively related: When there is a decrease or increase in a temperature record, there is also a tendency for the other record to share the same event.

In [None]:
years = ("2007", "2010")
d = df[(df["Date"] < pd.to_datetime("01/01/"+years[1])) & (df["Date"] >= pd.to_datetime("01/01/"+years[0]))]
minTemp = go.Scatter(
               x = d["Date"],
               y = d["MinTemp"],
               name = 'MinTemp')

maxTemp = go.Scatter(
                x = d["Date"],
                y = d["MaxTemp"],
                name = 'MaxTemp')

py.iplot([minTemp,maxTemp])

In [None]:
fig = px.scatter(d, x="MinTemp", y="MaxTemp", marginal_x="histogram", marginal_y="histogram")
fig.show()