# Rain in Australia

---
I find weather and meterological data highly interesting to work with and I created this notebook to understand this data and see if I can find something interesting in it. If you find this notebook useful, feel free sharing it and leave an upvote. Thanks!

In [None]:
import numpy as np
import pandas as pd
import os
import seaborn as sns
import plotly.express as px
import datetime as dt
from geopy.geocoders import Nominatim
import re

In [None]:
dataset = pd.read_csv("/kaggle/input/weather-dataset-rattle-package/weatherAUS.csv")
dataset.head()

In [None]:
print("Percentage of null values in columns")
100*np.round(dataset.isnull().sum()/len(dataset), 2)

In [None]:
dataset.describe()

## Understanding a bit about Australia's weather patterns
According to [wikipedia](https://en.wikipedia.org/wiki/Climate_of_Australia), more than 80% of Australia has an annual rainfall of less than 600 mm (24 in); among the continents, only Antarctica receives less rainfall. A place inland near Lake Eyre (in South Australia) would only receive 81 mm (3 in) of rain annually.Another place, Troudaninna Bore (29°11′44″S 138°59′28″E, altitude : 46 m) in South Australia, from 1893 to 1936, received, in average, 104.9 mm (4.13 inches) of precipitation.From one extreme to another, parts of the far North Queensland coast annually average over 4,000 mm (157 in), with the Australian annual record being 12,461 mm (491 in), set at the summit of Mount Bellenden Ker in 2000. Four factors contribute to the dryness of the Australian landmass:

* Cold ocean currents off the west coast
* Low elevation of landforms
* Dominance of high-pressure systems
* Shape of the landmass



Let us if our data shows us similar trends or not.

In [None]:
df = dataset.copy()
df["Date"] = df["Date"].apply(lambda x: dt.datetime.strptime(x, '%Y-%m-%d'))
df["Month"], df["Year"] = df["Date"].apply(lambda x: x.month), df["Date"].apply(lambda x: x.year)
date_trend = df.groupby("Date").agg({"Rainfall" : "mean"}).reset_index()
px.line(date_trend, x = "Date", y="Rainfall", title = "Timeline of rainfall in Australia", hover_name="Date")


According to [Wikipedia](https://en.wikipedia.org/wiki/Climate_of_Australia), the majority of rainfall occurs between December and March (the Southern Hemisphere summer), when thunderstorms are common and afternoon relative humidity averages over 70% during the wettest months. On average more than 1,570 mm (62 in) of rain falls in the north. Thunderstorms can produce spectacular lightning displays.



In [None]:
avg_rainfall = df.groupby(["Location","Year", "Month"]).agg({"Rainfall" : "sum", "Humidity9am" : "mean", "Humidity3pm" : "mean"}).reset_index()
avg_monthly_rainfall = avg_rainfall.groupby(["Year", "Month"]).agg({"Rainfall" : "mean"}).reset_index()
px.area(avg_monthly_rainfall, x = "Year", y = "Rainfall", color = "Month", title = "Monthly Rainfall Trend")

In [None]:
px.box(avg_monthly_rainfall, x = "Month", y = "Rainfall", title = "Monthly Rainfall Distribution")

In [None]:
locations_list = dataset["Location"].unique()
locations_list = [location + " Australia" for location in locations_list]

geolocator = Nominatim(user_agent="my_application")
locations_list = dataset["Location"].unique()
locations_list = [location + " Australia" for location in locations_list]
location_dict = dict()
for location in locations_list:
    geo_location = geolocator.geocode(location)
    if geo_location is not None:
        location_dict[re.sub(" Australia", "", location)] = [geo_location.latitude, geo_location.longitude]

In [None]:
location_dict, len(location_dict)

We only got the geolocations of 35 locations of Australia! Nevertheless, let's see how the data looks like when mapped on a geographical map of Australia

In [None]:
avg_rainfall_by_location = avg_rainfall.groupby(["Location", "Year"]).agg({"Rainfall": "sum"}).reset_index()
avg_yearly_rainfall_by_location = avg_rainfall_by_location.groupby("Location").agg({"Rainfall" : "mean"}).reset_index()
px.bar(avg_yearly_rainfall_by_location, x = "Location", y = "Rainfall", title = "Average Yearly Rainfall by location")

In [None]:
avg_yearly_rainfall_by_location_cordinates = avg_rainfall_by_location.copy().sort_values("Year")
avg_yearly_rainfall_by_location_cordinates["Cordinates"] = avg_yearly_rainfall_by_location_cordinates["Location"].map(location_dict)
avg_yearly_rainfall_by_location_cordinates = avg_yearly_rainfall_by_location_cordinates[~avg_yearly_rainfall_by_location_cordinates["Cordinates"].isnull()]
avg_yearly_rainfall_by_location_cordinates["Lat"] = avg_yearly_rainfall_by_location_cordinates["Cordinates"].apply(lambda x: x[0])
avg_yearly_rainfall_by_location_cordinates["Long"] = avg_yearly_rainfall_by_location_cordinates["Cordinates"].apply(lambda x: x[1])

fig = px.density_mapbox(avg_yearly_rainfall_by_location_cordinates, lat='Lat', lon='Long', z='Rainfall', radius=20,
                        center=dict(lat=-30.86, lon=140.20), zoom=3,
                        mapbox_style="stamen-terrain", animation_frame = "Year", 
                        range_color = [0, 1200], 
                        title = "Yearly rainfall distribution", 
                        height = 800)
fig.show()

## Summary

The following information from Wikipedia makes much more sense now

- The average annual rainfall in the Australian desert is low, ranging from 81 to 250 mm (3 to 10 in). 
- The southern parts of Australia get the usual westerly winds and rain-bearing cold fronts that come when high–pressure systems move towards northern Australia during winter.
- The tropical areas of northern Australia have a wet summer because of the monsoon. During "the wet", typically October to April, humid north-westerly winds bring showers and thunderstorms. 
- Rainfall records tend to be concentrated along the east coast of Australia, particularly in tropical north Queensland. 
- Cold ocean currents off the coast of Western Australia result in little evaporation occurring. Hence, rain clouds are sparsely formed and rarely do they form long enough for a continuous period of rain to be recorded. Australia's arid/semi-arid zone extends to this region. The absence of any significant mountain range or area of substantial height above sea level, results in very little rainfall caused by orographic uplift. In the east the Great Dividing Range limits rain moving into inland Australia.
- Australia has a compact shape, and no significant bodies of water penetrate very far inland. This is important in as much as moist winds are prevented from penetrating inland, so keeping rainfall low.


