# Visualisation Group Work

You can use this notebook as a template and add your plots in the cells below. We've already added some code to import the necessary packages and included an example plot to show you how a good plot might look like. 

Now it's you your turn to create your first plots with Python's plotting libraries. At the end of this exercise your notebook should contain one plot per library. Since you will share your notebook with the other groups, make sure to add comments so it's easy for them to understand your code. 

Your group number will tell you which kind of plot and data set you should use for the exercise. 

| Group | Plot | Dataset | 
|-------|------|---------|
|  1 | Scatterplot | Seattle Weather |
|  2 | Lineplot | Seattle Weather | 
|  3 | Barchart | Seattle Weather  | 
|  4 | Geographical Maps | Airports |  



## What makes a plot good?

For this exercise the charts do not have to be particularly fancy or provide mind-blowing insights into the data, but they should contain all the elements that make a good plot.
Take the following plot as an example:

![example_plot](image/example_plot.png)

Like the plot above your figures should have/be...
1. ... a meaningful title.
2. ... labels (with units when necessary) on both axis. 
3. ... a legend (if necessary). Make sure it doesn't overlap other important elements.
3. ... text that is easliy readable. You can change and increase the font size, rotate tick labels, flip axis etc. to improve readabilty. 
4. ... not overloaded with information. Try to keep it rather clean and simple.

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import altair as alt

## Import Data
Start by importing the data. We added some lines of code for the groups that work with the seattle weather data to help you get started. Depending on how you name your dataframe, you might have to slightly adjust the code in the cell below.

In [None]:
# Import data
df_weather = pd.read_csv("data_group_work/seattle-weather.csv")

In [None]:
# You might have to adjust the code in this cell if you named your dataframe differently
# Convert date to datetime format
df_weather.date = pd.to_datetime(df_weather.date)

# Create new column for year, month, day
df_weather["year"] = df_weather.date.dt.year
df_weather["month"] = df_weather["date"].dt.month
df_weather["day"] = df_weather["date"].dt.day

# Create new columns for extreme weather (98 % quantiles)
df_weather["ex_precipitation"] = (
    df_weather.precipitation > df_weather.precipitation.quantile(q=0.98)
)
df_weather["ex_wind"] = df_weather.wind > df_weather.wind.quantile(q=0.98)
df_weather["ex_temp_max"] = df_weather.temp_max > df_weather.temp_max.quantile(q=0.98)
df_weather["ex_temp_min"] = df_weather.temp_min > df_weather.temp_min.quantile(q=0.02)

# Combine extreme weather in one column (rain and wind)
df_weather["ex_p_w"] = np.logical_or(df_weather.ex_precipitation, df_weather.ex_wind)

# Create new column month; remove string column "weather"
df_avg = df_weather.drop(
    [
        "weather",
        "date",
        "year",
        "day",
        "ex_precipitation",
        "ex_wind",
        "ex_temp_max",
        "ex_temp_min",
        "ex_p_w",
    ],
    axis=1,
)
df_avg = df_avg.groupby("month").mean()
df_avg["month"] = df_avg.index

In [None]:
df_weather.head()

In [None]:
df_avg.head()

## 1. Matplotlib

In [None]:
# Monthly average rain/snow in Seattle from years 2012-2015

# select style
plt.style.use("default")

# create subplot with 2 rows and 1 column
fig, ax = plt.subplots(2, 1, figsize=(4.5, 7.5))
fig.tight_layout(pad=2)

fig.suptitle(
    "Mean Precipitation and Temperature in Seattle (2012 - 2015)",
)

ax[0].scatter(df_avg["month"], df_avg["precipitation"], color="steelblue")
ax[0].set(ylabel="Mean Precipitation (mm)", xlabel="")


ax[1].scatter(df_avg["month"], df_avg["temp_max"], color="coral", label="max")
ax[1].set(ylabel="Mean Temperature (°C)", xlabel="Month")
ax[1].scatter(df_avg["month"], df_avg["temp_min"], color="cornflowerblue", label="min")
ax[1].legend(
    loc="upper left", bbox_to_anchor=(1, 0.6), frameon=False, title="Temperature"
)
plt.show()

## 2. Seaborn

In [None]:
# Extreme weather in Seattle defined by rain or wind

# select style
sns.set_style("white")
plt.style.use("fivethirtyeight")
# plot
fig = sns.scatterplot(
    data=df_weather,
    x="date",
    y="precipitation",
    palette="Blues",
    size="wind",
    hue="ex_p_w",
    alpha=0.5,
    sizes=(1, 100),
    legend="brief",
)

# adjust aesthetics
## rotate x axis labels
plt.xticks(rotation=20)
## adjust figure title and x and y axis labels
fig.set(
    title="Extreme Weather in Seattle (2012 - 2015)",
    xlabel="Date",
    ylabel="Precipitation (mm)",
)

## Override legend subtitles and labels
handles, previous_labels = fig.get_legend_handles_labels()
fig.legend(
    handles=handles,
    labels=["Extreme Weather", "False", "True", "Wind", "2", "4", "6", "8"],
)

## move legend outside of plot, remove frame
sns.move_legend(fig, "upper left", bbox_to_anchor=(1, 0.75), frameon=False)

plt.show()

In [None]:
# Extreme weather dates by rain

# select style
sns.set_style("white")
plt.style.use("fivethirtyeight")

# define color palette
colors = {
    False: "#1f77b4",
    True: "#9467bd",
}

# plot
ax = sns.scatterplot(
    data=df_weather,
    x="date",
    y="temp_max",
    hue="ex_temp_max",
    alpha=0.3,
    sizes=(1, 100),
    legend=False,
    palette=colors,
)

# adjust aesthetics
## rotate x axis labels
plt.xticks(rotation=20)
## adjust figure title and x and y axis labels
ax.set(
    title="High Temperature in Seattle (2012 - 2015)",
    xlabel="Date",
    ylabel="Temperature (°C)",
)

plt.show()

In [None]:
# Plot min vs max temp by weather

# select default styles
plt.style.use("default")
sns.set_theme()

# define colors for weather category
# hint: sns.color_pallet() generates nice color palettes
colors = {
    "rain": "#1f77b4",
    "drizzle": "#aec7e8",
    "fog": "#a7a7a7",
    "sun": "#e7ba52",
    "snow": "#9467bd",
}

# take a sample from the population
#df_sample = df_weather.sample(200)
fig = sns.lmplot(data = df_weather, x="temp_min", y="temp_max", hue="weather", col="weather", palette=colors)
fig.set(
    #title="Max. Temperature vs. Min. Temperature in Seattle",
    xlabel="Min. Temperature (°C)",
    ylabel="Max. Temperature (°C)",
)

plt.show()

In [None]:
# Pairplot for all variables (correlation matrix)
sns.set_style("white")
plt.style.use("fivethirtyeight")

# define colors for weather category
colors = {
    "rain": "#1f77b4",
    "drizzle": "#aec7e8",
    "fog": "#a7a7a7",
    "sun": "#e7ba52",
    "snow": "#9467bd",
}

g = sns.pairplot(
    df_weather[["precipitation", "temp_max", "temp_min", "wind", "weather"]],
    vars=["temp_max", "temp_min", "precipitation"],
    hue="weather",
    palette=colors,
)

g.fig.suptitle("Correlation Between Max./Min. Temperature And Precipitation", y=1.05)


plt.show()

## 3. Plotly

## 4. Altair

In [None]:
## example stolen from: https://medium.com/analytics-vidhya/interactive-data-viz-using-altair-873139771fe2

scale = alt.Scale(
    domain=["sun", "fog", "drizzle", "rain", "snow"],
    range=["#e7ba52", "#a7a7a7", "#aec7e8", "#1f77b4", "#9467bd"],
)
color = alt.Color("weather:N", scale=scale)  # We create two selections
# - a brush that is active on the top panel
# - a multi-click that is active on the bottom panel
brush = alt.selection_interval(encodings=["x"])
click = alt.selection_multi(
    encodings=["color"]
)  # 1. Top panel is scatter plot of temperature vs time
points = (
    alt.Chart()
    .mark_point()
    .encode(
        alt.X("monthdate(date):T", title="Date"),
        alt.Y(
            "temp_max:Q",
            title="Maximum Daily Temperature (C)",
            scale=alt.Scale(domain=[-5, 40]),
        ),
        color=alt.condition(brush, color, alt.value("lightgray")),
        size=alt.Size("precipitation:Q", scale=alt.Scale(range=[5, 200])),
    )
    .properties(width=550, height=300)
    .add_selection(brush)
    .transform_filter(click)
)  ##########################
# 2. Bottom panel is a bar chart of weather type
bars = (
    alt.Chart()
    .mark_bar()
    .encode(
        x="count()",
        y="weather:N",
        color=alt.condition(click, color, alt.value("lightgray")),
    )
    .transform_filter(brush)
    .properties(
        width=550,
    )
    .add_selection(click)
)  ###########################3. Build Compound Plot
alt.vconcat(points, bars, data=df_weather, title="Seattle Weather: 2012-2015")