<div style="text-align:center"><img src="altair_img.png" /></div>
<h1 style="text-align: center;"> Altair </h1>

## Helpful Links:

#### Parent Documentation - https://altair-viz.github.io/
#### Simple Charts- https://altair-viz.github.io/gallery/index.html#simple-charts

---

# Plots Discussed in this Notebook

|Scatter Plot | Area Plot | Bar Chart |
|:---:|:---:|:---:|
|<img src="img/scatter.png" >|<img src="img/area_plot.png" >|<img src="img/bar_chart.png" width=300 height=300>|

| Heatmap | Histogram | Image Chart |
|:---:|:---:|:---:|
|<img src="img/heatmap.png" >|<img src="img/histogram.png" >|<img src="img/image_chart.png" >|

| Line Plot | Pie Chart | Pair Plot |
|:---:|:---:|:---:|
|<img src="img/line_chart.png" >|<img src="img/pie.png" >|<img src="img/pair.png" >|

---

# Import Libraries

In [1]:
# !pip install altair vega_datasets
import altair as alt
from vega_datasets import data
import numpy as np
import pandas as pd

# Prepare Demo Datasets
Every dataset will be discussed during individual demo

In [2]:
source = data.cars()
source["Unit"] = 1
scatter_df = source.copy()     # Scatter plot dataframe
pair_df = source.copy()        # Pair plot dataframe
hist_df = source.copy()        # Histogram dataframe

# Area chart dataframe
area_df = source.groupby(by=["Year","Origin"], as_index=False).agg({"Unit":pd.Series.count}).copy()

# Bar chart dataframe
bar_df = source.groupby(by=["Year","Origin"], as_index=False).agg({"Unit":pd.Series.count}).copy()

# Line plot dataframe
line_df = source.groupby(by="Year", as_index=False).agg({"Miles_per_Gallon": pd.Series.sum}).copy()

# Pie chart dataframe
pie_df = pd.DataFrame(data = source.Origin.value_counts().reset_index())
pie_df.columns = ["Origin", "Count"]

# Heatmap dataframe
x = np.random.randint(1, 11, size=(1000))
y= np.random.randint(1, 11, size=(1000))
z = np.random.randint(1, 11, size=(1000))
heat_df = pd.DataFrame({"x":x, "y":y, "z":z})

# Image Plot dataframe
fifa_df = pd.read_csv("players_22.csv", usecols=["short_name", "overall", "age", "club_name", "nationality_name"])
image_df = fifa_df[(fifa_df.club_name == "Manchester United")].copy()
image_df = image_df[image_df.nationality_name=="Portugal"].copy()
image_df["player_url"] = ["https://futhead.cursecdn.com/static/img/20/players/20801.png",
"https://futhead.cursecdn.com/static/img/21/players/212198.png",
"https://futhead.cursecdn.com/static/img/21/players/234574.png",]

---

# Data Types
<br>Ordinal **(O)** : Values following an ascending/descending order (eg. rank, year, etc.)
<br>Quantitative **(Q)** : Continuous numeric values (eg. marks obtained by students in class, stock prices, etc.)
<br>Nominal **(N)** : Labels/Names/Class (eg. country names, colors, etc.)
<br>Temporal **(T)** : Date-Time values (eg. dates, hours)

---

# 1. Scatter Plot

Format:<br>
`alt.Chart(source).mark_point().encode(
    x=alt.X("quant_val_1:Q", axis=alt.Axis(title="HP")),
    y=alt.Y("quant_val_2:Q", axis=alt.Axis(title="Mpg")),
    color="nominal_val_1:N",
    tooltip=["nominal_val_2:N"],
).properties(title="<Enter Title>",)`

Example Usage:
<br>dataframe = scatter_df
<br>quant_val_1(Q) = Horsepower
<br>quant_val_2(Q) = Miles_per_Gallon
<br>nominal_val_1(N) = Origin
<br>nominal_val_2(N) = Name


In [3]:
scatter_df.head(3)

Unnamed: 0,Name,Miles_per_Gallon,Cylinders,Displacement,Horsepower,Weight_in_lbs,Acceleration,Year,Origin,Unit
0,chevrolet chevelle malibu,18.0,8,307.0,130.0,3504,12.0,1970-01-01,USA,1
1,buick skylark 320,15.0,8,350.0,165.0,3693,11.5,1970-01-01,USA,1
2,plymouth satellite,18.0,8,318.0,150.0,3436,11.0,1970-01-01,USA,1


In [23]:
# Scatter Plot

alt.Chart(scatter_df).mark_point().encode(
    x=alt.X("Horsepower:Q", axis=alt.Axis(title="HP")),
    y=alt.Y("Miles_per_Gallon:Q", axis=alt.Axis(title="MPG")),
    color="Origin:N",
    tooltip=["Name:N"],
).properties(title="Scatter Plot", width = 500, height = 300)

---

# 2. Area Plot

Format:<br>
`alt.Chart(area_df).mark_area().encode(
    x="temporal_val:T", y="quant_val:Q", color="nominal_val:N"
).properties(title="<Enter Title>", height=300, width=500)`

Example Usage:<br>dataframe = area_df
<br>temporal_val(T) = Year
<br>quant_val(Q) = Unit
<br>nominal_val(N) = Origin

In [27]:
area_df.head(3)

Unnamed: 0,Year,Origin,Unit
0,1970-01-01,Europe,6
1,1970-01-01,Japan,2
2,1970-01-01,USA,27


In [28]:
# Area Plot
alt.Chart(area_df).mark_area().encode(
x="Year:T", y="Unit:Q", color="Origin:N").properties(title="Area Chart")

---

# 3. Bar Chart

A. Stacked Bar Chart Format

`alt.Chart(dataframe).mark_bar(size=<int>).encode(
    x="temporal_val:T", y="quantitative_val:Q", color="nominal_val:N"
).properties(title="<Enter Title>")`

Example Usage:<br>dataframe = bar_df <br> nominal_val(N) = Origin, <br>quantitative_val(Q) = Unit, <br>temporal_val(T) = "Year
<br><br><br>
B. UnStacked Bar Chart Format

`alt.Chart(dataframe).mark_bar(size=<int>).encode(
    x="nominal_val:N", y="quantitative_val:Q", color="nominal_val:N", column="temporal_val:T",
).properties(title="<Enter Title>")`

Example Usage:<br>dataframe = bar_df <br>nominal_val(N) = Origin, <br>quantitative_val(Q) = Unit, <br>temporal_val(T) = "Year

In [29]:
bar_df.head(3)

Unnamed: 0,Year,Origin,Unit
0,1970-01-01,Europe,6
1,1970-01-01,Japan,2
2,1970-01-01,USA,27


In [30]:
# Stacked
alt.Chart(bar_df).mark_bar(size=20).encode(
x="Year:T", y="Unit:Q", color="Origin:N").properties(title="Stacked Bar Chart")

In [31]:
# Unstacked
alt.Chart(bar_df).mark_bar(size=5).encode(
x="Origin:N", y="Unit:Q", color="Origin:N", column="Year:T").properties(title="UnStacked Bar Chart")

---

# 4. Heatmap

Format:<br>
`alt.Chart(heat_df).mark_rect().encode(
    x='ordinal_val_1:O',
    y='ordinal_val_2:O',
    color='quan_val:Q'
).properties(title="<Enter Title>", height=300, width=300)`

Example Usage: <br>ordinal_val_1(O) = x, <br>ordinal_val_2(O) = y, <br>quan_val(Q) = z

In [41]:
heat_df.head(3)

Unnamed: 0,x,y,z
0,4,4,2
1,1,1,6
2,6,6,2


In [42]:
# Heatmap
alt.Chart(heat_df).mark_rect().encode(
x="x:O", y="y:O", color="z:Q").properties(title="Heat Map")

---

# 5. Histogram

Format:
`alt.Chart(dataframe).mark_bar(size=10).encode(x="quant_val:Q", y="count()").properties(title="<Enter Title>")`

Example Usage:<br>dataframe = hist_df <br> quant_val(Q) = Weight_in_lbs

In [51]:
hist_df.head(3)

Unnamed: 0,Name,Miles_per_Gallon,Cylinders,Displacement,Horsepower,Weight_in_lbs,Acceleration,Year,Origin,Unit
0,chevrolet chevelle malibu,18.0,8,307.0,130.0,3504,12.0,1970-01-01,USA,1
1,buick skylark 320,15.0,8,350.0,165.0,3693,11.5,1970-01-01,USA,1
2,plymouth satellite,18.0,8,318.0,150.0,3436,11.0,1970-01-01,USA,1


In [54]:
# Histogram
alt.Chart(hist_df).mark_bar(size=10).encode(
x="Weight_in_lbs:Q", y="count()").properties(title="Histogram", width=600, height=300)

---

# 6. Image Chart

Format:<br>
`alt.Chart(dataframe).mark_image(width=50, height=50).encode(
    x="quant1", y="quant2", url="<image_url>", tooltip=[<list of attributes associated with image>]
)`

Example Usage:<br>
quant1(Q) = age<br>
quant2(Q) = overall<br>
tooltip can include name, overall, age etc.

In [55]:
image_df

Unnamed: 0,short_name,overall,age,club_name,nationality_name,player_url
2,Cristiano Ronaldo,91,36,Manchester United,Portugal,https://futhead.cursecdn.com/static/img/20/pla...
28,Bruno Fernandes,88,26,Manchester United,Portugal,https://futhead.cursecdn.com/static/img/21/pla...
1396,Diogo Dalot,76,22,Manchester United,Portugal,https://futhead.cursecdn.com/static/img/21/pla...


In [57]:
# Image Chart
alt.Chart(image_df).mark_image(height=50, width=50).encode(
x="age:Q", y="overall:Q", url="player_url", tooltip=["short_name", "age", "overall"]).properties(title="Image Chart")

---

# 7. Line Plot

Format:<br>
`alt.Chart(dataframe).mark_line().encode(
    x=alt.X("temporal_val:T", axis=alt.Axis(title="<X-axis-label>")),
    y=alt.Y("quant_val:Q", axis=alt.Axis(title="<Y-axis-label>")),
    tooltip=["nominal_val:N"],
).properties(title="<Plot Title>")`

Example Usage:
<br>dataframe = line_df
<br>temporal_val(T) = Year
<br>quant_val(Q) = Miles_per_Gallon
<br>nominal_val(N) = Name

In [58]:
line_df.head(3)

Unnamed: 0,Year,Miles_per_Gallon
0,1970-01-01,513.0
1,1971-01-01,595.0
2,1972-01-01,524.0


In [61]:
# Line Plot
alt.Chart(line_df).mark_line().encode(
x="Year:T", y="Miles_per_Gallon:Q", tooltip=["Year", "Miles_per_Gallon"]).properties(title="Line Chart").interactive()

---

# 8. Pie Chart

Pie Chart Format:<br>
`alt.Chart(dataframe).mark_arc().encode(
    theta="quant_val:Q", color="nominal_val:N", tooltip=["quant_val:Q", "nominal_val:N"]
)`

Donut Chart Format:<br>
`alt.Chart(dataframe).mark_arc(innerRadius=<int>).encode(
    theta="quant_val:Q", color="nominal_val:N", tooltip=["quant_val:Q", "nominal_val:N"]
)`


Example Usage:<br>
quant_val = Count:Q<br>
nominal_val = Origin:N

In [64]:
pie_df

Unnamed: 0,Origin,Count
0,USA,254
1,Japan,79
2,Europe,73


In [65]:
# Pie Chart
alt.Chart(pie_df).mark_arc().encode(
theta="Count:Q", color="Origin:N", tooltip=["Origin", "Count"]).properties(title="Pie Chart")

In [79]:
# Donut Chart
alt.Chart(pie_df).mark_arc(innerRadius=100).encode(
theta="Count:Q", color="Origin:N", tooltip=["Origin", "Count"]).properties(title="Donut Chart")

---

# 9. Pair Plot

Format:<br>
`alt.Chart(dataframe).mark_point().encode(
    x=alt.X(alt.repeat("column"), type="quantitative"),
    y=alt.Y(alt.repeat("row"), type="quantitative"),
    color="nominal_val_1:N",
    tooltip=["nominal_val_2:N"],
).properties(title="Pair Plot", height=200, width=200).repeat(
    row=[quant_list],
    column=[quant_list],
)`    

Example Usage:
<br>dataframe = pair_df
<br>quant_list = ["Horsepower", "Acceleration", "Miles_per_Gallon"]
<br>nominal_val_1(N) = Origin
<br>nominal_val_2(N) = Name

In [92]:
pair_df.head(3)

Unnamed: 0,Name,Miles_per_Gallon,Cylinders,Displacement,Horsepower,Weight_in_lbs,Acceleration,Year,Origin,Unit
0,chevrolet chevelle malibu,18.0,8,307.0,130.0,3504,12.0,1970-01-01,USA,1
1,buick skylark 320,15.0,8,350.0,165.0,3693,11.5,1970-01-01,USA,1
2,plymouth satellite,18.0,8,318.0,150.0,3436,11.0,1970-01-01,USA,1


In [93]:
# Pair Plot
alt.Chart(pair_df).mark_point().encode(
x=alt.X(alt.repeat("column"), type="quantitative"),
y=alt.Y(alt.repeat("row"), type="quantitative"),
color="Origin:N", tooltip=["Name:N"]).properties(title="Pairplot", height=200, width=200).repeat(
row=["Horsepower", "Acceleration", "Miles_per_Gallon"], column=["Horsepower", "Acceleration", "Miles_per_Gallon"])

---

END of the Notebook<br>
Author: Shounak Deshpande (shounak.python@gmail.com)<br>
Youtube: https://www.youtube.com/channel/UCpODmuqv_ljQ_vMYwO71a_g