# Intro to Altair: Explore Ukraine Data

Altair offers a powerful and concise visualization tool that enables you to build a wide range of statistical visualizations quickly. Read the [docs](https://altair-viz.github.io/) and [example gallery](https://altair-viz.github.io/gallery/index.html). 

#### Load python tools

In [1]:
%load_ext lab_black

In [2]:
import pandas as pd
import altair as alt

In [3]:
pd.options.display.max_columns = 1000
pd.options.display.max_rows = 1000

---

### Ukranian refugees count since the war began

Source: [Humanitarian Data Exchange](https://data.humdata.org/visualization/ukraine-humanitarian-operations/)

In [4]:
refugees = pd.read_csv("../data/processed/ukraine_refugees_series_data.csv")

In [5]:
refugees.head()

Unnamed: 0,date,affected_refugees,cumsum
0,2022-02-24,84681,84681
1,2022-02-25,108301,192982
2,2022-02-26,148319,341301
3,2022-02-27,168364,509665
4,2022-02-28,162474,672139


In [6]:
refugees.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25 entries, 0 to 24
Data columns (total 3 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   date               25 non-null     object
 1   affected_refugees  25 non-null     int64 
 2   cumsum             25 non-null     int64 
dtypes: int64(2), object(1)
memory usage: 728.0+ bytes


#### Build a [line chart](https://altair-viz.github.io/gallery/simple_line_chart.html) showing the refugees by day

In [7]:
alt.Chart(refugees).mark_line().encode(x="date:T", y="affected_refugees")

#### Add more features: color, line width, title, size

In [8]:
alt.Chart(refugees).mark_line(color="red", size=5).encode(
    x="date:T", y="affected_refugees"
).properties(
    title="Daily refugees fleeing Ukraine since the Russian invasion",
    height=250,
    width=350,
)

#### Try it as a column chart

In [9]:
alt.Chart(refugees).mark_bar(size=10, color="red").encode(
    x="date:T", y="affected_refugees"
).properties(
    title="Daily refugees fleeing Ukraine since the Russian invasion",
    height=250,
    width=350,
)

---

#### Read fatalities data

In [10]:
fatalities_df = pd.read_csv(
    "../data/processed/ukraine_incidents_fatalities_types_melted.csv"
)

In [11]:
fatalities_df.head()

Unnamed: 0,date_occurred,incident_type,fatalities,cumsum
0,2022-02-24,Battles,51.0,51.0
1,2022-02-25,Battles,19.0,70.0
2,2022-02-26,Battles,53.0,123.0
3,2022-02-27,Battles,30.0,153.0
4,2022-02-28,Battles,14.0,167.0


#### Group to count the number of fatalities by type

In [12]:
fatalities_grouped = (
    fatalities_df.groupby(["incident_type"])["fatalities"].sum().reset_index()
)

In [13]:
fatalities_grouped

Unnamed: 0,incident_type,fatalities
0,Battles,926.0
1,Explosions/Remote violence,599.0
2,Violence against civilians,91.0


In [14]:
alt.Chart(fatalities_grouped).mark_bar().encode(y="incident_type", x="fatalities")

#### Build a [stacked area chart](https://altair-viz.github.io/gallery/simple_stacked_area_chart.html) with fatalities data

In [15]:
alt.Chart(fatalities_df).mark_area().encode(
    x="date_occurred:T", y="fatalities:Q", color="incident_type:N"
)

#### Try to make it cumulative

In [16]:
alt.Chart(fatalities_df).mark_area().encode(
    x="date_occurred:T", y="cumsum:Q", color="incident_type:N"
).configure_legend(
    padding=10, orient="top",
)

---

#### Clean up the refugees column chart

In [17]:
alt.Chart(refugees).mark_bar(size=10, color="red").encode(
    x=alt.X("date:T", axis=alt.Axis(format="%b. %-d", grid=False), title="Date"),
    y=alt.Y(
        "affected_refugees",
        axis=alt.Axis(
            tickCount=5,
            domainOpacity=0,
            gridWidth=0.6,
            gridColor="#dddddd",
            offset=6,
            tickSize=0,
        ),
        title="Daily refugee counts",
    ),
).properties(
    title="Daily refugees fleeing Ukraine since the Russian invasion",
    height=250,
    width=350,
).configure_view(
    strokeOpacity=0
)

#### Facet charts for the fatalities

In [18]:
alt.Chart(fatalities_df).mark_area().encode(
    x=alt.X(
        "date_occurred:T",
        title="",
        axis=alt.Axis(format="%b. %-d", grid=False, tickCount=5),
    ),
    y=alt.Y(
        "cumsum:Q",
        title="Cumulative fatalities",
        axis=alt.Axis(
            tickCount=5,
            domainOpacity=0,
            gridWidth=0.6,
            gridColor="#dddddd",
            offset=6,
            tickSize=0,
        ),
    ),
    facet=alt.Facet("incident_type:N", title=" "),
    color=alt.Color("incident_type", legend=None),
).properties(
    width=150, height=150, title="Cumulative fatalities in Ukraine, by incident type"
).configure_view(
    strokeOpacity=0
)