# Intro to Altair: Explore Ukraine Data

Altair offers a powerful and concise visualization tool that enables you to build a wide range of statistical visualizations quickly. Read the [docs](https://altair-viz.github.io/) and [example gallery](https://altair-viz.github.io/gallery/index.html). 

#### Load python tools

In [2]:
%load_ext lab_black

In [3]:
import pandas as pd
import altair as alt

In [4]:
pd.options.display.max_columns = 1000
pd.options.display.max_rows = 1000

---

### Ukranian refugees count since the war began


Source: [Humanitarian Data Exchange](https://data.humdata.org/visualization/ukraine-humanitarian-operations/)

In [5]:
refugees = pd.read_csv(
    "https://raw.githubusercontent.com/stiles/usc/main/data/processed/ukraine_refugees_series_data.csv"
)

In [6]:
refugees

Unnamed: 0,date,affected_refugees,cumsum
0,2022-02-24,84681,84681
1,2022-02-25,108301,192982
2,2022-02-26,148319,341301
3,2022-02-27,168364,509665
4,2022-02-28,162474,672139
5,2022-03-01,166690,838829
6,2022-03-02,194483,1033312
7,2022-03-03,167905,1201217
8,2022-03-04,172099,1373316
9,2022-03-05,197520,1570836


#### Build a [line chart](https://altair-viz.github.io/gallery/simple_line_chart.html) showing the refugees by day

In [7]:
alt.Chart(refugees).mark_line().encode(x="date:T", y="affected_refugees")

#### Add more features: color, line width, title, size

In [8]:
alt.Chart(refugees).mark_line(color="red", size=10).encode(
    x="date:T", y="affected_refugees"
).properties(
    width=380, height=300, title="Daily refugees fleeing Ukraine since Russia invasion"
)

#### Try it as a column chart

In [9]:
alt.Chart(refugees).mark_bar(color="red", size=10).encode(
    x="date:T", y="affected_refugees"
).properties(
    width=380, height=300, title="Daily refugees fleeing Ukraine since Russia invasion"
)

---

#### Read fatalities data

In [10]:
fatalities_df = pd.read_csv(
    "https://raw.githubusercontent.com/stiles/usc/main/data/processed/ukraine_incidents_fatalities_types_melted.csv"
)

In [11]:
fatalities_df.head()

Unnamed: 0,date_occurred,incident_type,fatalities,cumsum
0,2022-02-24,Battles,51.0,51.0
1,2022-02-25,Battles,19.0,70.0
2,2022-02-26,Battles,53.0,123.0
3,2022-02-27,Battles,30.0,153.0
4,2022-02-28,Battles,14.0,167.0


#### Group to count the number of fatalities by type

In [12]:
fatalities_df.groupby("incident_type")["fatalities"].sum().reset_index()

Unnamed: 0,incident_type,fatalities
0,Battles,926.0
1,Explosions/Remote violence,599.0
2,Violence against civilians,91.0


In [13]:
# Build a stacked area with facilities data

In [14]:
alt.Chart(fatalities_df).mark_area().encode(
    x="date_occurred:T", y="fatalities:Q", color="incident_type:N"
)

#### Try to make it cumulative

In [44]:
alt.Chart(fatalities_df).mark_area().encode(
    x="date_occurred:T", y="cumsum:Q", color="incident_type:N"
)

---

#### Clean up the refugees column chart

In [50]:
alt.Chart(refugees).mark_bar(color="red", size=10).encode(
    x=alt.X("date:T", axis_alt.Axis(format="%b. %-d", grid=False), title ""
    y=alt.Y("affected_refugees", title="Number of refugees"
).properties(
    width=380, height=300, title="Daily refugees fleeing Ukraine since Russia invasion"
).configure_view(
    stroke0pacity=0
)

SyntaxError: invalid syntax (4200762997.py, line 2)

ERROR:root:Cannot parse: 2:73:     x=alt.X("date:T", axis_alt.Axis(format="%b. %-d", grid=False), title ""
Traceback (most recent call last):
  File "/Users/torihates_u/opt/anaconda3/lib/python3.9/site-packages/lab_black.py", line 218, in format_cell
    formatted_code = _format_code(cell)
  File "/Users/torihates_u/opt/anaconda3/lib/python3.9/site-packages/lab_black.py", line 29, in _format_code
    return format_str(src_contents=code, mode=FileMode())
  File "/Users/torihates_u/opt/anaconda3/lib/python3.9/site-packages/black.py", line 725, in format_str
    src_node = lib2to3_parse(src_contents.lstrip(), mode.target_versions)
  File "/Users/torihates_u/opt/anaconda3/lib/python3.9/site-packages/black.py", line 836, in lib2to3_parse
    raise exc from None
black.InvalidInput: Cannot parse: 2:73:     x=alt.X("date:T", axis_alt.Axis(format="%b. %-d", grid=False), title ""


#### Facet charts for the fatalities

In [52]:
alt.Chart(fatalities_df).mark_area().encode(
    x="date_occurred:T", y="cumsum:Q", color="incident_type:N", facet="incident_type"
).properties(width=250, hright=250)

SchemaValidationError: Invalid specification

        altair.vegalite.v4.api.Chart, validating 'additionalProperties'

        Additional properties are not allowed ('hright' was unexpected)
        

alt.Chart(...)