# Data Visualization in Python
## Building your plots iteratively

Question
* How to customize a chart by the
  *grammar of graphics* of Altair?

Objectives
* Change the aesthetics of a plot such as color.
* Change the scale of the axes.
* Edit the plot title and the axis labels.

In [None]:
import pandas as pd

# Load the cleaned data
surveys_complete = pd.read_csv('../data/surveys_0_NA.csv')
surveys_complete

In [None]:
import altair as alt
alt.data_transformers.disable_max_rows()

## Building your plots iteratively
Reminder: every Altair charts are `Chart()`
objects constructed with a DataFrame.
Then, a `mark_*()` method is called to specify the
type of chart, and some data fields are assigned
to encoding channels via the `encode()` method.

* We can then modify the chart in order to display more information.
  For example, with transparency:

In [None]:
alt.Chart(surveys_complete).mark_point().encode(
    x=alt.X('hindfoot_length'),
    y=alt.Y('weight'),
).configure_mark(
    opacity=0.05,
)

* To get a unique color per species, we need to encode
  the `species_id` field to the `color` channel:

In [None]:
alt.Chart(surveys_complete).mark_point().encode(
    x=alt.X('hindfoot_length'),
    y=alt.Y('weight'),
    color=alt.Color('species_id'),
).configure_mark(
    opacity=0.05,
)

* Because the colors are reused for multiple species, we
  better activate the `tooltip` channel with `species_id`:

In [None]:
alt.Chart(surveys_complete).mark_point().encode(
    x=alt.X('hindfoot_length'),
    y=alt.Y('weight'),
    color=alt.Color('species_id'),
    tooltip=['species_id'],
).configure_mark(
    opacity=0.05,
)

* The Y axis can be configured with a logarithmic scale:

In [None]:
alt.Chart(surveys_complete).mark_point().encode(
    x=alt.X('hindfoot_length'),
    y=alt.Y('weight').scale(type='log', base=2),
    color=alt.Color('species_id'),
    tooltip=['species_id'],
).configure_mark(
    opacity=0.05,
).properties(
    height=384,
)

* The titles and the axis labels can be set:

In [None]:
alt.Chart(surveys_complete).mark_point().encode(
    x=alt.X('hindfoot_length').title('Hindfoot length (mm)'),
    y=alt.Y('weight').scale(type='log', base=2).title('Weight (g)'),
    color=alt.Color('species_id').title('Species ID'),
    tooltip=['species_id'],
).configure_mark(
    opacity=0.05,
).properties(
    height=384,
    title='Weight by the hindfoot length',
)

### Exercise - Enrich the bar chart
Modify the chart from the previous exercise by
encoding the `sex` field to a specific color scale:
* The `'sex'` field must be encoded to the `color` channel.
  The `.scale()` method can then associate domain values `'F'`
  and `'M'` to colors `'orange'` and `'green'`, respectively.
  See [an example here](https://altair-viz.github.io/user_guide/customization.html#color-domain-and-range)
* In the `tooltip` channel, add `'sex'` at the beginning of the list
* Activate the `xOffset` channel and see what it does to the bar-plot

(4 min.)

In [None]:
alt.Chart(surveys_complete).mark_bar().encode(
    x=alt.X('plot_id').type('ordinal'),
    y=alt.Y('count()'),
    color=alt.Color('sex').scale(
        domain=['F', 'M'],
        range=['orange', 'green'],
    ),
    xOffset='sex',
    tooltip=['sex', 'count()'],
).properties(
    width=480,  # Fix the chart width (pixels)
)

## Key points
* **Assigning data fields to encoding channels**:
  * `chart.encode(...)`
  * Encoding channels:
    * `x=alt.X('field_for_X')` and `y=alt.Y('field_for_Y')`
      * `.scale(type='log', base=2)`
      * `.title('Name for the X or Y axis')`
    * `color=alt.Color('field_name_for_colors')`
      * `.scale(domain=[...], range=['#114499', ...])`
      * `.title('Displayed text for field_name_for_colors')`
    * `tooltip=['field_name1', 'field_name2', 'field_name3', ...]`
    * `xOffset='field_name'`
* **Other properties of the chart**
  * `chart.configure_mark(...)`
    * `opacity=0.05`
  * `chart.properties(...)`
    * `width=400`
    * `height=300`
    * `title='Whole figure title'`