# Set-up
Import altair

In [0]:
import altair as alt

Let's use datasets from [vega_datasets](https://github.com/vega/vega-datasets)

In [0]:
from vega_datasets import data
df = data.cars()
df.head()

# Scatterplot

Minimal scatterplot using `cars` dataset:

In [0]:
alt.Chart(df).mark_point().encode(
    x='Miles_per_Gallon',
    y='Horsepower')

With `circles` marks:

In [0]:
alt.Chart(df).mark_circle(opacity=0.5).encode(  #transparent circles
    x='Miles_per_Gallon',
    y='Horsepower')

# Bar chart
Bar chart of car models counts per `Origin`

In [0]:
alt.Chart(df).mark_bar().encode(
  x='Origin',
  y='count(Origin)')

Same with dataframe of counts

In [0]:
origin_counts = df[['Origin']].groupby('Origin').size().reset_index(name='counts')
origin_counts

alt.Chart(origin_counts).mark_bar().encode(
  x='Origin',
  y='counts')

With labels rotated and sized

In [0]:
alt.Chart(df, width=300).mark_bar().encode(
  x=alt.X('Origin', axis = alt.Axis(labelAngle=0)),
  y='count(Origin)')

# Histogram

In [0]:
alt.Chart(df).mark_bar().encode(
    alt.X("Miles_per_Gallon", bin=True),
    y='count()')

# Customizations
- Axis labels
- Transparency
- Title
- Tooltips
- Axis range
- Interactive

In [0]:
alt.Chart(df).mark_circle(opacity=0.5).encode(
  x=alt.X('Miles_per_Gallon', axis=alt.Axis(title='Miles per gallon'), scale=alt.Scale(zero=False)),
  y=alt.Y('Horsepower', axis=alt.Axis(title='Horsepower'), scale=alt.Scale(zero=False)),  
  color=alt.Color('Origin', legend=alt.Legend(title="Origin")),
  tooltip = ['Miles_per_Gallon', 'Horsepower']
  ).properties(
    title='Cars Data',
    width=300,
    height=180
  ).interactive()

# Exercises

## 😜 Exercise 1

Use `Altair` to create a scatterplot of `sepalLength` vs. `sepalWidth` for the `iris` dataset from `vega_datasets`, where the circles are colored by `spieces`. Set axes labels and title. Use transparency and resize the figure to deal with overplotting.

## 😜 Exercise 2

Use `Altair` to create a scatterplot of the `iris` dataset of `sepalLength` vs. `sepalWidth`, where the circles radius depends on `petalLength`. Set axes labels and title. Use transparency and resize the figure to deal with overplotting.

Hints:
- Use `size='petalLength'` to encode the size of the circles

## 🤔 Exercise 3

Use `Altair` to create a scatterplot of `alt` vs. `ptime` for the `SMO-VOR-2015` dataset. Set axes labels and title. Use transparency and resize the figure to deal with overplotting. Make the plot interactive adding tooltips to show `flight`, `alt` and `ptime` information.

Hints:
- Change `ptime` to a datetime object with `df.ptime = pd.to_datetime(df.ptime)`
- Remove the limitation on the maximum dataset rows with:
```python
alt.data_transformers.disable_max_rows()
```

## 😜 Exercise 4

Use `Altair` to create a boxplot of `alt` per `month` for the `SMO-VOR-2015` dataset. Set axes labels and title. Resize the figure as needed.

Hints:
- Remove the limitation on the maximum dataset rows with:
```python
alt.data_transformers.disable_max_rows()
```
- Change `ptime` to a datetime object with `df.ptime = pd.to_datetime(df.ptime)`
- Use `sort=None` for the `month` encoding  and order `month` with:
```python
months = ['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
df.month = pd.Categorical(df.month, ordered=True, categories=months)
```


## 🤔 Exercise 5

Use `Altair` to create an histogram of the `alt` for the `SMO-VOR-2015` dataset. Set axes labels and title. Resize the figure as needed.

Hints:
- Remove the limitation on the maximum dataset rows with:
```python
alt.data_transformers.disable_max_rows()
```
- Change `ptime` to a datetime object with `df.ptime = pd.to_datetime(df.ptime)`


## 😜 Exercise 6

Use `Altair` to create an histogram of the `alt` for the `SMO-VOR-2015` dataset with faceting per `month`. Set axes labels and title. Resize the figure as needed.

Hints:
Hints:
- Remove the limitation on the maximum dataset rows with:
```python
alt.data_transformers.disable_max_rows()
```
- Use `sort=None` for the `month` encoding  and order `month` with:
```python
months = ['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
df.month = pd.Categorical(df.month, ordered=True, categories=months)
```