# 01: Altair Continued

Answers to some questions about Altair.

In [None]:
import altair as alt
import pandas as pd
from vega_datasets import data

cars = data.cars()
bike = pd.read_csv('https://raw.githubusercontent.com/christophM/interpretable-ml-book/master/data/bike.csv')

## Can you sort by a column that is not encoded in the visualization?

Yes, here's an example where columns x and y are used in the bar chart and the bars are sorted by column z.

In [None]:
df = pd.DataFrame({
    'x': ['A', 'B', 'C', 'D'],
    'y': [4, 3, 2, 1],
    'z': [3, 1, 2, 4]
})

alt.Chart(df).mark_bar().encode(
    x=alt.X('x').sort(field='z'),
    y='y'
)

Here's a more complex example. We have a bar chart of the median number of bikes rented by month where we sort the months by average temperature.

In [None]:
alt.Chart(bike).mark_bar().encode(
    x=alt.X('mnth').sort(field='temp', op='average'),
    y='median(cnt)',
)

## Can you change the background color of a chart?

Yes, you can set the background color of the entire chart.

In [None]:
alt.Chart(bike).mark_point().encode(
    x=alt.X('temp').axis(gridColor='white'),
    y=alt.Y('hum').axis(gridColor='white'),
).properties(
    background='#EEEEEE'
)

Or you can set the background color of just the data area, which Altair calls the chart's view.

In [None]:
alt.Chart(bike).mark_point().encode(
    x=alt.X('temp').axis(gridColor='white'),
    y=alt.Y('hum').axis(gridColor='white'),
).configure_view(
    fill='#EEEEEE'
)

You can also choose from a set of predefined themes.

In [None]:
alt.themes

In [None]:
alt.themes.enable('default')

plot = alt.Chart(bike).mark_point().encode(
    x='temp',
    y='hum',
    color='season',
)

plot

In [None]:
alt.themes.enable('dark')
plot

In [None]:
alt.themes.enable('fivethirtyeight')
plot

In [None]:
alt.themes.enable('ggplot2')
plot

In [None]:
alt.themes.enable('latimes')
plot

In [None]:
alt.themes.enable('opaque')
plot

In [None]:
alt.themes.enable('quartz')
plot

In [None]:
alt.themes.enable('urbaninstitute')
plot

In [None]:
alt.themes.enable('vox')
plot

In [None]:
alt.themes.enable('default')

## Can you concatenate an arbitrary number of charts?

If you have a list of plots, then you can use `alt.hconcat` or `alt.vconcat` to concatenate them. For layering, there's `alt.layer`.

In [None]:
plots = [plot] * 3
plots

In [None]:
alt.hconcat(*plots)

In [None]:
alt.vconcat(*plots)

## Can you specify the amount of padding between concatenated charts?

Yes, we can set the `spacing` property for `alt.vconcat` or `alt.hconcat`.

In [None]:
alt.hconcat(plot, plot, spacing=0)

In [None]:
alt.hconcat(plot, plot, spacing=100)

## Why does this chart below look fine in Altair version 4.1.0 but not 4.2.0?

In [None]:
base = alt.Chart(bike)

scatter = base.mark_point().encode(
    x='temp',
    y='hum'
)

right_ticks = base.mark_tick().encode(
    y=alt.Y('hum').axis(None),
    opacity=alt.value(0.2)
)

top_ticks = base.mark_tick(opacity=0.2).encode(
    x=alt.X('temp').axis(None)
)

chart = top_ticks & (scatter | right_ticks)

chart

In Altair v4.1.0, the output of `print(chart.to_json())` is:

```json
{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.8.1.json",
  "config": {
    "view": {
      "continuousHeight": 300,
      "continuousWidth": 400
    }
  },
  "data": {
    "format": {
      "type": "json"
    },
    "url": "data/altair-data-609d2befa370d2384ff50d4edb9c0801.json"
  },
  "vconcat": [
    {
      "encoding": {
        "x": {
          "axis": null,
          "field": "temp",
          "type": "quantitative"
        }
      },
      "mark": {
        "opacity": 0.2,
        "type": "tick"
      }
    },
    {
      "hconcat": [
        {
          "encoding": {
            "x": {
              "field": "temp",
              "type": "quantitative"
            },
            "y": {
              "field": "hum",
              "type": "quantitative"
            }
          },
          "mark": "point"
        },
        {
          "encoding": {
            "opacity": {
              "value": 0.2
            },
            "y": {
              "axis": null,
              "field": "hum",
              "type": "quantitative"
            }
          },
          "mark": "tick"
        }
      ]
    }
  ]
}
```

In Altair v4.2.0, the output of `print(chart.to_json())` is:

```json
{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.17.0.json",
  "config": {
    "view": {
      "continuousHeight": 300,
      "continuousWidth": 400
    }
  },
  "data": {
    "format": {
      "type": "json"
    },
    "url": "data/altair-data-0ac8c8d045e15b64cb25e05e9ad46beb.json"
  },
  "vconcat": [
    {
      "encoding": {
        "x": {
          "axis": null,
          "field": "temp",
          "type": "quantitative"
        }
      },
      "mark": {
        "opacity": 0.2,
        "type": "tick"
      }
    },
    {
      "hconcat": [
        {
          "encoding": {
            "x": {
              "field": "temp",
              "type": "quantitative"
            },
            "y": {
              "field": "hum",
              "type": "quantitative"
            }
          },
          "mark": "point"
        },
        {
          "encoding": {
            "opacity": {
              "value": 0.2
            },
            "y": {
              "axis": null,
              "field": "hum",
              "type": "quantitative"
            }
          },
          "mark": "tick"
        }
      ]
    }
  ]
}
```

The only difference in the specifications is the version of Vega-Lite that is used (v4.8.1 vs. v4.17.0). I believie this is related to this [Vega-Lite issue](https://github.com/vega/vega-lite/issues/6209). The above specification works in Vega-Lite v5, but the most recently released version of Altair does not use Vega-Lite v5.

I haven't been able to get this chart to look right in Altair v4.2.0 by adjusting the dimensions. After playing with this [example](https://altair-viz.github.io/gallery/scatter_marginal_hist.html), the sizing is correct when the top marginal chart has a y-scale. A workaround is to add a column to the dataset that contains 0's and then set the y-scale of top marginal chart to use this column for the y encoding.

In [None]:
base = alt.Chart(bike).transform_calculate(zeros='0')

scatter = base.mark_point().encode(
    x='temp',
    y='hum'
)

right_ticks = base.mark_tick().encode(
    y=alt.Y('hum').axis(None),
    opacity=alt.value(0.2)
)

top_ticks = base.mark_tick(opacity=0.2).encode(
    x=alt.X('temp').axis(None),
    y=alt.Y('zeros:Q').axis(None)
).properties(height=20)

chart = top_ticks & (scatter | right_ticks)

top_ticks & (scatter | right_ticks)

## Can you use shorthand elsewhere in the API?

For example, is there a way to specify that we want the midpoint of this diverging color scale to be the median count other than by calculating it in pandas, like below? Based on this [open Vega-Lite GitHub issue](https://github.com/vega/vega-lite/issues/8020), this does not seem possible.

In [None]:
alt.Chart(bike).mark_circle().encode(
    x='temp',
    y='hum',
    color=alt.Color('cnt').scale(scheme='redblue', domainMid=bike['cnt'].median(), reverse=True)
)

## Can you specify custom continuous color schemes?

Yes, you can pass an array of colors and it will interpolate between them.

In [None]:
# https://gist.github.com/jscarto/6cc7f547bb7d5d9acda51e5c15256b01
blue_fluorite = ['#291b32', '#2a1b34', '#2b1b34', '#2d1c36', '#2f1c38', '#301c39', '#301d3a', '#321d3b', '#331d3d', '#351d3f', '#351e40', '#371e41', '#381e43', '#3a1e45', '#3b1f45', '#3c1f46', '#3e1f48', '#3f1f4a', '#401f4c', '#42204d', '#43204e', '#44204f', '#462051', '#472052', '#482054', '#4a2056', '#4a2157', '#4c2158', '#4e215a', '#4f215b', '#50215d', '#52215e', '#532160', '#552162', '#552263', '#562264', '#582265', '#592267', '#5b2268', '#5c226b', '#5e226c', '#5f226e', '#60226f', '#622271', '#632272', '#642274', '#662276', '#672277', '#692278', '#6a227a', '#6c227b', '#6e227d', '#6e237e', '#6f247f', '#702480', '#712581', '#722681', '#732683', '#742783', '#752884', '#762985', '#772987', '#792a87', '#792b88', '#7a2c89', '#7b2c8a', '#7c2d8a', '#7d2d8c', '#7e2e8d', '#7f2f8d', '#80308e', '#813190', '#823191', '#833292', '#843292', '#863393', '#863494', '#873595', '#893596', '#8a3697', '#8b3798', '#8b3899', '#8c389a', '#8e399b', '#8e3a9c', '#8f3b9c', '#8f3d9d', '#8f3e9e', '#903f9e', '#90419e', '#90439f', '#9044a0', '#9046a0', '#9047a1', '#9049a1', '#914aa2', '#914ca2', '#914ca3', '#914ea3', '#9150a4', '#9151a5', '#9153a5', '#9154a6', '#9156a6', '#9157a7', '#9258a7', '#9259a8', '#925aa8', '#925ba9', '#925da9', '#925faa', '#9260ab', '#9260ab', '#9263ac', '#9264ac', '#9265ad', '#9266ae', '#9268ae', '#9269ae', '#926aaf', '#926bb0', '#926cb0', '#926eb1', '#926fb1', '#9270b2', '#9271b2', '#9273b3', '#9274b3', '#9275b4', '#9277b5', '#9277b5', '#9278b6', '#927ab6', '#927bb7', '#927cb7', '#927eb8', '#927fb8', '#9280b9', '#9281ba', '#9282ba', '#9284bb', '#9285bb', '#9285bc', '#9187bc', '#9188bd', '#918abd', '#918bbe', '#918cbf', '#918dbf', '#918ec0', '#918fc0', '#9191c1', '#9092c2', '#9094c2', '#9094c2', '#9095c3', '#9096c3', '#8f99c4', '#8f9ac5', '#8f9ac5', '#8f9bc6', '#8f9cc6', '#8f9dc7', '#8e9fc8', '#8ea0c8', '#8ea2c9', '#8ea3c9', '#8da5ca', '#8da5ca', '#8da6cb', '#8da7cb', '#8ca9cc', '#8caacc', '#8caccd', '#8bacce', '#8badce', '#8baecf', '#8ab0d0', '#8ab2d0', '#8ab2d1', '#8ab4d1', '#89b4d1', '#89b5d2', '#89b7d2', '#88b8d3', '#88bad4', '#87bad4', '#87bbd5', '#86bdd6', '#86bed6', '#86c0d7', '#85c0d7', '#85c1d8', '#84c3d8', '#84c4d9', '#83c5d9', '#83c6da', '#82c8da', '#82c8db', '#81cadc', '#81cbdc', '#80ccdd', '#81cddd', '#84cfdd', '#85cfdd', '#87d0dd', '#8ad0de', '#8dd1de', '#8fd2de', '#90d2de', '#92d4de', '#95d5de', '#97d5de', '#98d6de', '#9bd7de', '#9dd7df', '#a0d8df', '#a1d9df', '#a2dadf', '#a5dadf', '#a7dbdf', '#aadcdf', '#abdddf', '#acdde0', '#afdfe0', '#b1dfe0', '#b3e0e0', '#b4e1e0', '#b7e2e0', '#bae2e1', '#bae3e1', '#bee3e2', '#c0e4e3', '#c1e5e3', '#c4e6e3', '#c6e6e4', '#c8e7e4', '#cbe7e5', '#cde8e5', '#cee9e6', '#d2e9e7', '#d3eae7', '#d5eae7', '#d8ebe8', '#d9ece8', '#dcece9', '#deedea', '#dfeeea', '#e2eeea', '#e5efeb', '#e6f0eb', '#e9f0ec', '#ebf1ed', '#ecf2ed', '#eff3ee', '#f1f3ee']

In [None]:
alt.Chart(bike).mark_circle().encode(
    x='temp',
    y='hum',
    color=alt.Color('cnt').scale(range=blue_fluorite)
)

## Can you add jittering to points, like in a beeswarm plot?

For example, can you add some jittering to the circles in the below plot so that they do not overlap as much?

In [None]:
alt.Chart(cars).mark_circle().encode(
    x='Horsepower',
    y='Cylinders:O',
    color=alt.Color('Miles_per_Gallon').scale(scheme='viridis')
)

We can do this by adding a column to the dataset that contains a random number. This can be done in pandas or with [transform_calculate](https://altair-viz.github.io/user_guide/transform/calculate.html). Then we can use that column offset encoding in our plot. Here are a [couple](https://altair-viz.github.io/user_guide/marks/point.html#dot-plot-with-jittering) [examples](https://altair-viz.github.io/gallery/strip_plot_jitter.html).

In [None]:
alt.Chart(cars).mark_circle().encode(
    x='Horsepower',
    y=alt.Y('Cylinders:O').scale(paddingInner=0.25),
    yOffset='jitter:Q',
    color=alt.Color('Miles_per_Gallon').scale(scheme='viridis')
).transform_calculate(
    jitter='random()'
).properties(
    height=alt.Step(50)
)

Here's an alternative approach that uses faceting.

In [None]:
alt.Chart(cars).mark_circle().encode(
    x='Horsepower',
    row=alt.Row('Cylinders', spacing=0),
    y=alt.Y('jitter:Q', axis=None),
    color=alt.Color('Miles_per_Gallon').scale(scheme='viridis')
).properties(
    height=50
).transform_calculate(
    jitter='random()'
).configure_legend(
    # make the legend taller
    gradientLength=200
)