# Altair Intro: Scale

In this session, we will take a closer look at the scale concept in Altair.  Scale controls _how_ data is mapped to the various visual channels specified by encoding. In Altair, we can specify scale along with each encoding specification. Let's start with a simple example.

In [1]:
import altair as alt
import pandas as pd

source = pd.DataFrame({"category": [1, 2, 3, 4, 5, 6], "value": [4, 6, 10, 3, 7, 8],
                       "quality": ["standard", "good", "excellent", "standard", "good", "excellent"]})
source

Unnamed: 0,category,value,quality
0,1,4,standard
1,2,6,good
2,3,10,excellent
3,4,3,standard
4,5,7,good
5,6,8,excellent


In this toy dataset, we again have 3 columns: category, value and quality.  Let's create a simple scatter plot first, where we want to encode "category" as the x coordinate and "value" as the y coordinate.

In [2]:
base = alt.Chart(source).mark_point(size=400, filled=True)\
                        .encode(x="category", y="value")
base

## Default scale

This is simple enough. The keyword arguments for `.encode` method, such as `x="category"`, are actually shorthands.  The full form is `x=alt.X("category")`, where the class `alt.X` is an example of channel schema. Each channel has its own schema class.  See a full list of channel schemas [here](https://altair-viz.github.io/user_guide/encodings/channels.html). Scale can be specified as part of the channel schema.

In [3]:
default_scale = base.encode(x=alt.X("category", scale=alt.Scale()), 
                            y=alt.Y("value", scale=alt.Scale()))
default_scale

In the example above, we have specified a scale for each of the X and Y channels. Because the scales we specified are just the default scale, the resulting plot looks exactly the same as before.  Let's try to change some scale settings next.

## Scale: zero

In [4]:
no_zero = base.encode(x=alt.X("category", scale=alt.Scale(zero=False)), 
                      y=alt.Y("value", scale=alt.Scale(zero=False)))
no_zero

In the example above, we change the scale to have `zero` set to `False`.  The `zero` scale setting determines whether to include the value zero in the plot.  Because `zero` is `False`, x axis now starts from 1.0 (the minimum of "category column") and y axis now starts from 3 (the minimum of "value" column).

## Scale: type

So far, we have being using linear scale. It is common in visualization to use log scale.  This can be done with the `type` scale setting.

In [5]:
log_scale = base.encode(
    x=alt.X("category"),
    y=alt.Y("value", scale=alt.Scale(type="log")))
log_scale

In the example above, the y axis is now using log scale. Altair supports many common scales include "linear", "log", "pow", "sqrt", etc.  See a list of supported [continuous scales](https://vega.github.io/vega-lite/docs/scale.html#continuous-scales), [discrete scales](https://vega.github.io/vega-lite/docs/scale.html#discrete) and [discretizing scales](https://vega.github.io/vega-lite/docs/scale.html#discretizing) for more information. Let's quickly try the power scale.

In [6]:
square_scale = base.encode(
    x=alt.X("category"), 
    y=alt.Y("value", scale=alt.Scale(type="pow", exponent=2)))
square_scale

By setting the `exponent` to 2, we are effectively using the squared y scaling as shown above.

## Scale: domain

We have already seen that we can adjust the domain of the scale a bit using the `zero` setting.  We can also just set the domain directly by using the `domain` scale setting.

In [7]:
domain_scale = base.encode(
    x=alt.X("category", scale=alt.Scale(domain=[0, 10])),
    y=alt.Y("value", scale=alt.Scale(domain=[2, 11])))
domain_scale

## Scale: reverse

Sometimes it is useful to revert the direction of the scale.  For example, instead of having y value increases from bottom to top, we may want to have y value decreases from bottom to top.  This can be done by setting `reverse` to `True`.

In [8]:
reverse_scale = base.encode(
    x=alt.X("category", scale=alt.Scale(reverse=True)),
    y=alt.Y("value", scale=alt.Scale(reverse=True)))
reverse_scale

## Scale: scheme

When working with color channel, there are more than one way of mapping data to color. Exactly how data values are mapped to color is determined by a color map or color scheme.  Altair actually provides very reasonable default color scheme.

In [9]:
color_scale = base.encode(x="category", y="value",
                          color="quality")
color_scale

One can always modify the color scheme by changing the `scheme` scale setting. See [a list of support color schemes](https://vega.github.io/vega/docs/schemes/) for more details.

In [10]:
color_scale = base.encode(
    x="category", y="value",
    color=alt.Color("quality", scale=alt.Scale(scheme="turbo")))
color_scale

## Scale: range

Sometimes, one may want to use a custom color scheme.  This can be achieved using the `range` scale setting. In the example bellow, we want to map `excellent` quality to `red`, `good` to `green` and `standard` to `blue`. This can be done by explicitly specifying the domain and the range of the color mapping.

In [11]:
qualities = ["excellent", "good", "standard"]
colors = ["red", "green", "blue"]
color_scale = base.encode(
    x="category", y="value",
    color=alt.Color("quality", 
                    scale=alt.Scale(domain=qualities, range=colors)))
color_scale

## Summary

In this session, we explored a few common scale settings in Altair. Scale is extremely useful in fine tuning the visualization. You can find out more about Altair's scale settings by reading [its API](https://altair-viz.github.io/user_guide/generated/core/altair.Scale.html#altair.Scale).