# Encoding Channels

With a basic framework of _data types_, _marks_, and _encoding channels_, we can concisely create a wide variety of visualizations. In the previous notebook you were exposed to attribute data types. A visualization represents data using a collection of _graphical marks_ (bars, lines, points, etc.). The attributes of a mark &mdash; such as its position, shape, size, or color &mdash; serve as _channels_ through which we can encode underlying data values. In other words, channels are a way to change the appearance of marks based on the data's attributes. In this notebook, we will use the `mark_point` graphical mark and different channels to create visual encodings.

When discussing data items, the word __attribute__ is used to signify the data that describes the item. In the context of visualizations, we will use the words __field__ and __attribute__ interchangablely.
At the heart of Altair is the use of *encodings* that bind data fields (with a given data type) to available encoding *channels* of a chosen *mark* type.

## Global Development Data

In [None]:
import pandas as pd
import altair as alt

In [None]:
data = pd.read_csv("tax_expense.csv")

In [None]:
data.head()


In [None]:
data.shape

Using pandas we can create a summary of each attribute. For the quantitative attributes, we will include the minimum and maximum values. For the others we will just get sense of the unique values that exist.

In [None]:
data.agg(
    {
        "corp":['unique'],
        "art":['min', 'max'],
        "tax_rate": ['min', 'max'],
        "tax_expense": ['min', 'max'],
        "earning": ['min', 'max'],

    }
)

## X

The `x` encoding channel sets a mark's horizontal position (x-coordinate). In addition, default choices of axis and title are made automatically. In the chart below, the choice of a quantitative data type results in a continuous linear axis scale:

In [None]:
#write code

## Y

The `y` encoding channel sets a mark's vertical position (y-coordinate). Here we've encoded the `art` field to the chart on the `y` channel.

In [None]:
#write code

What happens to the chart above if you swap the specify the art as nomimal data type?

In [None]:
#write code

## Size

The `size` encoding channel sets a mark's size or extent. The meaning of the channel can vary based on the mark type. For `point` marks, the `size` channel maps to the pixel area of the plotting symbol, such that the diameter of the point matches the square root of the size value.

Let's augment our scatter plot by encoding population (`pop`) on the `size` channel. As a result, the chart now also includes a legend for interpreting the size values.
By using the `size` channel to encode an additional quantitiative attribute, we have moved away from a standard **scatter plot**, to the lesser known, **bubble plot**.

In [None]:
#write code

In some cases we might be unsatisfied with the default size range. To provide a customized span of sizes, set the `range` parameter of the `scale` attribute to an array indicating the smallest and largest sizes. Here we update the size encoding to range from 0 pixels (for zero values) to 1,000 pixels (for the maximum value in the scale domain):

In [None]:
#write code

## Color

The `color` encoding channel sets a mark's color. The style of color encoding is highly dependent on the data type: nominal data will default to a multi-hued qualitative color scheme, whereas ordinal and quantitative data will use perceptually ordered color gradients.

Here, we encode the `tax_rate` field using the `color` channel and a nominal (`N`) data type, resulting in a distinct hue for each cluster value. 

In [None]:
#write code


But the tax_rate is a quantitative value, let's switch the data attribute type from 'Nominal' to 'Quantitative'

In [None]:
#write code

But our goal is to use the purples and oranges like NYTs, there are a range of color schemes (we will talk about this more in lecture)

In [None]:
#write code

In [None]:
alt.Chart(data).mark_circle().encode(
    x = alt.X('tax_rate',
              axis=alt.Axis(orient='top', domain=False, ticks=False)),
    size = alt.Size('earning', 
                    scale=alt.Scale(range=[0,1000]),
                    legend=alt.Legend(orient='bottom', direction='horizontal')),
    color= alt.Color('tax_rate:Q', 
                     scale=alt.Scale(scheme = 'purpleorange'),
                    legend=alt.Legend(orient='bottom', direction='horizontal')), 
     tooltip = [
        alt.Tooltip('corp:N'),
        alt.Tooltip('earning:Q'),
        alt.Tooltip('tax_rate:Q')
    ]   
    
).properties(
    width = 300, 
    height = 80
).configure_axis(
    grid = False
).configure_view(
    strokeWidth=0
)