# "Altair"
> "Subheader"

- author: Christopher Thiemann
- toc: true
- branch: master
- badges: true
- comments: true
- categories: [python, plotting ]
- hide: true
- search_exclude: true


In [30]:
#hide
import warnings
1+1

import numpy as np
import scipy as sp
import sklearn
import statsmodels.api as sm
from statsmodels.formula.api import ols

import altair as alt
from vega_datasets import data
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import seaborn as sns
sns.set_context("poster")
sns.set(rc={'figure.figsize': (16, 9.)})
sns.set_style("whitegrid")

import pandas as pd
pd.set_option("display.max_rows", 120)
pd.set_option("display.max_columns", 120)



## Altair

### Encodings

With the marks I can specifiy which kind of plot I want to generate, mark_points gives scatter, mark_bar gives a bar plot.

## Recipe

In [7]:
df = sns.load_dataset('car_crashes')
df.head()


Unnamed: 0,total,speeding,alcohol,not_distracted,no_previous,ins_premium,ins_losses,abbrev
0,18.8,7.332,5.64,18.048,15.04,784.55,145.08,AL
1,18.1,7.421,4.525,16.29,17.014,1053.48,133.93,AK
2,18.6,6.51,5.208,15.624,17.856,899.47,110.35,AZ
3,22.4,4.032,5.824,21.056,21.28,827.34,142.39,AR
4,12.0,4.2,3.36,10.92,10.68,878.41,165.63,CA


In [8]:
(alt.Chart(df) 
  .mark_point(color='black') #the kind of chart , can pass extra intputs, not based on data only of type of chart
  .encode(x='total', y='speeding', tooltip = ['abbrev', 'not_distracted']) # encode data from df to the chart, ecnode links to df
  .properties(width=600, height=300) #ad extra properties independent of the mark and chart
  .interactive())

In [31]:
!pip install palmerpenguins 



In [32]:
from palmerpenguins import load_penguins
penguins = load_penguins()
penguins

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex,year
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,male,2007
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,female,2007
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,female,2007
3,Adelie,Torgersen,,,,,,2007
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,female,2007
...,...,...,...,...,...,...,...,...
339,Chinstrap,Dream,55.8,19.8,207.0,4000.0,male,2009
340,Chinstrap,Dream,43.5,18.1,202.0,3400.0,female,2009
341,Chinstrap,Dream,49.6,18.2,193.0,3775.0,male,2009
342,Chinstrap,Dream,50.8,19.0,210.0,4100.0,male,2009


## Basic Plots

### Scatterplot

In [16]:
(alt.Chart(penguins) 
  .mark_point(color='green') #the kind of chart , can pass extra intputs, not based on data only of type of chart
  .encode(x='bill_length_mm', y='bill_depth_mm', color = 'species') # encode data from df to the chart, ecnode links to df
  .properties(width=600, height=300) #ad extra properties independent of the mark and chart
  )

### Bar Chart

In [19]:
alt.Chart(penguins).mark_bar().encode(
    x='species',
    y='bill_length_mm'
)

In [26]:
alt.Chart(penguins).mark_text().encode(
    x='bill_length_mm',
    y='bill_depth_mm',
    color='species',
    text='species',
).interactive()

In [27]:
chart = alt.Chart(penguins).mark_point().encode(
    y='bill_depth_mm',
    color='species:N'
).interactive()

chart.encode(x='bill_length_mm') | chart.encode(x='flipper_length_mm')

In [29]:
brush = alt.selection_interval()

alt.Chart(penguins).mark_point().encode(
    alt.X(alt.repeat('column'), type='quantitative'),
    alt.Y(alt.repeat('row'), type='quantitative'),
    color=alt.condition(brush, 'species:N', alt.value('gray'))
).add_selection(
    brush
).properties(
    width=250,
    height=250,
).repeat(
    row=['bill_depth_mm', 'bill_length_mm'],
    column=['flipper_length_mm', 'body_mass_g']
)

In [35]:
brush = alt.selection(type='interval')

points = alt.Chart(penguins).mark_point().encode(
    x='flipper_length_mm:Q',
    y='body_mass_g:Q',
    color=alt.condition(brush, 'species:N', alt.value('lightgray'))
).add_selection(
    brush
)

bars = alt.Chart(penguins).mark_bar().encode(
    y='species:N',
    color='species:N',
    x='count(species):Q'
).transform_filter(
    brush
)

points & bars

## Helper Functions

## Plot for the Blog Post

## Sources

- Hello This is a markdown page {% cite signaltrain %}

https://twitter.com/eitanlees/status/1234608793109061633

https://towardsdatascience.com/how-to-create-interactive-and-elegant-plot-with-altair-8dd87a890f2a

https://calmcode.io/altair/recipe.html

## References

{% bibliography --cited %}