# Information Visualization, Transformations Part 2
Licia He, Eytan Adar, Sereen Kallerackal, Dallas Card

School of Information, University of Michigan

## Plan
1. Long and Wide Data Transforms
2. Multiple charts

In [None]:
# imports we will use

import altair as alt
import pandas as pd
from vega_datasets import data as vega_data

# grab the data and clean it a bit
movies_url = vega_data.movies.url
movies = pd.read_json(movies_url)
movies.columns = movies.columns.str.strip().str.replace(' ', '_').str.replace('(', '').str.replace(')', '')

In [None]:
movies.sample(2)

## Long versus Wide Data

### Our Goal:

![stacked barchart](https://raw.githubusercontent.com/eytanadar/si649public/master/lab5/assets/demo/stacked.png)


There are two common conventions for storing data in a dataframe, sometimes called long-form and wide-form. Both are sensible patterns for storing data in a tabular format; briefly, the difference is this:

* long-form data has one row per observation, with metadata recorded within the table as values. For example: 


| item | key  | value |
|-------|-------|-------|
| 1     | key A | 2     |
| 1     | key B | 6     |
| 2     | key A | 4     |
| 2     | key B | 8     | 

If the data looked like the long form, Altair will make the stacked bar *easy*:

```Python
alt.Chart(data).mark_bar().encode(
    x = alt.X('item'),
    y = alt.Y('value:Q'),
    color = alt.Color('key:N')
)
```

But what you sometimes have is...

* wide-form data has one row per independent variable, with metadata recorded in the row and column labels.,for example: 

| item | key A | key B |
|-------|-------|-------|
| 1     | 2     | 6     |
| 2     | 4     | 8     |


In [None]:
#1.1 focus on US_Gross and Ww_Gross
movies_wide=movies.loc[:,["Title","US_Gross","Worldwide_Gross"]].iloc[:10,:]
movies_wide

It's hard to make a stacked bar from here. 

![stacked barchart](https://raw.githubusercontent.com/eytanadar/si649public/master/lab5/assets/demo/stacked.png)

In [None]:
#1.2 make stacked bar chart (the hard way)

#make Worldwide Gross bar chart



In [None]:
# 1.3 make Total bar chart



In [None]:
# 1.4 layering them together 



The problem is that with more classes of data this layering becomes tedious. We also don't get quite the right colors because the two charts don't know what the other has picked. Thus, depending on the task we may need to switch from the "wide" format to the "long" format (see [more here](https://altair-viz.github.io/user_guide/data.html#long-form-vs-wide-form-data)). 

To get the data back into the long form, we'll use the pandas operation [melt](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.melt.html).

In [None]:
# 1.5 melt (the Pandas ways to get Long form)


Here's the stacked bar chart in one short command

In [None]:
# 1.6 make one bar chart


Altair's **fold transformation** lets you quickly change between long and wide forms.

Using fold to convert wide to long 

In [None]:
# 1.7 fold (the Altair way)


### Converting long form to wide form using Pandas 

In [None]:
# 1.8 


In [None]:
# 1.9




# Compound and Faceted Charts

## Our Goal
![small multiples](https://raw.githubusercontent.com/eytanadar/si649public/master/lab5/assets/demo/smallmultiples.png)

In [None]:
# 2.1 give it a try...


In [None]:
# 2.2 put it together


## RepeatedChart

[Repeated Charts](https://altair-viz.github.io/user_guide/compound_charts.html#repeated-charts) are an alternative form.

In [None]:
# 2.3 Use repeated charts




## Faceted Charts

## Our Goal
![Facets](https://raw.githubusercontent.com/eytanadar/si649public/master/lab5/assets/demo/facet.png)

In [None]:
# 2.4 We'd repeat this 4 times, and have to change the color each time (or write a loop)


An easier way to do this is through [Faceted Charts](https://altair-viz.github.io/user_guide/compound_charts.html#faceted-charts).  You can do some fairly sophisticated stuff with facets. For example this "[Ridgeline Chart](https://altair-viz.github.io/gallery/ridgeline_plot.html?highlight=configure_facet)".

In [None]:
# 2.5 Faceted Charts 



In [None]:
# can we see all them nicely?

