# Bar chart

## Setup

- We disable Altair's data restrictions to be able to plot Dataframes with more than 5000 rows: `alt.data_transformers.disable_max_rows()`

In [None]:
import pandas as pd
import altair as alt

alt.data_transformers.disable_max_rows()

We also want to ignore a specific warning:

In [None]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

## Data

### Import data

In [None]:
ROOT = "https://raw.githubusercontent.com/kirenz/datasets/master/"
DATA = "loans.csv"

df = pd.read_csv(ROOT + DATA)

### Data structure

Display the dataframe with `df`

In [None]:
df

Show info

In [None]:
df.info()

### Data corrections

Change the data format from object to category for the variables `homeownership` and `application_type` with `.astype("category")`

In [None]:
# Change data format from object to category
df['homeownership'] = df['homeownership'].astype("category")
df['application_type'] = df['application_type'].astype("category")

### Variable lists

Next, we select the relevant variables we want to use (this will ease the plotting process).

We only use the variable `homeownership`

In [None]:
# make a list of variables you want to use
var_list = ['homeownership']

# create a new dataframe called source with only var_list
source = df[var_list]

## Analysis

We start our analysis with Altair, a declarative statistical visualization library for Python, based on Vega and Vega-Lite.

Altair charts work out-of-the-box on Jupyter Notebook, so long as there is a **web connection** to load the required javascript libraries.

Here is an example of using the Altair API (foo is a placeholder):


**a**lt.**C**hart().**m**ark_foo().**e**ncode() 


*You can remember the order of code blocks with the acronym "**a.C.m.e**"*

```python
alt.Chart(DATAFRAME).mark_PLOT().encode(
    x=alt.X('VARIABLE'),
    y=alt.Y('VARIABLE')
)

```

replace

- DATAFRAME with your data (e.g., `source` or `df`)
- PLOT with the plot type of your choice (e.g., `bar` or `circle`)
- VARIABLE with the varible name you want to plot

### Standard bar chart

In [None]:
alt.Chart(source).mark_bar().encode(
    x=alt.X('homeownership'),
    y=alt.Y('count(homeownership)')
)

### Sorted bar chart

In [None]:
alt.Chart(source).mark_bar().encode(
    x=alt.X('homeownership', 
           sort='-y'), # sort
    y=alt.Y('count(homeownership)')
)

### Bar chart with properties

In [None]:
alt.Chart(source).mark_bar().encode(
    x=alt.X('homeownership',
           sort='-y'),
    y=alt.Y('count(homeownership)')
).properties( # properties
    title='This is a simple bar chart',
    width=300,
    height=150
)

### Bar chart with custom axes

In [None]:
alt.Chart(source).mark_bar().encode(
    x=alt.X('homeownership', 
            sort='-y',
            axis=alt.Axis(title="Homeownership", # title of x axis
                          labelAngle=0, # angle of x axis text
                          titleAnchor="start")), # adjustment of text
    y=alt.Y('count(homeownership)',              
            axis=alt.Axis(title = "Count",
                          titleAnchor="end"))
).properties(
    title='This is a bar chart with custom axes',
    width=300,
    height=150
)

### Bar chart with custom axes and title

In [None]:
alt.Chart(source).mark_bar().encode(
    x=alt.X('homeownership',
            sort='-y',
            axis=alt.Axis(title="Homeownership", 
                          labelAngle=0,
                          titleAnchor="start")),
    y=alt.Y('count(homeownership)', 
            axis=alt.Axis(title = "Count",
                          titleAnchor="end"))
).properties(
    title='This is a bar chart with custom axes and title',
    width=300,
    height=150
).configure_title( # custom title
    fontSize=16,
    font='Arial',
    color='black',
    anchor='start'
)