# 03 - Bar Charts - Script

Welcome! In this course we will use NumPy and Pandas to store our data, and will use matplotlib and Seaborn to create our visualizations. We will also write and run all of our code using Jupyter Notebooks. 

In [None]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

## Introduction to the data set

The diamond dataset consists of  53,940 rows by 10 columns (features). These include both quantitative and qualitative features.

In [None]:
df = pd.read_csv('..//data/diamonds.csv')
df.shape

In [None]:
df.head(5)

Let's start by creating a bar chart on the diamond `cut` variable. Let's use seaborn. 

We need to set the following parameters: `data`, `x`, and `y`.

## Count plot

In [None]:
sns.countplot(data=df, x='cut');

Each bar is illustrated using a different color, but the cut is already encoded by position on the x-axis. Unless we have a good reason, it's better to plot everything in one color to avoid being distracting.

There are several ways to set the color. One is to choose a color value using Seaborn's color palette function. We will use the default palette here.

In [None]:
sns.color_palette()

In [None]:
color = sns.color_palette()[0]

In [None]:
sns.countplot(data=df, x='cut', color=color)

The revised plot looks like this. Much cleaner. 

Another way is to specifically specify the color we want. Here I select _Tableau Blue_.

In [None]:
sns.countplot(data=df, x='cut', color='tab:blue')

## Ordering the plot

One thing we might want to do with the plot now is to sort the chart in decreasing order of prevalence. The `order` parameter in the countplot function will do this.

This parameter takes as an argument a list with the order in which bars should be plotted. We will write some code to obtain that order programmatically.

In [None]:
order = df['cut'].value_counts().index

In [None]:
sns.countplot(data=df, x='cut', color='tab:blue', order=order)

## Rotating axis labels and bars

We can use matplotlib's Xticks functionto rotate the labels 90 degrees. 

Or we can create a horizontal bar chart.

In [None]:
sns.countplot(data=df, x='cut', color='tab:blue', order=order)
plt.xticks(rotation=15)

In [None]:
sns.countplot(data=df, y='cut', color='tab:blue', order=order)