# Bar Graphs 

#### TEAM MEMEBERS - Haritima Manchanda, Ishaan Thakker, Anna Wu

## Before we get into coding, some questions that you should consider -

1. **What is the visualization and what is it used for?**

    Bar graphs display numerical quantities on one axis and categorical variables on the other, letting you see how many occurrences there are for the  different categories. Bar charts can be used for visualizing a time series, as well as just categorical data. 

    1. Can be used to compare categories of data 
    2. Can show large changes in data over time, in contrast to line graphs which often show smaller changes in data
    
2. **What are some mistakes people make with the visualization?**

    1. Scaling mistakes - y axis not starting a 0, 
    2. Using bar charts for highly connected data (should use line graphs instead)
    3. Too many data points
    4. Stacking rates, ratios, and averages

    For more information go to: [Common Mistakes and General Rules for Making the Most of the Horizontal Bar Chart](https://charting-ahead.corsairs.network/what-not-to-bar-chart-a-visual-primer-cdb55a926d34)

#### In this tutorial, we'll take a look at how to plot a Bar Plot in Seaborn.

Seaborn is one of the most widely used data visualization libraries in Python, as an extension to Matplotlib. It offers a simple, intuitive, yet highly customizable API for data visualization.

# Plot a Bar Graph in Seaborn

Plotting a Bar Plot in Seaborn is as easy as calling the barplot() function on the sns instance, and passing in the categorical and continuous variables that we'd like to visualize:

1. Import Seaborn
2. Load Dataset from Seaborn as it contain good collection of datasets.
3. Plot Bar graph using seaborn.barplot() method.

In [1]:
import seaborn as sns
import matplotlib.pyplot as plt

# We will be using the titanic.csv file
# from seaborn library
print(titanic_dataset.head(5))

NameError: name 'titanic_dataset' is not defined

In [None]:
# Example 1

df = sns.load_dataset('titanic')
 
# who v/s fare barplot 
sns.barplot(x = 'who',
            y = 'fare',
            data = df, color = "Green")
 
# Show the plot
plt.show()

# Plot Grouped Bar Chart

Grouping Bars in plots is a common operation. Say you wanted to compare some common data, like, the survival rate of passengers, but would like to group them with some criteria.

Visualizing the relationship of passengers who survived, segregated into classes (first, second and third).

To group bars together, we use the **hue** argument.

In [None]:
# Example 2

df = sns.load_dataset('titanic')
sns.barplot(x = 'class',
            y = 'survived',
            hue = 'sex',
            data = df,
            palette = "Blues")
plt.show()

#### Try Yourself Problem 1: 

Plot a bar graph on the 'titanic' dataset (who vs fare) based on the median. (Use hue = 'class')

#### Try Yourself Problem 2: 

Plot a bar graph on the 'titanic' dataset with x = "sex", y ="survived"

# You can also create a bar plot from your own dataset. Try the following problem...

#### Try Yourself Problem 3:
Plot a bar graph with ['A', 'B', 'C'] on x axis and [1,5,3] on y axis

# Plot a Horizontal Bar Plot

To plot a Bar Plot horizontally, instead of vertically, we can simply switch the places of the x and y variables. 

Plotting the previous graph horizontally.

In [None]:
# Example 3

df = sns.load_dataset('titanic')
sns.barplot(x = 'fare',
            y = 'who',
            hue = 'class',
            data = df,
            palette = "Blues")
plt.show()

#### Try Yourself Problem 4:

Plot a bar graph (who vs fare) horizontally, and group the data by "embark_town"

# Ordering Grouped Bars in a Bar Plot 

You can change the order of the bars from the default order. 

This is done via the **order** argument, which accepts a list of the values and the order you'd like to put them in.

For example, so far, it ordered the classes from the first to the third. What if we'd like to do it the other way around?

In [None]:
# Example 4

df = sns.load_dataset('titanic')
sns.barplot(x = 'class',
            y = 'survived',
            hue = 'sex',
            order = ["Third", "Second", "First"],
            data = df,
            palette = "Reds")
plt.show()

# Change Confidence Interval

You can also easily change the confidence interval by setting the **ci** argument.

For example, you can turn it off, by setting it to None, or use standard deviation instead of the mean by setting sd, or even put a cap size on the error bars for aesthetic purposes by setting capsize.

Let's play around with the confidence interval attribute a bit:

In [None]:
# Example 5

df = sns.load_dataset('titanic')
sns.barplot(x = 'fare',
            y = 'who',
            hue = 'class',
            data = df,
            palette = "Blues", ci = 0)
plt.show()

# This now removes our error bars from before:

#### Try Yourself Problem 5:
Plot a bar graph (class vs survived), with hue = "sex". The order should be reversed and no error bars should be displayed.

In [None]:
# Example 6

# Or, we could use standard deviation for the error bars and set a cap size:
df = sns.load_dataset('titanic')
sns.barplot(x = "class", y = "survived", hue = "who", ci = "sd", capsize = 0.1, data = df)
plt.show()

1. **What are some alternative visualizations?**
    1. In seaborn, there are several different ways to visualize a relationship involving categorical data. There are a number of axes-level functions for plotting categorical data in different ways and a figure-level interface, **catplot()** is one of them that gives unified higher-level access to them.
    2. Line graph is another. (When you have data with many small changes over time should you use a line graph)
  
2. **What are some major variations of this visualization?**
    1. Horizontal Bar Graph
    2. Stacked Bar Chart




# Conclusion

In this tutorial, we've gone over several ways to plot a Bar Plot using Seaborn. We've started with simple plots, and horizontal plots, and then continued to customize them.
We've covered how to change the colors of the bars, group them together, order them and change the confidence interval.
 

##### Try Yourself Problem 6:

##### Fill in the blanks:

1. Bar graphs display (           ) quantities on one axis and (             ) variables on the other.
2. To group bars together, we use the (               )  argument.
3. To change the order of the bars, we use the (             ) argument.
4. The confidence interval can be changed by using (           ) argument. 
5. To remove the error bars, we need to set the confidence interval to (           ) or (            ).
6. To adjust the capsize, we use the (           ) argument.
7. To adjust the colors in the above example, we used (          ) and (              ) arguments.
8. When you have data with many small changes over time should you use a (             ) graph.