# Data Visualizations in Python
![image3.png](attachment:image3.png)

For data visualizations in Python, we have a variety of libraries and graphs to explore. We will discuss different types of bar charts and their customization using the top 3 Python libraries- matplotlib, seaborn, and plotly. Let’s dive deep into it.

# 1. Bar Chart
### Introduction
A bar chart (also known as a bar graph or column chart) is used to represent data that can be divided into different categories. Each bar represents a distinct category and the height of each bar represents the count or frequency of data points in that category. Bar charts can be represented vertically as well as horizontally.
Features of Bar Chart
	The main features of the Bar Chart are-
Used for categorical data only and continuous data frequency distribution histograms are used.
The primary variable(X-axis) needs to be categorical only for bar graphs and the secondary variable(Y-axis) needs to be numeric which determines the length of each bar.
Used for comparison of categorical features simultaneously.
Used for Univariate Analysis.
Used for Time-Series analysis.

Using Matplotlib
Matplotlib is a widely used library for creating a range of interactive visualizations in Python like Bar charts, scatter plots, histograms, and heatmaps with customizing options like changing colors, line styles, and marker styles. It interacts with Numpy and Pandas seamlessly and works with extensive data very well. Let’s apply it for visualization and observe its behavior.

Python Code and Visualizations of Bar Chart,
First of all, import the matplotlib library and plot a bar chart for Income distribution. There are two values of Income ‘>=50K’ and ‘>50K’. 

In [None]:
import matplotlib.pyplot as plt #Import matplotlib

plt.figure(figsize=(4,4))
df_data = df['income'].value_counts() #to extract counts of different income values
df_data.plot(kind='bar',figsize=(4,4), color = ['blue', 'red']) #code to plot bar chart
plt.xlabel('Income’)
plt.ylabel('Total Count')
plt.title('Frequency Distribution of Income')
plt.show()


![image9.png](attachment:image9.png)</br>
There are 24264 rows of data with income ‘<=50K’ and 7683 rows of data with income ‘>50K’. Here,’value_counts’ is used to first extract count values of different income values and then a bar chart is plotted. Discrete color is given to two bars using the color parameter.

Let’s see the frequency distribution of the ‘education’ feature.


In [None]:
df_data = df['education'].value_counts()
df_data.plot(kind='bar',figsize=(6,6)) #bar chart plotted
plt.xticks(rotation=90) #for readability xticks are prorated to 90 degrees.
plt.xlabel('Education')
plt.ylabel('Total Count')
plt.title('Frequency Distribution of Education')
plt.show()


![image10.png](attachment:image10.png)</br>
The bar chart is plotted for ‘education’ which defines the number of adults with their education level. We can also draw this graph horizontally using barh( code shown below)


In [None]:
df_data = df['education'].value_counts()
df_data.plot(kind='barh',figsize=(6,6)) #barh is used for horizontal bar graph.
plt.xlabel('Total Count')
plt.ylabel('Education')
plt.title('Frequency Distribution of Education')
plt.show()


![image11.png](attachment:image11.png)</br>
A horizontal bar graph does not require xticks to be rotated and can be easily readable. It is a good way to display big label names rather than tilting them.
## Observations
- Data can be easily analyzed with matplotlib
- It needs 5-6 lines of code to create a bar chart.
- For complex visualizations in Matplotlib, we need to implement a lot of code and it will be time-consuming.
## Using Seaborn
Seaborn library is built on top of Matplotlib and provides a higher-level interface for creating visualizations. Several additional visualizations have been added to Seaborn like pair plots, box plots, and violin plots.
## Python Code and Visualizations of Bar Chart
First of all, import the seaborn library and plot a bar chart for Income distribution same as we did for matplotlib to compare both of them. 

In [None]:
Import seaborn as sns #import seaborn library

plt.figure(figsize=(4,4))
sns.countplot(x='income', data=df) #to plot bar chart
plt.xticks(rotation=90)
plt.xlabel('Income')
plt.ylabel('Total Count')
plt.title('Income Distribution')
plt.show()

![image12.png](attachment:image12.png)</br>
In Seaborn, ‘countplot’ is used to draw a bar chart for frequency distribution. This one command does value-counts and plotting bar charts, unlike matplotlib where this job was done in 2 lines of code. Implicitly, it gives different colors to different Income levels.
	
	Let’s plot a bar graph and a horizontal bar chart for different levels of education.

In [None]:
sns.countplot(x='education',color=’b’, data=df)
plt.xticks(rotation=90)
plt.xlabel("Education")
plt.ylabel("Total count")
plt.title('Frequency distribution by Education')
plt.show()

![image13.png](attachment:image13.png)

In [None]:
sns.countplot(y='education', data=df,     color='g',order=df['education'].value_counts().index)
plt.xlabel("Total count")
plt.ylabel("Education")
plt.title('Frequency distribution by Education')
plt.show()

![image14.png](attachment:image14.png)</br>
We can see bar charts are created in a single line of code. However, seaborn not internally implement any sorting or order of the graph. The ‘order’ parameter is added in ‘countplot’ to define the order of bars. Also, color can be changed using the ‘color’ parameter.
##	Observations
- Seaborn requires fewer lines of code to draw a bar chart.
- It is easy to implement and less complex and can be merged with matplotlib code as it is built on top of matplotlib.
- The ‘color’ parameter is used to give beautiful colors to the chart.
### Sorting bars according to their count needs to be implemented explicitly in Seaborn using the ‘order’ parameter, unlike matplotlib.Using Plotly
The Plotly Python library is an interactive, open-source plotting library that offers over 40 unique charts for different kinds of visualization. It is built on top of the Plotly JavaScript library that enables Python users to create beautiful interactive web-based visualizations. The Plotly Python library is sometimes referred to as "plotly.py" to differentiate it from the JavaScript library.

##	Python Code and Visualizations of Bar Chart
First, import the plotly library and then start plotting bar charts for income and education distribution.	 We will use Plotly Express a high-level API of Plotly to draw easy and interactive plots.


In [None]:
import plotly.express as px #import library
fig = px.histogram(df, x='income',title='Income Distribution)
fig.show()

![image15.png](attachment:image15.png)</br>
As we can see, the Bar graph is created in a single line of Python code. No additional details for labels are required in Plotly. The title of the chart is also given as a parameter.

Note: We are using here histogram to create bar graphs. Normally histograms are used for numerical data, but in Plotly it can be used as an aggregated bar chart as well. We can use a bar chart in plotly as well, but it does not do well with huge data, and display bar colors are very light for huge data. For details click here.

Let’s plot a bar chart for the frequency distribution of ‘education’ on the X-axis and ‘education.num’ on the Y-axis with a continuous color scheme.

In [None]:
fig = px.histogram(df, x="education", color='education.num',
    title="Numeric 'education.num' values represents continuous color").update_xaxes(categoryorder='total ascending')
fig.show()

![image16.png](attachment:image16.png)</br>
Let us plot a Horizontal Bar Chart using plotly.

In [None]:
fig = px.histogram(df, y='education', title='Frequency Distribution for education').update_yaxes(categoryorder='total ascending')
fig.show()

![image17.png](attachment:image17.png)</br>
We have drawn vertical and horizontal Bar charts with sorting order as well easily with plotly. When the cursor hovers over the chart, plotly displays the feature value and its value-counts as shown in the Horizontal Bar Chart.
## Observations
- Plotly is user-friendly and interactive to use.
- Very little code is required.
- X and Y labels and xticks angles are not required to be mentioned in the code. For big feature names, it's implicitly tilt them to 45 degrees.
- On Mouse hover, the value name and its count are displayed.
- A continuous color scheme can be easily implemented with Plotly for the numeric data.

![tb1.PNG](attachment:tb1.PNG)

# Grouped Bar Chart
## Introduction
A grouped bar chart or clustered bar chart is an extension to a bar chart that plots numerical values for different levels of two categorical variables instead of one. Bars are grouped by position for levels of one categorical variable, and color indicates the levels of the second categorical variable within each group. Grouped Bar Charts can be represented vertically as well as horizontally.
Features of Grouped Bar Chart
##	The main features of the Grouped Bar Chart are-
- It displayed the frequency distribution of categorical variables, grouped by different levels of the second variable. 
- For the height of bars, either value counts of the first variable or a third variable can be used.
- Used for meaningful comparison of data sets of two or more categorical features simultaneously.
- Used for Bivariate and Multivariate analysis.

##	Using Matplotlib
Let’s plot a Grouped Bar Chart for education(X-axis) as the first categorical variable and compare the total count of adults(Y-axis) according to two possible levels of income(‘<=50K’ and ‘>50K’) represented by different colored bar graphs.
Python Code and Visualizations of Grouped Bar Chart


In [None]:
df_data1 = df[df['income']=='<=50K']['education'].value_counts().sort_index(ascending=True)
df_data2 = df[df['income']=='>50K']['education'].value_counts().sort_index(ascending=True)
lst=df[df['income']=='>50K']['education'].unique().tolist()

X = np.arange(15)
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
ax.set_xticks(X, lst)
plt.xticks(rotation=90)
ax.bar(X + 0.00, df_data1, color = 'b', width = 0.25)
ax.bar(X + 0.25, df_data2, color = 'y', width = 0.25)
plt.xlabel("Education")
plt.ylabel("Total count")
plt.title('Grouped Bar chart of Income distribution by Education level and Income')
ax.legend(labels=['<=50K', '>50K'])

![image18.png](attachment:image18.png)</br>
From the above plot, we can conclude that adults with Masters, Doctorate, and Prof-school degrees have high chances of income to be ‘>50K’.
## Observations
- Matplotlib needs data filtering and manipulation which leads to more lines of code to implement group stacked charts.
- Need more time to implement.
- Beautiful and interactive Visualizations
![tb2.PNG](attachment:tb2.PNG)</br>
## Using Seaborn
The same grouped Bar Chart is plotted using Seaborn Library.
Python Code and Visualizations of Grouped Bar Chart

In [None]:
import seaborn as sns
sns.countplot(x='education', hue='income', data=df)
plt.xticks(rotation=90)
plt.xlabel("Education")
plt.ylabel("Total count")
plt.title('Grouped Bar Chart for Income distribution by Education level and income')
plt.show()

![image19.png](attachment:image19.png)</br>
## Observations
- The same Grouped Bar Chart is plotted with few lines of code in Seaborn and no data filtering is required as we have done in Matplotlib.
- Beautiful and Interactive Visualizations.
- Takes less time to implement.

![tb3.PNG](attachment:tb3.PNG)</br>
## Using Plotly
- A grouped Bar Chart is plotted using Plotly Library.
- Python Code and Visualizations of Grouped Bar Chart


In [None]:
fig = px.histogram(df, x='education',color="income",barmode='group',title="Grouped Bar Chart of Income Distribution for education")
fig.show()

![image20.png](attachment:image20.png)</br>
The same grouped Bar Chart in Plotly is created in very few lines of code and with less complexity and creates great visualizations.
## Observations
- Single-line code and quick implementation
- Very beautiful and Interactive Visualizations.
- Good for huge dataset visualizations
- In the color parameter, income value is passed which is a categorical variable, so a discrete color scheme is applied based on levels of income.
![tb4.PNG](attachment:tb4.PNG)