# BEGINNER'S Guide to Various Data Visualization Techniques

These plots are studied-
* Bar plot
* Pie chart
* Stacked Area plot
* Line chart
* Histogram
* Scatter plot
* Regression plot
* Area and Line plot

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
import warnings 
warnings.filterwarnings("ignore")

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

# Bar Chart

A bar plot or bar chart is a graph that represents the category of data with rectangular bars with lengths and heights that is proportional to the values which they represent. 
The bar plots can be plotted horizontally or vertically. 
A bar chart describes the comparisons between the discrete categories. 

One of the axis of the plot represents the specific categories being compared, while the other axis represents the measured values corresponding to those categories.

In [None]:
Data= pd.read_csv("/kaggle/input/eda-datsets/bar_chart_data.csv")

In [None]:
Data

In [None]:
plt.figure(figsize=(9,6))
plt.bar(x=Data["Brand"],height=Data["Cars Listings"],color="midnightblue")
plt.xticks(rotation=45)
plt.yticks(fontsize=12)
plt.title("Car listings by Brand",fontsize=16,fontweight='bold')
plt.ylabel("Car Listings", fontsize=14)
plt.show()

# Pie Chart

A Pie Chart is a circular statistical plot that can display only one series of data. The area of the chart is the total percentage of the given data. The area of slices of the pie represents the percentage of the parts of the data. 

In [None]:
pie_data=pd.read_csv("/kaggle/input/eda-datsets/pie_chart_data.csv")

In [None]:
pie_data

In [None]:
sns.set_palette('colorblind')
plt.figure(figsize=(10,8))
plt.pie(pie_data["Number of Cars"],
   labels=pie_data["Engine Fuel Type"],
    autopct='%.2f%%',
    textprops={"size": "x-large",
              "rotation": "30"})
plt.legend()
plt.title("Cars by fuel type", fontsize="16",fontweight="bold")
plt.show()

# Stacked Bar Chart

Stacked bar plots represent different groups on the highest of 1 another. The peak of the bar depends on the resulting height of the mixture of the results of the groups. It goes from rock bottom to the worth rather than going from zero to value.

In [None]:
stack_data=pd.read_csv("/kaggle/input/eda-datsets/stacked_area_chart_data.csv")

In [None]:
stack_data.head()

In [None]:
plt.figure(figsize=(12,6))
sns.set_style("white")
labels=["Gas","Petrol","Diesel"]
plt.stackplot(stack_data["Year"],
              stack_data["Gas"],
              stack_data["Petrol"],
              stack_data["Diesel"],
              edgecolor='none',
              colors=["#FFB6C1","FF00FF","#9400D3"])
plt.xticks(stack_data["Year"], rotation=90)
plt.title("Popularity of engine fuel types(1982-2016)", fontsize=14, weight="bold")
plt.legend(labels=labels, loc='upper left')
sns.despine()
plt.show()

# Line Chart

 Line charts are used to represent the relation between two data X and Y on a different axis.

In [None]:
line_chart=pd.read_csv("/kaggle/input/eda-datsets/line_chart_data.csv")

In [None]:
line_chart

In [None]:
line_chart.info()

First convert the object Date column into datetime.

In [None]:
line_chart["date"]=pd.to_datetime(line_chart["Date"])

In [None]:
line_chart["date"]

By using whole data-

In [None]:
plt.figure(figsize=(20,8))
plt.plot(line_chart["date"],line_chart["GSPC500"])
plt.plot(line_chart["date"],line_chart["FTSE100"])
plt.title("S&P vs FTSE return(2000-2010)",fontsize=14,weight="bold")
plt.legend(labels=["Date","S&P 500","FTSE 100"], loc="upper left")
plt.show()

We can't analyse much from this line chart,so now what?

Try it out for sample data.

In [None]:
df_2009= line_chart[(line_chart.date>="2009-07-01") & (line_chart.date<="2009-12-31")]

In [None]:
df_2009

In [None]:
plt.figure(figsize=(20,8))
plt.plot(df_2009["date"],df_2009["GSPC500"],color="midnightblue")
plt.plot(df_2009["date"],df_2009["FTSE100"],color="crimson")
plt.title("S&P vs FTSE return(2000-2010)",fontsize=14,weight="bold")
plt.legend(labels=["Date","S&P 500","FTSE 100"], loc="upper left",fontsize="large")
plt.show()

Hurray! we are now able to see the chart properly and can analyse it further.

# Histogram

A histogram is basically used to represent data provided in a form of some groups.It is accurate method for the graphical representation of numerical data distribution.It is a type of bar plot where X-axis represents the bin ranges while Y-axis gives information about frequency.

In [None]:
hist_data=pd.read_csv("/kaggle/input/eda-datsets/histogram_data.csv")

In [None]:
hist_data

In [None]:
plt.figure(figsize=(9,6))
sns.set_style("white")
plt.hist(hist_data["Price"], bins=8, color="#008080")
plt.title("Distribution of Real State Prices", fontsize=14, weight="bold")
plt.xlabel("Price")
sns.despine()
plt.show()

# Scatter Plot

Scatter plots are used to observe relationship between variables and uses dots to represent the relationship between them. Scatter plots are widely used to represent relation among variables and how change in one affects the other.

In [None]:
scatter_data= pd.read_csv("/kaggle/input/eda-datsets/scatter_data.csv")

In [None]:
scatter_data

In [None]:
plt.figure(figsize=(9,6))
scatter=plt.scatter(scatter_data["Area (ft.)"],
            scatter_data["Price"],
           alpha=0.6,
           c=scatter_data["Building Type"],
           cmap='viridis')
plt.title("Area V/S Price", fontsize=14, weight="bold")
plt.xlabel("Area")
plt.ylabel("Price")
plt.legend(*scatter.legend_elements(), loc="upper left", title="Building Type")
plt.show()

# Regression Plot

 Regression plots as the name suggests creates a regression line between 2 parameters and helps to visualize their linear relationships.
 
 We can plot regression plots using two methods-
 1. sns.regplot()
 2. sns.lmplot()
 
Let's see both of these in details.

In [None]:
regression_data=pd.read_csv("/kaggle/input/eda-datsets/scatter_plot_ii.csv")

In [None]:
regression_data.head()

In [None]:
plt.figure(figsize=(9,6))
sns.regplot(x="Budget",
            y="Sales",
            data= regression_data,
            scatter_kws={'color':'k'},
            line_kws={'color':'red'})
plt.title("Effect of Budget on Sales", fontsize=14, weight="bold")
plt.show()

In [None]:
regression_data_2= pd.read_csv("/kaggle/input/eda-datsets/scatter_data.csv")

In [None]:
regression_data_2

In [None]:
sns.lmplot(x="Area (ft.)",
          y="Price",
          data=regression_data_2,
          scatter_kws={'color':'k'},
          line_kws={'color':'red'},
          height=6)
sns.set()
plt.title("Effect of Area on Price", fontsize=14, weight="bold")
plt.show()

# Bar and Line Chart

It is basically the combination of bar plot and line chart.

In [None]:
combined_data= pd.read_csv("/kaggle/input/eda-datsets/bar_line_chart_data.csv")

In [None]:
combined_data.head()

In [None]:
from matplotlib.ticker import PercentFormatter

In [None]:
sns.set_style("white")
fig, ax= plt.subplots(figsize=(10,7))
ax.bar(combined_data["Year"],
      combined_data["Participants"],
      color='k')
ax.set_ylabel("Number of Participants", fontsize=14, weight="bold")
ax1= ax.twinx()
ax1.set_ylim(0,1)
ax1.yaxis.set_major_formatter(PercentFormatter(xmax=1.0))
ax1.plot(combined_data["Year"],
      combined_data["Python Users"],
      color='#b60000',
      marker='D')
ax1.set_ylabel("Python Users", fontsize=14, weight="bold")
ax.set_title("KD Nuggets Survey Python Users", fontsize=14, weight="bold")
plt.show()

Happy Learning!

Hope you found it useful, if you get stuck, do comment!

Suggestions are always welcome!
Being a beginner,I will appreciate if you can give a read and review my notebook.Please upvote,if you like my work.It will boost my confidence.

Thank You.