# Data Visualization

* Data Visualization is all about viewing or visualizing data in the form of graphical plots, charts, figures, and animations.

* Data Visualization is an effective way of communicating information to others.

* Data Visualization is one of the steps in Data Science.

# Effective Data Visualization

* An effective visual can be created from data when you realize that you are actually telling a story.

* Steps to be followed for creating effective visualizations.

* Don't bother much about the tool used for creating the visuals.
* Define what to communicate before you look at data.
* Identify the right chart or plot, which suits your story.
* Create the visual and verify if it is aligned with your story.

# Data Visualization with Python

* There are many Python libraries used for Data Visualization.

* Few popular libraries are:

* matplotlib: It is the most widely used Python Data Visualization library.

* seaborn: It is used for generating informative statistical graphics. It is dependent on matplotlib.

* bokeh: It is used for generating interactive plots, which can be accessed as JSON, HTML objects, or interactive web applications.

### matplotlib is the first data visualization library in Python and is widely used.

##### In this course you will learn:

* Usage of matplotlib library in creating basic plots such as Line plot, Scatter Plot, etc.

* Creating multiple plots in a single figure.

* Customizing plots using various styles.

## Installing Matplotlib

* matplotlib is a third party library and is not part of standard Python library.

* You can easily install it using pip utility as shown in below expression.

pip install matplotlib
* matplotlib is available directly in distributions such as Anaconda, and WinPython.

## Loading matplotlib

* matplotlib is loaded using import as shown in below expression 
** import matplotlib
* You can find the version of matplotlib with the below command.
** print(matplotlib.__version__)
* If matplotlib is already installed and you want to upgrade it, run the below command on command line prompt.
** pip install --upgrade matplotlib

## About Matplotlib

* In matplotlib, everything is organized in a hierarchy.

* At the top level, it consists of matplotlib.pyplot module.

* pyplot is used only for few activities such as figure creation.

* Through the created figures, one or more axes/subplot objects are created.

* The axes objects are further used for doing many plotting actions.

* In next topic, you will understand the anatomy of a figure.

# Parts of a Matplotlib Figure

In this topic, you will get introduced to essential parts of a Matplotlib Figure.

* Figure: Whole area chosen for plotting.
* Axes: Area where data is plotted.
* Axis: Number-line like objects, which define graph limits.
* Artist: Every element on the figure is an artist.

# Figure

* Figure refers to the whole area or page on which everything is drawn.

* It includes Axes, Axis, and other Artist elements.

### Creating a Figure
* A figure is created using figure function of pyplot module, as shown below.-
* import matplotlib.pyplot as plt
* fig = plt.figure()
* Executing the above code doesn't display any figure.-
* You should explicitly tell pyplot to display it.
- NOTE: The code snippets shown in this course assumes that you have imported pylot as plt.

# Viewing a Figure
* show method can be used to view the created figure as shown below.

In [None]:
fig = plt.figure()
plt.show()
Output
<matplotlib.figure.Figure at 0x185417f0>

* The output simply shows the figure object.
* You will be able to view a picture only when a figure contains at least one Axes element.

# Axes
* An Axes is the region of the figure, available for plotting data.

* An Axes object is associated with only one Figure.

* A Figure can contain one or more number of Axes elements.

* An Axes contains two Axis objects in case of 2D plots and three Axis objects in case of 3D plots.



# Creating an Axes
## Creating an Axes
* An Axes can be added to a figure using add_subplot methods.
### Syntax
* add_subplot(nrows, ncols, index)

* When these argument values are less than 10, they all can be clubbed and passed as a single three-digit number.

* Hence, add_subplot(1, 1, 1) and add_subplot(111) are same.

* The below code generates a figure with one axes, ax.


In [None]:
fig = plt.figure()
ax = fig.add_subplot(111)
plt.show()

# Adjusting Figure Size
# Adjusting Figure Size
* The default width and height of a figure are 6 and 4 inches respectively.

* You can change the size of a figure using figsize argument.

* For example, the expression fig = plt.figure(figsize=(8,6)) generates a figure having 8 inches width and 6 inches height.

# Setting Title and Axis Labels
## Setting Title and Axis Labels
### set method can be used on created axes, ax, to set various parameters such as xlabel, ylabel and title.

In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='My First Plot',
      xlabel='X-Axis', ylabel='Y-Axis',
      xlim=(0, 5), ylim=(0,10))
plt.show()

# Setting Title and Axis Labels
## Setting Title and Axis Labels
### Setting an attribute can also be done with functions of the form set_<parameter_name>, as shown in below code.


In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set_title("My First Plot")
ax.set_xlabel("X-Axis"); ax.set_ylabel('Y-Axis')
ax.set_xlim([0,5]); ax.set_ylim([0,10])
plt.show()

# Plotting Data
## Plotting Data
* plot is one of the functions used for plotting data points.

* plot function is called on the created axes object, ax, as shown in below code snippet.


In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='My First Plot',
      xlabel='X-Axis', ylabel='Y-Axis',
      xlim=(0, 5), ylim=(0,10))
x = [1, 2, 3, 4]; y = [2, 4, 6, 8]
plt.plot(x, y)
plt.show()

# Plotting Data
## Plotting Data
* Plotting data or setting attributes can also be done by calling functions like plot, and title directly on plt.

* This would plot the data on axes, which is active currently.

* However, Explicit is better than implicit. Hence prefer former style of plotting.

In [None]:

fig = plt.figure(figsize=(8,6))

x = [1, 2, 3, 4]; y = [2, 4, 6, 8]
plt.plot(x, y)
plt.title('My First Plot')
plt.xlabel('X-Axis'); plt.ylabel('Y-Axis')
plt.xlim(0,5); plt.ylim(0,10)
plt.plot(x, y)
plt.show()

# Adding a Legend
## Adding a Legend
* legend function is called on axes object ax to produce a legend.

* The legend uses the label, provided to a line drawn using plot as shown in below code.


In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='My First Plot',
      xlabel='X-Axis', ylabel='Y-Axis',
      xlim=(0, 5), ylim=(0,10))
x = [1, 2, 3, 4]; y = [2, 4, 6, 8]
plt.plot(x, y, label='linear-growth')
plt.legend()
plt.show()

In [1]:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
plt.plot([10, 12, 14, 16])
plt.show()

<Figure size 640x480 with 1 Axes>

In [None]:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.gridspec as gridspec
#Write your code here
def test_my_first_plot():
    fig = plt.figure(figsize=(8,6))
    ax = fig.add_subplot(111)
    t = [5, 10, 15, 20, 25]
    d = [25, 50, 75, 100, 125]
    ax.set(title='Time vs Distance Covered',
          xlabel='time (seconds)', ylabel='distance (meters)',
          xlim=(0, 30), ylim=(0,130))
    plt.plot(t, d, label='d = 5t')
    plt.legend()
    plt.savefig("scatter.png")
test_my_first_plot()



# Types of Plots
### Data can be presented in a different number of plots.

### In this topic, you will learn how to draw below-mentioned plots using matplotlib.

* Line plot
* Scatter plot
* Bar plot
* Pie plot
* Histogram
* Box plot

# Line Plot
*  Line Plot is used to visualize a trend in data.

* Line Plot is also used to compare two variables.

* Line Plots are simple and effective in communicating.

* plot function is used for drawing Line plots.

## Syntax
plot(x, y)
#### 'x' , 'y' : Data values representing two variables.

# Line Plot Example
* The above example plots average day temperature of Jan 2018 month.
* Temperature is collected on every Monday and Friday of a week. It shows an increasing trend in temperature.

In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='Avg. Daily Temperature in Jan 2018',
      xlabel='Day', ylabel='Temperature (in deg)',
      xlim=(0, 30), ylim=(25, 35))
days = [1, 5, 8, 12, 15, 19, 22, 26, 29]
temp = [29.3, 30.1, 30.4, 31.5, 32.3, 32.6, 31.8, 32.4, 32.7]
ax.plot(days, temp)
plt.show()

# Common Parameters of 'plot' Function
* color: Sets the color of the line.

* linestyle: Sets the line style, e.g., solid, dashed, etc.

* linewidth: Sets the thickness of a line.

* marker: Chooses a marker for data points, e.g., circle, triangle, etc.

* markersize: Sets the size of the chosen marker.

* label: Names the line, which will come in legend.

# Setting 'plot' Parameters
* For customizing line, required parameters need to be passed as arguments to plot function.
* A green dashed line, having width 3 can be generated by using the following expression.

In [None]:
ax.plot(days, temp, color='green', linestyle='--', linewidth=3)

# Marking Data Points
* Data points are made visible using marker argument.

* The below-shown expression plots a green colored line with data points marked in circles.

In [None]:
ax.plot(days, temp, color='green', marker='o')

# Plotting Multiple Lines
* Using plot function multiple times is one of the ways to draw multiple lines.

* Two lines representing temperatures of two different locations are plotted using below code.

In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='Avg. Daily Temperature of Jan 2018',
      xlabel='Day', ylabel='Temperature (in deg)',
      xlim=(0, 30), ylim=(25, 35))
days = [1, 5, 8, 12, 15, 19, 22, 26, 29]
location1_temp = [29.3, 30.1, 30.4, 31.5, 32.3, 32.6, 31.8, 32.4, 32.7]
location2_temp = [26.4, 26.8, 26.1, 26.4, 27.5, 27.3, 26.9, 26.8, 27.0]
ax.plot(days, location1_temp, color='green', marker='o', linewidth=3)
ax.plot(days, location2_temp, color='red', marker='o', linewidth=3)
plt.show()

# Scatter Plot
* Scatter plot is very similar to Line Plot.

* Scatter Plot is used for showing how one variable is related with another.

* Scatter Plot consists of data points. If the spread of data points is linear, then two variables are highly correlated.

* scatter function is used for drawing scatter plots.

# Syntax

In [None]:
scatter(x, y)
# 'x', 'y' : Data values representing two variables.

# Scatter Plot with Scatter Function
* scatter plot only marks the data points with the chosen marker.

* The below example displays the average temperature of a day, corresponding to every Monday and Friday of Jan 2018.

In [None]:

fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='Avg. Daily Temperature of Jan 2018',
      xlabel='Day', ylabel='Temperature (in deg)',
      xlim=(0, 30), ylim=(25, 35))
days = [1, 5, 8, 12, 15, 19, 22, 26, 29]
temp = [29.3, 30.1, 30.4, 31.5, 32.3, 32.6, 31.8, 32.4, 32.7]
ax.scatter(days, temp)
plt.show()

# Common Parameters of 'scatter'
* c: Sets color of markers.

* s: Sets size of markers.

* marker: Selects a marker. e.g: circle, triangle, etc

* edgecolor: Sets the color of lines on edges of markers.

# Setting 'scatter' Parameters
## Parameters c and s can take a list of values.
* If the number of values is less than the number of data points considered, then the list is repeated.
* The below example plots green colored circles of size 60, with black edges.

In [None]:
ax.scatter(days, temp, marker='o', c=['green'], s=[60], edgecolor='black')

# Scatter Plot Using 'plot'
* plot function can also create a scatter plot when linestyle is set to none, and a marker is chosen, as shown in below code.

In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='Avg. Daily Temperature of Jan 2018',
      xlabel='Day', ylabel='Temperature (in deg)',
      xlim=(0, 30), ylim=(25, 35))
days = [1, 5, 8, 12, 15, 19, 22, 26, 29]
temp = [29.3, 30.1, 30.4, 31.5, 32.3, 32.6, 31.8, 32.4, 32.7]
ax.plot(days, temp, marker='o', linestyle='none')
plt.show()

# Handson

In [None]:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.gridspec as gridspec
from matplotlib.testing.decorators import image_comparison
#Write your code here
def test_sine_wave_plot():
    fig = plt.figure(figsize=(12,3))
    ax = fig.add_subplot(111)
    t = np.linspace(0.0,2.0,200)
    v = np.sin(2.5*np.pi*t)
    ax.set(title='Sine Wave', xlabel='Time (seconds)', ylabel='Voltage (mv)', xlim = (0,2), ylim=(-1,1))
    plt.xticks([0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0])
    plt.yticks([-1, 0, 1])
    plt.plot(t, v, label='sint(t)',color = 'red')
    plt.legend()
    ax.grid(color='black', linestyle='--',alpha=0.5)
    plt.grid(True)
    plt.savefig("sinewave.png")
test_sine_wave_plot()
def test_multi_curve_plot():
    fig = plt.figure(figsize=(12,3))
    ax = fig.add_subplot(111)
    x = np.linspace(0.0,5.0,20)
    y1 = x
    y2 = x**2
    y3 = x**3

    ax.set(title='Linear, Quadratic, & Cubic Equations',
          xlabel='X', ylabel='f(x)')
    ax.plot(x, y1, label='y=x',marker = 'o',color = 'red')
    ax.plot(x, y2, label='y = x**2',marker = "s",color = 'green')
    ax.plot(x, y3, label='y == x**3',marker = "v",color = 'blue')
    plt.legend()
    plt.savefig("multicurve.png")
test_multi_curve_plot()
def test_scatter_plot():
    fig = plt.figure(figsize=(12,3))
    ax = fig.add_subplot(111)
    s = [50, 60, 55, 50, 70, 65, 75, 65, 80, 90, 93, 95]
    months = [1,2,3,4,5,6,7,8,9,10,11,12]
    ax.set(title="Cars Sold by Company 'X' in 2017",
          xlabel='Months', ylabel='No. of Cars Sold',xlim=(0, 13), ylim=(20,100))
    ax.scatter(months, s,marker = 'o',color = 'red')
    plt.xticks([1, 3, 5, 7, 9,11])
    ax.set_xticklabels(['Jan', 'Mar', 'May', 'Jul', 'Sep','Nov'])
    plt.savefig("scatter.png")
test_scatter_plot()
    

# Bar Plot
### Bar Plot is commonly used for comparing categories.

* It is also used to compare categories over a short period of time.

* bar and barh are used for plotting vertical and horizontal bar plots respectively.

## Syntax

In [3]:
bar(x,height)
# 'x' : x coordinates of bars.
# 'height' : List of heights of each bar.
barh(y, width)
# 'y' : y coordinates of bars
# 'width' : List of widths.

NameError: name 'bar' is not defined

# Bar Plot Using 'bar'
* The below example plots the average sales of a company, in first three quarters of 2017.
* The code also sets the ticks on X-Axis and labels them.

In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='Avg. Quarterly Sales',
      xlabel='Quarter', ylabel='Sales (in millions)')
quarters = [1, 2, 3]
sales_2017 = [25782, 35783, 36133]
ax.bar(quarters, sales_2017)
ax.set_xticks(quarters)
ax.set_xticklabels(['Q1-2017', 'Q2-2017', 'Q3-2017'])
plt.show()

# Common Parameters of 'bar'
* color: Sets the color of bars.
* `edgecolor: Sets the color of the border line of bars.
* width: Sets the width of bars
* align: Aligns the bars w.r.t x-coordinates
* label: Sets label to a bar, appearing in legend.

# Setting Parameters of 'bar'
* The width of bars can be adjusted with width, color with color, edge color with edgecolor parameters.
* Red color bars with black edges can be drawn using the below expression.

In [None]:
ax.bar(quarters, sales_2017, color='red', width=0.6, edgecolor='black')

# Plotting Multiple Groups
* Vertical bar plots are used for comparing more than one category at a time.
* The example below compares a company sales, occurred in first three quarters of 2016 and 2017.

In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='Avg. Quarterly Sales',
      xlabel='Quarter', ylabel='Sales (in millions)')
quarters = [1, 2, 3]
x1_index = [0.8, 1.8, 2.8]; x2_index = [1.2, 2.2, 3.2]
sales_2016 = [28831, 30762, 32178]; sales_2017 = [25782, 35783, 36133]
ax.bar(x1_index, sales_2016, color='yellow', width=0.4, edgecolor='black', label='2016')
ax.bar(x2_index, sales_2017, color='red', width=0.4, edgecolor='black', label='2017')
ax.set_xticks(quarters)
ax.set_xticklabels(['Q1', 'Q2', 'Q3'])
ax.legend()
plt.show()

# Barplot Using 'barh'
* barh draws the bars horizontally as shown in above image.
* height parameter is used to adjust the height of each bar.
* Horizontal bar plots are used while comparing values of one category at a time.

In [None]:
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title='Avg. Quarterly Sales',
      xlabel='Sales (in millions)', ylabel='Quarter')
quarters = [1, 2, 3]
sales_2017 = [25782, 35783, 36133]
ax.barh(quarters, sales_2017, height=0.6, color='red')
ax.set_yticks(quarters)
ax.set_yticklabels(['Q1-2017', 'Q2-2017', 'Q3-2017'])
plt.show()

# Pie Plot
* Pie plot is effective in showing the proportion of categories.
* It is best suited used for comparing fewer categories.
* In general, Pie Plot is used to highlight proportion of one or a group of categories.
## Syntax

In [None]:
pie(x)
# 'x' : sizes of portions, passed either as a fraction or a number.

# Pie Plot Using 'pie'
* The above pie chart displays company sales, occurred in first three quarters of 2017.

In [None]:
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111)
ax.set(title='Avg. Quarterly Sales')
sales_2017 = [25782, 35783, 36133]
ax.pie(sales_2017)
plt.show()

# Common Parameters of 'pie'
* colors: Sets the colors of portions.
* labels: Sets the labels of portions.
* startangle: Sets the start angle at which portion drawing starts.
* autopct: Sets the percentage display format of an area, covering portions.

# Setting Parameters of 'pie'
* Labels and percentage of portions are drawn with below code snippet.

In [None]:
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111)
ax.set(title='Avg. Quarterly Sales')
sales_2017 = [25782, 35783, 36133]
quarters = ['Q1-2017', 'Q2-2017', 'Q3-2017']
ax.pie(sales_2017, labels=quarters, startangle=90, autopct='%1.1f%%')
plt.show()

# Histogram
* Histogram is used to visualize the spread of data of a distribution.

* hist function is used to plot a histogram.

## Syntax

In [None]:
hist(x)
# 'x' : Data values of a single variable.

# Histogram Using 'hist'
* The below example simulates 1000 percentage values from a normal distribution with mean 60 and standard deviation 10.
* Then the histogram of percentage values is plotted.

In [None]:
import numpy as np
np.random.seed(100)
x = 60 + 10*np.random.randn(1000)
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title="Distribution of Student's Percentage",
      ylabel='Count', xlabel='Percentage')
ax.hist(x)
plt.show()

# Common Parameters of 'hist'
* color: Sets the color of bars.
* bins: Sets the number of bins to be used.
* normed: Sets to True where bins display fraction and not the count.

# Setting Parameters of 'hist'
* You can also create more bins and show bin count as a fraction (as specified in the following example).

In [None]:
import numpy as np
np.random.seed(100)
x = 60 + 10*np.random.randn(1000)
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title="Distribution of Student's Percentage",
      ylabel='Proportion', xlabel='Percentage')
ax.hist(x, color='blue', bins=30, density=True)
plt.show()

# Box Plots
* Box plots are also used to visualize the spread of data.
* Box plots are used to compare distributions.
* Box plots can also be used to detect outliers.
## Syntax

In [None]:
boxplot(x)
# 'x' : list of values or list of list of values.

# Boxplot Using 'boxplot'
* The above image displays box plot of Percentages obtained by 1000 Students of a class.

In [None]:
import numpy as np
np.random.seed(100)
x = 50 + 10*np.random.randn(1000)
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title="Box plot of Student's Percentage",
      xlabel='Class', ylabel='Percentage')
ax.boxplot(x)
plt.show()

# Common Parameters of 'boxplot'
* labels: Sets the labels for box plots.

* notch: Sets to True if notches need to be created around the median.

* bootstrap: Number set to indicate that notches around the median are bootstrapped.

* vert: Sets to False for plotting Box plots horizontally.

# Setting Parameters of 'boxplot'
* Box plot of Student Percentages can be redrawn by setting notch, bootstrap and labels using the below-shown expression.

In [None]:
ax.boxplot(x, labels=['A'], notch=True, bootstrap=10000)

# Plotting Multiple Boxplots
* List of data values can be passed as an argument for plotting multiple box plots as shown in below code snippet.

In [None]:
import numpy as np
np.random.seed(100)
x = 50 + 10*np.random.randn(1000)
y = 70 + 25*np.random.randn(1000)
z = 30 + 5*np.random.randn(1000)
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)
ax.set(title="Box plot of Student's Percentage",
      xlabel='Class', ylabel='Percentage')
ax.boxplot([x, y, z], labels=['A', 'B', 'C'], notch=True, bootstrap=10000)
plt.show()

# Plotting Boxplots Horizontally
* Box plots are plotted horizontally by setting vert to False, as shown in the below code snippet.

In [None]:
ax.set(title="Box plot of Student's Percentage",
      xlabel='Percentage', ylabel='Class')
ax.boxplot([x, y, z], labels=['A', 'B', 'C'], vert=False, notch=True, bootstrap=10000)

# practice

In [None]:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.gridspec as gridspec
#Write your code here
def test_barplot_of_iris_sepal_length():
    import matplotlib.pyplot as plt
    import numpy as np
    fig = plt.figure(figsize = (8,6))
    ax = fig.add_subplot(111)
    species = ['setosa', 'versicolor', 'viriginica']
    index = [0.2, 1.2, 2.2]
    sepal_len = [5.01, 5.94, 6.59]
    ax.set(title = "Mean Sepal Length of Iris Species",
    xlabel = "Species" ,ylabel = "Sepal Length (cm)",
    xlim = (0,3),ylim = (0,7))
    ax.bar(index,sepal_len,width = 0.5,color = "red",edgecolor = "black")
    plt.xticks([0.45,1.45,2.45])
    ax.set_xticklabels(['setosa', 'versicolor', 'viriginica'])
    ax.legend()
    plt.savefig("bar_iris_sepal.png")
test_barplot_of_iris_sepal_length()
def test_barplot_of_iris_measurements():
    import matplotlib.pyplot as plt
    import numpy as np
    fig = plt.figure(figsize = (8,6))
    ax = fig.add_subplot(111)
    sepal_len = [5.01, 5.94, 6.59]
    sepal_wd = [3.42, 2.77, 2.97]
    petal_len = [1.46, 4.26, 5.55]
    petal_wd = [0.24, 1.33, 2.03]
    species = ['setosa', 'versicolor', 'viriginica']
    species_index1 = [0.7, 1.7, 2.7]
    species_index2 = [0.9, 1.9, 2.9]
    species_index3 = [1.1, 2.1, 3.1]
    species_index4 = [1.3, 2.3, 3.3]
    ax.set(title = "Mean Measurements of Iris Species",xlabel = "Species",ylabel = "Iris Measurements (cm)",
           xlim = (0.5,3.7),ylim = (0,10))
    ax.bar(species_index1, sepal_len, color='c', width=0.2, edgecolor='black', label='Sepal Length')
    ax.bar(species_index2, sepal_wd, color='m', width=0.2, edgecolor='black', label='Sepal Width')
    ax.bar(species_index3, petal_len, color='y', width=0.2, edgecolor='black', label='Petal Length')
    ax.bar(species_index4, petal_wd, color='orange', width=0.2, edgecolor='black', label='Petal Width')
    ax.set_xticks([1.1,2.1,3.1])
    ax.set_xticklabels(['setosa', 'versicolor','viriginica'])
    ax.legend()
    plt.savefig("bar_iris_measure.png")
test_barplot_of_iris_measurements()
def test_hbarplot_of_iris_petal_length():
    import numpy as np
    import matplotlib.pyplot as plt
    fig = plt.figure(figsize = (12,5))
    ax = fig.add_subplot(111)
    species = ['setosa', 'versicolor', 'viriginica']
    index = [0.2, 1.2, 2.2]
    petal_len = [1.46, 4.26, 5.55]
    ax.set(title = "Mean Petal Length of Iris Species",ylabel = "Species",xlabel = "Petal Length (cm)")
    ax.barh(index, petal_len, color='c', height = 0.5, edgecolor='black', label='Sepal Length')
    ax.set_yticks([0.45, 1.45,2.45])
    ax.set_yticklabels(['setosa', 'versicolor','viriginica'])
    #ax.legend()
    plt.savefig("bar_iris_petal.png")
test_hbarplot_of_iris_petal_length()
    

In [None]:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.gridspec as gridspec
#Write your code here
def test_hist_of_a_sample_normal_distribution():
    import numpy as np
    fig = plt.figure(figsize = (8,6))
    ax = fig.add_subplot(111)
    np.random.seed(100)
    x1 = 25+3*np.random.randn(1000)
    ax.set(title = "Histogram of a Single Dataset",xlabel = "x1",ylabel = "Bin Count")
    ax.hist(x1,bins = 30)
    plt.savefig("histogram_normal.png")
test_hist_of_a_sample_normal_distribution()
def test_hist_of_a_sample_normal_distribution():
    import numpy as np
    fig = plt.figure(figsize = (8,6))
    ax = fig.add_subplot(111)
    np.random.seed(100)
    x1 = 25+3.0*np.random.randn(1000)
    x2 = 35+5.0*np.random.randn(1000)
    x3 = 55+10.0*np.random.randn(1000)
    x4 = 45+3.0*np.random.randn(1000)
    ax.set(title = "Box plot of Multiple Datasets",xlabel = "Dataset",ylabel = "Value")
    ax.boxplot([x1, x2, x3, x4], labels=['X1', 'X2', 'X3', 'X4'], notch=True,patch_artist = "True",sym="+")
    plt.savefig("box_distribution.png")
test_hist_of_a_sample_normal_distribution()

    


# Matplotlib Styles
* matplotlib.pyplot comes with a lot of styles. Based on the chosen style, the display of figure changes.

* You can view various styles available in pyplot by running the following commands.

In [None]:
import matplotlib.pyplot as plt
print(plt.style.available)
Output
['seaborn-darkgrid', 'fivethirtyeight', ...]

# Using a Style
* A specific style can be invoked with either of the two expressions shown below.
* Using the later expression with a keyword, with is recommended.

In [None]:
plt.style.use('ggplot')
or

plt.style.context('ggplot')

# Using a Style
* The above shown image uses ggplot style.

In [None]:
with plt.style.context('ggplot'):
    fig = plt.figure(figsize=(8,6))
    ax = fig.add_subplot(111)
    ax.set(title='Avg. Daily Temperature of Jan 2018',
      xlabel='Day', ylabel='Temperature (in deg)',
      xlim=(0, 30), ylim=(25, 35))
    days = [1, 5, 8, 12, 15, 19, 22, 26, 29]
    temp = [29.3, 30.1, 30.4, 31.5, 32.3, 32.6, 31.8, 32.4, 32.7]
    ax.plot(days, temp, color='green', linestyle='--', linewidth=3)
    plt.show()

# Composing Styles
* Multiple style sheets can be used together in matplotlib.

* This provides the flexibility to compose two style sheets such as one for customizing colors and other for customizing element sizes.

In [None]:
with plt.style.context(['dark_background', 'seaborn-poster']):
   ....
   ....

# Creating a Custom Style
* A style sheet is a text file having extension .mplstyle.
* All custom style sheets are placed in a folder, stylelib, present in the config directory of matplotlib.

* Use the below expression for knowing the Config folder.

In [None]:
import matplotlib
print(matplotlib.get_configdir())

# Creating a Custom Style
* Now, create a file mystyle.mplstyle with the below-shown contents and save it in the folder <matplotlib_configdir/stylelib/.

In [None]:
axes.titlesize : 24
axes.labelsize : 20
lines.linewidth : 8
lines.markersize : 10
xtick.labelsize : 16
ytick.labelsize : 16

### Reload the matplotlib library with the subsequent expression.

In [None]:
matplotlib.style.reload_library()

# Using a Custom Style
* A custom style can also be used similar to builtin styles, after reloading the style library.
* The below code snippet used mystyle along with dark_background.

In [None]:
with plt.style.context(['dark_background', 'mystyle']):
   ....
   ....

# matplotlibrc file
* matplotlib uses all the settings specified in matplotlibrc file.
* These settings are known as rc settings or rc parameters.
* For customization, rc settings can be altered in the file or interactively.
* The location of active matplotlibrc file used by matplotlib can be found with below expression.

In [None]:
import matplotlib
matplotlib.matplotlib_fname()

# Matplotlib rcParams
* All rc settings, present in matplotlibrc file are stored in a dictionary named matplotlib.rcParams.

* Any settings can be changed by editing values of this dictionary.

* For example, if you want to change linewidth and color, the following expressions can be used.

In [None]:
import matplotlib as mpl
mpl.rcParams['lines.linewidth'] = 2
mpl.rcParams['lines.color'] = 'r'

# Practice

In [None]:
#implement different types of graphs available in python
#bargraph
import matplotlib.pyplot as plt 
def test_generate_plot_with_style1():

    # Write your functionality below
    fig = plt.figure(figsize=(8,6))
    ax = fig.add_subplot(111)
    
    sepal_len = [5.01, 5.94, 6.59]
    sepal_wd = [3.42, 2.77, 2.97]
    petal_len = [1.46, 4.26, 5.55]
    petal_wd = [0.24, 1.33, 2.03]
    species = ['setosa', 'versicolor', 'viriginica']
    species_index1 = [0.7, 1.7, 2.7]
    species_index2 = [0.9, 1.9, 2.9]
    species_index3 = [1.1, 2.1, 3.1]
    species_index4 = [1.3, 2.3, 3.3]
    ax.bar(species_index1, sepal_len,width=0.2,color='c',edgecolor='black',label='Sepal Length')
    ax.bar(species_index2, sepal_wd,width=0.2,color='m',edgecolor='black',label='Sepal Width')
    ax.bar(species_index3, petal_len,width=0.2,color='y',edgecolor='black',label='Petal Length')
    ax.bar(species_index4, petal_wd,width=0.2,color='orange',edgecolor='black',label='Petal Width')
    
    species=['setosa', 'versicolor', 'viriginica']
    index= [0.2, 1.2, 2.2]
    sepal_len=[5.01, 5.94, 6.59]
    ax.set(title='Mean Measurements of Iris Species',
    xlabel='Species', ylabel='Iris Measurements (cm)',xlim=(0.5, 3.7), ylim=(0,10))
    quarters = [1.1, 2.1, 3.1]
    ax.bar(index, sepal_len,width=0.5,color='red',edgecolor='black')
    
    ax.set_xticks(quarters)
    ax.set_xticklabels(['setosa', 'versicolor', 'viriginica'])
    ax.legend()
    plt.savefig("plotstyle1.png")
test_generate_plot_with_style1()
def test_generate_plot_with_style2():
    with plt.style.context([ 'seaborn-colorblind']):
        fig = plt.figure(figsize=(8,6))
        ax = fig.add_subplot(111)
        
        sepal_len = [5.01, 5.94, 6.59]
        sepal_wd = [3.42, 2.77, 2.97]
        petal_len = [1.46, 4.26, 5.55]
        petal_wd = [0.24, 1.33, 2.03]
        species = ['setosa', 'versicolor', 'viriginica']
        species_index1 = [0.7, 1.7, 2.7]
        species_index2 = [0.9, 1.9, 2.9]
        species_index3 = [1.1, 2.1, 3.1]
        species_index4 = [1.3, 2.3, 3.3]
        ax.bar(species_index1, sepal_len,width=0.2,color='c',edgecolor='black',label='Sepal Length')
        ax.bar(species_index2, sepal_wd,width=0.2,color='m',edgecolor='black',label='Sepal Width')
        ax.bar(species_index3, petal_len,width=0.2,color='y',edgecolor='black',label='Petal Length')
        ax.bar(species_index4, petal_wd,width=0.2,color='orange',edgecolor='black',label='Petal Width')
        
        species=['setosa', 'versicolor', 'viriginica']
        index= [0.2, 1.2, 2.2]
        sepal_len=[5.01, 5.94, 6.59]
        ax.set(title='Mean Measurements of Iris Species',
        xlabel='Species', ylabel='Iris Measurements (cm)',xlim=(0.5, 3.7), ylim=(0,10))
        quarters = [1.1, 2.1, 3.1]
        ax.bar(index, sepal_len,width=0.5,color='red',edgecolor='black')
        
        ax.set_xticks(quarters)
        ax.set_xticklabels(['setosa', 'versicolor', 'viriginica'])
        ax.legend()
        plt.savefig("plotstyle2.png")
test_generate_plot_with_style2()
    # Write your functionality below
def test_generate_plot_with_style3():
    with plt.style.context([ 'grayscale']):
        fig = plt.figure(figsize=(8,6))
        ax = fig.add_subplot(111)
        
        sepal_len = [5.01, 5.94, 6.59]
        sepal_wd = [3.42, 2.77, 2.97]
        petal_len = [1.46, 4.26, 5.55]
        petal_wd = [0.24, 1.33, 2.03]
        species = ['setosa', 'versicolor', 'viriginica']
        species_index1 = [0.7, 1.7, 2.7]
        species_index2 = [0.9, 1.9, 2.9]
        species_index3 = [1.1, 2.1, 3.1]
        species_index4 = [1.3, 2.3, 3.3]
        ax.bar(species_index1, sepal_len,width=0.2,color='c',edgecolor='black',label='Sepal Length')
        ax.bar(species_index2, sepal_wd,width=0.2,color='m',edgecolor='black',label='Sepal Width')
        ax.bar(species_index3, petal_len,width=0.2,color='y',edgecolor='black',label='Petal Length')
        ax.bar(species_index4, petal_wd,width=0.2,color='orange',edgecolor='black',label='Petal Width')
        
        species=['setosa', 'versicolor', 'viriginica']
        index= [0.2, 1.2, 2.2]
        sepal_len=[5.01, 5.94, 6.59]
        ax.set(title='Mean Measurements of Iris Species',
        xlabel='Species', ylabel='Iris Measurements (cm)',xlim=(0.5, 3.7), ylim=(0,10))
        quarters = [1.1, 2.1, 3.1]
        ax.bar(index, sepal_len,width=0.5,color='red',edgecolor='black')
        
        ax.set_xticks(quarters)
        ax.set_xticklabels(['setosa', 'versicolor', 'viriginica'])
        ax.legend()
        plt.savefig("plotstyle3.png")
test_generate_plot_with_style3()



# Creating Subplots
## Till now, you have seen how to create a single plot in a figure.

* In this topic, you will see how to create multiple plots in a single figure.

* subplot is one of the functions used to create subplots.

### Syntax

In [None]:
subplot(nrows, ncols, index)
# 'index' is the position in a virtual grid with 'nrows' and 'ncols'
# 'index' number varies from 1 to `nrows*ncols`.
subplot creates the Axes object at index position and returns it.

# Example of Using 'subplot'

In [None]:
fig = plt.figure(figsize=(10,8))
axes1 = plt.subplot(2, 2, 1, title='Plot1')
axes2 = plt.subplot(2, 2, 2, title='Plot2')
axes3 = plt.subplot(2, 2, 3, title='Plot3')
axes4 = plt.subplot(2, 2, 4, title='Plot4')
plt.show()

* The above shown code creates a figure with four subplots, having two rows and two columns.
* The third argument, index value varied from 1 to 4, and respective subplots are drawn in row-major order.

# Example 2 of 'subplot'
* Now let's try to create a figure with three subplots, where the first subplot spans all columns of first row.

In [None]:
fig = plt.figure(figsize=(10,8))
axes1 = plt.subplot(2, 2, (1,2), title='Plot1')
axes1.set_xticks([]); axes1.set_yticks([])
axes2 = plt.subplot(2, 2, 3, title='Plot2')
axes2.set_xticks([]); axes2.set_yticks([])
axes3 = plt.subplot(2, 2, 4, title='Plot3')
axes3.set_xticks([]); axes3.set_yticks([])
plt.show()fig = plt.figure(figsize=(10,8))
axes1 = plt.subplot(2, 2, (1,2), title='Plot1')
axes1.set_xticks([]); axes1.set_yticks([])
axes2 = plt.subplot(2, 2, 3, title='Plot2')
axes2.set_xticks([]); axes2.set_yticks([])
axes3 = plt.subplot(2, 2, 4, title='Plot3')
axes3.set_xticks([]); axes3.set_yticks([])
plt.show()

* The above code also removes all ticks of x and y axes.

# Subplots Using 'GridSpec'
* GridSpec class of matplotlib.gridspec can also be used to create Subplots.
* Initially, a grid with given number of rows and columns is set up.
* Later while creating a subplot, the number of rows and columns of grid, spanned by the subplot are provided as inputs to subplot function.

# Example of Using Gridspec
* The below example recreates the previous figure using GridSpec.
* A GridSpec object, gd is created with two rows and two columns.
* Then a selected grid portion is passed as an argument to subplot.

In [None]:
import matplotlib.gridspec as gridspec
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(10,8))
gd = gridspec.GridSpec(2,2)
axes1 = plt.subplot(gd[0,:],title='Plot1')
axes1.set_xticks([]); axes1.set_yticks([])
axes2 = plt.subplot(gd[1,0])
axes2.set_xticks([]); axes2.set_yticks([])
axes3 = plt.subplot(gd[1,-1])
axes3.set_xticks([]); axes3.set_yticks([])
plt.show()

# Creating a Complex Layout
* The below example creates a complex layout.
* It generates four plots as shown in the above figure.

In [None]:
fig = plt.figure(figsize=(12,10))
gd = gridspec.GridSpec(3,3)
axes1 = plt.subplot(gd[0,:],title='Plot1')
axes1.set_xticks([]); axes1.set_yticks([])
axes2 = plt.subplot(gd[1,:-1], title='Plot2')
axes2.set_xticks([]); axes2.set_yticks([])
axes3 = plt.subplot(gd[1:, 2], title='Plot3')
axes3.set_xticks([]); axes3.set_yticks([])
axes4 = plt.subplot(gd[2, :-1], title='Plot4')
axes4.set_xticks([]); axes4.set_yticks([])
plt.show()

# Practice

In [None]:
import matplotlib
#matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.gridspec as gridspec
#Write your code here

def test_generate_figure1():
  t = np.arange(0.0, 5.0, 0.01)
  s1 = np.sin(2*np.pi*t)
  s2 = np.sin(4*np.pi*t)
  fig = plt.figure(figsize=(8,6))
  axes1 = plt.subplot(211)
  axes1.set_title('Sin(2*pi*x)')
  axes1.plot(t,s1)
  axes2 = plt.subplot(212,sharex=axes1,sharey=axes1)
  axes2.set_title('Sin(4*pi*x)')
  fig.savefig('testfigure1.png')
  axes2.plot(t,s2)
  plt.show()
test_generate_figure1()

def test_generate_figure2():
  np.random.seed(1000)
  x = np.random.rand(10)
  y = np.random.rand(10)
  z = np.sqrt(x**2 + y**2)
  fig = plt.figure(figsize=(8,6))
  axes1 = plt.subplot(221)
  axes1.set_title('Scatter plot with Upper Traingle Markers')
  axes1.scatter(x,y,s=[80],c=z,marker='^')
  axes1.set(xticks=[0.0, 0.4, 0.8, 1.2],yticks=[-0.2, 0.2, 0.6, 1.0 ])
  axes2 = plt.subplot(222)
  axes2.set_title('Scatter plot with Plus Markers')
  axes2.scatter(x,y,s=[80],c=z,marker='+')
  axes2.set(xticks=[0.0, 0.4, 0.8, 1.2],yticks=[-0.2, 0.2, 0.6, 1.0 ])
  axes3 = plt.subplot(223)
  axes3.set_title('Scatter plot with Circle Markers')
  axes3.scatter(x,y,s=[80],c=z,marker='o')
  axes3.set(xticks=[0.0, 0.4, 0.8, 1.2],yticks=[-0.2, 0.2, 0.6, 1.0 ])
  axes4 = plt.subplot(224)
  axes4.set_title('Scatter plot with Diamond Markers')
  axes4.scatter(x,y,s=[80],c=z,marker='d')
  axes4.set(xticks=[0.0, 0.4, 0.8, 1.2],yticks=[-0.2, 0.2, 0.6, 1.0 ])
  plt.tight_layout()
  fig.savefig('testfigure2.png')
  plt.show()
test_generate_figure2()

def test_generate_figure3():
  x = np.arange(1,101)
  y1 = x
  y2 = x**2
  y3 = x**3
  fig = plt.figure(figsize=(8,6))
  g = gridspec.GridSpec(2,2)
  axes1 = plt.subplot(g[0,0])
  axes1.set(title="y = x")
  axes1.plot(x,y1)
  axes2 = plt.subplot(g[1,0])
  axes2.set(title="y = x**2")
  axes2.plot(x,y2)
  axes3 = plt.subplot(g[:,1])
  axes3.set(title="y = x**3")
  axes3.plot(x,y3)
  plt.tight_layout()
  fig.savefig('testfigure3.png')
  plt.show()
test_generate_figure3()  