# `pandas` - Data Visualization with `iris`

__Contents__:
1. Setup
1. Histogram
1. Scatter plot
1. Parallel coordinates plot

### Reference
- https://pandas.pydata.org/pandas-docs/stable/visualization.html

## 1. Setup

Load libraries.

In [6]:
import pandas as pd
import matplotlib.pyplot as plt
(pd.__version__
)

Peek at the iris data.

In [8]:
%sh head /dbfs/mnt/datalab-datasets/file-samples/iris.csv

Load the iris data from CSV file. Store the iris dataframe in an object `iris_df`.

In [10]:
iris_df = pd.read_csv('/dbfs/mnt/datalab-datasets/file-samples/iris.csv')
iris_df.head()

Check missing values and data types.

In [12]:
iris_df.info()

The iris data looks fine.

## 2. Histogram

Create a stacked frequency histogram of the dataframe using `dataframe.plot.hist()` method. Use `plt.show()` to display the figure.

In [16]:
iris_df.plot.hist()
display(plt.show())

##3. Scatter plot

Create a scatter plot using `dataframe.plot.scatter()` method. The x and y axes are specified by `SepalLength` and `PetalLength`. Set the x-axis and y-axis labels using `plt.xlabel()` and `plt.ylabel()` methods.

In [19]:
iris_df.plot.scatter(x='SepalLength', y='PetalLength')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Petal Length (cm)')
display(plt.show())

##4. Parallel coordinates plot

For visualizing multidimensional data (data consisting of many parameters) such as this Iris dataset, parallel-coordinates is a common technique. For this we need to import `parallel_coordinates` from `pandas.tools.plotting` package additionally.

- Introduction to parallel coordinates: https://en.wikipedia.org/wiki/Parallel_coordinates

Create parallel coordinates plot using `parallel_coordinates` function. The rows of `iris_df` are grouped by the value of `Species`

In [23]:

from pandas.plotting import parallel_coordinates
parallel_coordinates(iris_df, 'Name')
plt.xlabel("")
plt.ylabel("")
display(plt.show())

This notebook introduces three basic techniques (hisogram, scatter plot, parallel coordinates plot) of data visualization using `pandas` with the iris dataset.