# Fisher’s Iris data set 

This is a data set also known as Iris flower data set which was published by  British statistician and biologist Ronald Fisher in 1936. [Wikipedia](https://en.wikipedia.org/wiki/Iris_flower_data_set)

The data set consists of 150 records (50 for each of the three Iris species: Iris setosa, Iris versicolor and Iris virginica ) Each species in turn have four attributes which were measured: the length and the width of the sepals and petals in centimeters.
![title](Images/flowers.png)
(Sources: [1](https://commons.wikimedia.org/wiki/Category:Iris_setosa#/media/File:Irissetosa1.jpg), [2](https://en.wikipedia.org/wiki/Iris_flower_data_set#/media/File:Iris_versicolor_3.jpg), [3](https://en.wikipedia.org/wiki/Iris_flower_data_set#/media/File:Iris_virginica.jpg), Licenses: Public Domain)

<br>

## Saving original data set
***

In [1]:
from urllib.request import urlretrieve
import pandas as pd

#Assigning url of file. Idea taken from https://gist.github.com/curran/a08a1080b88344b0c8a7
iris='https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
#Saving file locally
urlretrieve(iris)

#Read file into a DataFrame and assign column names 
df = pd.read_csv(iris, sep=',', names=["sepal_length", "sepal_width", "petal_length", "petal_width", "class"])
#Print dataframe head/top (first 10 elements)
print(df.head(10))

   sepal_length  sepal_width  petal_length  petal_width        class
0           5.1          3.5           1.4          0.2  Iris-setosa
1           4.9          3.0           1.4          0.2  Iris-setosa
2           4.7          3.2           1.3          0.2  Iris-setosa
3           4.6          3.1           1.5          0.2  Iris-setosa
4           5.0          3.6           1.4          0.2  Iris-setosa
5           5.4          3.9           1.7          0.4  Iris-setosa
6           4.6          3.4           1.4          0.3  Iris-setosa
7           5.0          3.4           1.5          0.2  Iris-setosa
8           4.4          2.9           1.4          0.2  Iris-setosa
9           4.9          3.1           1.5          0.1  Iris-setosa


In [2]:
#Checking that whole data set has been imported (150 rows)
len(df)

150

<br>

## Looking at data

***
Before we start analysing the data set lets have a look at the data more closely using pandas functionality.

We can use <span style="color:SlateBlue; font-weight:bold;">columns</span> attribute to show the column labels of the DataFrame. <br>
Next, <span style="color:SlateBlue; font-weight:bold;">index</span> attribute shows RangeIndex(start, stop, step), index was automatically assigned when the csv file was read and df was created.<br>
<span style="color:SlateBlue; font-weight:bold;">ndim</span> parameter shows the number of axes/dimensions of the data set.<br>
<span style="color:SlateBlue; font-weight:bold;">shape</span> attribute can be used to show the number of rows (if used with index 0) and columns (if used with index 1) in the data set.<br>
<span style="color:SlateBlue; font-weight:bold;">size</span> attribute shows total number of elements in the DataFrame (150 rows x 5 columns = 750).<br>
<span style="color:SlateBlue; font-weight:bold;">dtypes</span> attribute shows the data types of the DataFrame.

In [3]:
print("The column labels of the iris DataFrame are: ", *df.columns, sep = "   ")
print(" The index of the DataFrame is: ", df.index, "\n")
print(f"The iris DataFrame has {df.ndim} dimensions")
print(f"The iris data set has {df.shape[0]} rows and {df.shape[1]} columns")
print(f"The iris DataFrame has {df.size} elements in total", "\n")
print("The data types of iris DataFrame are as follows:")
print(df.dtypes)

The column labels of the iris DataFrame are:    sepal_length   sepal_width   petal_length   petal_width   class
 The index of the DataFrame is:  RangeIndex(start=0, stop=150, step=1) 

The iris DataFrame has 2 dimensions
The iris data set has 150 rows and 5 columns
The iris DataFrame has 750 elements in total 

The data types of iris DataFrame are as follows:
sepal_length    float64
sepal_width     float64
petal_length    float64
petal_width     float64
class            object
dtype: object
