### Import required modules

In [165]:
import pandas as pd

### Load data

The Iris dataset source: Fisher, R. (1936). Iris [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C56C76.

Since the dataset is structured using comma-separated values, we can use the [pandas.read_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html) function to load it for analysis.

In [166]:
# Pass the file name when calling pandas read_csv() function
# Specifing separator is optional in this case as pandas automatically detects commas
# The file doesn't include a header row as confirmed by checking the original data source
# Column names were manually assigned based on iris.names metadata file
iris_df = pd.read_csv("iris.data", sep=',', names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'])

# Use pandas head function to show the first 5 rows giving an idea of the dataset structure
iris_df.head(5)

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,class
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


### Features Summary

Since the dataset contains four numerical features and one categorical feature (`class`), we analyse each feature separately using the appropriate statistics measures.

For the numerical features we summarise data using [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mean.html)  functions, calculating mean, median, standard deviation, minimum and maximum values. 

For categorical feature, we identify the numbur of unique classes and values count per each calss.

In [167]:
# Sepal length - Statistics Summary
mean = iris_df['sepal_length'].mean()
median = iris_df['sepal_length'].median()
st_dev = iris_df['sepal_length'].std()
min = iris_df['sepal_length'].min()
max = iris_df['sepal_length'].max() 

sepal_len_txt = (f'Sepal length\n\n'
                      f'Mean:\t\t{mean}\n'
                      f'Median:\t\t{median}\n'
                      f'Std Dev:\t{st_dev}\n' 
                      f'Min:\t\t{min}\n' 
                      f'Max:\t\t{max}')

In [168]:
# Sepal width - Statistics Summary
mean = iris_df['sepal_width'].mean()
median = iris_df['sepal_width'].median()
st_dev = iris_df['sepal_width'].std()
min = iris_df['sepal_width'].min()
max = iris_df['sepal_width'].max()

sepal_wid_txt = (f'Sepal width\n\n'
                      f'Mean:\t\t{mean}\n'
                      f'Median:\t\t{median}\n'
                      f'Std Dev:\t{st_dev}\n' 
                      f'Min:\t\t{min}\n' 
                      f'Max:\t\t{max}')

In [169]:
# Petal length - Statistics Summary 
mean = iris_df['petal_length'].mean()
median = iris_df['petal_length'].median()
st_dev = iris_df['petal_length'].std()
min = iris_df['petal_length'].min()
max = iris_df['petal_length'].max()

petal_len_txt = (f'Petal length\n\n'
                      f'Mean:\t\t{mean}\n'
                      f'Median:\t\t{median}\n'
                      f'Std Dev:\t{st_dev}\n' 
                      f'Min:\t\t{min}\n' 
                      f'Max:\t\t{max}')

In [170]:
# Petal width - Statistics Summary
mean = iris_df['petal_width'].mean()
median = iris_df['petal_width'].median()
st_dev = iris_df['petal_width'].std()       
min = iris_df['petal_width'].min()
max = iris_df['petal_width'].max()

petal_wid_txt = (f'Petal width\n\n'
                      f'Mean:\t\t{mean}\n'
                      f'Median:\t\t{median}\n'
                      f'Std Dev:\t{st_dev}\n' 
                      f'Min:\t\t{min}\n' 
                      f'Max:\t\t{max}')

In [175]:
# Class - Statistics Summary
unique_class = iris_df['class'].unique()
val_count = iris_df['class'].value_counts()

class_txt = (f'Class\n\n'
                  f'Unique values:\t{unique_class}\n'
                  f'Value counts:\n{val_count}\n')

In [176]:
filename = 'summary.txt'

with open(filename, 'w') as f:
    f.write(f'{sepal_len_txt}')
    f.write('\n\n\n')
    f.write(f'{sepal_wid_txt}')
    f.write('\n\n\n')
    f.write(f'{petal_len_txt}')
    f.write('\n\n\n')
    f.write(f'{petal_wid_txt}')
    f.write('\n\n\n')
    f.write(f'{class_txt}')

