# Required Imports for This Notebook
- **Pandas**: Used for data handling, exploration, and manipulation.
- **Scikit-learn**: Provides tools for loading datasets and applying machine learning techniques.

In [19]:
import pandas as pd
import sklearn as skl

# Task 1: Source the Data Set

## Importing the Iris Dataset
We will import the Iris dataset from the `sklearn.datasets` module using the `load_iris()` function.

### Understanding `load_iris()`
- The `load_iris()` function returns a dictionary-like object called a **Bunch**.
- The **Bunch** contains attributes that allow access to both the data and metadata of the dataset.
- The dataset consists of **numerical features** (sepal length, sepal width, petal length, petal width) and  **target classes** representing the species (setosa, versicolor, virginica).
- Full details are available in the [Scikit-learn documentation](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html).


In [20]:
# Load the iris dataset
iris_dataset = skl.datasets.load_iris()

# Task 2: Explore the Data Structure

In this task, we examined the structure of the Iris dataset by performing the following steps:

- **Printed the shape of the dataset** to determine the number of samples (rows) and features (columns).
- **Displayed the first 5 rows** to get an initial view of the data.
- **Displayed the last 5 rows** to check the end of the dataset.
- **Listed the feature names** to understand the measured attributes (sepal and petal dimensions).
- **Listed the target class names** to identify the species classifications.

In [21]:
# Convert the dataset into a Pandas DataFrame
iris_dataframe = pd.DataFrame(iris_dataset.data, columns=iris_dataset.feature_names)

# Add the target column to the DataFrame
iris_dataframe["target"] = iris_dataset.target

# 1. Print the shape of the dataset
print("Shape of the dataset:", iris_dataframe.shape)

# 2. Print the first 5 rows of the dataset
print("First 5 rows of the dataset:")
display(iris_dataframe.head())  # Use display() in Jupyter for better formatting

# 3. Print the last 5 rows of the dataset
print("Last 5 rows of the dataset:")
display(iris_dataframe.tail())

# 4. Print the feature names (column names)
print("Feature Names:", iris_dataset.feature_names)

# 5. Print the target class names (species)
print("Target Classes:", iris_dataset.target_names)

Shape of the dataset: (150, 5)
First 5 rows of the dataset:


Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


Last 5 rows of the dataset:


Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
145,6.7,3.0,5.2,2.3,2
146,6.3,2.5,5.0,1.9,2
147,6.5,3.0,5.2,2.0,2
148,6.2,3.4,5.4,2.3,2
149,5.9,3.0,5.1,1.8,2


Feature Names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Target Classes: ['setosa' 'versicolor' 'virginica']


# Task 3: The Title Goes here

# Task 4: The Title Goes here