# Tasks

## Task 1: Source the Data Set

### What the load_iris() function returns.

The `load_iris()` function comes from the `sklearn.datasets` module.

When you use `load_iris()`, it gives you a special object (like a small box) that holds the Iris data set.

Inside this object, you can find:

 - Data: A table with numbers. Each row is a flower, and the numbers describe the flower (like its petal size and sepal size).

 - Target: The types of flowers (for example: setosa, versicolor, virginica).

 - Feature names: The names of the things we measure (like "sepal width", "petal length", etc.).

 - Target names: The names of the flower types.

 - Other info: Some extra details (like description text).

In short:
   `load_iris()` gives you everything you need to work with the Iris flower data for learning and practice.

### Import the Iris data set from the sklearn.datasets module.

In [None]:
# First, we import the function

from sklearn.datasets import load_iris

# Then, we load the data

iris = load_iris()

# We can see what's inside

print(iris.data)        # Shows the numbers (features)

print(iris.target)      # Shows the flower types (0, 1, 2)

print(iris.feature_names) # Shows the names of each feature

print(iris.target_names)  # Shows the names of flower types

55


## Task 2: Explore the Data Structure

### Print and explain the shape of the data set, the first and last 5 rows of the data, the feature names, and the target classes.

Here is the Python code for this:

In [None]:
from sklearn.datasets import load_iris

# Load the data
iris = load_iris()

# Shape of the data
print("Shape of data:", iris.data.shape)

# First 5 rows
print("First 5 rows:\n", iris.data[:5])

# Last 5 rows
print("Last 5 rows:\n", iris.data[-5:])

# Feature names
print("Feature names:", iris.feature_names)

# Target classes (flower names)
print("Target classes:", iris.target_names)

Let's explain each part:

 - Shape of data: 
 iris.data.shape   tells us how big the table is.
 For example, (115, 4) means 115 flowers and 4 features (measurements) for each flower.
 
 - First 5 rows:
 iris.data[:5] shows the data for the first 5 flowers.
 Each flower has 4 numbers (like sepal length, sepal width, petal length, petal width).

 - Feature names:
 iris.feature_names shows the names of the measurements.
 For example: "sepal length (cm)", "sepal width (cm)", etc.

 - Target classes:
 iris.target_names tells us the real names of the flowers.
 For example: "setosa", "versicolor", "virginica".

Short Example:
We can imagine this:

 Shape of data: (150, 4)

 Feature names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

 Target classes: ['setosa' 'versicolor' 'virginica']

  - 50 flowers.

  - Each flower has 4 measurements.

  - 3 types of flowers.

## Task 3:Summarize the Data

#### For the summarize the Data I would like to use the Python code :

In [None]:
import pandas as pd
from sklearn.datasets import load_iris

# Load the Iris data
iris = load_iris()

# Create a pandas DataFrame
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)

# Now calculate and display the statistics

print("Mean:\n", df.mean())
print("\nMinimum:\n", df.min())
print("\nMaximum:\n", df.max())
print("\nStandard Deviation:\n", df.std())
print("\nMedian:\n", df.median())

 #### Explanation:

 - df.mean() → calculates the average value for each feature.

 - df.min() → finds the smallest number in each column.
  
 - df.max() → finds the biggest number in each column.

 - df.std() → calculates the standard deviation (how spread out the numbers are).

 - df.median() → finds the middle value (not average, but center number).

## Task 4: Visualize Features

#### Plot histograms for each feature using `matplotlib`.

 First of all, we plot a histogram for each feature (each column in our dataset).
 
  We use Matplotlib — a Python library for drawing pictures (plots, charts). 

  For example,

In [None]:
import matplotlib.pyplot as plt  # We import Matplotlib

# Let's say your dataset is called 'data'
# data = your DataFrame

# For each column (feature) in your data:
for column in data.columns:
    plt.figure(figsize=(6, 4))  # Create a new figure (size 6x4)
    plt.hist(data[column], bins=20, color='skyblue', edgecolor='black')  # Draw a histogram
    plt.title(f'Histogram of {column}')  # Add a title
    plt.xlabel(column)  # Label the x-axis
    plt.ylabel('Frequency')  # Label the y-axis
    plt.grid(True)  # Add a grid
    plt.show()  # Show the plot

In [None]:
# 1. Import the needed libraries
import pandas as pd      # To read the data
import matplotlib.pyplot as plt   # To make the histograms

# 2. Read the CSV file (change "data.csv" to your file name)
data = pd.read_csv("data.csv")

# 3. Look at the first 5 rows (optional, just to see the data)
print(data.head())

# 4. Make a histogram for each column (feature) in the dataset
for column in data.columns:
    plt.figure()  # Start a new figure (new window for the plot)

    # Make the histogram
    plt.hist(data[column], bins=20, color='skyblue', edgecolor='black')  

    # Add a title for the histogram
    plt.title(f"Histogram of {column}")  

    # Add labels for X and Y axes
    plt.xlabel(column)        # X-axis = the feature name
    plt.ylabel("Frequency")   # Y-axis = how many times a value appears

    # Show the histogram
    plt.show()


## Task 4 

## END