# Iris Dataset Practice:

This notebook will guide you through pandas basics using the famous Iris dataset. 

**Instructions:**
1. Read each question carefully
2. Try to write your own solution in the "Your Code" cell
3. Run the "Solution" cell to see the answer
4. Compare your solution with the provided one

Let's get started!

In [2]:
# Import required libraries
import pandas as pd

# Load the dataset
df = pd.read_csv('./data/Iris/Iris.csv')

# Display first few rows
df.head()

Unnamed: 0,Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species
0,1,5.1,3.5,1.4,0.2,Iris-setosa
1,2,4.9,3.0,1.4,0.2,Iris-setosa
2,3,4.7,3.2,1.3,0.2,Iris-setosa
3,4,4.6,3.1,1.5,0.2,Iris-setosa
4,5,5.0,3.6,1.4,0.2,Iris-setosa


## Question 1: Basic Dataset Information

**Task:** Get the basic information about the dataset including:
- Number of rows and columns
- Column names
- Data types
- Memory usage

In [3]:
# Your code here
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Id             150 non-null    int64  
 1   SepalLengthCm  150 non-null    float64
 2   SepalWidthCm   150 non-null    float64
 3   PetalLengthCm  150 non-null    float64
 4   PetalWidthCm   150 non-null    float64
 5   Species        150 non-null    object 
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB


## Question 2: Descriptive Statistics

**Task:** Get descriptive statistics for all numerical columns including:
- Count
- Mean
- Standard deviation
- Min/max values
- Quartiles

In [4]:
# Your code here
df.describe()

Unnamed: 0,Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm
count,150.0,150.0,150.0,150.0,150.0
mean,75.5,5.843333,3.054,3.758667,1.198667
std,43.445368,0.828066,0.433594,1.76442,0.763161
min,1.0,4.3,2.0,1.0,0.1
25%,38.25,5.1,2.8,1.6,0.3
50%,75.5,5.8,3.0,4.35,1.3
75%,112.75,6.4,3.3,5.1,1.8
max,150.0,7.9,4.4,6.9,2.5


## Question 3: Check for Missing Values

**Task:** Check if there are any missing values in the dataset.

In [5]:
# Your code here
df.isnull().sum()

Id               0
SepalLengthCm    0
SepalWidthCm     0
PetalLengthCm    0
PetalWidthCm     0
Species          0
dtype: int64

## Question 4: Unique Species

**Task:** Find all unique species in the dataset and count how many of each species there are.

In [10]:
# Your code here
df["Species"].unique()

array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype=object)

## Question 5: Filtering Data

**Task:** Filter the dataset to show only:
- Setosa species with sepal length greater than 5.0
- Versicolor species with petal width less than 1.5

In [13]:
# Your code here
print(df[(df["Species"] == "Iris-setosa") & (df["SepalLengthCm"] > 5.0)])

df[(df["Species"] == "Iris-versicolor") & (df["PetalWidthCm"] < 1.5)]

    Id  SepalLengthCm  SepalWidthCm  PetalLengthCm  PetalWidthCm      Species
0    1            5.1           3.5            1.4           0.2  Iris-setosa
5    6            5.4           3.9            1.7           0.4  Iris-setosa
10  11            5.4           3.7            1.5           0.2  Iris-setosa
14  15            5.8           4.0            1.2           0.2  Iris-setosa
15  16            5.7           4.4            1.5           0.4  Iris-setosa
16  17            5.4           3.9            1.3           0.4  Iris-setosa
17  18            5.1           3.5            1.4           0.3  Iris-setosa
18  19            5.7           3.8            1.7           0.3  Iris-setosa
19  20            5.1           3.8            1.5           0.3  Iris-setosa
20  21            5.4           3.4            1.7           0.2  Iris-setosa
21  22            5.1           3.7            1.5           0.4  Iris-setosa
23  24            5.1           3.3            1.7           0.5

Unnamed: 0,Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species
50,51,7.0,3.2,4.7,1.4,Iris-versicolor
53,54,5.5,2.3,4.0,1.3,Iris-versicolor
55,56,5.7,2.8,4.5,1.3,Iris-versicolor
57,58,4.9,2.4,3.3,1.0,Iris-versicolor
58,59,6.6,2.9,4.6,1.3,Iris-versicolor
59,60,5.2,2.7,3.9,1.4,Iris-versicolor
60,61,5.0,2.0,3.5,1.0,Iris-versicolor
62,63,6.0,2.2,4.0,1.0,Iris-versicolor
63,64,6.1,2.9,4.7,1.4,Iris-versicolor
64,65,5.6,2.9,3.6,1.3,Iris-versicolor


## Question 6: Grouping and Aggregation

**Task:** Calculate the following for each species:
- Average sepal length
- Maximum petal width
- Minimum sepal width

In [17]:
# Your code here
print(df["SepalLengthCm"].mean())
print(df["PetalWidthCm"].max())
print(df["SepalWidthCm"].min())

5.843333333333334
2.5
2.0


## Question 7: Adding New Columns

**Task:** Create a new column called 'petal_area' that calculates the area of the petal (length × width).

In [18]:
# Your code here
df["petal_area"] = df["PetalLengthCm"] * df["PetalWidthCm"]
df.head()

Unnamed: 0,Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species,petal_area
0,1,5.1,3.5,1.4,0.2,Iris-setosa,0.28
1,2,4.9,3.0,1.4,0.2,Iris-setosa,0.28
2,3,4.7,3.2,1.3,0.2,Iris-setosa,0.26
3,4,4.6,3.1,1.5,0.2,Iris-setosa,0.3
4,5,5.0,3.6,1.4,0.2,Iris-setosa,0.28


## Question 8: Sorting Data

**Task:** Sort the dataset by:
1. Sepal length in descending order
2. Petal width in ascending order (for same sepal length values)

In [20]:
# Your code here
print(df["SepalLengthCm"].sort_values(ascending=False).head(5))
print(df["PetalWidthCm"].sort_values(ascending=True).head(5))

131    7.9
122    7.7
118    7.7
117    7.7
135    7.7
Name: SepalLengthCm, dtype: float64
12    0.1
13    0.1
9     0.1
32    0.1
34    0.1
Name: PetalWidthCm, dtype: float64


## Congratulations! 🎉

You've completed the basic pandas exercises with the Iris dataset!

**What you've practiced:**
- Loading and inspecting datasets
- Handling missing values
- Filtering and selecting data
- Grouping and aggregation
- Adding new columns
- Sorting data

**Next Steps:**
1. Try these operations on a different dataset
2. Explore more advanced pandas operations
3. Practice with the Titanic dataset (next in our series)

Keep practicing and happy coding! 🐼