# Demonstration

For this demonstration, let's use the popular Iris dataset. This dataset is often used for educational purposes and is excellent for demonstrating data manipulation with Pandas due to its simplicity and small size. The Iris dataset includes data on iris flowers, with measurements of petals and sepals and the species of the iris.

You can download the Iris dataset from the UCI Machine Learning Repository. Here's the direct download link:

Iris Dataset CSV

First, let's load the dataset into a Pandas DataFrame.

### Setup
Install and import Pandas:

In [1]:
# pip install pandas

In [2]:
import pandas as pd

### Load Data

In [3]:
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
column_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
iris = pd.read_csv(url, names=column_names)

### Data Inspection
Inspect the first few rows of the DataFrame:

In [14]:
iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,petal_area
0,5.1,3.5,1.4,0.2,Iris-setosa,0.28
1,4.9,3.0,1.4,0.2,Iris-setosa,0.28
2,4.7,3.2,1.3,0.2,Iris-setosa,0.26
3,4.6,3.1,1.5,0.2,Iris-setosa,0.3
4,5.0,3.6,1.4,0.2,Iris-setosa,0.28


View a summary of the DataFrame:

In [13]:
iris.describe()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,petal_area
count,150.0,150.0,150.0,150.0,150.0
mean,5.843333,3.054,3.758667,1.198667,5.793133
std,0.828066,0.433594,1.76442,0.763161,4.713499
min,4.3,2.0,1.0,0.1,0.11
25%,5.1,2.8,1.6,0.3,0.42
50%,5.8,3.0,4.35,1.3,5.615
75%,6.4,3.3,5.1,1.8,9.69
max,7.9,4.4,6.9,2.5,15.87


### Data Cleaning
Check for missing values:

In [15]:
iris.isnull().sum()

sepal_length    0
sepal_width     0
petal_length    0
petal_width     0
species         0
petal_area      0
dtype: int64

(Note: The Iris dataset from UCI generally doesn't have missing values, but it's good practice to check.)

### Data Transformation
Create a new column that is a combination of petal length and width:

In [7]:
iris['petal_area'] = iris['petal_length'] * iris['petal_width']

### Data Filtering
Select only the Iris-setosa species:

In [11]:
setosa = iris[iris['species'] == 'Iris-setosa']
setosa.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,petal_area
0,5.1,3.5,1.4,0.2,Iris-setosa,0.28
1,4.9,3.0,1.4,0.2,Iris-setosa,0.28
2,4.7,3.2,1.3,0.2,Iris-setosa,0.26
3,4.6,3.1,1.5,0.2,Iris-setosa,0.3
4,5.0,3.6,1.4,0.2,Iris-setosa,0.28


### Data Aggregation
Calculate the mean sepal width for each species:

In [12]:
species_sepal_width_mean = iris.groupby('species')['sepal_width'].mean()
species_sepal_width_mean

species
Iris-setosa        3.418
Iris-versicolor    2.770
Iris-virginica     2.974
Name: sepal_width, dtype: float64

### Exporting Data
Export the modified DataFrame to a new CSV file:

In [10]:
iris.to_csv('iris_modified.csv')

This work demonstrates basic data manipulation operations like loading data, inspecting data, cleaning, transforming, filtering, aggregating, and exporting data using Pandas, all applied to the Iris dataset. 

It's a good starting point for any data manipulation and analysis task.