# Introduction to Pandas

## Course Overview

This notebook provides a comprehensive introduction to Pandas, a powerful data manipulation library in Python. We'll cover:
- What is Pandas?
- Series and DataFrame basics
- Data loading and inspection
- Data selection and filtering
- Data manipulation and transformation
- Basic statistical operations

#### Pandas is a Python library used for working with data sets. It has functions for analyzing, cleaning, exploring, and manipulating data. The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created by Wes McKinney in 2008.


## 1. Getting Started with Pandas

First, let's import Pandas and check its version:

In [1]:
import pandas as pd
print(pd.__version__)

2.2.3


## 2. Creating Pandas Series

A Series is a one-dimensional labeled array that can hold any data type:

In [3]:
# Creating a Series from a list
fruits = pd.Series(['Apple', 'Banana', 'Cherry'])


# Creating a Series with custom index
prices = pd.Series([1.5, 0.75, 2.0], index=['Apple', 'Banana', 'Cherry'])
prices

Apple     1.50
Banana    0.75
Cherry    2.00
dtype: float64

## 3. Creating DataFrames

DataFrames are two-dimensional labeled data structures with columns of potentially different types:

In [4]:
# Creating a DataFrame from a dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'San Francisco', 'Chicago']
}
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,San Francisco
2,Charlie,35,Chicago


## 4. Loading Data

Pandas can read data from various sources:

In [6]:
# Reading a CSV file
# Note: Replace 'your_file.csv' with an actual file path
# df = pd.read_csv('your_file.csv')
# print(df.head())

# Example with sample data
sample_data = pd.DataFrame({
    'Product': ['Laptop', 'Phone', 'Tablet'],
    'Price': [1000, 500, 300],
    'Stock': [50, 100, 75]
})
sample_data




Unnamed: 0,Product,Price,Stock
0,Laptop,1000,50
1,Phone,500,100
2,Tablet,300,75


## 5. Data Inspection

Useful methods to understand your data:

In [None]:
# Basic information about the DataFrame
print(sample_data.info())

# Statistical summary
print(sample_data.describe())

Unnamed: 0,Price,Stock
count,3.0,3.0
mean,600.0,75.0
std,360.555128,25.0
min,300.0,50.0
25%,400.0,62.5
50%,500.0,75.0
75%,750.0,87.5
max,1000.0,100.0


## 6. Data Selection and Filtering

In [19]:
# Selecting a single column
#sample_data['Price']

# Filtering data
expensive_products = sample_data[sample_data['Price'] <= 500]
print("Expensive Products:")
expensive_products

Expensive Products:


Unnamed: 0,Product,Price,Stock
1,Phone,500,100
2,Tablet,300,75


## 7. Data Manipulation

In [None]:
# Adding a new column
sample_data['Total_Value'] = sample_data['Price'] * sample_data['Stock']
#print(sample_data)

# Sorting
sorted_by_price = sample_data.sort_values('Price', ascending=True)
print("Sorted by Price:")
sorted_by_price

Sorted by Price:


Unnamed: 0,Product,Price,Stock,Total_Value
0,Laptop,1000,50,50000
2,Tablet,300,75,22500
1,Phone,500,100,50000


## 8. Statistical Operations

In [24]:
# Basic statistical methods
print("Mean Price:", sample_data['Price'].mean())
print("Max Stock:", sample_data['Stock'].max())
print("Total Value:", sample_data['Total_Value'].sum())

Mean Price: 600.0
Max Stock: 100
Total Value: 122500


## 9. Grouping and Aggregation

Let's demonstrate grouping with a more complex dataset:

In [29]:
sales_data = pd.DataFrame({
    'Product': ['Laptop', 'Phone', 'Laptop', 'Phone', 'Tablet', 'Tablet'],
    'Region': ['North', 'North', 'South', 'South', 'East', 'West'],
    'Sales': [100, 150, 120, 200, 80, 90]
})


# Grouping and aggregation
grouped_sales = sales_data.groupby('Product')['Sales'].sum()
print("Total Sales by Product:")
grouped_sales


Total Sales by Product:


Product
Laptop    220
Phone     350
Tablet    170
Name: Sales, dtype: int64

## 10. Next Steps and Further Learning

This notebook covered the basics of Pandas. To continue learning:
- Explore more advanced selection methods like `.loc` and `.iloc`
- Learn about handling missing data
- Study advanced grouping and pivoting techniques
- Practice with real-world datasets

Recommended resources:
- Pandas official documentation
- Online tutorials and courses
- Data science books and resources