# 🐼 Pandas for Data Analysis
Pandas is a powerful library for data manipulation and analysis. It provides data structures like `Series` and `DataFrame` that make working with structured data intuitive.

## 📥 Importing Pandas

In [1]:
import pandas as pd
print(pd.__version__)

2.2.3


## 📊 Creating DataFrames

In [2]:
# From dictionary
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age
0,Alice,25
1,Bob,30
2,Charlie,35


## 🔍 Inspecting Data

In [3]:
df.head()  # First few rows
df.info()  # Summary info
df.describe()  # Statistics

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    3 non-null      object
 1   Age     3 non-null      int64 
dtypes: int64(1), object(1)
memory usage: 180.0+ bytes


Unnamed: 0,Age
count,3.0
mean,30.0
std,5.0
min,25.0
25%,27.5
50%,30.0
75%,32.5
max,35.0


## 📌 Indexing and Selecting Data

In [4]:
df['Name']  # Single column
df[['Name', 'Age']]  # Multiple columns
df.loc[0]  # Row by label
df.iloc[0]  # Row by position

Name    Alice
Age        25
Name: 0, dtype: object

## 🔄 Filtering and Boolean Indexing

In [5]:
df[df['Age'] > 28]  # People older than 28

Unnamed: 0,Name,Age
1,Bob,30
2,Charlie,35


## ➕ Adding and Removing Columns

In [6]:
df['Country'] = ['UK', 'USA', 'Canada']
df.drop('Age', axis=1)  # Remove column

Unnamed: 0,Name,Country
0,Alice,UK
1,Bob,USA
2,Charlie,Canada


## 🔁 Handling Missing Values

In [7]:
df.loc[1, 'Country'] = None  # Insert missing value
df.fillna('Unknown')
df.dropna()

Unnamed: 0,Name,Age,Country
0,Alice,25,UK
2,Charlie,35,Canada


## 🧮 Grouping and Aggregation

In [8]:
df.groupby('Country').size()

Country
Canada    1
UK        1
dtype: int64

## 📤 Reading and Writing Files

In [9]:
# df.to_csv('mydata.csv', index=False)
# df = pd.read_csv('mydata.csv')

## ✅ Practice Challenge
**Q:** Create a DataFrame with 5 rows of fictional student data including columns `Name`, `Score`, and `Passed`, then calculate the average score.

In [10]:
# Your code here
import pandas as pd
students = pd.DataFrame({
    'Name': ['Tom', 'Jerry', 'Anna', 'Lucy', 'Jack'],
    'Score': [85, 92, 78, 90, 88],
    'Passed': [True, True, True, True, True]
})
students['Score'].mean()

np.float64(86.6)