# 🐼 Pandas DataFrames Tutorial
This notebook is a **comprehensive guide to Pandas DataFrames**, including:
- Theory and definitions
- Visual diagrams for better understanding
- Corrected and working code examples
- Exercises for practice

---

## 1️⃣ What is a DataFrame?
- A **DataFrame** is a 2D labeled data structure in Pandas.
- It looks like a **table with rows and columns** (like Excel or SQL tables).
- Each column in a DataFrame is actually a **Series**.

### Structure
```
   Name   Age   City    Salary
0  John    26   NY      56000
1  Anna    34   Paris   79000
2  Linda   29   London  65000
3  Peter   42   Berlin  85000
```


### Diagram
```
DataFrame = { column name → Series }

Columns:   Name   | Age | City  | Salary
Index → 0: John   | 26  | NY    | 56000
Index → 1: Anna   | 34  | Paris | 79000
Index → 2: Linda  | 29  | London| 65000
Index → 3: Peter  | 42  | Berlin| 85000
```


## 2️⃣ Creating DataFrames
DataFrames can be created in multiple ways:
- From lists of lists
- From dictionaries
- From NumPy arrays

In [None]:
import pandas as pd

# From a list of lists
data_list = [
    ['John', 26, 'New York', 56000],
    ['Anna', 34, 'Paris', 79000],
    ['Linda', 29, 'London', 65000],
    ['Peter', 42, 'Berlin', 85000]
]

columns = ['Name','Age','City','Salary']
df = pd.DataFrame(data_list, columns=columns)
print(df)

In [None]:
# From dictionary
data_dict = {
    'Name': ['John','Anna','Linda','Peter'],
    'Age': [26,34,29,42],
    'City': ['New York','Paris','London','Berlin'],
    'Salary': [56000,79000,65000,85000]
}

pd.DataFrame(data_dict)

## 3️⃣ Accessing Data
- **Columns**: Use `df['col']`
- **Rows**: Use `loc[]` (label-based) or `iloc[]` (position-based)

In [None]:
# Column access
print(df['Name'])

# Row access with loc
print(df.loc[1])

# Row access with iloc
print(df.iloc[2])

## 4️⃣ Adding and Dropping Columns


In [None]:
# Adding a column
df['Designation'] = ['Doctor','Engineer','Coder','Engineer']
print(df)

# Dropping a column (not permanent)
print(df.drop('Designation', axis=1))

# Dropping permanently
df.drop('Designation', axis=1, inplace=True)
print(df)

## 5️⃣ Basic Operations on DataFrames


In [None]:
# Summary statistics
print(df.describe())

# Selecting multiple columns
print(df[['Name','Salary']])

# Filtering
print(df[df['Salary'] > 60000])

## 6️⃣ Handling Missing Data


In [None]:
import numpy as np

# Insert some missing values
df.loc[2,'Salary'] = np.nan
print(df)

# Detect NaN
print(df.isnull())

# Fill NaN
df['Salary'] = df['Salary'].fillna(df['Salary'].mean())
print(df)

## 7️⃣ Hands-on Exercises ✍️
1. Create a DataFrame of 5 students with Name, Age, Grade, and City.
2. Access the 'Grade' column and the 3rd row.
3. Add a new column 'Scholarship' with True/False values.
4. Drop the 'City' column permanently.
5. Handle missing values in your DataFrame.

---
✅ This concludes the **Pandas DataFrame tutorial**.