### **Module 4: Data Manipulation with Pandas**

---

### **Introduction to Pandas**

Pandas is a Python library for working with data in table-like structures called **dataframes**. A dataframe is similar to an Excel spreadsheet or SQL table. It is widely used for data analysis and manipulation.

Key features of Pandas:
- **Dataframes and Series**: Dataframes store 2D data (rows and columns), while Series are 1D arrays.
- **Data Manipulation**: Adding, removing, or modifying data is easy.
- **Descriptive Statistics**: Quickly calculate means, sums, counts, and more.

---

### **Working with Pandas Dataframes**

#### **Creating a Pandas Dataframe**
You can create a dataframe using Python dictionaries or lists.


In [None]:
import pandas as pd  # Import the Pandas library

# Create a dataframe from a dictionary
player_data = {
    "Name": ["LeBron James", "Stephen Curry", "Kevin Durant"],
    "Team": ["Lakers", "Warriors", "Suns"],
    "Points Per Game": [27.2, 24.6, 26.9],
}
df = pd.DataFrame(player_data)
print(df)

### Adding Data to a Dataframe
You can add new columns or rows to a dataframe.

In [None]:
# Add a new column for player heights
df["Height (cm)"] = [206, 191, 208]
print("Dataframe with heights:\n", df)

# Add a new row
new_player = pd.DataFrame(
    {"Name": ["Giannis Antetokounmpo"], "Team": ["Bucks"], "Points Per Game": [29.9], "Height (cm)": [211]}
)
df = pd.concat([df, new_player], ignore_index=True)
print("Updated dataframe:\n", df)


### Extracting Data from a Dataframe
You can access specific rows, columns, or individual values.

In [None]:
# Access a single column
print("Names of players:\n", df["Name"])

# Access a single row by index
print("First player's data:\n", df.iloc[0])

# Access a specific value
print("LeBron's points per game:", df.loc[df["Name"] == "LeBron James", "Points Per Game"].values[0])


### **Your Turn: Exercises**

1. Create a dataframe with the following data for three sports:
   - **Columns**: `Sport`, `Players Per Team`, `Is Team Sport`
   - **Data**:  
     - `Basketball`, `5`, `True`  
     - `Tennis`, `1`, `False`  
     - `Soccer`, `11`, `True`

2. Add a new column to the dataframe from Exercise 1 named `Average Duration (minutes)` with the values `[48, 90, 90]`.

3. Extract the `Sport` and `Average Duration (minutes)` columns from the dataframe.
