# Data Analysis with Pandas

In this lesson, you will learn how to use the Pandas library for data analysis in JupyterLab. You will explore how to load datasets, manipulate data, and perform basic analysis to extract meaningful insights.

## Learning Objectives
- Load data into Pandas DataFrames.
- Perform basic data manipulation techniques.
- Analyze data to extract insights.

## Why This Matters

Pandas is a powerful library for data manipulation and analysis in Python. Understanding how to work with DataFrames is essential for data science and analytics, as it allows you to handle structured data efficiently.

## Key Concept: DataFrames

DataFrames are two-dimensional, size-mutable, potentially heterogeneous tabular data structures with labeled axes (rows and columns). They are similar to SQL tables or Excel spreadsheets.

In [None]:
# Example: Creating a DataFrame from a Dictionary
import pandas as pd

data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
print(df)

### Micro-Exercise 1: Create a DataFrame

**Exercise:** Create a DataFrame from a dictionary with your own data.

**Starter Code:**
```python
import pandas as pd

data = {'Column1': ['Value1', 'Value2'], 'Column2': [1, 2]}
df = pd.DataFrame(data)
print(df)
```

In [None]:
# Micro-Exercise 1 Starter Code
import pandas as pd

data = {'Column1': ['Value1', 'Value2'], 'Column2': [1, 2]}
df = pd.DataFrame(data)
print(df)

## Key Concept: Data Manipulation

Data manipulation involves transforming data into a format that is suitable for analysis. This includes filtering, grouping, and aggregating data.

In [None]:
# Example: Filtering Data in a DataFrame
import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)

# Filtering rows where Age is greater than 28
filtered_df = df[df['Age'] > 28]
print(filtered_df)

### Micro-Exercise 2: Filter Data

**Exercise:** Filter the DataFrame to include only rows where 'Age' is greater than 28.

**Starter Code:**
```python
filtered_df = df[df['Age'] > 28]
print(filtered_df)
```

In [None]:
# Micro-Exercise 2 Starter Code
import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)

# Filtering rows where Age is greater than 28
filtered_df = df[df['Age'] > 28]
print(filtered_df)

## Examples Section

### Example 1: Creating a DataFrame from a Dictionary
This example demonstrates how to create a DataFrame using a dictionary where keys are column names and values are lists of column data.

```python
import pandas as pd

data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
print(df)
```

### Example 2: Filtering Data in a DataFrame
This example shows how to filter rows in a DataFrame based on a condition.

```python
filtered_df = df[df['Age'] > 26]
print(filtered_df)
```

## Main Exercise
**Description:** Load a dataset, perform filtering and grouping operations, and summarize your findings in a markdown cell.

**Starter Code:**
```python
import pandas as pd

df = pd.read_csv('your_dataset.csv')
# Perform your analysis here
```

In [None]:
# Main Exercise Starter Code
import pandas as pd

df = pd.read_csv('your_dataset.csv')
# Example analysis: Group by a column and calculate mean
summary = df.groupby('column_name').mean()
print(summary)

## Common Mistakes
- Not checking data types before analysis.
- Forgetting to handle missing values in the dataset.

## Recap
In this lesson, you learned how to use Pandas for data analysis, including loading data into DataFrames and performing basic data manipulation techniques. Next, you can explore more advanced data analysis techniques and visualizations.