# Pandas Queries and Functions

This notebook covers basic, intermediate, and advanced queries and functions in Pandas, along with common interview questions.

## Basic Pandas Queries and Functions

Pandas is a powerful data manipulation library in Python. Here are some fundamental queries and functions essential for beginners:

- **Creating a DataFrame**:
```python
import pandas as pd
data = {'Name': ['John', 'Alice', 'Bob'], 'Age': [28, 24, 30]}
df = pd.DataFrame(data)
```

- **Accessing Data**:
  - Retrieve the first few rows: `df.head()`
  - Access specific columns: `df['Name']`

- **Filtering Data**:
  - Using boolean indexing:
  ```python
  filtered_df = df[df['Age'] > 25]
  ```

- **Handling Missing Values**:
  - Drop rows with missing values: `df.dropna()`
  - Fill missing values: `df.fillna(value=0)`

## Intermediate Pandas Queries and Functions

As you progress, you can utilize more complex operations:

- **Merging DataFrames**:
```python
df1 = pd.DataFrame({'key': ['A', 'B'], 'value': [1, 2]})
df2 = pd.DataFrame({'key': ['A', 'B'], 'value': [3, 4]})
merged_df = pd.merge(df1, df2, on='key')
```

- **Group By Operations**:
```python
grouped = df.groupby('Age').mean()
```

- **Sorting Data**:
```python
sorted_df = df.sort_values(by='Age', ascending=False)
```

- **Applying Functions**:
```python
df['Age'] = df['Age'].apply(lambda x: x + 1)
```

## Advanced Pandas Queries and Functions

For advanced users, Pandas offers sophisticated functionalities:

- **Pivot Tables**:
```python
pivot_table = df.pivot_table(values='value', index='key', aggfunc='sum')
```

- **MultiIndex DataFrames**:

Creating a MultiIndex DataFrame allows for more complex data structures.

- **Time Series Analysis**:
```python
date_range = pd.date_range(start='1/1/2020', periods=100)
time_series_df = pd.DataFrame(data={'date': date_range, 'value': np.random.randn(100)})
```

- **Custom Aggregation Functions**:
You can define custom aggregation functions to apply during group operations.
```python
def custom_agg(x):
 return x.max() - x.min()

result = df.groupby('Category').agg(custom_agg)
```

# Common Interview Questions

- **What is the difference between `merge` and `join`?**
`merge` allows joining on columns, while `join` is primarily based on indices.

- **How do you handle missing data in a DataFrame?**
`dropna()` or `fillna()` can be used to manage missing values.

- **Explain the use of the `groupby` function.**
`groupby` is used to split the data into groups based on some criteria and perform operations on these groups.

- **How can you filter a DataFrame using the query method?**
`filtered_df = df.query("Age > 25")`

- **What are some common aggregation functions you can use with groupby?**
`mean()`, `sum()`, `count()`, and custom functions can be applied.