<a href="https://colab.research.google.com/github/ddaeducation/DataAnalyst/blob/main/Filtering_Columns_and_Rows_in_Pandas.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Filtering Columns and Rows in Pandas

### Learning Objectives
1. Understand how to filter rows and columns in a Pandas DataFrame.
2. Learn to apply conditions to filter data.
3. Gain familiarity with indexing and selecting data in Pandas.

### Introduction
Pandas is a powerful data manipulation library in Python that provides data structures like Series and DataFrames. Filtering data is a common operation in data analysis, allowing you to focus on specific subsets of your data based on certain conditions. This can be particularly useful in data science for cleaning and preparing data for analysis.

### Filtering Columns and Rows in Pandas
To filter columns and rows in a Pandas DataFrame, you can use various methods such as boolean indexing, the `.loc[]` and `.iloc[]` accessors, and the `.query()` method. Below is an example of how to filter data using a sample DataFrame.

#### Example Data
Let's create a sample DataFrame to demonstrate filtering:


In [None]:
import pandas as pd

# Sample data
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
    'Age': [24, 27, 22, 32, 29],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix'],
    'Salary': [70000, 80000, 60000, 90000, 75000]
}
# Create DataFrame
df = pd.DataFrame(data)
# Display the DataFrame
df

Unnamed: 0,Name,Age,City,Salary
0,Alice,24,New York,70000
1,Bob,27,Los Angeles,80000
2,Charlie,22,Chicago,60000
3,David,32,Houston,90000
4,Eva,29,Phoenix,75000



### Filtering Examples
1. **Filter Rows by Condition**: Select rows where Age is greater than 25.
   

In [None]:
   df[df['Age'] > 25]


Unnamed: 0,Name,Age,City,Salary
1,Bob,27,Los Angeles,80000
3,David,32,Houston,90000
4,Eva,29,Phoenix,75000



2. **Filter Specific Columns**: Select only the 'Name' and 'Salary' columns.
   

In [None]:
   df[['Name', 'Salary']]


Unnamed: 0,Name,Salary
0,Alice,70000
1,Bob,80000
2,Charlie,60000
3,David,90000
4,Eva,75000



3. **Filter Rows with Multiple Conditions**: Select rows where Age is greater than 25 and Salary is less than 80000.
   

In [None]:
   df[(df['Age'] > 25) & (df['Salary'] < 80000)]


Unnamed: 0,Name,Age,City,Salary
4,Eva,29,Phoenix,75000



### Questions for Jupyter Notebook
Here are 10 questions that can be used in a Jupyter Notebook format, each designed to test understanding of filtering in Pandas:

**Question 1**: What is the purpose of filtering data in a DataFrame?

**Question 2**: How can you filter rows in a DataFrame where the 'City' is 'Chicago'?

**Question 3**: Write a command to filter the DataFrame to show only the names of individuals older than 25.


**Question 4**: How would you select the 'Age' and 'City' columns from the DataFrame?

**Question 5**: What is the difference between `.loc[]` and `.iloc[]` in Pandas?

**Question 6**: Write a command to filter the DataFrame for individuals with a salary greater than 70000 and living in 'New York'.

**Question 7**: How can you use the `.query()` method to filter for individuals younger than 30?

**Question 8**: What will be the output of `df[df['Salary'] < 70000]`?

**Question 9**: Write a command to filter the DataFrame to show only the rows where the 'Name' starts with 'A'.

10. **Question 10**: How can you reset the index of a filtered DataFrame?

In [None]:
These questions and examples should provide a comprehensive understanding of filtering columns and rows in Pandas, suitable for a data science context.