# Week 6: Data Filtering and Conditional Selections


## Objectives:
In this week, you will:
1. Learn how to filter datasets based on conditions.
2. Use conditional selections to extract subsets of data.
3. Practice combining multiple conditions for more advanced filtering.




## 1. Introduction to Data Filtering
Filtering data allows you to extract subsets of data that meet specific conditions. For example, you may want to extract rows where visitor counts are above a certain threshold or where revenue is below a certain value.

### Common Methods:
- **Filtering with boolean indexing**: Use conditions to filter data based on column values.
- **Combining multiple conditions**: Use logical operators to filter data based on multiple conditions.

Let's start by loading a sample dataset and filtering data based on visitor counts.


In [1]:

# Import pandas and load a sample dataset
import pandas as pd

# Sample dataset
data = {
    'Location': ['Park A', 'Museum B', 'Beach C', 'Park A', 'Museum B'],
    'Visitors': [200, 150, 100, 300, 180],
    'Revenue': [1000, 750, 500, 1500, 900]
}

df = pd.DataFrame(data)

# Filter rows where Visitors are greater than 150
df_filtered = df[df['Visitors'] > 150]

# Show the result
df_filtered


Unnamed: 0,Location,Visitors,Revenue
0,Park A,200,1000
3,Park A,300,1500
4,Museum B,180,900



## 2. Filtering with Multiple Conditions
In many cases, you may need to filter data based on more than one condition. For example, you may want to filter rows where visitor counts are above a certain value **and** revenue is below a certain value.

### Example:
Let's filter the data to show only rows where visitor counts are greater than 150 and revenue is greater than 800.


In [2]:

# Filter rows where Visitors > 150 and Revenue > 800
df_filtered_multi = df[(df['Visitors'] > 150) & (df['Revenue'] > 800)]

# Show the result
df_filtered_multi


Unnamed: 0,Location,Visitors,Revenue
0,Park A,200,1000
3,Park A,300,1500
4,Museum B,180,900



## 3. Filtering with OR Conditions
Sometimes, you may want to filter data where **either** of two conditions is true. This can be done using the OR operator (`|`).

### Example:
Let's filter the data to show rows where visitor counts are greater than 250 **or** revenue is below 800.


In [3]:

# Filter rows where Visitors > 250 or Revenue < 800
df_filtered_or = df[(df['Visitors'] > 250) | (df['Revenue'] < 800)]

# Show the result
df_filtered_or


Unnamed: 0,Location,Visitors,Revenue
1,Museum B,150,750
2,Beach C,100,500
3,Park A,300,1500



## 4. Filtering with String Values
You can also filter data based on string values. For example, you may want to filter data based on specific locations.

### Example:
Let's filter the data to show only rows where the location is 'Park A'.


In [4]:

# Filter rows where Location is 'Park A'
df_filtered_string = df[df['Location'] == 'Park A']

# Show the result
df_filtered_string


Unnamed: 0,Location,Visitors,Revenue
0,Park A,200,1000
3,Park A,300,1500



## 5. Combining Multiple String Conditions
Just like numeric conditions, you can combine multiple string conditions using the AND or OR operators.

### Example:
Let's filter the data to show rows where the location is either 'Park A' or 'Museum B'.


In [5]:

# Filter rows where Location is 'Park A' or 'Museum B'
df_filtered_multi_string = df[(df['Location'] == 'Park A') | (df['Location'] == 'Museum B')]

# Show the result
df_filtered_multi_string


Unnamed: 0,Location,Visitors,Revenue
0,Park A,200,1000
1,Museum B,150,750
3,Park A,300,1500
4,Museum B,180,900



## 6. Summary
This week, you learned how to:
1. Filter data based on conditions using boolean indexing.
2. Combine multiple conditions to filter data more effectively.
3. Filter data based on string values and combine string conditions.

### Homework:
- Practice filtering data from a new dataset using the techniques you learned this week.
- Experiment with filtering based on both numeric and string conditions for deeper insights.

