Disclaimer: The below code is explained in author's perspective. Feel free to correct out if any mistake is found.

In [8]:
import pandas as pd

In [2]:
data = [['Anika', 20,"Dhaka","female","Student"],
       ['Raj', 22,"Rajshahi","male","Teacher"],
       ['Saarah', 27,"Dhaka","female","Engineer"],
       ['Zahra', 29,"Chittagong","female","Economist"],
       ['Abrar', 25,"Rajshahi","male","Doctor"]]
df = pd.DataFrame(data, columns=['Name', 'Age','City',"Gender",'Profession'])
print(df)

     Name  Age        City  Gender Profession
0   Anika   20       Dhaka  female    Student
1     Raj   22    Rajshahi    male    Teacher
2  Saarah   27       Dhaka  female   Engineer
3   Zahra   29  Chittagong  female  Economist
4   Abrar   25    Rajshahi    male     Doctor


# Previous questions

The questions were: 
1. Information of people whose age > 20 
2. Information of people who lives in Rajshahi 
3. Information of people whose gender is male 

### Information of people whose age > 20 

Now, to find the information of people whose age > 20, the below code gives a series object of true and false values.

In [3]:
df["Age"] > 20

0    False
1     True
2     True
3     True
4     True
Name: Age, dtype: bool

Now, to find the rows which will only return the true values and false values will be dropped, the code is given below:

In [4]:
(df[df["Age"] > 20]).all

<bound method DataFrame.all of      Name  Age        City  Gender Profession
1     Raj   22    Rajshahi    male    Teacher
2  Saarah   27       Dhaka  female   Engineer
3   Zahra   29  Chittagong  female  Economist
4   Abrar   25    Rajshahi    male     Doctor>

Now, to only get the rows with original index, the index has to be reset.

In [5]:
df[df["Age"] > 20].reset_index()

Unnamed: 0,index,Name,Age,City,Gender,Profession
0,1,Raj,22,Rajshahi,male,Teacher
1,2,Saarah,27,Dhaka,female,Engineer
2,3,Zahra,29,Chittagong,female,Economist
3,4,Abrar,25,Rajshahi,male,Doctor


The index column can be dropped by the following code:

In [6]:
df[df["Age"] > 20].reset_index(drop=True)

Unnamed: 0,Name,Age,City,Gender,Profession
0,Raj,22,Rajshahi,male,Teacher
1,Saarah,27,Dhaka,female,Engineer
2,Zahra,29,Chittagong,female,Economist
3,Abrar,25,Rajshahi,male,Doctor


### Information of people who lives in Rajshahi 

Now the information of the people (rows) has to be printed whose city is in Rajshahi.

In [9]:
df["city"] == "Rajshahi"

KeyError: 'city'

Make sure the column name is typed properly.In this code, *KeyError* means 'city' column name does not belong to DataFrame df.

In [10]:
# Boolean values show which index in column 'City' matches 'Rajshahi'
df["City"] == "Rajshahi"

0    False
1     True
2    False
3    False
4     True
Name: City, dtype: bool

In [11]:
# Will return the rows containing indexes 1 and 4 (as true values seen above)
df[df["City"] == "Rajshahi"]

Unnamed: 0,Name,Age,City,Gender,Profession
1,Raj,22,Rajshahi,male,Teacher
4,Abrar,25,Rajshahi,male,Doctor


### Information of people whose gender is male

In [12]:
df[df["Gender"]=="male"].reset_index(drop=True)

Unnamed: 0,Name,Age,City,Gender,Profession
0,Raj,22,Rajshahi,male,Teacher
1,Abrar,25,Rajshahi,male,Doctor


# More Questions

1. The name of the city where Zahra lives in
2. Information of people whose age is greater than 25 and lives in Dhaka
3. Information of people whose age is greater than 25 and less than 30
4. Information of people living in Dhaka or Chittagong

## Finding out values of specific column on data condition

### 1. The name of the city where Zahra lives in 

In [13]:
df

Unnamed: 0,Name,Age,City,Gender,Profession
0,Anika,20,Dhaka,female,Student
1,Raj,22,Rajshahi,male,Teacher
2,Saarah,27,Dhaka,female,Engineer
3,Zahra,29,Chittagong,female,Economist
4,Abrar,25,Rajshahi,male,Doctor


Now, to find out the name of the city where Zahra lives in, we need only one information of the column "City", the other columns are not needed. Now, the format to find out the city, the below code is to be executed:

In [14]:
df[df["Name"]=="Zahra"]["City"]

3    Chittagong
Name: City, dtype: object

One column in a DataFrame is a series column. Now, since the pandas series above contains only one row, the we can use pandas *Series.item()* function to return the single element *Chittagong*. More details on [2]

In [15]:
df[df["Name"]=="Zahra"]["City"].item()

'Chittagong'

However, *Series.item()* cannot be used if there are more than one element in the *Series* object. Below is a dataframe which shows names of the people living in Dhaka:

In [16]:
df[df["City"]=="Dhaka"]["Name"]

0     Anika
2    Saarah
Name: Name, dtype: object

Now, applying the *Series.item()* function gives the following error:

In [17]:
df[df["City"]=="Dhaka"]["Name"].item()

ValueError: can only convert an array of size 1 to a Python scalar

ValueError: can only convert an array of size 1 to a Python scalar - is shown as array size > 1.

## Dataframe Filtering for more than one conditon

Now, we all have an idea of truth table. Let's revise again of *or* operations and *and* operations. 

### AND operation

 <center>Truth Table for *AND* operation for two inputs:</center>

| a   | b   | Output |
|-----|-----|--------|
| 0   | 0   | 0      |
| 0   | 1   | 0      |
| 1   | 0   | 0      |
| 1   | 1   | 1      |

Now, the truth value is 0 if any input is 0 or both input is 0 and truth value is 1 if both inputs are 1. In DataFrame we will see 0 as False and 1 as True.

#### 2. Information of people whose age is greater than 25 and lives in Dhaka

Now, the dataframe is

In [18]:
df

Unnamed: 0,Name,Age,City,Gender,Profession
0,Anika,20,Dhaka,female,Student
1,Raj,22,Rajshahi,male,Teacher
2,Saarah,27,Dhaka,female,Engineer
3,Zahra,29,Chittagong,female,Economist
4,Abrar,25,Rajshahi,male,Doctor


In [19]:
# people older than 25
df["Age"]>25

0    False
1    False
2     True
3     True
4    False
Name: Age, dtype: bool

In [20]:
# people who live in Dhaka
df["City"]=="Dhaka"

0     True
1    False
2     True
3    False
4    False
Name: City, dtype: bool

<center>Now, the truth table for people whose (age > 25) and (lives in = Dhaka)</center>

index | Age | City| Output |
------|-----|-----|--------|
0     | 0   | 1   | 0      |
1     | 0   | 0   | 0      |
2     | 1   | 1   | 1      |
3     | 1   | 0   | 0      |
4     | 0   | 0   | 0      |

Thus, it is seen that the second index will be returned since it matches all conditions and is an *and* operation. A step-by-step code is run for determining - Information of people whose age is greater than 25 and lives in Dhaka

In [22]:
query1 = df["Age"]>25
query2 = df["City"]=="Dhaka"

In [23]:
df[query1 & query2]

Unnamed: 0,Name,Age,City,Gender,Profession
2,Saarah,27,Dhaka,female,Engineer


The code can also be written as:

In [24]:
df[(df["Age"]>25) & (df["City"]=="Dhaka")]

Unnamed: 0,Name,Age,City,Gender,Profession
2,Saarah,27,Dhaka,female,Engineer


Make sure not to miss the brackets! Otherwise error is shown:

In [25]:
df[df["Age"]>25 & df["City"]=="Dhaka"]

TypeError: Cannot perform 'rand_' with a dtyped [object] array and scalar of type [bool]

A similar exercise:

#### 3. Information of people whose age is greater than 25 and less than 30 (Do It Yourself)

Now, for filtering the dataframe for more than one condition, two conditions are needed - age > 25 and age <30. 

In [26]:
# Code for runnung the above condition
df[(df["Age"]>25) & ( df["Age"]<30)]

Unnamed: 0,Name,Age,City,Gender,Profession
2,Saarah,27,Dhaka,female,Engineer
3,Zahra,29,Chittagong,female,Economist


### OR Operation

Now, the *OR* operation will return *True* value if any of the input returns *True* value.

 <center>Truth Table for *OR* operation for three inputs:</center>


| a   | b   | c  | Output |
|-----|-----|----|--------|
| 0   | 0   | 0  | 0      |
| 0   | 0   | 1  | 1      |
| 0   | 1   | 0  | 1      |
| 0   | 1   | 1  | 1      |
| 1   | 0   | 0  | 1      |
| 1   | 0   | 1  | 1      |
| 1   | 1   | 0  | 1      |
| 1   | 1   | 1  | 1      |


#### 4. Information of people living in Dhaka or Chittagong

Now, the dataframe is

In [27]:
df

Unnamed: 0,Name,Age,City,Gender,Profession
0,Anika,20,Dhaka,female,Student
1,Raj,22,Rajshahi,male,Teacher
2,Saarah,27,Dhaka,female,Engineer
3,Zahra,29,Chittagong,female,Economist
4,Abrar,25,Rajshahi,male,Doctor


In [28]:
# people living in Dhaka
df["City"] == "Dhaka"

0     True
1    False
2     True
3    False
4    False
Name: City, dtype: bool

In [29]:
# people who live in Dhaka
df["City"] == "Chittagong"

0    False
1    False
2    False
3     True
4    False
Name: City, dtype: bool

<center>Now, the truth table for people whose (lives in = Dhaka) or (lives in = Chittagong)</center>

index | City(lives in Dhaka) | City(lives in Chittagong)| Output |
------|----------------------|--------------------------|--------|
0     | 1                    | 0                        | 1      |
1     | 0                    | 0                        | 0      |
2     | 1                    | 0                        | 1      |
3     | 0                    | 1                        | 1      |
4     | 0                    | 0                        | 0      |

So, the index (0, 2 and 3) will be returned where the Truth-value is seen as *1* or *True*.

In [30]:
query1 = df["City"] == "Dhaka"
query2 = df["City"] == "Chittagong"
df[query1 | query2] 

Unnamed: 0,Name,Age,City,Gender,Profession
0,Anika,20,Dhaka,female,Student
2,Saarah,27,Dhaka,female,Engineer
3,Zahra,29,Chittagong,female,Economist


The code can also be written as:

In [None]:
df[(df["City"] == "Dhaka") | (df["City"] == "Chittagong")]

## Information of people whose age is greater than 20 or lives in Rajshahi or is male

In [36]:
df[(df["Age"] > 20) | (df["City"] == "Rajshahi") | (df["Gender"] == "male") ]

Unnamed: 0,Name,Age,City,Gender,Profession
1,Raj,22,Rajshahi,male,Teacher
2,Saarah,27,Dhaka,female,Engineer
3,Zahra,29,Chittagong,female,Economist
4,Abrar,25,Rajshahi,male,Doctor


## Infomration of men who lives in Rajshahi and is older than 20.

In [37]:
df[(df["Age"] > 20) & (df["City"] == "Rajshahi") &(df["Gender"] == "male") ]

Unnamed: 0,Name,Age,City,Gender,Profession
1,Raj,22,Rajshahi,male,Teacher
4,Abrar,25,Rajshahi,male,Doctor


# New Task

Now, following task 7, the three conditions are given in three questions. Now print two dataframes where: 
1. Information of people whose age is greater than 20 or lives in Rajshahi or is male
2. Infomration of men who lives in Rajshahi and is older than 20.

# Useful Links

1. https://www.geeksforgeeks.org/ways-to-filter-pandas-dataframe-by-column-values/
2. https://www.geeksforgeeks.org/python-pandas-series-item/
3. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.filter.html
4. https://datagy.io/filter-pandas/
5. https://www.geeksforgeeks.org/python-pandas-series-item/