## Selection and Filteration in Data frame

1. Selecting specific columns.

2. Filter rows on specific condition.

3. Combining multiple conditions

### 1. Selecting specific columns

Selecting any specific column to perform analysis. It can return;

a. `A Series` (a single column)

b. `A dataframe` (multiple column of data)

`Syntax (for single col):` column = df["Column Name"]

`Syntax (for multiple col):` subset = df[ ["Column 1", "Column 2", "Column 3", ........] ]

In [6]:
import pandas as pd

data = {
    "Name": ["Alex", "Finch", "Root", "Williamson", "Inzamam", "Ponting", "Kohli"],
    "Age": [23, 21, 34, 32, 29, 45, 32],
    "Salary": [32000, 50000, 58000, 21000, 88000, 43000, 34000],
    "Performance Score": [45, 68, 90, 79, 53, 75, 90]
}

df = pd.DataFrame(data)
print(f"Original Data:\n{df}")

names = df["Name"]
salary = df["Salary"]

print("\nSingle Column (Series):")
print(f"\nNames column: \n{names}")
print(f"\nSalary column: \n{salary}")

Original Data:
         Name  Age  Salary  Performance Score
0        Alex   23   32000                 45
1       Finch   21   50000                 68
2        Root   34   58000                 90
3  Williamson   32   21000                 79
4     Inzamam   29   88000                 53
5     Ponting   45   43000                 75
6       Kohli   32   34000                 90

Single Column (Series):

Names column: 
0          Alex
1         Finch
2          Root
3    Williamson
4       Inzamam
5       Ponting
6         Kohli
Name: Name, dtype: object

Salary column: 
0    32000
1    50000
2    58000
3    21000
4    88000
5    43000
6    34000
Name: Salary, dtype: int64


In [11]:
print("Multiple columns (data frames):\n")

mul_col = df[ ["Name", "Salary", "Performance Score"] ]
print(mul_col)

Multiple columns (data frames):

         Name  Salary  Performance Score
0        Alex   32000                 45
1       Finch   50000                 68
2        Root   58000                 90
3  Williamson   21000                 79
4     Inzamam   88000                 53
5     Ponting   43000                 75
6       Kohli   34000                 90


### 2. Filter rows on specific condition

Filtering rows on a single specific condition. `E.g.` Filtering the employees having salary `less than` 50,000.

`Syntax:` filtered_rows = df[ df["Column Name"] < value_or_condition ]

In [14]:
higher_sal = df[df["Salary"] > 35000]
print(f"Employees with salary higher than 35K: \n\n{higher_sal}")

Employees with salary higher than 35K: 

      Name  Age  Salary  Performance Score
1    Finch   21   50000                 68
2     Root   34   58000                 90
4  Inzamam   29   88000                 53
5  Ponting   45   43000                 75


### 3. Combining multiple conditions

Filtering the rows based on multiple conditions. `E.g.` Filtering the employees having salary `more than` 50,000 but `less than` 80,000.

Multiple conditions are given using the *`&`* operator.

`Syntax:` filtered_rows = df[ ( df["Column Name"] > value ) & ( df["Column Name"] < value ) ]

In [15]:
# filtering employees with age < 30 and salary > 35K

multiple_filter = df[ (df["Age"] < 30) & (df["Salary"] > 35000) ]

print(f"Filtered rows based on multiple filter: \n\n{multiple_filter}")

Filtered rows based on multiple filter: 

      Name  Age  Salary  Performance Score
1    Finch   21   50000                 68
4  Inzamam   29   88000                 53


### 4. OR Operator

It is used to display result when atleast one of the given condition is true.

Conditions are given using the **`|`** operator.

`Syntax:` filtered_rows = df[ ( df["Column Name"] > value ) | ( df["Column Name"] < value ) ]

In [16]:
# filtered employees with atleast age > 30 or performance score > 90

filtered_row = df[ (df["Age"] > 30) | (df["Performance Score"] > 90) ]

print(f"Filtered rows on the basis of OR operator: \n{filtered_row}")

Filtered rows on the basis of OR operator: 
         Name  Age  Salary  Performance Score
2        Root   34   58000                 90
3  Williamson   32   21000                 79
5     Ponting   45   43000                 75
6       Kohli   32   34000                 90
