<a href="https://colab.research.google.com/github/TAbdullah-T/T5-SAD/blob/Data-Analysis/selection_indexing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Introduction
This notebook demonstrates various selection and indexing techniques in pandas to handle data efficiently. We use a traffic dataset for this example.

In [None]:
from  google.colab import drive

drive.mount('/gdrive')
!ln -s "/gdrive/My Drive/T5_SAD/Week_1/Day_2" "/content/Day_2"

Mounted at /gdrive


### Importing Libraries
Here, we import pandas, a powerful library for data manipulation and analysis.

In [None]:
import pandas as pd

### Loading Data
We load the data from a CSV file into a pandas DataFrame. A DataFrame is a 2-dimensional labeled data structure.

In [None]:
df = pd.read_csv('/content/Day_2/Datasets/Traffic.csv')

### Viewing Data
Using the `head()` function, we can view the first few rows of our DataFrame to understand its structure and the data it contains.

In [None]:
df.head()

Unnamed: 0,Time,Date,Day of the week,CarCount,BikeCount,BusCount,TruckCount,Total,Traffic Situation
0,12:00:00 AM,10,Tuesday,13,2,2,24,41,normal
1,12:15:00 AM,10,Tuesday,14,1,1,36,52,normal
2,12:30:00 AM,10,Tuesday,10,2,2,32,46,normal
3,12:45:00 AM,10,Tuesday,10,2,2,36,50,normal
4,1:00:00 AM,10,Tuesday,11,2,1,34,48,normal


### Accessing Column Data
To access data from a single column, you use the column's label.

In [None]:
df['Day of the week']

0        Tuesday
1        Tuesday
2        Tuesday
3        Tuesday
4        Tuesday
          ...   
5947    Thursday
5948    Thursday
5949    Thursday
5950    Thursday
5951    Thursday
Name: Day of the week, Length: 5952, dtype: object

### Selecting Multiple Columns
You can select multiple columns from a DataFrame by passing a list of column names.

In [None]:
df[['Time', 'Date','Day of the week']]

Unnamed: 0,Time,Date,Day of the week
0,12:00:00 AM,10,Tuesday
1,12:15:00 AM,10,Tuesday
2,12:30:00 AM,10,Tuesday
3,12:45:00 AM,10,Tuesday
4,1:00:00 AM,10,Tuesday
...,...,...,...
5947,10:45:00 PM,9,Thursday
5948,11:00:00 PM,9,Thursday
5949,11:15:00 PM,9,Thursday
5950,11:30:00 PM,9,Thursday


### Using iloc for Index-Based Selection
The `iloc` method is used for index-based selection, allowing us to select rows and columns by their integer index.

In [None]:
df.iloc[5]

Day of the week         Tuesday
Time                 1:15:00 AM
Date                         10
CarCount                     15
BikeCount                     1
BusCount                      1
TruckCount                   39
Total                        56
Traffic Situation        normal
Name: 5, dtype: object

In [None]:
df.iloc[5:10]

Unnamed: 0,Day of the week,Time,Date,CarCount,BikeCount,BusCount,TruckCount,Total,Traffic Situation
5,Tuesday,1:15:00 AM,10,15,1,1,39,56,normal
6,Tuesday,1:30:00 AM,10,14,2,2,27,45,normal
7,Tuesday,1:45:00 AM,10,13,2,1,20,36,normal
8,Tuesday,2:00:00 AM,10,7,0,0,26,33,normal
9,Tuesday,2:15:00 AM,10,13,0,0,34,47,normal


### Using loc for Label-Based Selection
The `loc` method allows for label-based selection. This example demonstrates selecting data by multi-index labels.

In [None]:
df.loc[1:2, ['Date', 'Time']]


Unnamed: 0,Date,Time
1,10,12:15:00 AM
2,10,12:30:00 AM


In [None]:
df = df.set_index(['Day of the week', 'Time'])
df.loc[('Monday', '12:00:00 AM')] # pay attention to

  df.loc[('Monday', '12:00:00 AM')] # pay attention to


Unnamed: 0_level_0,Unnamed: 1_level_0,Date,CarCount,BikeCount,BusCount,TruckCount,Total,Traffic Situation
Day of the week,Time,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Monday,12:00:00 AM,16,11,2,2,39,54,normal
Monday,12:00:00 AM,23,15,1,0,37,53,normal
Monday,12:00:00 AM,30,14,2,0,33,49,normal
Monday,12:00:00 AM,6,9,2,0,27,38,normal
Monday,12:00:00 AM,16,18,4,1,28,51,normal
Monday,12:00:00 AM,23,13,2,1,17,33,low
Monday,12:00:00 AM,30,13,3,1,12,29,low
Monday,12:00:00 AM,6,11,3,0,26,40,normal


### Resetting Index
Resetting the index of a DataFrame can simplify data handling, especially after operations that may create a hierarchical index.

In [None]:
df = df.reset_index(inplace =False)

In [None]:
df.head()

Unnamed: 0,level_0,index,Day of the week,Time,Date,CarCount,BikeCount,BusCount,TruckCount,Total,Traffic Situation
0,0,0,Tuesday,12:00:00 AM,10,13,2,2,24,41,normal
1,1,1,Tuesday,12:15:00 AM,10,14,1,1,36,52,normal
2,2,2,Tuesday,12:30:00 AM,10,10,2,2,32,46,normal
3,3,3,Tuesday,12:45:00 AM,10,10,2,2,36,50,normal
4,4,4,Tuesday,1:00:00 AM,10,11,2,1,34,48,normal


### Conditional Selection Using loc
We can perform conditional selection using `loc` to select rows based on a specific condition.

In [None]:
df[df['Total'] > 250]

Unnamed: 0,Day of the week,Time,Date,CarCount,BikeCount,BusCount,TruckCount,Total,Traffic Situation
3303,Friday,9:45:00 AM,13,171,59,27,3,260,heavy
3307,Friday,10:45:00 AM,13,178,68,28,4,278,heavy
3308,Friday,11:00:00 AM,13,169,70,20,3,262,heavy
3311,Friday,11:45:00 AM,13,179,46,27,3,255,heavy
3315,Friday,12:45:00 PM,13,174,67,30,1,272,heavy
3318,Friday,1:30:00 PM,13,175,63,19,3,260,heavy
3319,Friday,1:45:00 PM,13,172,62,20,2,256,heavy
3321,Friday,2:15:00 PM,13,172,65,18,2,257,heavy
3974,Friday,9:30:00 AM,20,170,62,18,2,252,heavy
3976,Friday,10:00:00 AM,20,166,63,22,1,252,heavy


### Conditional Selection Using query
The `query` method allows for filtering data using a condition, similar to the conditional selection but often more readable and concise.

In [None]:
df.query('Total > 250')

Unnamed: 0,Day of the week,Time,Date,CarCount,BikeCount,BusCount,TruckCount,Total,Traffic Situation
3303,Friday,9:45:00 AM,13,171,59,27,3,260,heavy
3307,Friday,10:45:00 AM,13,178,68,28,4,278,heavy
3308,Friday,11:00:00 AM,13,169,70,20,3,262,heavy
3311,Friday,11:45:00 AM,13,179,46,27,3,255,heavy
3315,Friday,12:45:00 PM,13,174,67,30,1,272,heavy
3318,Friday,1:30:00 PM,13,175,63,19,3,260,heavy
3319,Friday,1:45:00 PM,13,172,62,20,2,256,heavy
3321,Friday,2:15:00 PM,13,172,65,18,2,257,heavy
3974,Friday,9:30:00 AM,20,170,62,18,2,252,heavy
3976,Friday,10:00:00 AM,20,166,63,22,1,252,heavy
