# **Data Selection & Indexing**

In [1]:
import pandas as pd

## **1. Basic Selection Techniques**

This section includes the most foundational and commonly used ways to access data in a DataFrame or Series. Mastering these will make advanced techniques much easier.

### 🔸 1. **Selecting Columns using `[]` (Bracket Notation)**

This is the simplest and most intuitive way.

#### ✅ **Syntax**

```python
df['column_name']        # Returns a Series
df[['col1', 'col2']]     # Returns a DataFrame

In [3]:
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'London', 'Paris']
}

df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,London
2,Charlie,35,Paris


In [4]:
df['Name']

0      Alice
1        Bob
2    Charlie
Name: Name, dtype: object

In [5]:
df[['Name', 'City']]

Unnamed: 0,Name,City
0,Alice,New York
1,Bob,London
2,Charlie,Paris


#### ✅ **Real-time Example**

Imagine you're working with customer data from an e-commerce platform:

```python
# Selecting only important customer info
df[['CustomerID', 'Name', 'Email']]
```

#### ⚠️ **Common Errors**

```python
df['Name', 'Age']  ❌  # This throws an error
```

Use double brackets `[['Name', 'Age']]` for multiple columns.


### 🔸 2. **Selecting Rows using `.loc[]` and `.iloc[]`**

We'll deep dive into these in the next section, but here’s a quick intro:

* `.loc[]` is **label-based** (row names/index) - **location**
* `.iloc[]` is **integer position-based** - **index location**

In [6]:
df.loc[0] # First row by index label

Name       Alice
Age           25
City    New York
Name: 0, dtype: object

In [7]:
df.iloc[0] # First row by position

Name       Alice
Age           25
City    New York
Name: 0, dtype: object

### 🔸 3. **Slicing Rows and Columns**

#### ✅ **Syntax**

```python
df[start:stop]              # Slice rows using standard slicing (position-based)
df.iloc[start:stop, :]      # Slice rows explicitly
df.loc[:, 'col1':'col3']    # Slice columns by label
```

In [8]:
df

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,London
2,Charlie,35,Paris


In [9]:
df[0:2] # First 2 rows

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,London


In [10]:
df.iloc[1:3, 0:2] # Row 1 to 2, columns 0 to 1

Unnamed: 0,Name,Age
1,Bob,30
2,Charlie,35


In [12]:
df.loc[:, 'Name':'City'] # All rows, Name to Age columns

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,London
2,Charlie,35,Paris


#### ✅ **Real-time Example**

Selecting a range of transactions:

```python
# First 100 customer records
df[0:100]
```

### 🔸 4. **Boolean Indexing**

Boolean indexing helps filter rows based on a condition.

#### ✅ **Syntax**

```python
df[df['Age'] > 30]
```

In [15]:
df[df['City'] == 'London']

Unnamed: 0,Name,Age,City
1,Bob,30,London


#### ✅ **Real-time Example**

Filter customers from a specific country:

```python
# Customers from India
df[df['Country'] == 'India']
```

You can also combine conditions:

In [17]:
df[(df['Age'] > 25) & (df['City'] == 'London')]

Unnamed: 0,Name,Age,City
1,Bob,30,London


### 🔸 5. **`.at[]` and `.iat[]` for Fast Access**

These are **fast scalar accessors**:

* `.at[]` – label-based
* `.iat[]` – integer position-based

#### ✅ **Syntax**

```python
df.at[row_label, 'column_name']
df.iat[row_index, column_index]
```


In [20]:
df.at[0, 'Name']

'Alice'

In [21]:
df.iat[0, 1]

25

#### ✅ **Real-time Example**

Quickly update a cell:

```python
df.at[2, 'City'] = 'Berlin'

## ✅ Summary Table: Basic Selection Techniques

| Technique              | Method                 | Type      | Notes              |
| ---------------------- | ---------------------- | --------- | ------------------ |
| Column selection       | `df['col']`            | Series    | One column         |
| Multi-column selection | `df[['col1', 'col2']]` | DataFrame | Multiple columns   |
| Row selection (label)  | `df.loc[row_label]`    | Series    | Next section       |
| Row selection (pos)    | `df.iloc[row_index]`   | Series    | Next section       |
| Row slicing            | `df[start:stop]`       | DataFrame | Position-based     |
| Boolean indexing       | `df[cond]`             | DataFrame | Powerful filtering |
| Scalar access (label)  | `df.at[row, col]`      | Scalar    | Fast               |
| Scalar access (pos)    | `df.iat[row, col]`     | Scalar    | Fast               |


<center><b>Thanks</b></center>