# **Data Selection & Indexing**

In [2]:
import pandas as pd

## 3. **Label-based Selection in pandas**

Label-based selection lets you **retrieve rows and columns using labels (names)** instead of integer positions. This is one of the most powerful features in pandas — especially important when dealing with **named indexes, time series, and real-world datasets**.

We’ll focus on two main tools:

1. `df.loc[]` – Label-based access (rows and columns)
2. `df.at[]` – Fast scalar access by label (single cell)


## 🔹 10. `df.loc[]` – Label-Based Selection

### ✅ **Purpose**

Select rows, columns, or subsets using **index/column labels** (not positions).


### ✅ **Syntax**

```python
df.loc[row_label, column_label]
df.loc[row_label]                # Select row
df.loc[:, column_label]         # Select column
df.loc[start_label:end_label]   # Label-based slicing
```


### ✅ **Create Sample Data**

In [3]:
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['NY', 'London', 'Paris', 'Berlin']
}
df = pd.DataFrame(data, index=['C001', 'C002', 'C003', 'C004'])

df

Unnamed: 0,Name,Age,City
C001,Alice,25,NY
C002,Bob,30,London
C003,Charlie,35,Paris
C004,David,40,Berlin


### ✅ **Use Cases & Examples**

#### 🔸 Select a Single Row by Index Label

In [4]:
df.loc['C002']

Name       Bob
Age         30
City    London
Name: C002, dtype: object

In [5]:
# Selecting multiple Rows
df.loc[['C002', 'C004']]

Unnamed: 0,Name,Age,City
C002,Bob,30,London
C004,David,40,Berlin


In [6]:
# Selecting Row and Column
df.loc['C002', 'Age']

30

In [7]:
# Selecting multiple rows and column
df.loc[['C002', 'C004'], 'Age']

C002    30
C004    40
Name: Age, dtype: int64

In [8]:
# Selecting row slice and column
df.loc['C001': 'C003', 'Age']

C001    25
C002    30
C003    35
Name: Age, dtype: int64

In [9]:
# Selecting row slice and columns
df.loc['C001': 'C003', ['Age', 'City']]

Unnamed: 0,Age,City
C001,25,NY
C002,30,London
C003,35,Paris


> 🔸 Unlike `iloc`, `loc` **includes the stop label** (`'C003'` is included above)

### ✅ **Boolean Filtering with `loc`**

In [10]:
df['Age'] > 30

C001    False
C002    False
C003     True
C004     True
Name: Age, dtype: bool

In [11]:
df.loc[df['Age'] > 30]

Unnamed: 0,Name,Age,City
C003,Charlie,35,Paris
C004,David,40,Berlin


In [12]:
df.loc[df['City'] == 'London', ['Name', 'Age']]

Unnamed: 0,Name,Age
C002,Bob,30


### ✅ **Assign Values with `loc`**

In [13]:
df.loc['C002', 'City'] = 'Manchester'
df

Unnamed: 0,Name,Age,City
C001,Alice,25,NY
C002,Bob,30,Manchester
C003,Charlie,35,Paris
C004,David,40,Berlin


### ✅ **Real-time Examples**

| Scenario                                 | Use `loc` for...                             |
| ---------------------------------------- | -------------------------------------------- |
| 🔹 Selecting a customer by ID            | `df.loc['C002']`                             |
| 🔹 Filtering employees over 40           | `df.loc[df['Age'] > 40]`                     |
| 🔹 Updating product price by ID          | `df.loc['P101', 'Price'] = 299.99`           |
| 🔹 Extracting user emails from Bangalore | `df.loc[df['City'] == 'Bangalore', 'Email']` |


## 🔹 11. `df.at[]` – Fast Scalar Access by Label

### ✅ **Purpose**

Access **a single cell** by row/column **label** (fastest for scalar access).


### ✅ **Syntax**

```python
df.at[row_label, column_label]
```


In [14]:
df

Unnamed: 0,Name,Age,City
C001,Alice,25,NY
C002,Bob,30,Manchester
C003,Charlie,35,Paris
C004,David,40,Berlin


In [15]:
df.at['C001', 'City']

'NY'

In [None]:
# Updating a cell
df.at['C001', 'City'] = 'New York'
df

Unnamed: 0,Name,Age,City
C001,Alice,25,New York
C002,Bob,30,Manchester
C003,Charlie,35,Paris
C004,David,40,Berlin


### ✅ **Performance**

* `at[]` is faster than `loc[]` for accessing/updating a **single cell**
* Prefer `at[]` when you're not dealing with slices or multiple columns/rows


### ✅ **Real-time Examples**

| Scenario                     | Use `at[]` for...                  |
| ---------------------------- | ---------------------------------- |
| ✅ Updating a specific salary | `df.at['E1002', 'Salary'] = 85000` |
| ✅ Reading a specific email   | `df.at['U001', 'Email']`           |


## ⚠️ Key Differences: `loc[]` vs `at[]`

| Feature       | `.loc[]`                        | `.at[]`                           |
| ------------- | ------------------------------- | --------------------------------- |
| Access type   | Label-based, multiple or single | Label-based, **single cell only** |
| Row input     | Single label, list, slice       | Single label                      |
| Column input  | Single, list, slice             | Single column label               |
| Performance   | Slower (flexible)               | Fastest (scalar only)             |
| Can set value | ✅                               | ✅                                 |

---

## 🧠 Quick Summary of Label-based Selection

| Task                       | Code Example                        |
| -------------------------- | ----------------------------------- |
| Select row by label        | `df.loc['C001']`                    |
| Select value at row/col    | `df.loc['C001', 'City']`            |
| Select multiple rows/cols  | `df.loc[['C001','C002'], ['Name']]` |
| Filter rows by condition   | `df.loc[df['Age'] > 30]`            |
| Single cell access (fast)  | `df.at['C001', 'Age']`              |
| Assign a value             | `df.loc['C001', 'Age'] = 28`        |
| Assign fast to single cell | `df.at['C001', 'Age'] = 28`         |


<center><b>Thanks</b></center>