# **Data Selection & Indexing**

In [1]:
import pandas as pd

## 4. **Integer-based Selection in pandas**

This section covers how to select rows and columns based on their **integer positions** — just like using zero-based indexing in regular Python lists or NumPy arrays.

There are two key tools:

1. `df.iloc[]` – Select multiple values using integer positions **(index location)**
2. `df.iat[]` – Fast access for a single value using integer positions **(index at)**

## 🔹 12. `df.iloc[]` – Integer Position-Based Selection

### ✅ **Purpose**

Access rows and columns using **zero-based integer indexing**, not labels.


### ✅ **Syntax**

```python
df.iloc[row_position]             # Single row
df.iloc[start:stop]               # Row slice
df.iloc[row_pos, col_pos]         # Single cell
df.iloc[[row1, row2], [col1, col2]]  # Specific rows & columns
```


In [2]:
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['NY', 'London', 'Paris', 'Berlin']
}
df = pd.DataFrame(data)

df

Unnamed: 0,Name,Age,City
0,Alice,25,NY
1,Bob,30,London
2,Charlie,35,Paris
3,David,40,Berlin


In [None]:
# Select a Row by Integer Position
df.iloc[1]

Name       Bob
Age         30
City    London
Name: 1, dtype: object

In [None]:
# Select Multiple Rows
df.iloc[[1, 3]]

Unnamed: 0,Name,Age,City
1,Bob,30,London
3,David,40,Berlin


In [None]:
# Select a Cell by Position
df.iloc[1, 2]

'London'

In [None]:
# Select Rows and Columns
df.iloc[:, 2]

0        NY
1    London
2     Paris
3    Berlin
Name: City, dtype: object

In [11]:
df.iloc[[1,3], [0, 2]]

Unnamed: 0,Name,City
1,Bob,London
3,David,Berlin


In [None]:
df.iloc[1:3, 0:2]
# Selects rows 1–2 and columns 0–1 (excluding stop index 3 and 2).

Unnamed: 0,Name,Age
1,Bob,30
2,Charlie,35


### ✅ **Real-Time Examples**

| Scenario                                    | Example              |
| ------------------------------------------- | -------------------- |
| ✅ Get first 100 rows                        | `df.iloc[0:100]`     |
| ✅ Remove header & access data from 2nd row  | `df.iloc[1:]`        |
| ✅ Select name and age for first two records | `df.iloc[0:2, 0:2]`  |
| ✅ Random sampling by integer indexes        | `df.iloc[[2, 5, 9]]` |

### 🔸 Negative Indexing (Like Python Lists)

In [13]:
df

Unnamed: 0,Name,Age,City
0,Alice,25,NY
1,Bob,30,London
2,Charlie,35,Paris
3,David,40,Berlin


In [15]:
df.iloc[-1]

Name     David
Age         40
City    Berlin
Name: 3, dtype: object

In [17]:
df.iloc[-2:, :-3:-1]

Unnamed: 0,City,Age
2,Paris,35
3,Berlin,40


## 🔹 13. `df.iat[]` – Fast Scalar Access by Integer Index

### ✅ **Purpose**

Access a **single value quickly** using row and column positions (like `iloc` but scalar-only).


### ✅ **Syntax**

```python
df.iat[row_position, column_position]
```


In [19]:
df

Unnamed: 0,Name,Age,City
0,Alice,25,NY
1,Bob,30,London
2,Charlie,35,Paris
3,David,40,Berlin


In [21]:
df.iat[1, 2]

'London'

In [23]:
df.iat[2, 0]

'Charlie'

In [25]:
df.iat[0, 1] = 26

df

Unnamed: 0,Name,Age,City
0,Alice,26,NY
1,Bob,30,London
2,Charlie,35,Paris
3,David,40,Berlin


### ✅ **Real-Time Examples**

| Scenario                               | Use `iat` for...                                                |
| -------------------------------------- | --------------------------------------------------------------- |
| ✅ Update salary of a specific employee | `df.iat[4, 3] = 95000`                                          |
| ✅ Fetch customer email from a cell     | `df.iat[2, 5]`                                                  |
| ✅ Fast update inside loop              | Looping through rows and changing one value per row using `iat` |

---

## 🔄 `iloc[]` vs `iat[]`

| Feature     | `iloc[]`                            | `iat[]`                     |
| ----------- | ----------------------------------- | --------------------------- |
| Input       | Integer-based (single, list, slice) | Integer-based (single only) |
| Returns     | Row/column/DF/subset                | Scalar                      |
| Performance | General-purpose (slower)            | Fastest for scalar          |
| Usage       | Multiple row/col selection          | Single cell                 |

---

## ✅ Quick Summary Table

| Task                    | Code Example              |
| ----------------------- | ------------------------- |
| Get 2nd row             | `df.iloc[1]`              |
| Get 3rd row, 2nd column | `df.iloc[2, 1]`           |
| Get last 2 rows         | `df.iloc[-2:]`            |
| Get value (fast)        | `df.iat[2, 2]`            |
| Set value (fast)        | `df.iat[1, 0] = 'Robert'` |
| Get first 100 rows      | `df.iloc[:100]`           |


## ⚠️ Best Practices

* Use `iloc[]` when position matters (e.g., clean CSV without headers).
* Use `iat[]` inside loops or performance-critical tasks.
* Avoid `iloc` with label-based logic — prefer `loc` for that.


<center><b>Thanks</b></center>