## 1 Setup

### 1.1 Import Libraries

In [1]:
import numpy as np
import pandas as pd

print("Pandas version:", pd.__version__)

Pandas version: 2.3.0


### 1.2 Import Data

#### Mckinsey GDP Dataset

In [2]:
mk_df = pd.read_csv("data/mckinsey.csv")
mk_df.head(3)

Unnamed: 0,country,year,population,continent,life_exp,gdp_cap
0,Afghanistan,1952,8425333,Asia,28.801,779.445314
1,Afghanistan,1957,9240934,Asia,30.332,820.85303
2,Afghanistan,1962,10267083,Asia,31.997,853.10071


#### Employees Dataset

In [3]:
employee_data = {
    "Name": ["Alex", "Ajax", "Jane", "John", "Anna"],
    "Age": [31, 31, 28, 35, 40],
    "Role": ["Senior SD", "Associate Architect", "Junior SD", "Architect", "V.P."],
    "DOJ": ["01-06-2021", "01-01-2025", "01-03-2023", "01-12-2022", "01-08-2000"],
}

emp_df = pd.DataFrame(data=employee_data)
emp_df

Unnamed: 0,Name,Age,Role,DOJ
0,Alex,31,Senior SD,01-06-2021
1,Ajax,31,Associate Architect,01-01-2025
2,Jane,28,Junior SD,01-03-2023
3,John,35,Architect,01-12-2022
4,Anna,40,V.P.,01-08-2000


### 1.3 Update Index

Before accessing rows using indices lets update index of DataFrame to create distinction between row-explicit index and row-implicit index.

#### Mckinskey `DataFrame`

##### Old Explicit Indices

For the `mk_df` DataFrame, old explicit row indices range from 0 to 1703 as seen below.

In [4]:
mk_df.index.values

array([   0,    1,    2, ..., 1701, 1702, 1703], shape=(1704,))

In [5]:
mk_df.index = range(1, len(mk_df) + 1)

##### New Explicit Indices

For the `mk_df` DataFrame, new explicit row indices range from 1 to 1704 as seen below.

In [6]:
mk_df.index.values

array([   1,    2,    3, ..., 1702, 1703, 1704], shape=(1704,))

#### Employee `DataFrame`

Explicit row indices can be strings hence using custom string indices for `emp_df`.

##### Old Explicit Indices

In [7]:
emp_df.index.values

array([0, 1, 2, 3, 4])

In [8]:
emp_df.index = ["E01", "E02", "E03", "E04", "E05"]

##### New Explicit Indices

In [9]:
emp_df.index.values

array(['E01', 'E02', 'E03', 'E04', 'E05'], dtype=object)

## 2 Implicit vs Explicit Index

### 2.1 The Problem

#### Case #1: Using Single Index

In [10]:
try:
    mk_df[1]
except KeyError as err:
    print("KeyError:", err)

KeyError: 1


In [11]:
try:
    emp_df["E05"]
except KeyError as err:
    print("KeyError:", err)

KeyError: 'E05'


#### Case #2: Using Multiple Index

In [12]:
try:
    mk_df[1, 2, 3]
except KeyError as err:
    print("KeyError:", err)

KeyError: (1, 2, 3)


In [13]:
try:
    emp_df["E03", "E04", "EO5"]
except KeyError as err:
    print("KeyError:", err)

KeyError: ('E03', 'E04', 'EO5')


#### Case #3: Using Slicing

In [14]:
mk_df[100:115:5]

Unnamed: 0,country,year,population,continent,life_exp,gdp_cap
101,Bangladesh,1972,70759295,Asia,45.252,630.233627
106,Bangladesh,1997,123315288,Asia,59.412,972.770035
111,Belgium,1962,9218400,Europe,70.25,10991.20676


In [15]:
emp_df["E03":"EO5"]

Unnamed: 0,Name,Age,Role,DOJ
E03,Jane,28,Junior SD,01-03-2023
E04,John,35,Architect,01-12-2022
E05,Anna,40,V.P.,01-08-2000


### 2.2 Solution

## 3 Access Rows

### 3.1 Using `loc`

#### Case #1: Using Single Index

In [16]:
mk_df.loc[1]

country       Afghanistan
year                 1952
population        8425333
continent            Asia
life_exp           28.801
gdp_cap        779.445314
Name: 1, dtype: object

In [17]:
emp_df.loc["E05"]

Name          Anna
Age             40
Role          V.P.
DOJ     01-08-2000
Name: E05, dtype: object

#### Case #2: Using Multiple Index

In [18]:
mk_df.loc[[1, 2, 3]]

Unnamed: 0,country,year,population,continent,life_exp,gdp_cap
1,Afghanistan,1952,8425333,Asia,28.801,779.445314
2,Afghanistan,1957,9240934,Asia,30.332,820.85303
3,Afghanistan,1962,10267083,Asia,31.997,853.10071


In [19]:
emp_df.loc[["E03", "E04", "E05"]]

Unnamed: 0,Name,Age,Role,DOJ
E03,Jane,28,Junior SD,01-03-2023
E04,John,35,Architect,01-12-2022
E05,Anna,40,V.P.,01-08-2000


#### Case #3: Using Slicing

In [20]:
mk_df.loc[100:115:5]

Unnamed: 0,country,year,population,continent,life_exp,gdp_cap
100,Bangladesh,1967,62821884,Asia,43.453,721.186086
105,Bangladesh,1992,113704579,Asia,56.018,837.810164
110,Belgium,1957,8989111,Europe,69.24,9714.960623
115,Belgium,1982,9856303,Europe,73.93,20979.84589


In [21]:
emp_df.loc["E03":"E05"]

Unnamed: 0,Name,Age,Role,DOJ
E03,Jane,28,Junior SD,01-03-2023
E04,John,35,Architect,01-12-2022
E05,Anna,40,V.P.,01-08-2000


### 3.2 Using `iloc`

1. Since Implicit index supports negative values `iloc` supports negative value.
2. End index is exclusive while slicing.

#### Case #1: Using Single Index

In [22]:
mk_df.iloc[0]

country       Afghanistan
year                 1952
population        8425333
continent            Asia
life_exp           28.801
gdp_cap        779.445314
Name: 1, dtype: object

In [23]:
emp_df.iloc[4]

Name          Anna
Age             40
Role          V.P.
DOJ     01-08-2000
Name: E05, dtype: object

#### Case #2: Using Multiple Index

In [24]:
mk_df.iloc[[0, 10, 1703]]

Unnamed: 0,country,year,population,continent,life_exp,gdp_cap
1,Afghanistan,1952,8425333,Asia,28.801,779.445314
11,Afghanistan,2002,25268405,Asia,42.129,726.734055
1704,Zimbabwe,2007,12311143,Africa,43.487,469.709298


In [25]:
emp_df.iloc[[4, 1, 3]]

Unnamed: 0,Name,Age,Role,DOJ
E05,Anna,40,V.P.,01-08-2000
E02,Ajax,31,Associate Architect,01-01-2025
E04,John,35,Architect,01-12-2022


#### Case #3: Using Slicing

In [26]:
mk_df.iloc[100:115:5]

Unnamed: 0,country,year,population,continent,life_exp,gdp_cap
101,Bangladesh,1972,70759295,Asia,45.252,630.233627
106,Bangladesh,1997,123315288,Asia,59.412,972.770035
111,Belgium,1962,9218400,Europe,70.25,10991.20676


With `iloc` the end index is exclusive and it works on implicit index.

In [27]:
emp_df.iloc[0:3]

Unnamed: 0,Name,Age,Role,DOJ
E01,Alex,31,Senior SD,01-06-2021
E02,Ajax,31,Associate Architect,01-01-2025
E03,Jane,28,Junior SD,01-03-2023


In [28]:
mk_df.iloc[1:10:2]

Unnamed: 0,country,year,population,continent,life_exp,gdp_cap
2,Afghanistan,1957,9240934,Asia,30.332,820.85303
4,Afghanistan,1967,11537966,Asia,34.02,836.197138
6,Afghanistan,1977,14880372,Asia,38.438,786.11336
8,Afghanistan,1987,13867957,Asia,40.822,852.395945
10,Afghanistan,1997,22227415,Asia,41.763,635.341351


In [29]:
mk_df.iloc[-1:-10:-2]

Unnamed: 0,country,year,population,continent,life_exp,gdp_cap
1704,Zimbabwe,2007,12311143,Africa,43.487,469.709298
1702,Zimbabwe,1997,11404948,Africa,46.809,792.44996
1700,Zimbabwe,1987,9216418,Africa,62.351,706.157306
1698,Zimbabwe,1977,6642107,Africa,57.674,685.587682
1696,Zimbabwe,1967,4995432,Africa,53.995,569.795071


### 3.3 Conclusion

## 4 Access Columns

### 4.1 Using Lables

#### Case #1: Using Single Index

##### Example #1

In [30]:
emp_df["Name"]

E01    Alex
E02    Ajax
E03    Jane
E04    John
E05    Anna
Name: Name, dtype: object

Explicit row index can be used with Series.

In [31]:
emp_df["Name"]["E05"]

'Anna'

##### Example #2

Explicit row index can be used with Series.

In [32]:
mk_df["continent"][1702]

'Africa'

#### Case #2: Using Multiple Index

#### Case #3: Using Slicing

### 4.2 Using `loc`

#### Case #1: Using single column name

In [33]:
# temp.loc["a"]

In [34]:
# temp.loc[["a"]]

#### Case #2: Using multiple column name

In [35]:
# temp.loc[["a", "c"]]

#### Case #3: Using slicing on column names

In [36]:
mk_df.loc[:2, "year":"continent"]

Unnamed: 0,year,population,continent
1,1952,8425333,Asia
2,1957,9240934,Asia


In [37]:
mk_df.loc[0:3, "country":"population"]

Unnamed: 0,country,year,population
1,Afghanistan,1952,8425333
2,Afghanistan,1957,9240934
3,Afghanistan,1962,10267083


In [38]:
mk_df.loc[6:3:-1, "country"::2]

Unnamed: 0,country,population,life_exp
6,Afghanistan,14880372,38.438
5,Afghanistan,13079460,36.088
4,Afghanistan,11537966,34.02
3,Afghanistan,10267083,31.997


In [39]:
# df.loc[9::-3]

### 4.2 Using `iloc`

#### Case #1: Using Single Index

#### Case #2: Using Multiple Index

In [40]:
mk_df.iloc[[0, 1, 2], [1, 3]]

Unnamed: 0,year,continent
1,1952,Asia
2,1957,Asia
3,1962,Asia


#### Case #3: Using Slicing

In [41]:
# df.iloc[:20]

In [42]:
# pd.DataFrame(df, columns=['country', 'year'])
# df.iloc[:, 0:2]

In [43]:
mk_df.iloc[:4, :3]

Unnamed: 0,country,year,population
1,Afghanistan,1952,8425333
2,Afghanistan,1957,9240934
3,Afghanistan,1962,10267083
4,Afghanistan,1967,11537966


## 5 Access Index

## 5 Insert Rows

In [44]:
emp_df

Unnamed: 0,Name,Age,Role,DOJ
E01,Alex,31,Senior SD,01-06-2021
E02,Ajax,31,Associate Architect,01-01-2025
E03,Jane,28,Junior SD,01-03-2023
E04,John,35,Architect,01-12-2022
E05,Anna,40,V.P.,01-08-2000


### 5.1 Using `loc`

In [45]:
emp_df.loc["E06"] = ["Bob", 30, "Junior SD", "01-07-2024"]
emp_df

Unnamed: 0,Name,Age,Role,DOJ
E01,Alex,31,Senior SD,01-06-2021
E02,Ajax,31,Associate Architect,01-01-2025
E03,Jane,28,Junior SD,01-03-2023
E04,John,35,Architect,01-12-2022
E05,Anna,40,V.P.,01-08-2000
E06,Bob,30,Junior SD,01-07-2024


Hence its is possible to add new row using `loc`.

### 5.2 Using `iloc`

##### Example #1

Fetch explicit indices

In [46]:
# [mk_df.index.get_loc(exp_idx) for exp_idx in mk_df.index]
np.where(mk_df.index.values)

(array([   0,    1,    2, ..., 1701, 1702, 1703], shape=(1704,)),)

In [47]:
try:
    mk_df.iloc[1704] = ["Zimbabwe", 2008, 12311143, "Africa", 43.487, 469.709298]
except IndexError as err:
    print("IndexError:", err)

IndexError: iloc cannot enlarge its target object


##### Example #2

Fetch explicit indices

In [48]:
# [emp_df.index.get_loc(exp_idx) for exp_idx in emp_df.index]
np.where(emp_df.index.values)

(array([0, 1, 2, 3, 4, 5]),)

In [49]:
try:
    emp_df.iloc[6] = ["Kavin", 30, "Junior SD", "01-07-2025"]
except IndexError as err:
    print("IndexError:", err)

IndexError: iloc cannot enlarge its target object


Hence its is **NOT** possible to add new row using `iloc`.