# Pandas DataFrame Index

Every Pandas DataFrame has an **index**, which labels the rows.  
While the index is often numeric (0, 1, 2, …), it **does not have to be**.

This notebook covers:
- What a DataFrame index is
- Using custom index values
- Sorting by index
- Resetting the index
- Renaming the old index column after resetting



## Setup

In [1]:
import pandas as pd

## Example DataFrame


In [2]:
baseball_df = pd.DataFrame({
    "wins": [76, 83, 74, 71, 86],
    "losses": [86, 79, 88, 91, 76]
})

baseball_df

Unnamed: 0,wins,losses
0,76,86
1,83,79
2,74,88
3,71,91
4,86,76


## The Index Is Not Just Row Numbers

By default, Pandas assigns numeric row labels.  
However, the index can contain other values, such as strings.


### Assigning a Custom Index


In [3]:
baseball_df.index = ["Bucs", "Reds", "Cubs", "Cards", "Brewers"]
baseball_df

Unnamed: 0,wins,losses
Bucs,76,86
Reds,83,79
Cubs,74,88
Cards,71,91
Brewers,86,76


Here, the index now represents team nicknames instead of numeric row IDs.

## Sorting by Index

You can sort a DataFrame by its index using `sort_index()`. As discussed in the previous notebooks, sorting by string works **alphabetically** (lexicographical order).


In [4]:
baseball_df.sort_index()

Unnamed: 0,wins,losses
Brewers,86,76
Bucs,76,86
Cards,71,91
Cubs,74,88
Reds,83,79


## Resetting the Index

Sometimes you'd like to simply drop the index and replace it with the default, a range of numbers starting, in true Python fashion, with 0. To do this, use `reset_index(drop=True)`. Note that this *returns* a new DataFrame and doesn't modify the original DataFrame in place unless you set `reset_index(drop=True, inplace=True)`. See the [`reset_index` documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reset_index.html) for details. 

Below we will assign the DataFrame with the index reset to a new variable.

In [5]:
df_default_index = baseball_df.reset_index(drop=True)
df_default_index

Unnamed: 0,wins,losses
0,76,86
1,83,79
2,74,88
3,71,91
4,86,76


Sometimes you want to turn the index back into a regular column.  
Use `reset_index()` with no arguments to do this. Note: Pandas names the old index column "index" by default, which not very descriptive and can get confused with the actual index.

In [6]:
baseball_df.reset_index()

Unnamed: 0,index,wins,losses
0,Bucs,76,86
1,Reds,83,79
2,Cubs,74,88
3,Cards,71,91
4,Brewers,86,76


## Renaming the Old Index Column

It’s a good practice to rename the `"index"` column to something meaningful instead of keeping it `"index"`.

In [7]:
baseball_with_nicknames = baseball_df.reset_index().rename(columns={"index": "nickname"})
baseball_with_nicknames

Unnamed: 0,nickname,wins,losses
0,Bucs,76,86
1,Reds,83,79
2,Cubs,74,88
3,Cards,71,91
4,Brewers,86,76


## Writing Operations Across Multiple Lines

Not related to indexes, but for readability, you can chain multiple operations across lines using `\`.


In [None]:
baseball_df \
    .reset_index() \
    .rename(columns={"index": "nickname"})

# Practice Exercise

Example DataFrame

In [8]:
# Custom DataFrame for Practice Exercises

team_stats = pd.DataFrame({
    "wins": [100, 100, 50, 0],
    "losses": [2, 10, 10, 100],
    "Drivers": ["Max Verstappen", "Charles Leclrec", "Lewis Hamilton", "George Russell"],
})

team_stats


Unnamed: 0,wins,losses,Drivers
0,100,2,Max Verstappen
1,100,10,Charles Leclrec
2,50,10,Lewis Hamilton
3,0,100,George Russell


Question 1. Perform the following operations in a  **single chained expression** using line continuation (`\`).
  - Assign the following team nicknames as the index:
    - `["Red Bull", "Mercedes", "Mercedes", "Mercedes"]`,

  - Sort the DataFrame by its index.

  - Reset the index so that the team names become a regular column.

  - Rename the default `"index"` column to `"team_name"`.

In [10]:
team_stats.index = ["Red Bull", "Mercedes", "Mercedes", "Mercedes"]
team_stats


Unnamed: 0,wins,losses,Drivers
Red Bull,100,2,Max Verstappen
Mercedes,100,10,Charles Leclrec
Mercedes,50,10,Lewis Hamilton
Mercedes,0,100,George Russell


In [11]:
team_stats \
    .sort_index() \
    .reset_index() \
    .rename(columns={"index": "team_name"})

Unnamed: 0,team_name,wins,losses,Drivers
0,Mercedes,100,10,Charles Leclrec
1,Mercedes,50,10,Lewis Hamilton
2,Mercedes,0,100,George Russell
3,Red Bull,100,2,Max Verstappen


In [12]:
team_stats

Unnamed: 0,wins,losses,Drivers
Red Bull,100,2,Max Verstappen
Mercedes,100,10,Charles Leclrec
Mercedes,50,10,Lewis Hamilton
Mercedes,0,100,George Russell
