

# Sorting Pandas DataFrames

In this notebook, we will learn how to sort Pandas DataFrames using one or more columns.

This notebook covers:
- Sorting by column values
- Resetting the index after sorting
- Sorting in descending order
- Understanding when sorting modifies the original DataFrame




## Setup

In [1]:
import pandas as pd

## Example DataFrame


In [2]:
df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie", "Diana"],
    "score": [88, 92, 85, 90],
    "age": [24, 22, 25, 23]
})
df

Unnamed: 0,name,score,age
0,Alice,88,24
1,Bob,92,22
2,Charlie,85,25
3,Diana,90,23


## Sorting by a Column

Use `df.sort_values()` to sort a DataFrame by one or more columns.

Example: sort by the `score` column.


In [3]:
df.sort_values(["score"])

Unnamed: 0,name,score,age
2,Charlie,85,25
0,Alice,88,24
3,Diana,90,23
1,Bob,92,22


Notice the index values are according to the original dataframe. To reset the index, use `ignore_index=True`.


In [4]:
df.sort_values(["score"], ignore_index=True)

Unnamed: 0,name,score,age
0,Charlie,85,25
1,Alice,88,24
2,Diana,90,23
3,Bob,92,22


## Sorting in Descending Order

To sort from largest to smallest, set `ascending=False`.


In [5]:
df.sort_values(["score"], ascending=False, ignore_index=True)

Unnamed: 0,name,score,age
0,Bob,92,22
1,Diana,90,23
2,Alice,88,24
3,Charlie,85,25


## Keeping the Sorted DataFrame

The `sort_values()` method does **not** modify the original DataFrame unless explicitly specified.

There are two ways to keep the sorted result:
1. Assign the sorted DataFrame to a new variable.
2. Change the original DataFrame by setting the `inplace` parameter of `sort_values` to `True`.


### Assigning the Sorted DataFrame to a New Variable
This is often the preferred method, as it keeps the sorted DataFrame separate from the original.

In [10]:
df

Unnamed: 0,name,score,age
2,Charlie,85,25
0,Alice,88,24
3,Diana,90,23
1,Bob,92,22


In [6]:
df.sort_values(["score"], ignore_index=True)
print(df) #Will display unsorted(original) dataframe.

      name  score  age
0    Alice     88   24
1      Bob     92   22
2  Charlie     85   25
3    Diana     90   23


In [7]:
sorted_df = df.sort_values(["score"])
sorted_df

Unnamed: 0,name,score,age
2,Charlie,85,25
0,Alice,88,24
3,Diana,90,23
1,Bob,92,22


### Modifying the Original DataFrame with `inplace=True`

If you want to update the original DataFrame directly, use `inplace=True`. Note: Using `inplace=True` permanently modifies the DataFrame. This cannot be undone unless you reload the data.


In [None]:
df.sort_values(["score"], inplace=True)
df

##  Sorting DataFrames by String Columns

Pandas allows you to sort DataFrames using **string (text) columns** just as easily as numeric columns.  
By default, strings are sorted **alphabetically** (lexicographical order).


In [11]:
df.sort_values(["name"])

Unnamed: 0,name,score,age
0,Alice,88,24
1,Bob,92,22
2,Charlie,85,25
3,Diana,90,23


# Practice Exercise 1

1. Sort the DataFrame by `age` in descending order.

In [13]:
df.sort_values(["age"], ascending=False)

Unnamed: 0,name,score,age
2,Charlie,85,25
0,Alice,88,24
3,Diana,90,23
1,Bob,92,22


# Practice Exercise 2

Try sorting by two columns (e.g., `score` first, then `age`). Hint: the `sort_values` method takes a list of column names.

In [14]:
df.sort_values(["name"])

Unnamed: 0,name,score,age
0,Alice,88,24
1,Bob,92,22
2,Charlie,85,25
3,Diana,90,23


In [16]:
df.sort_values(["score"], ascending=False)

Unnamed: 0,name,score,age
1,Bob,92,22
3,Diana,90,23
0,Alice,88,24
2,Charlie,85,25
