Pandas Sort

Sorting is a fundamental operation in data manipulation and analysis that involves arranging data in a specific order.

Sorting is crucial for tasks such as organizing data for better readability, identifying patterns, making comparisons, and facilitating further analysis.

Sort DataFrame in Pandas
In Pandas, we can use the sort_values() function to sort a DataFrame. For example,

In [2]:
import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [28, 22, 25]}
df = pd.DataFrame(data)

# sort DataFrame by Age in ascending order
sorted_df = df.sort_values(by='Age',ascending=True)

print(sorted_df.to_string(index=False))

   Name  Age
    Bob   22
Charlie   25
  Alice   28


Sort Pandas DataFrame by Multiple Columns
We can also sort DataFrame by multiple columns in Pandas. When we sort a Pandas DataFrame by multiple columns, the sorting is done with a priority given to the order of the columns listed.

To sort by multiple columns in Pandas, you can pass the desired columns as a list to the by parameter in the sort_values() method. Here's how we do it.

In [3]:


data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 22, 30, 22],
        'Score': [85, 90, 75, 80]}

df = pd.DataFrame(data)

# 1. Sort DataFrame by 'Age' and then by 'Score' (Both in ascending order)
df1 = df.sort_values(by=['Age', 'Score'])

print("Sorting by 'Age' (ascending) and then by 'Score' (ascending):\n")
print(df1.to_string(index=False))

print()
# 2. Sort DataFrame by 'Age' in ascending order, and then by 'Score' in descending order
df2 = df.sort_values(by=['Age', 'Score'], ascending=[True, False])

print("Sorting by 'Age' (ascending) and then by 'Score' (descending):\n")
print(df2.to_string(index=False))

Sorting by 'Age' (ascending) and then by 'Score' (ascending):

   Name  Age  Score
  David   22     80
    Bob   22     90
  Alice   25     85
Charlie   30     75

Sorting by 'Age' (ascending) and then by 'Score' (descending):

   Name  Age  Score
    Bob   22     90
  David   22     80
  Alice   25     85
Charlie   30     75


#index Sort Pandas DataFrame Using sort_index()
We can also sort by the index of a DataFrame in Pandas using the sort_index() function.

The sort_index() function is used to sort a DataFrame or Series by its index. This is useful for organizing data in a logical order, improving query performance, and ensuring consistent data representation.

In [4]:
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [28, 22, 25]}
# create a DataFrame with a non-sequential index
df = pd.DataFrame(data, index=[2, 0, 1])

print("Original DataFrame:")
print(df.to_string(index=True))
print("\n")

# sort DataFrame by index in ascending order
sorted_df = df.sort_index()

print("Sorted DataFrame by index:")
print(sorted_df.to_string(index=True))

Original DataFrame:
      Name  Age
2    Alice   28
0      Bob   22
1  Charlie   25


Sorted DataFrame by index:
      Name  Age
0      Bob   22
1  Charlie   25
2    Alice   28
