
---

🔄 Data Transformation

After cleaning your data, the next step is reshaping, reformatting, and reordering it to prepare for analysis. Pandas offers powerful tools to support this transformation process.

🔢 Sorting & Ranking

Sort by Values
You can sort a DataFrame based on one or more column values to bring structure—such as arranging data by age or salary in ascending or descending order. When multiple columns are used, the first column is prioritized, and ties are broken using the next.

Reset Index
After filtering or reshaping, the DataFrame index might become disordered. Resetting the index restores a sequential index and can optionally discard the old one.

Sort by Index
Sorting rows by index ensures consistent ordering—especially helpful after deletions, merges, or concatenations.

Ranking
Assigning ranks to numeric data helps in scoring or comparative analysis. The default assigns average ranks to ties (with decimals). You can use methods like 'dense' to handle ties with whole numbers while keeping the sequence compact.

🏷 Renaming Columns & Index

Renaming helps make your dataset clearer and more readable. You can rename specific columns or indices or replace all column headers at once to reflect more descriptive names.

📊 Changing Column Order

Rearranging columns allows you to highlight important fields. You can bring certain columns (like 'Name' or 'City') to the front or redefine the entire order for clarity.

✅ Summary

Use sorting to organize data meaningfully. Apply ranking to understand relative standing. Rename columns or indexes for clarity. Reorder columns to highlight what matters most.

---




In [2]:
import pandas as pd
df = pd.read_csv("data2.csv")
df

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
2,Aamir Khan,Dangal,2016,Biography,2024,8.4
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6
4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
9,Kartik Aaryan,Bhool Bhulaiyaa 2,2022,Horror Comedy,266,5.9


In [3]:
df.sort_values("Year")

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
2,Aamir Khan,Dangal,2016,Biography,2024,8.4
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
10,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1
4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
11,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6


In [5]:
df.sort_values(["Year","IMDb"])


Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
2,Aamir Khan,Dangal,2016,Biography,2024,8.4
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
10,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1
4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
11,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6


In [20]:
df2 = df.sort_values(["Year","IMDb"]).copy()
df2

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
2,Aamir Khan,Dangal,2016,Biography,2024,8.4
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
10,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1
4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
11,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6


In [16]:
df2.reset_index()


Unnamed: 0,index,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2
1,9,Kartik Aaryan,Bhool Bhulaiyaa 2,2022,Horror Comedy,266,5.9
2,3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6
3,11,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2
4,8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
5,7,Hrithik Roshan,War,2019,Action,475,6.5
6,5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
7,6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
8,4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
9,10,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1


In [21]:
df2.reset_index(drop = True)

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Aamir Khan,Dangal,2016,Biography,2024,8.4
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
2,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1
3,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
4,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
6,Hrithik Roshan,War,2019,Action,475,6.5
7,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
8,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2
9,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6


In [22]:
df2

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
2,Aamir Khan,Dangal,2016,Biography,2024,8.4
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
10,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1
4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
11,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6


In [40]:
df2.reset_index(drop = True)
df2

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb,Rank
0,Aamir Khan,Dangal,2016,Biography,2024,8.4,1.0
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0,10.0
2,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1,9.0
3,Ranveer Singh,Padmaavat,2018,Historical,585,7.0,6.5
4,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5,4.0
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3,2.0
6,Hrithik Roshan,War,2019,Action,475,6.5,8.0
7,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0,6.5
8,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2,3.0
9,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6,12.0


In [41]:
df.sort_index()

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
2,Aamir Khan,Dangal,2016,Biography,2024,8.4
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6
4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
9,Kartik Aaryan,Bhool Bhulaiyaa 2,2022,Horror Comedy,266,5.9


In [42]:
df2["Rank"]=df2["IMDb"].rank(ascending = False,method = "dense")
df2


Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb,Rank
0,Aamir Khan,Dangal,2016,Biography,2024,8.4,1.0
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0,9.0
2,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1,8.0
3,Ranveer Singh,Padmaavat,2018,Historical,585,7.0,6.0
4,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5,4.0
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3,2.0
6,Hrithik Roshan,War,2019,Action,475,6.5,7.0
7,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0,6.0
8,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2,3.0
9,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6,11.0


In [46]:
df2.rename(columns ={"Actor":"Actor Name"},inplace = True)
df2

Unnamed: 0,Actor Name,Film,Year,Genre,BoxOffice(INR Crore),IMDb,Rank
0,Aamir Khan,Dangal,2016,Biography,2024,8.4,1.0
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0,9.0
2,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1,8.0
3,Ranveer Singh,Padmaavat,2018,Historical,585,7.0,6.0
4,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5,4.0
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3,2.0
6,Hrithik Roshan,War,2019,Action,475,6.5,7.0
7,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0,6.0
8,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2,3.0
9,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6,11.0


In [51]:
col ="Name" +[col for col in df.columns if col!="Name"]
df[col]

TypeError: can only concatenate str (not "list") to str