# Pandas tip #12: Ranking the stars in Pandas
A method that is sometimes forgotten and is available in Pandas is the .rank(). As the name suggest, it 'ranks' the data, starting from the lowest upto the largest value.

When we would rank top artists using votes, the best artist has the highest votes. Therefore, we need to apply ascending=False option to invert the count as by default, the lowest votes would get the highest rank.

When there are duplicate numbers, i.e. two artists have an identical amount of votes, .rank() averages the rank. If two artists have identical votes and are op place 2 and 3 in the ranking, both artists would get rank (2 + 3) / 2 = 2.5. To change this behaviour we can change the default method='average' to for example 'min' or 'dense'. Both methods set the artists to the minimum rank (2) but dense also makes the next artist (if there is one) get 3 place, instead of 4.

The example shows (one of many possibilities) how to order artists on how often they are ranked 1st, 2nd, and 3rd each year. 

With .rank() quite a bit is possible and is yet another tool to put in your Pandas toolbox!

Lets generate some random data:

In [None]:
import numpy as np
import pandas as pd
from itertools import product

rng = np.random.default_rng(42)
artists = ['Rick Astley', 'Bananarama', 'David Hasselhof']
years = range(2010, 2021)

df = pd.DataFrame([{
    'name': row[1],
    'year': row[0],
    'votes': rng.integers(100, 1000)
} for row in product(years, artists)])

Lets rank the artists first per year:

In [None]:
df['rank'] = (df
    .groupby('year')['votes']
    .rank(ascending=False, method='dense')
    .astype(int)
)

To get the artists with the most often 1st, 2nd, and 3rd ranks:

In [None]:
(df
    .groupby(['name', 'rank'])['year']
    .count()
    .sort_values()
    .groupby(level=0)
    .tail(1)
    .sort_index(level=1)
)

Difference in methods:

In [None]:
numbers = [1,2,2,2,3,4,4,5]
pd.DataFrame({
    'numbers': numbers,
    'base': pd.Series(numbers).rank(),
    'min': pd.Series(numbers).rank(method='min'),
    'dense': pd.Series(numbers).rank(method='dense'),
})

If you have any questions, comments, or requests, feel free to [contact me on LinkedIn](https://linkedin.com/in/dennisbakhuis).