# Public LB Progression Over Time

Here is a way to show that at least you were up there in the gold/silver/bronze zones (or top 50%, or at least higher than you ended up) at some point in time…

It turns out to be very easy to reconstruct the public leaderboard at any point in time using a few lines of Pandas. For newbies/those who haven’t noticed, all submissions that improve on the public LB are available [here](https://www.kaggle.com/c/8540/publicleaderboarddata.zip) linked from the [leaderboard page](https://www.kaggle.com/c/talkingdata-adtracking-fraud-detection/leaderboard) - every row in that CSV creates a new public LB ordering. (The file is available here via @inversion's [data set](https://www.kaggle.com/inversion/talkingdata-leaderboard-data) too.) If you select by a time cutoff, then groupby teams and take the max submission score for each, and rank that, you end up with a snapshot of the public LB. I’ve put a kernel here to make it easy…

You can fork this Notebook and change the last cell, or download it and run it locally.

I’ve demo'ed only my own here – and I think it would be nicer if people stick to sharing their own journey, rather than using it to point out other people who mysteriously join & leap into the medals in the dying moments >:S

In [1]:
%matplotlib inline
import numpy as np, pandas as pd

In [3]:
s = pd.read_csv('../input/talkingdata-adtracking-fraud-detection-publicleaderboard.csv', parse_dates=['SubmissionDate'])
cut = pd.to_datetime('2018-05-08')
s = s.loc[s.SubmissionDate<=cut] # omit post competition subs
s['doy'] = s.SubmissionDate.dt.dayofyear
doy2date = s.groupby('doy').SubmissionDate.max().dt.date.to_dict()
days = np.unique(s.doy.values)

In [5]:
def leaderboard(doy):
    return s.loc[s.doy<=doy].groupby('TeamName').Score.max().rank(ascending=False)

def leaderboard_rank(doy, team):
    return leaderboard(doy).get(team)

def chart_public_lb(team):
    ser = pd.Series({doy2date[doy]:leaderboard_rank(doy, team) for doy in days})
    p = ser.plot(figsize=(12,5))
    p.set_title(team + ' - TalkingData AdTracking Fraud Detection Challenge - Public LB Rank')
    p.invert_yaxis()
    p.grid(True)
    p.set_ylim(top=0)
    return p

In [7]:
chart_public_lb('James Trotman') # change this to your team name