<h1 style='background:#FFFFFF; border:0; color:black'><center>How to Analyze
    Leaderboard Easily?<center><h1>

* In competitions, we frequently check leaderboard and confirm our rank.
* Information about leaderboard becomes more important in competitions that counts towards tiers.

* In this notebook, the way to get information about public leaderboard will be introduced.
* This way is very easy, so you can apply it to other competitions.

### If you find useful or interesting, please feel free to upvote!


In [None]:
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Library

* In this notebook, we get the information by reading a json format file, so we should import "json".
* In addition to that, you should import some data visualization libraries if you want to visualize the leaderboard data.

In [None]:
import os
import sys
import numpy as np
import pandas as pd
import json
import plotly.express as px
import plotly.graph_objects as go

# Read JSON Files

* You can easily read json files by using json.load() method.
* Firstly, we read a json file which contains the information on public leaderboard in [Cassava Leaf Disease Classification competition](https://www.kaggle.com/c/cassava-leaf-disease-classification).

In [None]:
!wget 'https://www.kaggle.com/c/cassava-leaf-disease-classification/leaderboard.json?includeBeforeUser=true&includeAfterUser=false' -O cassava_leaderboard.json

* The Json file that contains the public leaderboard data has the path as described below. "https://www.kaggle.com/c/competition_name/leaderboard.json?includeBeforeUser=true&includeAfterUser=false"
* In the "competition_name" part, put the name of a specific competition.
* (e.g. Cassava Leaf Diasese Classification -> cassava-leaf-disease-classification)
* (e.g. Tabular Playground Series - Jan 2021 -> tabular-playground-series-jan-2021)

### Then, let's load the JSON file!

In [None]:
with open("cassava_leaderboard.json") as f:
    cassava_jsn = json.load(f)

# What kind of information can we get from the data?

* The json file we just read contains various information about public leaderboard.
* As an example, we would like to look at only the data of 1st team.

In [None]:
cassava_jsn['submissions'][0]

* We can see easily that various information is lined up.
* In addition to the rank and the score of each team, you can also confirm whether the team is in the medal zone and the number of submissions (entries), etc.

* Since up to 5 members can belong to one team, so we can get information such as member's profile url and tiers.

# Get only specific information

* How to retrieve only specific information (e.g. the score of each team)?
* It is very easy to do that!

In [None]:
for user in cassava_jsn['beforeUser']+cassava_jsn['afterUser']:
    if user['medal'] == "gold":
        print(user['score'])
    else:
        break

* In the code above, the scores of teams in the gold medal zone are output.
* We can easily retrieve specific information by using a for loop.

# Application Example: Visualize the Leaderboard

* Visualizing the information by using libraries makes it easier to understand the situation.

In [None]:
teams = []
scores = []
for user in cassava_jsn['beforeUser']+cassava_jsn['afterUser']:
    if user['medal'] == "gold":
        user['score'] = float(user['score'])
        scores.append(user['score'])
        teams.append(user['teamName'])
    else:
        break

score_df = pd.DataFrame({"Team":teams,"Public Score":scores})
fig = px.bar(score_df.iloc[::-1], x='Public Score', y='Team',
              color='Public Score',height=700, title='Gold Medal Zone(Cassava Leaf Disease Classification)',text='Public Score')
fig.show()

* The graph above shows the names and scores of teams in the gold medal zone.
* The data contained in "user['score']" is treated as a character string, so if you want to compare scores using a graph, you have to use float().

### We can easily do the same thing in other competitions!

In [None]:
!wget 'https://www.kaggle.com/c/hubmap-kidney-segmentation/leaderboard.json?includeBeforeUser=true&includeAfterUser=false' -O hubmap_leaderboard.json

In [None]:
with open("hubmap_leaderboard.json") as h:
    hubmap_jsn = json.load(h)

In [None]:
hubmap_teams = []
hubmap_scores = []
for user in hubmap_jsn['beforeUser']+hubmap_jsn['afterUser']:
    if user['medal'] == "gold":
        user['score'] = float(user['score'])
        hubmap_scores.append(user['score'])
        hubmap_teams.append(user['teamName'])
    else:
        break

score_df = pd.DataFrame({"Team":hubmap_teams,"Public Score":hubmap_scores})
fig = px.bar(score_df.iloc[::-1], x='Public Score', y='Team',
              color='Public Score',height=700, title='Gold Medal Zone (HuBMAP - Hacking the Kidney)',text='Public Score')
fig.show()

In [None]:
!wget 'https://www.kaggle.com/c/rfcx-species-audio-detection/leaderboard.json?includeBeforeUser=true&includeAfterUser=false' -O rfcx_leaderboard.json

In [None]:
with open("rfcx_leaderboard.json") as c:
    rfcx_jsn = json.load(c)

In [None]:
rfcx_teams = []
rfcx_scores = []
for user in rfcx_jsn['beforeUser']+rfcx_jsn['afterUser']:
    if user['medal'] == "gold":
        user['score'] = float(user['score'])
        rfcx_scores.append(user['score'])
        rfcx_teams.append(user['teamName'])
    else:
        break

score_df = pd.DataFrame({"Team":rfcx_teams,"Public Score":rfcx_scores})
fig = px.bar(score_df.iloc[::-1], x='Public Score', y='Team',
              color='Public Score',height=700, title='Gold Medal Zone (Rainforest Connection Species Audio Detection)',text='Public Score')
fig.show()

In [None]:
!wget 'https://www.kaggle.com/c/jane-street-market-prediction/leaderboard.json?includeBeforeUser=true&includeAfterUser=false' -O jane_leaderboard.json

In [None]:
with open("jane_leaderboard.json") as j:
    jane_jsn = json.load(j)

In [None]:
jane_teams = []
jane_scores = []
for user in jane_jsn['beforeUser']+jane_jsn['afterUser']:
    if user['medal'] == "gold":
        user['score'] = float(user['score'])
        jane_scores.append(user['score'])
        jane_teams.append(user['teamName'])
    else:
        break

score_df = pd.DataFrame({"Team":jane_teams,"Public Score":jane_scores})
fig = px.bar(score_df.iloc[::-1], x='Public Score', y='Team',
              color='Public Score',height=700, title='Gold Medal Zone (Jane Street Market Prediction)',text='Public Score')
fig.show()

# Can we get information on competitions that have already ended?

* So far, we've got information on active competitions, but can we do the same thing on completed competitions?
* The answer is **Yes**!
* As an example, we get information on Mechanisms of Action (MoA) Prediction (ended December 1st 2020).

In [None]:
!wget 'https://www.kaggle.com/c/lish-moa/leaderboard.json?includeBeforeUser=true&includeAfterUser=false' -O moa_leaderboard.json

In [None]:
with open("moa_leaderboard.json") as m:
    moa_jsn = json.load(m)

In [None]:
moa_teams = []
moa_scores = []
for user in moa_jsn['beforeUser']+moa_jsn['afterUser']:
    if user['medal'] == "gold":
        user['score'] = float(user['score'])
        moa_scores.append(user['score'])
        moa_teams.append(user['teamName'])
    else:
        break

score_df = pd.DataFrame({"Team":moa_teams,"Public Score":moa_scores})
fig = px.bar(score_df.iloc[::-1], x='Public Score', y='Team',
              color='Public Score',height=700, title='Gold Medal Zone (Mechanisms of Action (MoA) Prediction)',text='Public Score')
fig.show()

* In this competition, private leaderboard was already announced, but only information on public can be obtained.

## Thank you for reading to the end!
## If you like, feel free to upvote!