Mini Datathon

Heroku web app

Mini Datathon

This datathon platform is fully developped in python using streamlit in only 115 lines of code!

As written in the title, it is designed for small datathon and the scripts are easy to understand.

Installation

Clone the repo into your server.

git clone mini_datathon; cd mini_datathon

Usage

You need 5 simple steps to setup your mini hackathon:

modify the password of the admin user in users.csv
add the participants in users.csv
modify the load_target and evaluate function in main.py according to your needs (see Example)
edit the templates.py to change the content of the different pages (markdown format).
run the command streamlit run main.py

Please do not forget to notify the participants that the submission file need to be a csv ordered the same way as given in test and should contain the column predictions.

Example

An example version of the code is deployed on heroku here: web app

In the current version, the step #3 functions are implemented using the UCI Secom imbalanced dataset (binary classification) and evaluated by the PR-AUC score:

from sklearn.metrics import average_precision_score

@st.cache
def load_target():
    labels = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/secom/secom_labels.data',
                         header=None, sep=' ', names=['target', 'time'])
    y_train = labels.sample(**train_test_sampling)
    y_test = labels.loc[~labels.index.isin(y_train.index), 'target']
    return y_test

def evaluate(y_true, y_pred):
    return average_precision_score(y_true, y_pred, average='micro')

Behind the scenes

Databases

The platform needs only 2 components to be saved:

The leaderboard

The leaderboard is in fact a csv file that is being updated everytime a user submit predictions. The csv file contains 2 columns:

id: the login of the user
score: the maximum score of the user

We will have only 1 row per user since only the maximum score is being saved.

By default, a benchmark score is pushed to the leaderboard:

id	score
benchmark	0.6

For more details, please refer to the script leaderboard.

The users

Like the leaderboard, it is a csv file. It is supposed to be defined by the admin of the competition. It contains 2 columns:

login
password

A default user is created at first to begin to play with the platform:

login	password
admin	password

In order to add new participants, simply add rows to the current users.csv file.

For more details, please refer to the script users.

Next steps

allow to have a private and public leaderboard like it is done on kaggle.com
store the encrypted password in users.csv
allow to connect using oauth
define user permissions

License

MIT License here.

Credits

We could not find an easy implementation for our yearly internal hackathon at Intel. The idea originally came from my dear devops coworker Elhay Efrat and I took the responsability to develop it.

This version is not the one used at intel.

If you like this project, let me know by buying me a coffee :)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
leaderboard.py		leaderboard.py
main.py		main.py
mini_datathon.gif		mini_datathon.gif
requirements.txt		requirements.txt
runtime.txt		runtime.txt
setup.sh		setup.sh
templates.py		templates.py
users.csv		users.csv
users.py		users.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mini Datathon

Installation

Usage

Example

Behind the scenes

Databases

The leaderboard

The users

Next steps

License

Credits

About

Releases

Packages

Languages

License

JordanGarzon/mini_datathon

Folders and files

Latest commit

History

Repository files navigation

Mini Datathon

Installation

Usage

Example

Behind the scenes

Databases

The leaderboard

The users

Next steps

License

Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages