Kaggle

A list of personal contributions to the kaggle community.

Herculaneum papyri: the kind of challenge you can tackle on the Kaggle platform.

Note

Kaggle datasets and notebooks are also archived in this repository.

Datasets

Paris flood dataset. A tabular dataset that provides 125+ years of daily maximum water level measurements (1900–February 2026) from hydrometric stations monitoring the Seine River at Paris Austerlitz. Data collected via the official Hub'Eau API and formatted as a clean, chronologically ordered CSV file that includes 53,332 records with minimal gaps and no artificial interpolation.
- Related project: Fluctuat nec mergitur
Fallout: New Vegas, the dataset. A narrative corpus with speaker and other metadata tagged text entries containing mostly dialogues mined from the game. This 59415 entries rich dataset enables NLP and game design researchers to unlock their post-apocalyptic storytelling potential and other advanced language modeling perks.
- Related project: Fallout New Vegas papyrus
French rare words lexicon: a CSV of 9351 uncommon French words annotated with gender, lemma, phonological transcription, frequency index, and dictionary definitions for NLP, lexicography, and language‑learning research.
- Related project: dailyword
Toulouse public libraries loans dataset: a CSV that contains annual circulation statistics and bibliographic metadata for print, music and movie items from the Toulouse public library collection. It is designed for analyzing lending trends, collection management, patron behavior, and media usage.
- Related project: Toulouse biblio chronicle

Notebooks

Paris flood dataset, the designer notebook is an exploratory data analysis of the Paris flood dataset that leverages top-notch visualization techniques to achieve visually striking results through creative design. This is a follow-up of the dataswag submission to the Hackaviz competition.
Fallout: New Vegas data exploration notebook is an NLP tailored playground that explores and transforms raw game text through some systematic exploratory data analysis and preprocessing.
This EDA guide walks through an analysis of the Toulouse public library loans dataset. It includes basic statistics, outlier detection, and column‑level analytics for fields like title, author, media type, and audience. It also includes summary reports about popular authors, media types and audience trends plus an interactive Jupyter widget for exploratory visualization.
A hands‑on exploration of the unusual/rare French words dataset presenting a curated set of computational‑linguistics methods for data prep, feature‑level EDA (word form, gender, frequency, definitions), lexical and semantic analyses (affixes, n‑grams), and interactive visualizations. It is intended as a practical toolkit rather than an exhaustive study. The project is designed to be reproducible, extensible and invites to user experimentation and feedback.

Competitions

Kaggle Digit Recognizer: Playing with some deep learning underused techniques, ObsCure MNIST ranked 7 (on 1000+ participants) with a 99.94% accuracy.
Predicting road accidents: Had the chance to learn more about residuals, ensembling and blending models. That's what we call a very close competition, ranked 328/4082.

Ahh.. xkcd

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
dataset_archive		dataset_archive
img		img
notebook_archive		notebook_archive
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kaggle

Datasets

Notebooks

Competitions

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Kaggle

Datasets

Notebooks

Competitions

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages