Skip to content

upura/scipy-japan-2020-kaggle-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

Quick Start Guide of Kaggle: Machine Learning Competitions with Python

This repository contains materials and source code of a tutorial named “Quick Start Guide of Kaggle: Machine Learning Competitions with Python” (Pythonで機械学習コンペティション「Kaggle」をはじめよう) in Scipy Japan 2020, held on October 30.

Abstract

In the recent high-profile machine learning competition platform known as Kaggle, data scientists from all over the world are using Python to build machine learning models. In this hands-on tutorial, you'll learn the basics of machine learning and Kaggle by running the Notebook-style source code. The objective is to help participants learn how to compete and learn with Kaggle using Python. The Speaker won first place in a Kaggle competition, hosted a Kaggle Days Tokyo competition, and published a technical book for beginners.

Table of Contents

09:00-09:35 Introduction

What is machine learning & Kaggle?

09:35-09:40 Short break

09:40-10:45 Practice 1: From participation to submission

  1. Participation in a competition
  2. How to use Python environment in Kaggle
  3. Loading packages
  4. Loading datasets
  5. Feature engineering
  6. Training and prediction of machine learning algorithms
  7. Submission to the leaderboard

10:45-11:00 Long break

11:00-11:45 Practice 2: How to boost your score

  1. Exploratory data analysis
  2. Adding hypothesis-based features
  3. Switching machine learning algorithms
  4. Hyper parameters tuning
  5. The importance of validation
  6. Ensembling

11:45-11:50 Short break

11:50-12:30 Conclusion

Wrap up & future resources

Notebook URLs

  1. Start: https://www.kaggle.com/sishihara/python-kaggle-start-book-ch02-01
  2. Benchmark: https://www.kaggle.com/sishihara/python-kaggle-start-book-ch02-02
  3. Exploratory data analysis: https://www.kaggle.com/sishihara/python-kaggle-start-book-ch02-03
  4. Adding hypothesis-based features: https://www.kaggle.com/sishihara/python-kaggle-start-book-ch02-04
  5. Switching machine learning algorithms: https://www.kaggle.com/sishihara/python-kaggle-start-book-ch02-05
  6. Hyper parameters tuning: https://www.kaggle.com/sishihara/python-kaggle-start-book-ch02-06
  7. The importance of validation: https://www.kaggle.com/sishihara/python-kaggle-start-book-ch02-07
  8. Ensembling: https://www.kaggle.com/sishihara/python-kaggle-start-book-ch02-08

All are from: https://github.com/upura/python-kaggle-start-book

Slides

You can see the slides here. If you want to click URLs in the slides, please download PDF.

Prerequisites

Log in to Kaggle website. If you don’t have an account, please create the account. Be careful because unlike your display name, you can’t change your Kaggle ID after registration. Your Kaggle ID is different from your display name. For example, my Kaggle ID is ‘sishihara’ which can’t be changed and the display name is ‘u++’ which can be changed in your profile page.

About

https://www.scipyjapan.scipy.org/

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published