Kaggle has run several educational 5-Day Challenges on different topics. Each challenge consists of five short exercises designed to give you hands-on practice with a different data science technique. This notebook collects links to the exercises for each challenge so you can work through them at your own pace. 

___
## [5-Day Data Challenge](https://www.kaggle.com/rtatman/the-5-day-data-challenge)

* **Topic**: Getting started with data science
* **Level:** Beginner
* **Language:** Python and R
* **Daily tasks**:
    * Day 1: Reading data into a kernel
    * Day 2: Plot a Numeric Variable with a Histogram
    * Day 3: Perform a t-test
    * Day 4: Visualize categorical data with a bar chart
    * Day 5: Using a Chi-Square Test

New to data science? Need a quick refresher? This five day challenge will give you the guidance and support you need to kick-start your data science journey.

By the time you finish this challenge, you will:

* Read in and summarize data
* Visualize both numeric and categorical data
* Know when and how to use two foundational statistical tests (t-test and chi-squared)

All the material for this challenge [is in one notebook](https://www.kaggle.com/rtatman/the-5-day-data-challenge).

_____

## [5-Day Data Challenge: Regression](https://www.kaggle.com/rtatman/the-5-day-regression-challenge)

* **Topic**: Regression
* **Level:** Intermediate (should already be familiar with R)
* **Language:** R
* **Daily tasks:**
    * [Day 1: Learn about different types of regression (Poisson, linear and logistic) and when to use them](https://www.kaggle.com/rtatman/regression-challenge-day-1)
    * [Day 2: Learn how to fit &amp; evaluate a model with diagnostic plots](https://www.kaggle.com/rtatman/regression-challenge-day-2)
    * [Day 3: Learn how to read and understand models](https://www.kaggle.com/rtatman/regression-challenge-day-3)
    * [Day 4: Learn how to fit &amp; interpret a multiple regression model](https://www.kaggle.com/rtatman/regression-challenge-day-4)
    * [Day 5: Learn how to use Elastic Net to select input variables](https://www.kaggle.com/rtatman/regression-challenge-day-5)

By the time you finish this challenge, you’ll understand how and when to implement three foundational regression techniques. Each day we will cover one aspect of regression analysis in depth.

  - How to pick the right regression technique for your data
  - How to use diagnostic plots to check your model
  - How to interpret and communicate your model
  - Visualizing your model
  - Comparing models & selecting variables

We’ll work with real datasets to help develop an intuitive understanding of how each type of model works and how to interpret the results.

_____

## [SQL Scavenger Hunt](https://www.kaggle.com/rtatman/sql-scavenger-hunt-handbook/) (not a 5-Day Challenge, but follows a similar format)

* **Topic**: How to query data in SQL
* **Level:** Beginner
* **Language:** Python and SQL
* **Daily tasks:**
     - [Before you start: How to use SQL on Kaggle](https://www.kaggle.com/rtatman/sql-scavenger-hunt-handbook/)
     - [Day 1: SELECT FROM](https://www.kaggle.com/rtatman/sql-scavenger-hunt-day-1/)
     - [Day 2: GROUP BY](https://www.kaggle.com/rtatman/sql-scavenger-hunt-day-2/) 
     - [Day 3: ORDER BY &amp; Dates](https://www.kaggle.com/rtatman/sql-scavenger-hunt-day-3/)
     - [Day 4: WITH &amp; AS](https://www.kaggle.com/rtatman/sql-scavenger-hunt-day-4/)
     - [Day 5: JOIN](https://www.kaggle.com/rtatman/sql-scavenger-hunt-day-5/)

In our SQL Scavenger Hunt, you’ll learn how to use SQL to get data from BigQuery databases. Each day you’ll learn about a core SQL technique and practice using it to get the data you need to answer real-world questions like:

* How many GitHub users made more than ten commits on January 1, 2015?
* Which five cities had the highest air pollution last week?
* You’ll also learn best practices for working with BIG datasets.

SQL (short for “Structured Query Language”) is the primary way to get data out of relational databases. It’s also the third most popular software tool for data science, right after Python and R, and a key skill for aspiring data scientists to develop.

### [This challenge is also available as a Learn track](https://www.kaggle.com/learn/sql)
_____

## Python: [5-Day Data Challenge: Data Cleaning](https://www.kaggle.com/rtatman/data-cleaning-challenge-handling-missing-values)

* **Topic**: Data cleaning
* **Level:** Beginner to intermediate (should already be familiar with Python)
* **Language:** Python
* **Daily tasks:**
    * [Day 1: Handling missing values](https://www.kaggle.com/rtatman/data-cleaning-challenge-handling-missing-values)
    * [Day 2: Scaling and normalization](https://www.kaggle.com/rtatman/data-cleaning-challenge-scale-and-normalize-data)
    * [Day 3: Parsing dates](https://www.kaggle.com/rtatman/data-cleaning-challenge-parsing-dates/)
    * [Day 4: Character encodings](https://www.kaggle.com/rtatman/data-cleaning-challenge-character-encodings/)
    * [Day 5: Inconsistent Data Entry](https://www.kaggle.com/rtatman/data-cleaning-challenge-inconsistent-data-entry/)

Data cleaning is a key part of data science, but it can be deeply frustrating. Why are some of your text fields garbled? What should you do about those missing values? Why aren’t your dates formatted correctly? How can you quickly clean up inconsistent data entry? In this five day challenge, you'll learn why you've run into these problems and, more importantly, how to fix them!

In this challenge we’ll learn how to tackle some of the most common data cleaning problems so you can get to actually analyzing your data faster. We’ll work through five hands-on exercises with real, messy data and answer some of your most commonly-asked data cleaning questions.

_____

## R: [5-Day Data Challenge: Data Cleaning](https://www.kaggle.com/rtatman/data-cleaning-challenge-json-txt-and-xls/)

* **Topic**: Data cleaning
* **Level:** Beginner to intermediate (should already be familiar with R)
* **Language:** R
* **Daily tasks:**
    * [Day 1: Reading in common data file formats: .json, .txt and .xlsx](https://www.kaggle.com/rtatman/data-cleaning-challenge-json-txt-and-xls/)
    * [Day 2: Filling in missing values ](https://www.kaggle.com/rtatman/data-cleaning-challenge-imputing-missing-values/)
    * [Day 3: Identifying & handling outliers](https://www.kaggle.com/rtatman/data-cleaning-challenge-outliers/)
    * [Day 4: Removing duplicate records](https://www.kaggle.com/rtatman/data-cleaning-challenge-deduplication/)
    * [Day 5: Cleaning numbers (percentages, money, dates and times)](https://www.kaggle.com/rtatman/data-cleaning-challenge-cleaning-numeric-columns/)

Data cleaning is a necessary part of data science, but it can be deeply frustrating. What are you supposed to do with this .json file? How can you handle all these missing values in your data? Is there a fast way to get rid of duplicate entries? In this challenge, we’ll learn how to solve some common data cleaning problems.

This challenge is in R and covers different topics from the earlier Python version of the Data Cleaning 5-Day Challenge so even if you did the last challenge, you’ll discover some new tips and tricks! Here’s what we’ll be covering: