# Extreme Gradient Boosting with XGBoost

Do you know the basics of supervised learning and want to use state-of-the-art models on real-world datasets? 

Gradient boosting is currently one of the most popular techniques for efficient modeling of tabular datasets of all sizes. 

XGboost is a very fast, scalable implementation of gradient boosting, with models using XGBoost regularly winning online data science competitions and being used at scale across different industries. 

In this course, you'll learn how to use this powerful library alongside pandas and scikit-learn to build and tune supervised learning models. 

You'll work with real-world datasets to solve classification and regression problems.

## 1. Classification with XGBoost

This chapter will introduce you to the fundamental idea behind XGBoost—boosted learners. Once you understand how XGBoost works, you'll apply it to solve a common classification problem found in industry: predicting whether a customer will stop being a customer at some point in the future.

    1.1 Welcome to the course!
    1.2 Which of these is a classification problem?
    1.3 Which of these is a binary classification problem?
    1.4 Introducing XGBoost
    1.5 XGBoost: Fit/Predict
    1.6 What is a decision tree?
    1.7 Decision trees
    1.8 What is Boosting?
    1.9 Measuring accuracy
    1.10 Measuring AUC
    1.11 When should I use XGBoost?
    1.12 Using XGBoost

## 2. Regression with XGBoost

After a brief review of supervised regression, you'll apply XGBoost to the regression task of predicting house prices in Ames, Iowa. You'll learn about the two kinds of base learners that XGboost can use as its weak learners, and review how to evaluate the quality of your regression models.

    2.1 Regression review
    2.2 Which of these is a regression problem?
    2.3 Objective (loss) functions and base learners
    2.4 Decision trees as base learners
    2.5 Linear base learners
    2.6 Evaluating model quality
    2.7 Regularization and base learners in XGBoost
    2.8 Using regularization in XGBoost
    2.9 Visualizing individual XGBoost trees
    2.10 Visualizing feature importances: What features are most important in my dataset

## 3. Fine-tuning your XGBoost model

This chapter will teach you how to make your XGBoost models as performant as possible. You'll learn about the variety of parameters that can be adjusted to alter the behavior of XGBoost and how to tune them efficiently so that you can supercharge the performance of your models.

    3.1 Why tune your model?
    3.2 When is tuning your model a bad idea?
    3.3 Tuning the number of boosting rounds
    3.4 Automated boosting round selection using early_stopping
    3.5 Overview of XGBoost's hyperparameters
    3.6 Tuning eta
    3.7 Tuning max_depth
    3.8 Tuning colsample_bytree
    3.9 Review of grid search and random search
    3.10 Grid search with XGBoost
    3.11 Random search with XGBoost
    3.12 Limits of grid search and random search
    3.13 When should you use grid search and random search?

## 4. Using XGBoost in pipelines

Take your XGBoost skills to the next level by incorporating your models into two end-to-end machine learning pipelines. 
You'll learn how to tune the most important XGBoost hyperparameters efficiently within a pipeline, and get an introduction to some more advanced preprocessing techniques.

    4.1 Review of pipelines using sklearn
    4.2 Exploratory data analysis
    4.3 Encoding categorical columns I: LabelEncoder
    4.4 Encoding categorical columns II: OneHotEncoder
    4.5 Encoding categorical columns III: DictVectorizer
    4.6 Preprocessing within a pipeline
    4.7 Incorporating XGBoost into pipelines
    4.8 Cross-validating your XGBoost model
    4.9 Kidney disease case study I: Categorical Imputer
    4.10 Kidney disease case study II: Feature Union
    4.11 Kidney disease case study III: Full pipeline
    4.12 Tuning XGBoost hyperparameters
    4.13 Bringing it all together
    4.14 Final Thoughts

# Aditional material

- Datacamp course: https://learn.datacamp.com/courses/extreme-gradient-boosting-with-xgboost
- Xgboost documentation: https://xgboost.readthedocs.io/en/latest/
- sklearn.tree.DecisionTreeClassifier documentation: https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier
- metrics: https://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter