## Polynomial features

- Linear regression can be improved by adding polynomial features
- Polynomial features are those features created by raising existing features to an exponent
- For example, if we have a feature x, we can create a new feature x^2 by squaring x, x^3 by cubing x, and so on
- This is useful when the relationship between the features and the target is nonlinear

Polynomial features(3) = PolynomialFeatures(degree=3)

- PolynomialFeatures(degree=d) transforms an array containing n features into an array containing (n+d)!/(d!n!) features, where n is the number of features in the original array
- For example, if we have two features a and b, PolynomialFeatures(degree=2) will create the following features: 1, a, b, a^2, ab, b^2



# **Ensemble methods**
---

Ensemble methods combine multiple machine learning models to create more powerful models


## Various Types: Voting & Stacking

Stacking method is a more advanced ensemble method that involves training a model to combine the predictions of several other models
--> Base Learner & Meta Learner (Used to combine the predictions of the base learners) by default, the meta learner is a logistic regression model

Voting method is a simple ensemble method in which several models are trained and each model predicts whether a data point belongs to the positive class or negative class; the predictions of all the models are then combined, and the combined prediction is used as the final prediction

## Similar Type: Tree Based Models

The models used in similar types are all tree-based models, which are models that use decision trees to make predictions

Bootstrap Sampling: The process of creating a new dataset by sampling with replacement from the original dataset, allowing duplicate samples. n of bootstrap sample = n of models in ensemble.

## Bagging Methods

Bagging: Bootstrap Aggregating, a way to **decrease the variance** of a model by training many models on bootstrap samples of the original dataset and then averaging the predictions

### Random Forest

The concept of Bagging are implemented in Random Forest, which is an ensemble method that uses a collection of decision trees that have been trained on bootstrap samples. In the end hard voting is used to make the final prediction.

In [None]:
from sklearn.ensemble import RandomForestClassifier

# Create a random forest Classifier. By convention, clf means 'Classifier'
clf = RandomForestClassifier(n_estimators=2, random_state=0)
#n_estimators=2 means 2 trees in the forest
#random_state=0 means the random seed is 0

## Boosting Methods

Boosting is an **iterative/sequential** method that attempts to reduce the bias of the combined models. Boosting methods train models sequentially, and each model attempts to correct the errors of the previous model.

Boosting methods are more complex than bagging methods and usually perform better, but they are also more likely to overfit the training data.

Transform weak learner into strong learner (with the same models)

***Weak learner***: A model that is slightly better than random guessing

Weight concept: The weight of each data point **(features)** indicates how important the data point is to the model, and data points with higher weights are given more attention by the model

Hyperparameter tuning in Boosting:
- n_estimators: The number of models to iteratively train
- shrinkage: The learning rate, which controls how much each model contributes to the overall ensemble
- subsample: The fraction of data points to randomly sample for each model


Methods:
- Gradient Boosting, in python: AdaBoost (Adaptive boosting)
- Extreme Gradient Boosting, in python: XGBClassifier (XGBoost library), well-known for its speed and performance
- CatBoost, in python: CatBoostClassifier 

All methods are based from Decision Tree