## Q1. What is boosting in machine learning?

Ans-Boosting is an ensemble modeling technique that attempts to build a strong classifier from the number of weak classifiers. It is done by building a model by using weak models in series. Firstly, a model is built from the training data. Then the second model is built which tries to correct the errors present in the first model. This procedure is continued and models are added until either the complete training data set is predicted correctly or the maximum number of models are added. 

## Q2. What are the advantages and limitations of using boosting techniques?

AnsAdvantages of Boosting 

Improved Accuracy – Boosting can improve the accuracy of the model by combining several weak models’ accuracies and averaging them for regression or voting over them for classification to increase the accuracy of the final model.

Robustness to Overfitting – Boosting can reduce the risk of overfitting by reweighting the inputs that are classified wrongly. 

Better handling of imbalanced data – Boosting can handle the imbalance data by focusing more on the data points that are misclassified 

Better Interpretability – Boosting can increase the interpretability of the model by breaking the model decision process into multiple processes.

## Q3. Explain how boosting works.

Ans-boosting combines weak learner a.k.a. base learner to form a strong rule.
to find weak rule, we apply base learning (ML) algorithms with a different distribution. Each time base learning algorithm is applied, it generates a new weak prediction rule. This is an iterative process. After many iterations, the boosting algorithm combines these weak rules into a single strong prediction rule.
For choosing the right distribution, here are the following steps:

Step 1:  The base learner takes all the distributions and assign equal weight or attention to each observation.

Step 2: If there is any prediction error caused by first base learning algorithm, then we pay higher attention to observations having prediction error. Then, we apply the next base learning algorithm.

Step 3: Iterate Step 2 till the limit of base learning algorithm is reached or higher accuracy is achieved.

Finally, it combines the outputs from weak learner and creates  a strong learner which eventually improves the prediction power of the model. Boosting pays higher focus on examples which are mis-classiﬁed or have higher errors by preceding weak rules.


## Q4. What are the different types of boosting algorithms?

Ans-Types of Boosting Algorithms
Underlying engine used for boosting algorithms can be anything.  It can be decision stamp, margin-maximizing classification algorithm etc. There are many boosting algorithms which use other types of engine such as:

AdaBoost (Adaptive Boosting)

Gradient Tree Boosting

XGBoost

## Q5. What are some common parameters in boosting algorithms?

Ans-learning_rate. This determines the impact of each tree on the final outcome (step 2.4). ...

n_estimators. The number of sequential trees to be modeled (step 2) ...

subsample. The fraction of observations to be selected for each tree.

## Q6. How do boosting algorithms combine weak learners to create a strong learner?

Ans-Boosting is an ensemble method that integrates multiple models(called as weak learners) to produce a supermodel (Strong learner).

Basically boosting is to train weak learners sequentially, each trying to correct its predecessor. For boosting, we need to specify a weak model (e.g. regression, shallow decision trees, etc.), and then we try to improve each weak learner to learn something from the data.

AdaBoost is a boosting algorithm where a decision tree with a single split is used as a weak learner. Also, we have gradient boosting and XG boosting.

## Q7. Explain the concept of AdaBoost algorithm and its working.

Ans-There are many machine learning algorithms to choose from for your problem statements. One of these algorithms for predictive modeling is called AdaBoost.

AdaBoost, also called Adaptive Boosting, is a technique in Machine Learning used as an Ensemble Method. The most common estimator used with AdaBoost is decision trees with one level which means Decision trees with only 1 split. These trees are also called Decision Stumps.

What this algorithm does is that it builds a model and gives equal weights to all the data points. It then assigns higher weights to points that are wrongly classified. Now all the points with higher weights are given more importance in the next model. It will keep training models until and unless a lower error is received.

## Q8. What is the loss function used in AdaBoost algorithm?

Ans-The error function that AdaBoost uses is an exponential loss function. First we find the products between the true values of training samples and the overall prediction for each sample. Then we take the sum of all the exponentials of these products in order to compute the error at iteration m

## Q9. How does the AdaBoost algorithm update the weights of misclassified samples?

Ans-New Sample Weight = Sample Weight * e^(Performance) 

In our case Sample weight = 1/5 so, 1/5 * e^ (0.693) = 0.399

For correctly classified records, we use the same formula with the performance value being negative. This leads the weight for correctly classified records to be reduced as compared to the incorrectly classified ones. The formula is:

New Sample Weight = Sample Weight * e^- (Performance)

Putting the values, 1/5 * e^-(0.693) = 0.100

The updated weight for all the records can be seen in the figure. As is known, the total sum of all the weights should be 1. In this case, it is seen that the total updated weight of all the records is not 1, it’s 0.799. To bring the sum to 1, every updated weight must be divided by the total sum of updated weight. For example, if our updated weight is 0.399 and we divide this by 0.799, i.e. 0.399/0.799=0.50. 

0.50 can be known as the normalized weight. In the below figure, we can see all the normalized weight and their sum is approximately 1.

## Q10. What is the effect of increasing the number of estimators in AdaBoost algorithm?

Ans-Adaboost Attributes.
estimators: The list of classifiers provided to be fit into the model.

classes: The class labels.

estimator_weights_: The weight assigned to each base estimator.

estimator_errors_: Classification error for each estimator in the boosted ensemble.

5. feature_importance_: Shows us which column has more importance than the other

Hyperparameter tuning with Adaboost
Let us play with the various parameters provided to us by the AdaBoost class and observer the accuracy changes:

Explore the number of trees
An important hyperparameter for Adaboost is n_estimator. Often by changing the number of base models or weak learners we can adjust the accuracy of the model. The number of trees added to the model must be high for the model to work well, often hundreds, if not thousands. Afterall the more is the number of weak learners, the more the model will change from being high biased to low biased.

# explore adaboost ensemble number of trees effect on performance
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import AdaBoostClassifier
from matplotlib import pyplot

# get the dataset
def get_dataset():
	X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=6)
	return X, y

# get a list of models to evaluate
def get_models():
	models = dict()
	# define number of trees to consider
	n_trees = [10, 50, 100, 500, 1000, 5000]
	for n in n_trees:
		models[str(n)] = AdaBoostClassifier(n_estimators=n)
	return models

# evaluate a given model using cross-validation
def evaluate_model(model, X, y):
	# define the evaluation procedure
	cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
	# evaluate the model and collect the results
	scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
	return scores

# define dataset
X, y = get_dataset()
# get the models to evaluate
models = get_models()
# evaluate the models and store results
results, names = list(), list()
for name, model in models.items():
	# evaluate the model
	scores = evaluate_model(model, X, y)
	# store the results
	results.append(scores)
	names.append(name)
	# summarize the performance along the way
	print('>%s %.3f (%.3f)' % (name, mean(scores), std(scores)))
# plot model performance for comparison
pyplot.boxplot(results, labels=names, showmeans=True)
pyplot.show()