AdaBoost, shortening for adaptive boosting is a boosting ensemble method used both in classification and in regression problems. The authors won the Goedel prize for the work in 2003, whose original paper is in the first reference but a nice reading over the general idea, by the same authors, is the paper in the second reference.
The idea is to fit a sequence of weak learners (a weak learner is one that is just slightly better than a random guesser) on repeatedly modified versions of the training set, then combine their predictions through a weighted majority voting system.
In the first iteration, you give the same weight to all$$n$$training samples,
As per current literature, AdaBoost with decision trees is considered a pretty strong classifier.
This part follows the Wikipedia page. Let's see we have a binary classification problem, class variables being
- Call a weak learner
$$h$$ - At iteration
$$t$$ , you'll have built a combination of weak learners into a strong learner
so that
This way we have built a linear combination of weak learners over several iterations. The weights
(note that the argument of the exponential will be a 1 if the point is well classified and a -1 if not) which by posing
We can now split the sum between the points which are well classified and those misclassified:
In the last expression above, the only part depending on the weak classifiers is the second sum, so the weak classifier that minimises$$E$$ is the one minimising this sum, which means the one that minimises
If we derive$$E$$with respect to$$\alpha_t$$, we obtain
Now, the weighted error rate of the weak classifiers is
so it follows that we can write the
which is
To summarise then, the AdaBoost algorithm consists of
- Choose the weak classifier which minimised the error
$$E$$ - Use it to compute the classifiers weighted error
$$\epsilon_t$$ - Use this to compute the weights
$$\alpha_t$$ - Use this to compute the boosted (strong) classifier
$$H_t$$
Note that there exist several variants of the original AdaBoost.
- Y Freund, R E Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, J of computer and system sciences 55.1, 1997
- Y Freund, R E Schapire, A short introduction to boosting, J of Japanese Society for Artificial Intelligence, 14(5), 1999