Adaboost is different from random forests in that for the random forests, we were looking for low bias, high variance models, which, in an ensemble, would have lower varaince.  For boosting, we are taking high bias models (weak learners of 50-60 % accuracy) which have low variance.  The variance remains low, but bias improves.  We weight each weak learner:
$$
f(x) =\sum_{m=1}^M \alpha_m f_m(x)
$$
The weak learners ($f_m$) are decision trees with depth 1, which split the space in half.  Alternatively, we could use logistic regression.  They should learn fast, so that we can train 1000s of them.  For training classes, we use $-1,1$. We train the base model with no resampling, but instead weight how important each sample is, with $w_1,\dots,w_n$ as weights for the n samples.  These weights change with each new classfier.  This requires the classifier to accept weighted data.  

Initialize $w_i=1 \,\forall \, i$.  Then for $1\le m \le M$:

fit $f_m$ with sample weights $w_i$
calculate error as $$
\epsilon_m= \frac{\sum_{i=1}^N w_i 1_{y_i \neq f_m(x_i)}}{\sum_{i=1}^N w_i}
$$
set $$\alpha_m=\frac{1}{2} \log \left[ \frac{1-\epsilon_m}{\epsilon_m} \right]$$ and calculate new weights
$$
w_i\leftarrow w_i \exp \left( -\alpha_m y_i f_m(x_i) \right)
$$
then normalize
$$
w_i\leftarrow \frac{w_i}{\sum w_i}
$$
add $\alpha_m,f_m$ to the classifier.

In more general additive modeling, you begin with a loss/cost function $L(y,f(x))$.  Then initialize $F_0(x)=0$, and for $1\le m \le M$:
Find the weight and model parameters which minimize the loss function:
$$
\alpha_m, \theta_m = \arg \min_{\alpha, \theta} \sum_{i=1}^N L \left( y_i, F_{m-1} (x_i) + \alpha f_m(x_i,\theta) \right)
$$
$F_{m} (x_i)=F_{m-1} (x_i) + \alpha_m f_m(x_i,\theta)$

Now we try the sci-kit learn regression version of adaboost on a data set of the type used for our random forests and bagging 

In [1]:
from sklearn.ensemble import AdaBoostRegressor
from sklearn.datasets import make_regression, make_classification
from sklearn.model_selection import train_test_split
X, Y =make_regression(n_samples=1000, n_features=100, n_informative=10)
x_train, x_test,y_train,  y_test = train_test_split(X,Y,test_size=.30)


model=AdaBoostRegressor(n_estimators=10)
model.fit(x_train,y_train)
print(model.score(x_test,y_test))

model=AdaBoostRegressor(n_estimators=1000)
model.fit(x_train,y_train)
print(model.score(x_test,y_test))

0.619019570569
0.778294370623
