In [None]:
'''
Adaboost Machine Learning:
Definition:
- Adaboost, short for Adaptive Boosting, is an ensemble learning technique that combines 
    multiple weak classifiers to create a strong classifier. 
- It works by sequentially applying weak classifiers to the training data, 
    adjusting the weights of misclassified instances so that subsequent classifiers focus more on difficult cases. 
- The final model is a weighted sum of the individual classifiers, which improves accuracy and reduces bias.
Key Features:
- **Boosting Technique**: Adaboost is a boosting algorithm that enhances the performance of weak classifiers.
- **Adaptive Weights**: It adjusts the weights of training instances based on the errors of previous classifiers, allowing it to focus on harder-to-classify examples.
- **Versatile**: Can be used with various weak classifiers, such as decision trees or linear models.
- **Robustness**: Adaboost is less sensitive to outliers compared to some other ensemble methods, as it focuses on misclassified instances.
- **Performance**: Often leads to improved accuracy and generalization compared to individual weak classifiers. 

'''

In [None]:
'''
How it works: Step-by-Step
1. **Initialization**: Start with equal weights for all training instances.
2. **Training Weak Classifiers**: Train a weak classifier on the weighted training data.
3. **Error Calculation**: Calculate the error rate of the weak classifier on the training data.
4. **Weight Adjustment**: Update the weights of the training instances:
   - Increase the weights of misclassified instances.
   - Decrease the weights of correctly classified instances.
5. **Classifier Weighting**: Assign a weight to the weak classifier based on its error rate (lower error means higher weight).
6. **Iteration**: Repeat steps 2-5 for a specified number of iterations or until a stopping criterion is met.
7. **Final Model**: Combine the weak classifiers into a final strong classifier using their assigned weights.
8. **Prediction**: For a new instance, each weak classifier votes, and the final prediction is based on the weighted votes of all classifiers.
9. **Evaluation**: Assess the performance of the final model using metrics like accuracy, precision, recall, or F1-score.

'''

In [None]:
'''
Adaboost for Classification Steps:
1. **Data Preparation**: Load and preprocess the dataset, ensuring it is suitable for classification tasks.
2. **Model Initialization**: Initialize the Adaboost classifier with a weak learner (e.g., decision tree).
3. **Training**: Fit the Adaboost model to the training data, allowing it to learn from the weighted instances.
4. **Prediction**: Use the trained model to make predictions on the test data.
5. **Evaluation**: Evaluate the model's performance using metrics such as accuracy, precision, recall, or F1-score.
6. **Hyperparameter Tuning**: Optionally, perform hyperparameter tuning to optimize the model's performance.

Note: Here we use Gini or Entropy as the criterion for the weak learner, which is typically a decision tree.

'''

In [None]:
'''
Adaboost for Regression Steps:
1. **Data Preparation**: Load and preprocess the dataset, ensuring it is suitable for regression tasks.
2. **Model Initialization**: Initialize the Adaboost regressor with a weak learner (e.g., decision tree).
3. **Training**: Fit the Adaboost model to the training data, allowing it to learn from the weighted instances.
4. **Prediction**: Use the trained model to make predictions on the test data.
5. **Evaluation**: Evaluate the model's performance using metrics such as mean squared error (MSE), mean absolute error (MAE), or R-squared.
6. **Hyperparameter Tuning**: Optionally, perform hyperparameter tuning to optimize the model's performance.

Note:
Here we use mean squared error as the loss function for the weak learner, which is typically a decision tree.
'''

## Adaboost Classification Implementation

```python

In [1]:
# Holiday Package Recommendation System
import numpy as np
import pandas as pd
from sklearn.ensemble import AdaBoostClassifier, AdaBoostRegressor
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, mean_squared_error, r2_score

In [2]:
df = pd.read_csv('Travel.xls')
df.head()

Unnamed: 0,CustomerID,ProdTaken,Age,TypeofContact,CityTier,DurationOfPitch,Occupation,Gender,NumberOfPersonVisiting,NumberOfFollowups,ProductPitched,PreferredPropertyStar,MaritalStatus,NumberOfTrips,Passport,PitchSatisfactionScore,OwnCar,NumberOfChildrenVisiting,Designation,MonthlyIncome
0,200000,1,41.0,Self Enquiry,3,6.0,Salaried,Female,3,3.0,Deluxe,3.0,Single,1.0,1,2,1,0.0,Manager,20993.0
1,200001,0,49.0,Company Invited,1,14.0,Salaried,Male,3,4.0,Deluxe,4.0,Divorced,2.0,0,3,1,2.0,Manager,20130.0
2,200002,1,37.0,Self Enquiry,1,8.0,Free Lancer,Male,3,4.0,Basic,3.0,Single,7.0,1,3,0,0.0,Executive,17090.0
3,200003,0,33.0,Company Invited,1,9.0,Salaried,Female,2,3.0,Basic,3.0,Divorced,2.0,1,5,1,1.0,Executive,17909.0
4,200004,0,,Self Enquiry,1,8.0,Small Business,Male,2,3.0,Basic,4.0,Divorced,1.0,0,5,1,0.0,Executive,18468.0
