# Practical and Theoretical Exercises: SVMs, Decision Trees, and Ensemble Methods

This Jupyter Notebook provides hands-on exercises for Support Vector Machines (SVMs), Decision Trees, and Ensemble Methods (Bagging & Boosting) using Python and scikit-learn.

## Instructions:
1. Follow the provided examples to understand each concept.
2. Complete the exercises by filling in the missing code.
3. Answer the theoretical questions to deepen your understanding.
4. Run the code to check your answers.

## Theoretical Exercises
### Support Vector Machines (SVM)
1. What is the crux of the SVM? 
2. What is a hyperplane?
3. What is a support vector?
4. What is the role of the kernel function in SVMs? Give some examples for kernel functions and describe when a certain function is used!
5. Explain how we can deal with $c$-class problems when using SVMs!

### Decision Trees
1. How is a decision tree learned from data?
2. What is entropy in the context of decision trees? How can we compute it?  
3. What does information gain measure and how can it be computed?
4. What are two other common impurity measures besides entropy?

### Algorithm independent methods
1. What is the main idea behind combining classifiers?
2. What is bagging, and how does it work?  
3. What is boosting, and how does AdaBoost work?

## Practical Tasks

In [None]:
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import BaggingClassifier, AdaBoostClassifier, GradientBoostingClassifier
from sklearn.metrics import classification_report
from sklearn.datasets import make_classification


### Task 1: Implement a function that generates the necessary data! After you generated the features (X) and labels (y) split the corresponding data set into test and train data! 

In [None]:
# Generate synthetic dataset
def get_dataset():
    X, y = 'Todo'
    return 'Todo'

### Task 2: Generate a data set and train a SVM! How is your SVM performing on the corresponding test data set?

In [None]:
# --- Support Vector Machines (SVM) ---
X_train, X_test, y_train, y_test = get_dataset()
# Todo: Data preprocessing?

# Initialize the model
svm_model = 'Todo'
# Train the model

# Test your model

# Extra: Try out different kernels

### Task 3: Train a Decision Tree with the same data set you've already generated. Compare its performance on the test set to the performance of the SVM!

In [None]:
# --- Decision Trees ---
# Initialize the model
dt_model = 'Todo'
# Train the model

# Test your model

# TODO: Experiment with different values of max_depth

### Task 4: Try out different Bagging and Boosting approaches with different estimators. Which setup works best for you?

In [None]:
# --- Bagging (Bootstrap Aggregating) ---
# Initialize the model
baging_model = 'Todo'
# Train the model

# Test your model

# Extra: Try out different estimators

# --- Boosting (AdaBoost) ---
# Initialize the model
adaboost_model = 'Todo'
# Train the model

# Test your model

# Extra: Try out different estimators