In [None]:
## Ensemble Techniques in Machine Learning (Beginner-Friendly)

### Q1. What is an ensemble technique in machine learning?
Ensemble techniques combine multiple models to improve accuracy and reduce errors. Instead of relying on a single model, these methods use a group of models to make better predictions.

### Q2. Why are ensemble techniques used in machine learning?
They are used to increase accuracy, reduce overfitting, and make models more robust by combining different perspectives from multiple models.

### Q3. What is bagging?
Bagging (Bootstrap Aggregating) is an ensemble technique where multiple models are trained on different random subsets of the data, and their predictions are averaged (or voted) to improve stability and accuracy.

### Q4. What is boosting?
Boosting is an ensemble technique that builds models sequentially, where each new model learns from the mistakes of the previous ones, making the final prediction stronger.

### Q5. What are the benefits of using ensemble techniques?
- Higher accuracy
- Reduced overfitting
- More stable predictions
- Works well with different data distributions

### Q6. Are ensemble techniques always better than individual models?
Not always. They may increase complexity and require more computation, but in most cases, they provide better accuracy compared to a single model.

### Q7. How is the confidence interval calculated using bootstrap?
Bootstrap calculates confidence intervals by repeatedly sampling data with replacement and computing the statistic of interest (like the mean). The interval is derived from percentiles of these repeated computations.

### Q8. How does bootstrap work, and what are the steps involved?
1. Take multiple random samples from the dataset with replacement.
2. Calculate the desired statistic (mean, median, etc.) for each sample.
3. Repeat this process many times (e.g., 1000 times).
4. Compute the confidence interval using the percentiles of the results.

### Q9. Bootstrap example: Estimating the 95% confidence interval for mean height
```python
import numpy as np

# Given sample data
np.random.seed(42)
data = np.random.normal(loc=15, scale=2, size=50)  # Mean=15m, Std=2m

# Bootstrap sampling
def bootstrap_ci(data, num_samples=1000, confidence_level=95):
    sample_means = [np.mean(np.random.choice(data, size=len(data), replace=True)) for _ in range(num_samples)]
    lower_bound = np.percentile(sample_means, (100 - confidence_level) / 2)
    upper_bound = np.percentile(sample_means, 100 - (100 - confidence_level) / 2)
    return lower_bound, upper_bound

ci_lower, ci_upper = bootstrap_ci(data)
print(f"95% Confidence Interval for mean height: ({ci_lower:.2f}, {ci_upper:.2f}) meters")
```
