###[Q1.] What is an ensemble technique in machine learning?
#####[Ans]
An ensemble technique in machine learning combines the predictions of multiple individual models to produce a more robust and accurate final model. The goal is to leverage the strengths of each model while minimizing their weaknesses.

###[Q2.] Why are ensemble techniques used in machine learning?
#####[Ans]
Ensemble techniques are used because they:

1. Improve prediction accuracy.
2. Reduce the risk of overfitting.
3. Enhance model robustness by combining diverse models.
4. Help in handling complex datasets where individual models may fail.

###[Q3.] What is bagging?
#####[Ans]
Bagging (Bootstrap Aggregating) is an ensemble technique where:

1. Multiple models are trained on different bootstrapped (randomly sampled with replacement) subsets of the data.
2. The predictions of these models are aggregated (e.g., averaged for regression, majority voting for classification).

Example: Random Forest uses bagging with decision trees.

###[Q4.] What is boosting?
#####[Ans]
Boosting is an ensemble technique that:

- Trains models sequentially, where each model tries to correct the errors of its predecessor.
- Assigns higher weights to misclassified samples in subsequent models.
- Combines the predictions of all models, often using weighted sums.

Example: Gradient Boosting, AdaBoost.

###[Q5.] What are the benefits of using ensemble techniques?
#####[Ans]
- Improved accuracy: Combining multiple models often yields better performance than individual models.
- Reduced variance: Helps prevent overfitting.
- Robustness: Handles noisy data better.
- Adaptability: Effective for both classification and regression tasks.

###[Q6.] Are ensemble techniques always better than individual models?
#####[Ans]
Not always. Ensemble techniques:

- May not improve performance if the individual models are already optimal.
- Can increase computational cost and complexity.
- Require careful tuning to avoid overfitting or underfitting.


###[Q7.] How is the confidence interval calculated using bootstrap?
#####[Ans]
Using bootstrap, the confidence interval is estimated by:

- Generating multiple bootstrap samples from the data.
- Calculating the statistic (e.g., mean) for each sample.
- Computing the percentiles (e.g., 2.5% and 97.5% for a 95% confidence interval) of the bootstrap statistics.

###[Q8.] How does bootstrap work and What are the steps involved in bootstrap?
#####[Ans]
Bootstrap is a resampling technique to estimate the sampling distribution of a statistic.

Steps:

- Randomly sample the data with replacement to create a bootstrap sample.
- Compute the desired statistic (e.g., mean, variance) for the sample.
- Repeat steps 1–2 multiple times to build a distribution of the statistic.
- Use the bootstrap distribution to estimate parameters, confidence intervals, or other metrics.

###[Q9.] A researcher wants to estimate the mean height of a population of trees. They measure the height of a sample of 50 trees and obtain a mean height of 15 meters and a standard deviation of 2 meters. Use bootstrap to estimate the 95% confidence interval for the population mean height.
#####[Ans]

In [2]:
import numpy as np

sample_size = 50
sample_mean = 15
sample_std = 2

np.random.seed(42)
original_sample = np.random.normal(loc=sample_mean, scale=sample_std, size=sample_size)

num_bootstrap_samples = 10000
bootstrap_means = []

for _ in range(num_bootstrap_samples):
    bootstrap_sample = np.random.choice(original_sample, size=sample_size, replace=True)
    bootstrap_means.append(np.mean(bootstrap_sample))

lower_bound = np.percentile(bootstrap_means, 2.5)
upper_bound = np.percentile(bootstrap_means, 97.5)

print(f"95% Confidence Interval for Population Mean Height: ({lower_bound:.2f}, {upper_bound:.2f})")


95% Confidence Interval for Population Mean Height: (14.03, 15.06)
