# Ensemble Techniques And Its Types-1

#### Q1. What is an ensemble technique in machine learning?

Ensemble techniques in machine learning involve combining predictions from multiple individual models to create a stronger, more robust model. These techniques leverage the wisdom of the crowd to improve predictive accuracy, generalization, and robustness.

#### Q2. Why are ensemble techniques used in machine learning?

Ensemble techniques are used in machine learning for several reasons:
* They can improve model performance by reducing bias and variance.
* They enhance generalization and reduce overfitting.
* They handle complex relationships in data by combining multiple perspectives.
* They are effective in capturing different patterns in the data.
* They provide more robust predictions by reducing the impact of individual model errors.

#### Q3. What is bagging?

Bagging (Bootstrap Aggregating) is an ensemble technique that involves creating multiple subsets of the training data through bootstrapping (random sampling with replacement) and training a separate model on each subset. The final prediction is typically an average or majority vote of the predictions made by individual models.

#### Q4. What is boosting?

Boosting is an ensemble technique that combines multiple weak learners (usually simple models) to create a strong learner. It assigns higher weight to instances that are misclassified by the previous models, allowing subsequent models to focus on the difficult examples. Boosting iteratively improves model performance by giving more emphasis to challenging instances.

#### Q5. What are the benefits of using ensemble techniques?

The benefits of using ensemble techniques include:
* Improved predictive accuracy and generalization.
* Better handling of complex data patterns.
* Enhanced model robustness.
* Reduced overfitting.
* Improved model stability.
* Increased resistance to noisy data.

#### Q6. Are ensemble techniques always better than individual models?

While ensemble techniques often outperform individual models, they are not always guaranteed to do so. The effectiveness of an ensemble depends on various factors, including the quality of base models, diversity among the models, and the nature of the data. In some cases, well-tuned individual models may perform comparably or even better than ensembles.

#### Q7. How is the confidence interval calculated using bootstrap?

To calculate a confidence interval using bootstrap, we perform the following steps:
1. Sample the data with replacement (bootstrapping) to create multiple resampled datasets.
2. Compute the statistic (e.g., mean) of interest for each resampled dataset.
3. Calculate the lower and upper percentiles (e.g., 2.5th and 97.5th percentiles for a 95% confidence interval) of the distribution of the statistic.
4. The confidence interval is defined by these lower and upper percentiles.

#### Q8. How does bootstrap work and What are the steps involved in bootstrap?

Bootstrap is a resampling technique used for estimating the sampling distribution of a statistic. The steps involved in bootstrap are as follows:
1. Randomly select (with replacement) a sample of the same size as the original dataset from the dataset. This creates a resampled dataset.
2. Calculate the statistic of interest (e.g., mean, variance) for the resampled dataset.
3. Repeat steps 1 and 2 a large number of times (typically thousands of iterations) to create a distribution of the statistic.
4. Use the distribution to estimate properties of the population, such as confidence intervals or standard errors.

#### Q9. A researcher wants to estimate the mean height of a population of trees. They measure the height of a sample of 50 trees and obtain a mean height of 15 meters and a standard deviation of 2 meters. Use bootstrap to estimate the 95% confidence interval for the population mean height.

In [1]:
import numpy as ny
sample_heights = ny.array([15] * 50)                                            # Sample data (heights of 50 trees)
i = 10000                                                                          # Number of bootstrap iterations
bs_means = ny.zeros(i)                                        # Initialize an array to store bootstrap sample means
for i in range(i):                                               # Perform bootstrap resampling and calculate means
                      # Generate a bootstrap sample by randomly selecting with replacement from the original sample
    bs_sample = ny.random.choice(sample_heights, size=len(sample_heights), replace=True)
    bs_mean = ny.mean(bs_sample)                                      # Calculate the mean for the bootstrap sample
    bs_means[i] = bs_mean                                                         # Store the bootstrap sample mean
confidence_interval = ny.percentile(bs_means, [2.5, 97.5])           # Calculate the 95% confidence interval
print(f"95% Confidence Interval for Mean Height: {confidence_interval}")            # Print the confidence interval

95% Confidence Interval for Mean Height: [15. 15.]
