In [None]:
# Q1. What is an ensemble technique in machine learning?
# An ensemble technique is a machine learning method that combines multiple models to improve the overall performance.
# It aggregates the predictions of several base models to provide better predictions than any individual model.

# Q2. Why are ensemble techniques used in machine learning?
# Ensemble techniques are used to:
# - Improve accuracy by reducing variance (bagging) or bias (boosting).
# - Handle noisy data and imbalanced datasets more effectively.
# - Provide more stable and reliable predictions by leveraging multiple models.

# Q3. What is bagging?
# Bagging (Bootstrap Aggregating) involves training multiple models on different subsets of the training data
# sampled with replacement (bootstrapped). The predictions from all models are then aggregated (e.g., voting for classification, averaging for regression).

# Q4. What is boosting?
# Boosting involves training multiple models sequentially, where each new model tries to correct the errors of the previous model.
# The final prediction is made by combining the predictions of all models, usually with a weighted sum.

# Q5. What are the benefits of using ensemble techniques?
# Benefits of ensemble techniques include:
# - Reduced variance and bias.
# - Better handling of overfitting and underfitting.
# - Improved model accuracy.
# - Robustness to noise and outliers.

# Q6. Are ensemble techniques always better than individual models?
# Not always. While ensemble methods tend to improve performance by combining multiple models,
# they are computationally expensive and may not always outperform individual models, especially when those models are already strong.

# Q7. How is the confidence interval calculated using bootstrap?
# Confidence intervals are calculated by:
# 1. Resampling the data with replacement to generate multiple bootstrap samples.
# 2. Calculating the statistic of interest (e.g., mean, median) for each sample.
# 3. Sorting the statistics and calculating the desired percentile (e.g., 95% confidence interval).

# Q8. How does bootstrap work and what are the steps involved in bootstrap?
# Steps involved in bootstrap:
# 1. Sample with replacement from the original dataset to create bootstrap samples.
# 2. Compute the statistic of interest for each bootstrap sample.
# 3. Repeat the process multiple times to create a distribution of the statistic.
# 4. Calculate the confidence interval by taking the appropriate percentiles of the distribution.

# Q9. A researcher wants to estimate the mean height of a population of trees.
# The mean height of the sample is 15 meters, the standard deviation is 2 meters,
# and the sample size is 50. Use bootstrap to estimate the 95% confidence interval for the population mean height.

import numpy as np

# Given data
sample_mean = 15  # Mean height of the sample
sample_std = 2    # Standard deviation of the sample
sample_size = 50  # Number of trees in the sample

# Generate the sample data (we assume it's normally distributed for simplicity)
sample_data = np.random.normal(sample_mean, sample_std, sample_size)

# Bootstrap procedure
n_iterations = 10000  # Number of bootstrap iterations
bootstrap_means = []

# Generate bootstrap samples and compute means
for i in range(n_iterations):
    bootstrap_sample = np.random.choice(sample_data, size=sample_size, replace=True)
    bootstrap_means.append(np.mean(bootstrap_sample))

# Convert bootstrap means to a numpy array
bootstrap_means = np.array(bootstrap_means)

# Calculate the 95% confidence interval
lower_bound = np.percentile(bootstrap_means, 2.5)
upper_bound = np.percentile(bootstrap_means, 97.5)

# Output the results
print(f"Bootstrap 95% Confidence Interval for the Mean Height: ({lower_bound}, {upper_bound})")
