In [None]:
# Q1. What is Random Forest Regressor?
# A Random Forest Regressor is an ensemble learning method that uses multiple decision trees to make predictions for regression tasks. It combines the predictions of several base decision trees, each built on different subsets of the training data and features, to produce a final aggregated prediction. This approach aims to improve the predictive performance and robustness of individual decision trees.

# Q2. How does Random Forest Regressor reduce the risk of overfitting?
# Random Forest Regressor reduces the risk of overfitting by:
# - Training multiple decision trees on different bootstrap samples (random subsets with replacement) of the training data, which introduces diversity among the trees.
# - Using random subsets of features for each split in the decision trees, which further diversifies the trees and prevents them from being overly dependent on any particular feature.
# The aggregation of predictions from multiple diverse trees helps to average out the errors and reduces the likelihood of overfitting to the training data.

# Q3. How does Random Forest Regressor aggregate the predictions of multiple decision trees?
# In a Random Forest Regressor, the predictions from multiple decision trees are aggregated by averaging their individual predictions. Each decision tree in the ensemble makes a prediction for a given input, and the final prediction of the random forest is the mean of these individual predictions.

# Q4. What are the hyperparameters of Random Forest Regressor?
# Some key hyperparameters of Random Forest Regressor include:
# - `n_estimators`: The number of decision trees in the forest.
# - `max_depth`: The maximum depth of each decision tree.
# - `min_samples_split`: The minimum number of samples required to split an internal node.
# - `min_samples_leaf`: The minimum number of samples required to be at a leaf node.
# - `max_features`: The number of features to consider when looking for the best split.
# - `bootstrap`: Whether bootstrap samples are used when building trees.
# - `random_state`: Seed used by the random number generator.

# Q5. What is the difference between Random Forest Regressor and Decision Tree Regressor?
# - A Decision Tree Regressor uses a single decision tree to make predictions, which can lead to overfitting if the tree is too deep or if the data is noisy.
# - A Random Forest Regressor uses multiple decision trees trained on different subsets of the data and features, and aggregates their predictions to improve accuracy and robustness. This ensemble approach reduces the risk of overfitting and generally provides better performance than a single decision tree.

# Q6. What are the advantages and disadvantages of Random Forest Regressor?
# Advantages:
# - Reduces overfitting compared to individual decision trees.
# - Provides robust and accurate predictions.
# - Handles both numerical and categorical features.
# - Works well with large datasets and high-dimensional data.

# Disadvantages:
# - Can be computationally intensive and memory-consuming, especially with a large number of trees.
# - The model is less interpretable than a single decision tree.

# Q7. What is the output of Random Forest Regressor?
# The output of a Random Forest Regressor is a continuous value, which is the average of the predictions made by the individual decision trees in the ensemble for a given input.

# Q8. Can Random Forest Regressor be used for classification tasks?
# No, Random Forest Regressor is specifically designed for regression tasks where the output is a continuous value. However, the Random Forest algorithm can be adapted for classification tasks using a similar approach called Random Forest Classifier. In classification, the individual decision trees vote for a class label, and the final prediction is based on the majority vote among the trees.

# Example of implementing Random Forest Regressor in Python:

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error

# Load dataset
boston = load_boston()
X = boston.data
y = boston.target

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train Random Forest Regressor
rf_regressor = RandomForestRegressor(n_estimators=100, random_state=42)
rf_regressor.fit(X_train, y_train)

# Make predictions
y_pred = rf_regressor.predict(X_test)

# Evaluate model
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)

# Example of using Random Forest Classifier for classification tasks:

from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier

# Load dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train Random Forest Classifier
rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)
rf_classifier.fit(X_train, y_train)

# Make predictions
y_pred = rf_classifier.predict(X_test)

# Evaluate model
accuracy = rf_classifier.score(X_test, y_test)
print("Accuracy:", accuracy)
