## Question-1 :What is the purpose of grid search cv in machine learning, and how does it work?

In [None]:
Grid Search CV (Cross-Validation) is a hyperparameter tuning technique used in machine learning to systematically search through a predefined hyperparameter grid and find the combination of hyperparameter values that optimizes the performance of a model. It is particularly useful for fine-tuning models and improving their generalization on unseen data. Here's an overview of the purpose and functioning of Grid Search CV:

Purpose of Grid Search CV:
Hyperparameter Tuning:

Machine learning models often have hyperparameters (parameters external to the model training process) that significantly impact performance.
Grid Search CV helps in finding the best combination of hyperparameter values by exhaustively searching through a predefined set of values.
Model Performance Optimization:

The primary goal is to optimize the performance of the model on a validation set or through cross-validation.
By systematically trying different hyperparameter combinations, Grid Search CV helps identify the set that leads to the best model performance.
How Grid Search CV Works:
Define Hyperparameter Grid:

Specify a hyperparameter grid, which is a dictionary or a list of dictionaries, where each dictionary contains hyperparameter names as keys and a list of values to be tried as values.
python
Copy code
param_grid = {'C': [0.1, 1, 10, 100], 'kernel': ['linear', 'rbf'], 'gamma': [0.001, 0.01, 0.1, 1]}
Model and Scorer Selection:

Choose the machine learning model for which hyperparameters need tuning and define an evaluation metric (scorer) to measure model performance.
python
Copy code
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

model = SVC()
scorer = 'accuracy'
Instantiate GridSearchCV:

Create an instance of the GridSearchCV class, passing the model, hyperparameter grid, scoring metric, and any desired cross-validation settings.
python
Copy code
grid_search = GridSearchCV(model, param_grid, scoring=scorer, cv=5)
Fit to Data:

Fit the GridSearchCV object to the training data. During this process, it performs an exhaustive search over the hyperparameter grid.
python
Copy code
grid_search.fit(X_train, y_train)
Retrieve Best Hyperparameters:

After fitting, you can access the best hyperparameters that yielded the highest performance according to the chosen scorer.
python
Copy code
best_params = grid_search.best_params_
Evaluate on Test Set:

Use the model with the best hyperparameters to make predictions on a test set and evaluate its performance.
python
Copy code
best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)
test_accuracy = accuracy_score(y_test, y_pred)
Cross-Validation in Grid Search CV:
Grid Search CV typically uses cross-validation to assess the model's performance across different subsets of the training data.
The cv parameter specifies the number of folds in the cross-validation process. A common choice is 5-fold or 10-fold cross-validation.
Result Analysis:
The final result of Grid Search CV includes the best hyperparameter values and the corresponding model performance.
Additional information, such as the mean and standard deviation of the performance across folds, is also available.
Considerations:
Computational Cost:

Grid Search CV can be computationally expensive, especially for large hyperparameter grids. Techniques like Randomized Search CV are alternative approaches that sample a subset of hyperparameter combinations.
Nested Cross-Validation:

To obtain a more robust estimate of model performance, it is recommended to use nested cross-validation, where an outer loop performs the grid search, and an inner loop assesses performance through cross-validation.
Grid Search CV is a systematic approach to hyperparameter tuning that helps automate the process of finding the optimal hyperparameters for a machine learning model, making the model more effective and better generalized to unseen data.







## Question-2:Describe the difference between grid search cv and randomize search cv, and when might you choose one over the other?

In [None]:
Both Grid Search CV and Randomized Search CV are hyperparameter tuning techniques used in machine learning to find the optimal set of hyperparameter values for a model. However, they differ in their search strategies:

Grid Search CV:
Search Strategy:

Grid Search CV performs an exhaustive search over a predefined hyperparameter grid.
It systematically evaluates all possible combinations of hyperparameter values within the specified grid.
Computationally Expensive:

As it explores all combinations, Grid Search CV can be computationally expensive, especially when dealing with a large search space.
Usage:

Grid Search CV is suitable when the hyperparameter search space is relatively small, and it is feasible to evaluate all combinations without excessive computational cost.
Regular Grid:

Hyperparameter values are selected from a regular grid, defined by a set of discrete values for each hyperparameter.
Result Interpretation:

Grid Search CV provides a comprehensive overview of the entire hyperparameter space, making it easier to interpret the relationships between different hyperparameters.
Randomized Search CV:
Search Strategy:

Randomized Search CV samples a specified number of hyperparameter combinations randomly from the hyperparameter space.
Instead of exploring all combinations, it focuses on a random subset.
Computationally Efficient:

Randomized Search CV is often more computationally efficient than Grid Search CV, especially when dealing with a large search space.
It allows for a more targeted exploration of the hyperparameter space without evaluating all possibilities.
Usage:

Randomized Search CV is suitable when the hyperparameter search space is large, and an exhaustive search is impractical within the available resources.
Continuous or Distributions:

Hyperparameter values can be sampled from continuous distributions, making it flexible for scenarios where hyperparameters have a wide range of potential values.
Result Interpretation:

While Randomized Search CV may not provide a comprehensive overview of the entire hyperparameter space, it can efficiently discover good-performing combinations with fewer evaluations.
When to Choose One over the Other:
Search Space Size:

Choose Grid Search CV when the hyperparameter search space is relatively small, and you can afford to evaluate all combinations.
Choose Randomized Search CV when the search space is large, and an exhaustive search is computationally expensive.
Computational Resources:

If computational resources are limited, Randomized Search CV may be a more practical choice as it allows you to explore a diverse set of hyperparameter combinations without evaluating all possibilities.
Exploration vs. Exhaustiveness:

Grid Search CV exhaustively explores the entire search space, providing a comprehensive view of hyperparameter relationships.
Randomized Search CV is more focused on efficient exploration, sampling random combinations to quickly identify promising regions of the hyperparameter space.
Continuous Hyperparameters:

Randomized Search CV is more suitable when dealing with hyperparameters that can take continuous values or when you want to explore a distribution of values.
Initial Hyperparameter Tuning:

For an initial hyperparameter tuning pass, where you want to quickly identify promising regions, Randomized Search CV might be preferred.
For a more fine-grained search, Grid Search CV can be applied in subsequent steps.
In summary, the choice between Grid Search CV and Randomized Search CV depends on the size of the hyperparameter search space, the available computational resources, and the level of detail desired in exploring hyperparameter combinations. Randomized Search CV is often a good choice when efficiency is crucial, while Grid Search CV may be preferable for smaller search spaces where a thorough exploration is feasible.

## Question-3 :What is data leakage, and why is it a problem in machine learning? Provide an example.

In [None]:
Data leakage in machine learning refers to the unintentional or inappropriate use of information during the model training process that could lead to overly optimistic performance estimates or biased models. It occurs when information from the test set or future data is inadvertently used in the training phase, contaminating the learning process. Data leakage can seriously compromise the generalization ability of a model, as it may learn patterns that do not hold in real-world scenarios.

Reasons for Data Leakage:
Temporal Leakage:

Using future information to predict past events, which the model wouldn't have had access to at the time of prediction.
Data Preprocessing Errors:

Mistakenly applying feature scaling, imputation, or other preprocessing steps based on the entire dataset, including the test set.
Information Leaking into Target Variable:

Including information related to the target variable that would not be available at prediction time, leading to artificially high performance during training.
Target Leakage:

Inadvertently including predictive information that will not be available at the time of prediction, leading to overly optimistic model performance.
Example of Data Leakage:
Scenario: Predicting Stock Prices

Suppose you are building a machine learning model to predict stock prices using historical data. The dataset includes features like stock prices, trading volumes, and financial indicators. You split the dataset into training and testing sets, with the testing set representing future data.

Data Leakage Example:

Feature Engineering:

You decide to engineer a new feature called "Next Day's Price Change," representing the percentage change in stock prices from one day to the next.
Train-Test Split:

You split the dataset into training and testing sets, ensuring that the testing set represents future dates.
Feature Calculation:

You calculate the "Next Day's Price Change" based on information in the testing set to create the target variable for training.
python
Copy code
# Incorrect calculation of the target variable
training_set['Next_Day_Price_Change'] = (training_set['Close'].shift(-1) - training_set['Close']) / training_set['Close']
Model Training:
You train a machine learning model to predict the "Next Day's Price Change" using historical features.
python
Copy code
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor

# Incorrect split that leads to data leakage
X_train, X_test, y_train, y_test = train_test_split(training_set.drop('Next_Day_Price_Change', axis=1),
                                                    training_set['Next_Day_Price_Change'],
                                                    test_size=0.2,
                                                    random_state=42)

model = RandomForestRegressor()
model.fit(X_train, y_train)
Problem:

The calculation of the "Next Day's Price Change" includes information from the testing set during the training phase.
The model, during training, learns patterns that are not representative of real-world scenarios because it has access to information about future price changes.
Consequence:

The model may appear to perform well during evaluation on the training set, but it is likely to perform poorly on new, unseen data because it has inadvertently learned from future information.
Preventive Measures:

Ensure that feature engineering, preprocessing, and target variable creation are done using only information available at the time of prediction.
Use proper time-based splitting to avoid temporal leakage.
In summary, data leakage can lead to overfitting and inaccurate assessments of model performance. It is essential to be vigilant about how information is used during the training process and to ensure that the model is making predictions based on the information available at the time of prediction.

User


## Question-4 :How can you prevent data leakage when building a machine learning model?

In [None]:
Preventing data leakage is crucial to ensure the integrity and generalization ability of a machine learning model. Here are several strategies to help prevent data leakage during the model-building process:

1. Time-Based Splitting:
If your dataset has a temporal structure (e.g., time series data), use time-based splitting for training and testing.
Ensure that the training set precedes the testing set in time to avoid using future information in the training process.
2. Strict Cross-Validation:
Implement strict cross-validation procedures, such as forward chaining in time series data.
Avoid using information from future folds during the training of the current fold.
3. Feature Engineering:
Be cautious when creating new features based on information that may not be available at the time of prediction.
Features should be generated using only information available up to the point of prediction.
4. Target Variable Creation:
Ensure that the target variable is calculated based on information available at the time of prediction.
Do not use future information in creating the target variable.
5. Feature Scaling and Imputation:
Apply feature scaling and imputation separately to the training and testing sets.
Do not use global statistics (mean, standard deviation) from the entire dataset, as this can introduce future information into the training process.
6. Data Preprocessing:
Perform data preprocessing steps (e.g., normalization, encoding) separately for the training and testing sets.
Avoid using information from the testing set during the preprocessing phase.
7. Use of External Data:
If using external data, ensure that it is aligned with the time frame of the training data.
Do not include external data that provides information from the future.
8. Awareness and Documentation:
Document the steps of your data preparation and modeling pipeline.
Be aware of the potential sources of data leakage and actively check for unintentional uses of future information.
9. Feature Selection and Model Evaluation:
Perform feature selection based only on information available at the time of prediction.
Evaluate the model's performance using metrics calculated on the testing set, and be cautious about using performance metrics from the training set for decision-making.
10. Pipeline Construction:
vbnet
Copy code
- Use machine learning pipelines to encapsulate all the steps in the modeling process.
- Ensure that each step of the pipeline operates independently and is applied consistently to the training and testing sets.
11. Separate Development and Test Environments:
css
Copy code
- When working with sensitive or confidential data, ensure a clear separation between development and test environments.
- Strictly control access to test datasets to prevent unintentional use of future information during development.
12. Monitoring and Auditing:
css
Copy code
- Regularly monitor and audit your modeling pipeline to detect any potential sources of data leakage.
- Implement checks and logs to track the origin of features and target variables.
13. Education and Awareness:
kotlin
Copy code
- Educate team members about the risks and consequences of data leakage.
- Foster a culture of awareness and careful consideration when working with data.
By following these strategies, you can significantly reduce the risk of data leakage and build models that generalize well to new, unseen data. Consistent and careful practices during feature engineering, preprocessing, and model training are essential for maintaining the integrity of the modeling process.







## Question-5 :What is a confusion matrix, and what does it tell you about the performance of a classification model?

In [None]:
A confusion matrix is a table used in classification to assess the performance of a machine learning model. It provides a detailed breakdown of the model's predictions compared to the actual class labels. The confusion matrix is particularly useful in evaluating the performance of binary and multiclass classification models. It helps in understanding the types of errors made by the model and various performance metrics can be derived from it.

A standard confusion matrix for a binary classification problem is structured as follows:

mathematica
Copy code
                    Actual Class 1     Actual Class 0
Predicted Class 1    True Positive (TP)   False Positive (FP)
Predicted Class 0    False Negative (FN)  True Negative (TN)
Here's a breakdown of the terms in a binary confusion matrix:

True Positive (TP): Instances correctly predicted as positive by the model.
False Positive (FP): Instances incorrectly predicted as positive by the model when the true class is negative (Type I error).
False Negative (FN): Instances incorrectly predicted as negative by the model when the true class is positive (Type II error).
True Negative (TN): Instances correctly predicted as negative by the model.
Key Metrics Derived from a Confusion Matrix:
Accuracy:

Accuracy
=
TP + TN
TP + FP + FN + TN
Accuracy= 
TP + FP + FN + TN
TP + TN
​
 
Measures the overall correctness of the model.
Precision (Positive Predictive Value):

Precision
=
TP
TP + FP
Precision= 
TP + FP
TP
​
 
Measures the accuracy of positive predictions.
Recall (Sensitivity, True Positive Rate):

Recall
=
TP
TP + FN
Recall= 
TP + FN
TP
​
 
Measures the ability of the model to capture all relevant instances of the positive class.
Specificity (True Negative Rate):

Specificity
=
TN
TN + FP
Specificity= 
TN + FP
TN
​
 
Measures the ability of the model to correctly identify negative instances.
F1 Score (Harmonic Mean of Precision and Recall):

F1 Score
=
2
×
Precision
×
Recall
Precision + Recall
F1 Score= 
Precision + Recall
2×Precision×Recall
​
 
Balances precision and recall, providing a single metric that considers both false positives and false negatives.
False Positive Rate (FPR):

FPR
=
FP
FP + TN
FPR= 
FP + TN
FP
​
 
Measures the rate of false positives relative to the total number of actual negatives.
False Negative Rate (FNR):

FNR
=
FN
FN + TP
FNR= 
FN + TP
FN
​
 
Measures the rate of false negatives relative to the total number of actual positives.
Use Cases:
Imbalanced Datasets:

In imbalanced datasets, where one class is more prevalent than the other, accuracy alone might be misleading. Precision, recall, and F1 score provide a more nuanced understanding of the model's performance.
Medical Diagnostics:

In medical diagnostics, false negatives (missed detections) may be more critical than false positives. In such cases, high recall is prioritized.
Fraud Detection:

In fraud detection, precision is often more critical than recall because falsely flagging a non-fraudulent transaction can inconvenience users, while missing a fraudulent transaction can have severe consequences.
In summary, a confusion matrix provides a detailed breakdown of a classification model's predictions and is a fundamental tool for evaluating its performance. Different metrics derived from the confusion matrix offer insights into different aspects of the model's behavior, allowing practitioners to make informed decisions based on the specific requirements of the problem at hand.

## Question-6 :Explain the difference between precision and recall in the context of a confusion matrix.

In [None]:
Precision and recall are performance metrics derived from a confusion matrix, particularly in the context of binary classification. They provide insights into different aspects of a model's performance, specifically focusing on the positive class. Here's an explanation of precision and recall:

Precision:
Precision, also known as Positive Predictive Value, is a measure of the accuracy of positive predictions made by the model. It answers the question: "Of all the instances predicted as positive, how many were truly positive?"

Precision is calculated using the following formula:

Precision
=
True Positives (TP)
True Positives (TP) + False Positives (FP)
Precision= 
True Positives (TP) + False Positives (FP)
True Positives (TP)
​
 

True Positives (TP): Instances correctly predicted as positive by the model.
False Positives (FP): Instances incorrectly predicted as positive by the model when the true class is negative (Type I error).
Interpretation:

High precision indicates that when the model predicts the positive class, it is likely to be correct.
Precision is crucial in scenarios where false positives are costly or have significant consequences.
Recall:
Recall, also known as Sensitivity or True Positive Rate, measures the ability of the model to capture all relevant instances of the positive class. It answers the question: "Of all the truly positive instances, how many were successfully predicted as positive by the model?"

Recall is calculated using the following formula:

Recall
=
True Positives (TP)
True Positives (TP) + False Negatives (FN)
Recall= 
True Positives (TP) + False Negatives (FN)
True Positives (TP)
​
 

True Positives (TP): Instances correctly predicted as positive by the model.
False Negatives (FN): Instances incorrectly predicted as negative by the model when the true class is positive (Type II error).
Interpretation:

High recall indicates that the model is effective in capturing a large proportion of the positive instances.
Recall is crucial in scenarios where false negatives are costly or have significant consequences.
Precision-Recall Tradeoff:
There is often a tradeoff between precision and recall. Improving one may come at the expense of the other.
For example, increasing the threshold for predicting the positive class can lead to higher precision but lower recall, and vice versa.
Use Cases:
Imbalanced Datasets:

In imbalanced datasets, where one class is more prevalent than the other, precision and recall provide a more nuanced understanding of the model's performance than accuracy alone.
Medical Diagnostics:

In medical diagnostics, false negatives (missed detections) may be more critical than false positives. In such cases, high recall is prioritized.
Spam Detection:

In spam detection, precision is often more critical than recall because falsely classifying a non-spam email as spam (false positive) can inconvenience users.
In summary, precision and recall offer complementary insights into a model's performance, focusing on different aspects of its ability to predict the positive class. The choice between precision and recall depends on the specific requirements and priorities of the problem at hand.







## Question-7 :How can you interpret a confusion matrix to determine which types of errors your model is making?

In [None]:
Interpreting a confusion matrix is crucial for understanding the types of errors your model is making and gaining insights into its performance. A confusion matrix provides a detailed breakdown of the model's predictions compared to the actual class labels. Here's how you can interpret a confusion matrix:

Key Components of a Confusion Matrix:
Consider a binary classification confusion matrix:

mathematica
Copy code
                    Actual Class 1     Actual Class 0
Predicted Class 1    True Positive (TP)   False Positive (FP)
Predicted Class 0    False Negative (FN)  True Negative (TN)
1. True Positives (TP):
Instances correctly predicted as positive by the model.
Interpretation: The model correctly identified these instances as belonging to the positive class.
2. False Positives (FP):
Instances incorrectly predicted as positive by the model when the true class is negative (Type I error).
Interpretation: The model falsely classified these instances as positive.
3. False Negatives (FN):
Instances incorrectly predicted as negative by the model when the true class is positive (Type II error).
Interpretation: The model missed these instances and incorrectly classified them as negative.
4. True Negatives (TN):
Instances correctly predicted as negative by the model.
Interpretation: The model correctly identified these instances as belonging to the negative class.
Insights into Model Errors:
Misclassifications:

Examine the off-diagonal elements (FP and FN).
Identify which class is more prone to misclassifications.
Precision and Recall:

Precision 
=
�
�
�
�
+
�
�
= 
TP+FP
TP
​
 : Assesses the accuracy of positive predictions.
Recall 
=
�
�
�
�
+
�
�
= 
TP+FN
TP
​
 : Measures the ability to capture positive instances.
Imbalance in precision and recall indicates specific types of errors.
Accuracy:

Accuracy 
=
�
�
+
�
�
�
�
+
�
�
+
�
�
+
�
�
= 
TP+FP+FN+TN
TP+TN
​
 : Overall correctness of predictions.
High accuracy may not capture class-specific errors.
Specificity and Sensitivity:

Specificity 
=
�
�
�
�
+
�
�
= 
TN+FP
TN
​
 : Measures the ability to correctly identify negative instances.
Sensitivity 
=
�
�
�
�
+
�
�
= 
TP+FN
TP
​
  (equivalent to recall): Measures the ability to capture positive instances.
Useful for evaluating the model's performance on each class separately.
Visualizations and Summary Metrics:
Heatmaps:

Use a heatmap to visually represent the confusion matrix, highlighting areas of high and low counts.
Precision-Recall Curve:

Plotting precision against recall for different threshold values provides a visual representation of the tradeoff between precision and recall.
F1 Score:

F1 Score 
=
2
×
Precision
×
Recall
Precision + Recall
= 
Precision + Recall
2×Precision×Recall
​
 : Harmonic mean of precision and recall. Balances false positives and false negatives.
Area Under the ROC Curve (AUC-ROC):

ROC curves illustrate the tradeoff between sensitivity and specificity across different threshold values.
Use Cases:
Medical Diagnostics:

A false negative (FN) in a medical diagnosis could have severe consequences. High recall is crucial.
Spam Detection:

A false positive (FP) in spam detection might inconvenience users. High precision is prioritized.
Fraud Detection:

Balancing precision and recall is crucial. False negatives (missed fraud cases) and false positives (false alarms) have different implications.
Iterative Improvement:
Model Adjustment:

Adjust the model's threshold or hyperparameters based on the specific goals and constraints.
Feature Engineering:

Analyze misclassified instances and consider additional features or modifications to improve model performance.
Interpreting a confusion matrix allows you to go beyond accuracy and gain a nuanced understanding of a model's strengths and weakness.


## Question-8 :What are some common metrics that can be derived from a confusion matrix, and how are they calculated?

In [None]:
Several common metrics can be derived from a confusion matrix to assess the performance of a classification model. These metrics provide insights into different aspects of the model's behavior, including accuracy, precision, recall, F1 score, and more. Here are some common metrics and their formulas:

1. Accuracy:
Measures the overall correctness of the model.
Formula: 
Accuracy
=
TP + TN
TP + FP + FN + TN
Accuracy= 
TP + FP + FN + TN
TP + TN
​
 
2. Precision (Positive Predictive Value):
Measures the accuracy of positive predictions.
Formula: 
Precision
=
TP
TP + FP
Precision= 
TP + FP
TP
​
 
3. Recall (Sensitivity, True Positive Rate):
Measures the ability of the model to capture all relevant instances of the positive class.
Formula: 
Recall
=
TP
TP + FN
Recall= 
TP + FN
TP
​
 
4. Specificity (True Negative Rate):
Measures the ability of the model to correctly identify negative instances.
Formula: 
Specificity
=
TN
TN + FP
Specificity= 
TN + FP
TN
​
 
5. F1 Score (Harmonic Mean of Precision and Recall):
Balances precision and recall, providing a single metric that considers both false positives and false negatives.
Formula: 
F1 Score
=
2
×
Precision
×
Recall
Precision + Recall
F1 Score= 
Precision + Recall
2×Precision×Recall
​
 
6. False Positive Rate (FPR):
Measures the rate of false positives relative to the total number of actual negatives.
Formula: 
FPR
=
FP
FP + TN
FPR= 
FP + TN
FP
​
 
7. False Negative Rate (FNR):
Measures the rate of false negatives relative to the total number of actual positives.
Formula: 
FNR
=
FN
FN + TP
FNR= 
FN + TP
FN
​
 
8. Area Under the ROC Curve (AUC-ROC):
ROC curves illustrate the tradeoff between sensitivity and specificity across different threshold values.
AUC-ROC is the area under the ROC curve, providing a single scalar value for model discrimination ability.
9. Area Under the Precision-Recall Curve (AUC-PR):
Precision-Recall curves show the tradeoff between precision and recall across different threshold values.
AUC-PR is the area under the Precision-Recall curve, indicating model performance in the context of imbalanced datasets.
10. Matthews Correlation Coefficient (MCC):
arduino
Copy code
- A correlation coefficient between the observed and predicted binary classifications.
- Formula: \( \text{MCC} = \frac{\text{TP} \times \text{TN} - \text{FP} \times \text{FN}}{\sqrt{(\text{TP} + \text{FP})(\text{TP} + \text{FN})(\text{TN} + \text{FP})(\text{TN} + \text{FN})}} \)
11. Cohen's Kappa Coefficient:
vbnet
Copy code
- Measures the agreement between the model's predictions and the actual labels, corrected for chance.
- Formula: \( \text{Kappa} = \frac{\text{Observed Agreement} - \text{Expected Agreement}}{1 - \text{Expected Agreement}} \)
12. Balanced Accuracy:
arduino
Copy code
- The arithmetic mean of sensitivity and specificity, providing a balanced measure for imbalanced datasets.
- Formula: \( \text{Balanced Accuracy} = \frac{\text{Sensitivity + Specificity}}{2} \)
13. Youden's J statistic:
arduino
Copy code
- A summary measure of the ROC curve that maximizes the difference between sensitivity and specificity.
- Formula: \( \text{Youden's J} = \text{Sensitivity} + \text{Specificity} - 1 \)
These metrics help evaluate different aspects of a model's performance and are chosen based on the specific requirements and priorities of the problem at hand. It's important to consider the context of the application when selecting and interpreting these metrics.

## Question-8 :What is the relationship between the accuracy of a model and the values in its confusion matrix?