## Fine Tuning the Model

* **Biased Error**: Error produced by the model during the Fitting Stage (Training Stage).
* **Variance Error**: Difference in prediction when model fits into diffrent data set.

1. Biased Error is Low, Varinace Error is High then the model is ***Over Fitted Model***
2. Biased Error is High, Varinace Error is Low then the model is ***Under Fitted Model***

How to handle overfitted model ?
> **Regularization**: 
> * Rich
> * Lasso
> * Elastic net

SSE = $\sum((Y_a - Y_p)^2)$

= $\sum((y-b_1 \times x_1 - b_0)^2)$

=  $\sum((y-b_1 \times x_1 - b_0)^2) + P$ here the $P$ is penality constant is called ***Regularization*** 

= $\sum((y -b_1 \times x_1 - b_2 \times x_2 - b_0)^2) + \lambda (\beta1 ^ 2 + \beta2 ^ 2)$ here the lamda of beta is called Hyper Parameter $L_2$.

= $\sum((y -b_1 \times x_1 - b_2 \times x_2 - b_0)^2) + \lambda (\beta1 + \beta2)$ here the lamda of beta is called Hyper Parameter $L_1$ norm which defines ***Lasso Regularization***.

= $\sum((y -b_1 \times x_1 - b_2 \times x_2 - b_0)^2) + L_2 + L_1$ which defines ***Elasto Regularization***.

### a) What is Machine Learning? State any two types of machine learning.

**Machine Learning** is a branch of artificial intelligence that focuses on building systems that can learn from and make decisions based on data. It involves algorithms that improve their performance as they are exposed to more data over time.

**Two types of machine learning**:
1. **Supervised Learning**: The algorithm is trained on labeled data, meaning the input comes with the correct output. The goal is to learn a mapping from inputs to outputs. Examples include classification and regression.
2. **Unsupervised Learning**: The algorithm is trained on unlabeled data and must find patterns and relationships within the data. Examples include clustering and association.

### b) How can you handle overfitting and underfitting?

**Overfitting** occurs when a model learns the training data too well, including noise and outliers, leading to poor generalization to new data. **Underfitting** occurs when a model is too simple to capture the underlying patterns in the data.

**Handling Overfitting**:
1. **Cross-Validation**: Use techniques like k-fold cross-validation to ensure the model generalizes well to unseen data.
2. **Regularization**: Apply regularization techniques like L1 (Lasso) or L2 (Ridge) to penalize large coefficients and reduce model complexity.

**Handling Underfitting**:
1. **Increase Model Complexity**: Use a more complex model that can capture the underlying patterns in the data.
2. **Feature Engineering**: Add more relevant features or transform existing features to provide the model with more information.

### c) State the assumptions of the linear regression algorithm.

The assumptions of linear regression are:
1. **Linearity**: The relationship between the independent and dependent variables is linear.
2. **Independence**: The residuals (errors) are independent.
3. **Homoscedasticity**: The residuals have constant variance at every level of the independent variable.
4. **Normality**: The residuals of the model are normally distributed.

### d) If \( y = 2x_1 + 12x_2 + 3x_3 + 5 \) is the linear regression equation, then explain how the coefficients of \( x_1 \) and \( x_2 \) affect the value of \( y \).

In the linear regression equation \( y = 2x_1 + 12x_2 + 3x_3 + 5 \):
- The coefficient of \( x_1 \) is 2, which means that for every one-unit increase in \( x_1 \), \( y \) increases by 2 units, assuming all other variables remain constant.
- The coefficient of \( x_2 \) is 12, which means that for every one-unit increase in \( x_2 \), \( y \) increases by 12 units, assuming all other variables remain constant.

### e) Explain any two of the data preprocessing steps.

1. **Normalization/Standardization**: This step involves scaling the data to a standard range, typically between 0 and 1 (normalization) or to have a mean of 0 and a standard deviation of 1 (standardization). This helps in improving the performance and convergence speed of the learning algorithms.
2. **Handling Missing Values**: This step involves dealing with missing data points in the dataset. Techniques include removing rows with missing values, imputing missing values with mean, median, or mode, or using more advanced methods like K-Nearest Neighbors imputation.


Sure, let's go through each of these questions one by one.

### 1a) State a few applications of Machine Learning

Machine Learning has a wide range of applications, including:
1. **Image and Speech Recognition**: Used in facial recognition systems, voice assistants like Siri and Alexa.
2. **Healthcare**: Predicting disease outbreaks, personalized treatment plans, and medical image analysis.
3. **Finance**: Fraud detection, algorithmic trading, and credit scoring.
4. **Marketing**: Customer segmentation, recommendation systems, and sentiment analysis.
5. **Autonomous Vehicles**: Self-driving cars use machine learning for object detection and decision-making.

### 2b) What is the difference between Classification and Regression problem?

- **Classification**: Involves predicting a categorical label. For example, determining whether an email is spam or not spam.
- **Regression**: Involves predicting a continuous value. For example, predicting the price of a house based on its features.

### 2c) Mention any two assumptions of linear regression algorithm.

1. **Linearity**: The relationship between the independent and dependent variables is linear.
2. **Homoscedasticity**: The residuals have constant variance at every level of the independent variable.

### 2d) How can you deal with multicollinearity?

Multicollinearity can be dealt with by:
1. **Removing Highly Correlated Predictors**: Identify and remove one of the highly correlated variables.
2. **Principal Component Analysis (PCA)**: Transform the correlated variables into a set of linearly uncorrelated variables.

### 2e) Explain some issues with Machine Learning.

Some issues with Machine Learning include:
1. **Overfitting**: The model performs well on training data but poorly on new, unseen data.
2. **Underfitting**: The model is too simple to capture the underlying patterns in the data.
3. **Data Quality**: Poor quality data can lead to inaccurate models.
4. **Bias and Fairness**: Models can inherit biases present in the training data, leading to unfair outcomes.

### 2a) Explain Forward Selection in brief.

**Forward Selection** is a feature selection technique where we start with no features and iteratively add the most significant feature at each step. The process continues until adding new features does not significantly improve the model's performance.

### 2b) If \( y = 2x_1 + 12x_2 + 3x_3 + 5 \) is the linear regression equation, then explain how the coefficients of \( x_2 \) and \( x_3 \) affect the value of \( y \).

In the linear regression equation \( y = 2x_1 + 12x_2 + 3x_3 + 5 \):
- The coefficient of \( x_2 \) is 12, which means that for every one-unit increase in \( x_2 \), \( y \) increases by 12 units, assuming all other variables remain constant.
- The coefficient of \( x_3 \) is 3, which means that for every one-unit increase in \( x_3 \), \( y \) increases by 3 units, assuming all other variables remain constant.

### 2c) How can you handle overfitting and underfitting?

**Overfitting** can be handled by:
1. **Cross-Validation**: Use techniques like k-fold cross-validation to ensure the model generalizes well to unseen data.
2. **Regularization**: Apply regularization techniques like L1 (Lasso) or L2 (Ridge) to penalize large coefficients and reduce model complexity.

**Underfitting** can be handled by:
1. **Increase Model Complexity**: Use a more complex model that can capture the underlying patterns in the data.
2. **Feature Engineering**: Add more relevant features or transform existing features to provide the model with more information.

### 2d) Explain Gradient Descent in brief.

**Gradient Descent** is an optimization algorithm used to minimize the cost function in machine learning models. It iteratively adjusts the model parameters in the direction of the steepest descent of the cost function until it reaches the minimum. The steps are:
1. Initialize the parameters.
2. Calculate the gradient of the cost function with respect to the parameters.
3. Update the parameters by moving in the direction opposite to the gradient.
4. Repeat until convergence.

### 2e) What is Lasso Regularization?

**Lasso Regularization** (Least Absolute Shrinkage and Selection Operator) is a regularization technique that adds a penalty equal to the absolute value of the magnitude of coefficients to the cost function. It helps in feature selection by shrinking some coefficients to zero, effectively removing them from the model. The Lasso regression objective function is:

\[ \text{Minimize} \left( \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 + \lambda \sum_{j=1}^{p} | \beta_j | \right) \]

where \( \lambda \) is the regularization parameter.



In [None]:
# !pip install optuna

In [4]:
import pyforest
import optuna
from sklearn.model_selection import cross_val_score

  from .autonotebook import tqdm as notebook_tqdm


In [5]:
data=pd.read_csv('../data-sets/IPL_IMB_data.csv')

<IPython.core.display.Javascript object>