#### Q1.Explain the difference between simple linear regression and multiple linear regression. Provide an example of each.

##### Simple Linear Regression:

Simple linear regression involves modeling the relationship between a single independent variable (predictor) and a dependent variable (response).
The relationship is assumed to be linear, meaning that changes in the independent variable are associated with constant changes in the dependent variable.
The equation for simple linear regression is typically represented as:
Yi = β0 + β1Xi 

where 
- Yi = Dependent variable,  
- β0 = constant/Intercept, 
- β1 = Slope/Intercept, 
- Xi = Independent variable

Example: Predicting a student's exam score (dependent variable) based on the number of hours they studied (independent variable). Here, the number of hours studied is the only predictor influencing the exam score.

##### Multiple Linear Regression:

Multiple linear regression is a technique to understand the relationship between a single dependent variable and multiple independent variables.

The formulation for multiple linear regression is also similar to simple linear regression with the small change that instead of having one beta variable, you will now have betas for all the variables used. The formula is given as:

 Y = B0 + B1X1 + B2X2 + … + BpXp + ε
 
Example: Predicting house prices (dependent variable) based on multiple factors such as square footage, number of bedrooms, number of bathrooms, and neighborhood income. Here, square footage, number of bedrooms, number of bathrooms, and neighborhood income are all predictors influencing the house prices.

#### Q2. Discuss the assumptions of linear regression. How can you check whether these assumptions hold in a given dataset?

1. Linearity: The relationship between the independent variables and the dependent variable should be linear. This means that the change in the dependent variable is proportional to the change in the independent variables.
Check: Plot the dependent variable against each independent variable. If the relationship appears to be linear, the assumption is likely met.

2. Independence: The observations should be independent of each other. There should be no correlation between the residuals (errors) of the model.
Check: Examine the residuals plot to ensure there are no patterns or trends. Additionally, check for autocorrelation in the residuals using statistical tests like the Durbin-Watson test.

3. Homoscedasticity: The variance of the residuals should be constant across all levels of the independent variables (homogeneity of variance).
Check: Plot the residuals against the predicted values or against each independent variable. The presence of a cone-shaped pattern or widening/narrowing spread of residuals indicates heteroscedasticity. Statistical tests like the Breusch-Pagan test or White test can also be used to formally test for homoscedasticity.

4. Normality of Residuals: The residuals should be normally distributed. This assumption is necessary for hypothesis testing and constructing confidence intervals.
Check: Create a histogram or a Q-Q plot of the residuals to visually inspect their distribution. Additionally, perform a statistical test for normality such as the Shapiro-Wilk test or the Kolmogorov-Smirnov test.

5. No Multicollinearity: The independent variables should not be highly correlated with each other. High multicollinearity can lead to unstable parameter estimates and inflated standard errors.
Check: Calculate the correlation matrix of the independent variables. If any pair of variables has a correlation coefficient close to 1 or -1, it indicates multicollinearity. Alternatively, use variance inflation factor (VIF) analysis to quantify multicollinearity.

#### Q3. How do you interpret the slope and intercept in a linear regression model? Provide an example using a real-world scenario.

In a linear regression model of the form Yi = β0 + β1Xi the slope (β1) represents the change in the dependent variable (y) for a one-unit change in the independent variable (x), while holding all other variables constant. The intercept (β0) represents the value of the dependent variable when the independent variable is zero.

Here's how you can interpret the slope and intercept using a real-world scenario:

Scenario: Suppose we want to predict the price of a house (dependent variable, y) based on its size in square feet (independent variable, x). We fit a simple linear regression model to the data and obtain the following equation: Price=β0+β1×Size+ϵ.

Intercept (β0): The intercept represents the base price of a house when its size is zero. However, in most real-world scenarios, a house cannot have a size of zero. Therefore, the intercept may not have a meaningful interpretation in this context. It's important to consider the context of the problem when interpreting the intercept.

Slope (β1): The slope represents the average change in the price of a house for a one-unit increase in size (e.g., one square foot), assuming all other factors remain constant. For example, if the slope is 100, it means that, on average, the price of a house increases by $100 for every additional square foot of space.

#### Q4. Explain the concept of gradient descent. How is it used in machine learning?

- Gradient Descent is defined as one of the most commonly used iterative optimization algorithms of machine learning to train the machine learning and deep learning models. It helps in finding the local minimum of a function.

- If we move towards a negative gradient or away from the gradient of the function at the current point, it will give the local minimum of that function.

- Whenever we move towards a positive gradient or towards the gradient of the function at the current point, we will get the local maximum of that function.

![image.png](attachment:e3613aea-8abc-4309-aa23-584acabd21bb.png)

- This entire procedure is known as Gradient Ascent, which is also known as steepest descent. The main objective of using a gradient descent algorithm is to minimize the cost function using iteration. To achieve this goal, it performs two steps iteratively:

1. Calculates the first-order derivative of the function to compute the gradient or slope of that function.
2. Move away from the direction of the gradient, which means slope increased from the current point by alpha times, where Alpha is defined as Learning Rate. It is a tuning parameter in the optimization process which helps to decide the length of the steps.

#### Q5. Describe the multiple linear regression model. How does it differ from simple linear regression?

The main difference between simple linear regression and multiple linear regression is the number of independent variables used in the model. In simple linear regression, we use one independent variable, while in multiple linear regression, we use two or more independent variables.

Another difference is the complexity of the model. Simple linear regression models are relatively simple and easy to interpret, as they involve only two variables. Multiple linear regression models, on the other hand, are more complex and require more computational power. They also require more careful interpretation, as the relationships between the independent variables and the dependent variable can be more difficult to understand.

Example
To illustrate the differences between simple linear regression and multiple linear regression, let’s consider an example. Suppose we want to predict a person’s score on a math test based on their study time and their IQ score. We collect data on study time (in hours) and IQ scores (on a scale of 0 to 100) for a sample of 50 students, along with their scores on a math test (out of 100). We can then use this data to create both a simple linear regression model and a multiple linear regression model.

First, let’s create a simple linear regression model. We can plot the data on a scatter plot to visualize the relationship between study time and math scores.

![image.png](attachment:6584eee6-80d9-49cc-936b-b1bf9cc637a5.png)

From the scatter plot, we can see that there appears to be a positive linear relationship between study time and math scores. We can then fit a linear regression line to the data to estimate the relationship between the two variables.

The equation for the simple linear regression model is:

Math Score = 32.55 + 1.89 x Study Time

This means that for every one-hour increase in study time, we expect the student’s math score to increase by 1.89 points, on average.

Now, let’s create a multiple linear regression model that includes both study time and IQ score as independent variables. The equation for the multiple linear regression model is:

Math Score = 17.62 + 1.68 x Study Time + 0.26 x IQ Score

![image.png](attachment:2325e975-62bb-4ffb-b013-be529bd3e977.png)

This means that for every one-hour increase in study time, we expect the student’s math score to increase by 1.68 points, on average, holding the IQ score constant. Similarly, for every one-point increase in IQ score, we expect the student’s math score to increase by 0.26 points, on average, holding study time constant.

#### Q6. Explain the concept of multicollinearity in multiple linear regression. How can you detect and address this issue?

Multicollinearity is a statistical phenomenon that occurs when two or more independent variables in a regression model are highly correlated with each other. In other words, multicollinearity indicates a strong linear relationship among the predictor variables. This can create challenges in the regression analysis because it becomes difficult to determine the individual effects of each independent variable on the dependent variable accurately.

Multicollinearity can lead to unstable and unreliable coefficient estimates, making it harder to interpret the results and draw meaningful conclusions from the model. It is essential to detect and address multicollinearity to ensure the validity and robustness of regression models.

Multicollinearity occurs when two or more independent variables in a data frame have a high correlation with one another in a regression model.

This means that one independent variable can be predicted from another in a regression model. For example, sets like height and weight, household income and water consumption, mileage and the price of a car, study time and leisure time, etc.

Let me take a simple example from our everyday life to explain this. Colin loves watching television while munching on chips. The more television he watches, the more chips he eats, and the happier he gets!

Now, if we could quantify happiness and measure Colin’s happiness while he’s busy doing his favorite activity, which do you think would have a greater impact on his happiness? Having chips or watching television? That’s difficult to determine because the moment we try to measure Colin’s happiness from eating chips, he starts watching television. And the moment we try to measure his happiness from watching television, he starts eating chips.

Eating chips and watching television are highly correlated in the case of Colin, and we cannot individually determine the impact of individual activities on his happiness. This is the multicollinearity problem!

##### How to detect multicollinearity:

In Python, there are several ways to detect multicollinearity in a dataset, such as using the Variance Inflation Factor (VIF) or calculating the correlation matrix of the independent variables. To address multicollinearity, techniques such as regularization or feature selection can be applied to select a subset of independent variables that are not highly correlated with each other.We will focus on the most common one – VIF (Variance Inflation Factors).

” VIF determines the strength of the correlation between the independent variables. It is predicted by taking a variable and regressing it against every other variable. “

or

VIF score of an independent variable represents how well the variable is explained by other independent variables.

R^2 value is determined to find out how well an independent variable is described by the other independent variables. A high value of R^2 means that the variable is highly correlated with the other variables. This is captured by the VIF, which is denoted below:

![image.png](attachment:9c21c602-a305-4fb1-a1a1-4e1dffd8c545.png)

So, the closer the R^2 value to 1, the higher the value of VIF and the higher the multicollinearity with the particular independent variable.

#### Q7. Describe the polynomial regression model. How is it different from linear regression?

A simple linear regression algorithm only works when the relationship between the data is linear. But suppose we have non-linear data, then linear regression will not be able to draw a best-fit line. Simple regression analysis fails in such conditions. Consider the below diagram, which has a non-linear relationship, and you can see the linear regression results on it, which does not perform well, meaning it does not come close to reality. Hence, we introduce polynomial regression to overcome this problem, which helps identify the curvilinear relationship between independent and dependent variables.

![image.png](attachment:cb89e062-a751-411c-81b6-3401be29e7ba.png)

Suppose we have a dataset where variable X represents the Independent data and Y is the dependent data. Before feeding data to a mode in the preprocessing stage, we convert the input variables into polynomial terms using some degree.

Consider an example my input value is 35, and the degree of a polynomial is 2, so I will find 35 power 0, 35 power 1, and 35 power 2 this helps to interpret the non-linear relationship in data.
The equation of polynomials becomes something like this.

                     y = a0 + a1x1 + a2x12 + … + anx1n

The degree of order which to use is a Hyperparameter, and we need to choose it wisely. But using a high degree of polynomial tries to overfit the data, and for smaller values of degree, the model tries to underfit, so we need to find the optimum value of a degree. Polynomial Regression models are usually fitted with the method of least squares. The least square method minimizes the variance of the coefficients under the Gauss-Markov Theorem.

#### Q8. What are the advantages and disadvantages of polynomial regression compared to linear regression? In what situations would you prefer to use polynomial regression?

Polynomial regression is a type of regression analysis that models the relationship between the independent variable and the dependent variable as an nth-degree polynomial. Here are the advantages and disadvantages of polynomial regression compared to linear regression:

Advantages of Polynomial Regression:

Flexibility: Polynomial regression can capture non-linear relationships between variables better than linear regression. It can fit curves and patterns that linear regression cannot.

Higher Order Relationships: It can model complex relationships between variables by including higher-order terms (e.g., quadratic, cubic) in the regression equation.

Improved Fit: When the relationship between variables is non-linear, polynomial regression can provide a better fit to the data compared to linear regression.

Disadvantages of Polynomial Regression:

Overfitting: Polynomial regression models with higher-order terms can overfit the training data, meaning they capture noise or random fluctuations in the data rather than the underlying relationship. This can lead to poor generalization to new data.

Interpretability: Polynomial regression models can be more difficult to interpret compared to linear regression models, especially when higher-order terms are included.

Computational Complexity: As the degree of the polynomial increases, the computational complexity of fitting the model also increases. This can make polynomial regression slower and more computationally intensive, especially for large datasets or high-degree polynomials.

Situation for Using Polynomial Regression:

Polynomial regression is preferred over linear regression in situations where the relationship between the independent and dependent variables is non-linear or when a linear model cannot adequately capture the complexity of the data. Some scenarios where polynomial regression may be appropriate include:

- When there is evidence of a curved or non-linear relationship between variables based on exploratory data analysis.
- When the relationship between variables changes over different ranges of the independent variable.
- When the data exhibits patterns such as exponential growth or decay.
- When you need to model interactions between variables that cannot be captured by linear terms alone.