In [None]:
Simple linear regression and multiple linear regression are two types of regression analysis techniques used to model 
the relationship between a dependent variable and one or more independent variables. Here are the key differences 
between 
the two: 1. Number of Independent Variables: - 
        Simple Linear Regression: In simple linear regression, there is only one 
            independent variable that is used to predict the dependent variable. The relationship between the 
            dependent 
            variable and the independent variable is modeled as a straight line. - 
        Multiple Linear Regression: In multiple linear regression, there are two or more independent variables 
            that are used to predict the dependent variable. The relationship between the dependent variable 
            and the independent variables is modeled as a linear combination of these variables. 
        2. Complexity of Model: - 
            Simple Linear Regression: Simple linear regression is a simpler model 
                compared to multiple linear regression as it involves only one independent variable. The model 
                equation is of the form Y = β0 + β1X, where Y is the dependent variable, X is the independent 
                variable, β0 is the intercept, and β1 is the coefficient. - 
            Multiple Linear Regression: Multiple linear regression is a more complex model as it involves 
                multiple independent variables. The model equation is of the form Y = β0 + β1X1 + β2X2 + ... + βnXn, 
                where Y is the dependent variable, X1, X2, ..., Xn are the independent variables, β0 is the intercept, 
                and β1, β2, ..., βn are the coefficients for each independent variable.
        3. Interpretation of Coefficients: - 
            Simple Linear Regression: In simple linear regression, there is only one coefficient to interpret, 
                which represents the relationship between the independent variable and the dependent variable. 
            - Multiple Linear Regression: In multiple linear regression, there are multiple coefficients to 
                interpret, each representing the relationship between a specific independent variable and the 
                dependent variable while holding other variables constant.
        4. Assumptions: - Both simple linear regression and multiple linear regression rely on the same 
            assumptions, such as linearity, independence of errors, homoscedasticity, and normality of residuals.
            However, multiple linear regression is more sensitive to multicollinearity, which occurs when independent 
            variables are highly correlated with each other. 

In [None]:
Gradient Descent: 
    1. Gradient descent is an optimization algorithm used to minimize a cost function by iteratively adjusting
    the parameters of the model. 
    2. It is an iterative approach where the coefficients are updated in the 
    direction of the steepest descent of the cost function. 
    3. Gradient descent can be used for both linear and non-linear regression models.
    4. It is computationally efficient for a large number of features or instances as it does not require 
    the inversion of a matrix. 
    5. Gradient descent may require hyperparameter tuning, such as learning rate,
    to ensure convergence and avoid overshooting the minimum. 
Normal Equation Method: 
    1. The normal equation
        method is a closed-form solution to calculate the coefficients of a linear regression model by directly
        solving the normal equations. 
    2. It calculates the coefficients by finding the parameters that minimize the cost function analytically. 
    3. The normal equation method can provide the exact solution for regression coefficients without requiring iteration.
    4. It can be computationally expensive for a large number of features or instances, as it involves the 
    inversion of a matrix (X^T*X), which can be computationally intensive for large datasets. 
    5. The normal equation method may not work well with highly correlated features or when the 
    matrix (X^T*X) is not invertible due to multicollinearity. 
    In summary, the main differences between gradient descent and the normal equation method for regression coefficients 
    estimation are the iterative vs. analytical nature of the methods, computational efficiency, and sensitivity to 
    hyperparameters and data characteristics. Gradient descent is more flexible and computationally efficient 
    for large datasets, while the normal equation method provides an exact solution but may be computationally 
    expensive for large datasets and sensitive to collinearity.

In [None]:
1. Assumption of Linearity: Linear regression assumes a linear relationship between the independent variables 
    and the dependent variable. If this assumption is violated, the model may not accurately capture the true 
    relationship in the data. 
    2. Multicollinearity: Multicollinearity occurs when independent variables in 
        the model are highly correlated with each other. This can lead to unstable coefficients and 
        make it challenging to interpret the individual effects of each variable on the dependent variable. 
        3. Heteroscedasticity: Heteroscedasticity refers to the unequal variance of errors across the range
            of predictor variables. This violates the assumption of constant variance in linear regression,
            leading to inaccurate parameter estimates and confidence intervals.
        4. Outliers: Outliers are data points that do not follow the general trend of the data and 
                can significantly impact the results of a linear regression model. Outliers can influence
                the estimated coefficients and reduce the model's predictive accuracy. 
        5. Overfitting or Underfitting: Overfitting occurs when the model captures noise in the
                    training data rather than the underlying relationship, leading to poor generalization
                    to new data. Underfitting, on the other hand, occurs when the model is too simple to
                    capture the true relationship in the data, resulting in low predictive accuracy.
        6. Non-linear Relationships: Linear regression assumes a linear relationship between
                        the independent and dependent variables. If the true relationship is non-linear,
                        linear regression may not be able to capture the complexity of the data accurately. 
        7. Independence of Errors: Linear regression assumes that the errors or residuals 
                            are independent of each other. Violation of this assumption can lead to biased 
                            parameter estimates and incorrect inferences. 
        8. Limited Expressive Power: Linear regression is a simple and interpretable 
                                model but may lack the expressive power to capture complex relationships 
                                in the data that can be captured by more flexible models like polynomial 
                                regression or decision trees. Addressing these issues involves careful 
                                data preprocessing, model selection, and diagnostics to ensure that the 
                                linear regression model is appropriate for the data and provides reliable
                                results.

In [None]:
Gradient Descent is an optimization algorithm that is commonly used to minimize the loss function in linear
regression models. In the context of linear regression, the goal is to find the optimal values for the 
coefficients (weights) of the linear equation that best fit the data. 
Here's how Gradient Descent works in the context of linear regression: 
1. Initialize the coefficients: Start with some initial values for the coefficients of the linear equation. 
2. Calculate the gradient: Compute the gradient of the loss function with respect to each coefficient. 
    The gradient indicates the direction in which the coefficients should be adjusted to minimize the loss function. 
3. Update the coefficients: Adjust the coefficients in the opposite direction of the gradient to move towards
    the minimum of the loss function. This is done iteratively using the following formula:
        new_coefficient = old_coefficient - learning_rate * gradient Here, the learning rate is a 
        hyperparameter that controls the size of the steps taken during each iteration. 
4. Repeat steps 2 and 3: Continue calculating the gradient and updating the coefficients iteratively 
    until convergence is reached, i.e., when the changes in the coefficients become very small or 
    the loss function reaches a minimum. By iteratively updating the coefficients using the Gradient 
    Descent algorithm, the linear regression model is optimized to find the best-fitting line 
    that minimizes the error between the predicted values and the actual values in the training data.

In [1]:
import numpy as np
from sklearn.linear_model import LinearRegression

# Data
TV = np.array([127.44, 135.76, 130.14, 127.24, 121.18, 132.29, 121.88, 144.59, 148.18, 119.17]).reshape(-1, 1)
Radio = np.array([66.95, 56.75, 68.38, 74.05, 56.22, 64.40, 64.80, 64.31, 55.58, 73.82]).reshape(-1, 1)
Sales = np.array([716.54, 660.01, 658.56, 679.68, 632.94, 751.38, 691.32, 732.85, 691.15, 693.58])

# Fit the multiple regression model
X = np.hstack((TV, Radio))
model = LinearRegression().fit(X, Sales)

# Print the coefficients
print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)

# Predict sales for new data points
new_TV = np.array([140.0]).reshape(-1, 1)
new_Radio = np.array([60.0]).reshape(-1, 1)
new_X = np.hstack((new_TV, new_Radio))
predicted_sales = model.predict(new_X)
print("Predicted sales for new data points:", predicted_sales)

Coefficients: [2.08016218 2.76138544]
Intercept: 240.5616715061651
Predicted sales for new data points: [697.46750372]


In [2]:
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

# Given data
X = np.array([15, 23, 18, 23, 24, 22, 22, 19, 19, 16]).reshape(-1, 1)
Y = np.array([49, 63, 58, 60, 58, 61, 60, 63, 60, 52])

# Linear regression model
linear_model = LinearRegression().fit(X, Y)

# Polynomial regression model of degree 3
poly_features = PolynomialFeatures(degree=3)
X_poly = poly_features.fit_transform(X)
poly_model = LinearRegression().fit(X_poly, Y)

# Print coefficients of linear regression model
print("Linear Regression Model Coefficients:")
print("Intercept:", linear_model.intercept_)
print("Coefficient:", linear_model.coef_[0])

# Print coefficients of polynomial regression model
print("\nPolynomial Regression Model Coefficients:")
print("Intercept:", poly_model.intercept_)
print("Coefficients:", poly_model.coef_)

Linear Regression Model Coefficients:
Intercept: 37.463442069741276
Coefficient: 1.0416197975253096

Polynomial Regression Model Coefficients:
Intercept: -103.8179600958095
Coefficients: [ 0.00000000e+00  1.57135372e+01 -3.66067751e-01 -2.86182579e-04]


In [None]:
The normal equation method is a mathematical approach used to find the parameters (coefficients) of a multiple 
linear regression model directly, without requiring iterative optimization algorithms like gradient descent.
It provides a closed-form solution for the optimal parameters that minimize the sum of squared errors between 
the predicted values and the actual values in the training data.

In [None]:
Advantages of the normal equation method:

No need for feature scaling: Unlike gradient descent, the normal equation method does not require 
    feature scaling,
    making it convenient when dealing with features on different scales.
Direct solution: The normal equation provides a direct solution for the optimal parameters without 
    the need for iteration.

In [3]:
import numpy as np

def normal_equation(X, y):
 
    # Add intercept term to X if it's not already included
    if X.shape[1] == 1:
        X = np.hstack((np.ones((X.shape[0], 1)), X))
    
    # Compute parameters using the normal equation
    theta = np.linalg.inv(X.T @ X) @ X.T @ y
    
    return theta

# Example usage
X = np.array([[1, 15], [1, 23], [1, 18], [1, 23], [1, 24], [1, 22], [1, 22], [1, 19], [1, 19], [1, 16]])
y = np.array([49, 63, 58, 60, 58, 61, 60, 63, 60, 52])
theta = normal_equation(X, y)
print("Parameters (coefficients) estimated by the normal equation method:")
print(theta)

Parameters (coefficients) estimated by the normal equation method:
[37.46344207  1.0416198 ]
