# Assignment

Question 1 : What is Simple Linear Regression (SLR)? Explain its purpose.
   - Simple Linear Regression (SLR) is a statistical technique used to model the
linear relationship between one independent variable (X) and one dependent
variable (Y).

  - It is represented as:
        Y = β0 + β1X + ε

   - Purpose:
      - To predict the value of Y from X
      - To understand the relationship between two variables
      - To measure the effect of X on Y
      - To analyze trends and patterns in data
      - To support decision-making and forecasting


Question 2: What are the key assumptions of Simple Linear Regression?
   - The key assumptions of Simple Linear Regression are:

      1. Linearity: There is a linear relationship between the independent variable (X) and the dependent variable (Y).
      2. Independence: The observations are independent of each other.
      3. Homoscedasticity: The variance of the error terms is constant for all values of X.
      4. Normality: The error terms are normally distributed with a mean of zero.
      5. No Perfect Multicollinearity: The independent variable is not constant and varies across observations.


Question 3: Write the mathematical equation for a simple linear regression model and explain each term.
   - The mathematical equation of a Simple Linear Regression model is:
        
        Y = β0 + β1X + ε

Explanation of each term:

  - Y  : Dependent variable (the output or response to be predicted)
  - X  : Independent variable (the input or predictor)
  - β0 : Intercept - the value of Y when X = 0
  - β1 : Slope - the change in Y for a one-unit change in X
  - ε  : Error term - represents random error or unexplained variation


Question 4: Provide a real-world example where simple linear regression can be
applied.
   - Real-world example of Simple Linear Regression:

  - Simple Linear Regression can be applied to predict house prices based on house size.

  - Example:
    - X = Size of the house (in square feet)
    - Y = Price of the house

  - As the size of the house increases, the price generally increases.
Simple Linear Regression helps model this relationship and predict
the house price for a given house size.

Question 5: What is the method of least squares in linear regression?
   - The Method of Least Squares is a technique used in linear regression to
estimate the best-fitting line for a given set of data points.

  - It works by minimizing the sum of the squares of the errors (residuals)
between the observed values and the predicted values.

  - Residual = Actual value - Predicted value

  - Objective: Minimize  Σ (Yi - Ŷi)²

  - Where:
    - Yi  = Actual observed value
    - Ŷi  = Predicted value from the regression line

  - Purpose:
      - To find optimal values of β0 (intercept) and β1 (slope)
      - To ensure the regression line fits the data as closely as possible
      - To reduce overall prediction error


Question 6: What is Logistic Regression? How does it differ from Linear Regression?
  - Logistic Regression is a statistical and machine learning technique used
for binary classification problems, where the dependent variable has
two possible outcomes (such as 0/1, Yes/No, True/False).
  - Instead of predicting continuous values, Logistic Regression predicts
the probability that an observation belongs to a particular class.
It uses the sigmoid (logistic) function to map predicted values between 0 and 1.

  - Sigmoid function: P(Y=1) = 1 / (1 + e^(-z))

  - Difference between Linear Regression and Logistic Regression:
      - 1. Output Type:
        - Linear Regression -> Continuous values
        - Logistic Regression -> Probabilities / Class labels
      - 2. Dependent Variable:
        - Linear Regression -> Numerical
        - Logistic Regression -> Categorical (Binary)
      - 3. Model Equation:
        - Linear Regression -> Straight line
        - Logistic Regression -> Sigmoid (S-shaped) curve
      - 4. Use Case:
        - Linear Regression -> Prediction problems
        - Logistic Regression -> Classification problems

Question 7: Name and briefly describe three common evaluation metrics for regression models.
  - Three common evaluation metrics for regression models are:
    - 1. Mean Squared Error (MSE): It measures the average of the squared differences between actual and predicted values. Larger errors are penalized more due to squaring.
      - Formula: MSE = (1/n) Σ (Yi - Ŷi)²

    - 2. Mean Absolute Error (MAE): It calculates the average of the absolute differences between actual and predicted values. It treats all errors equally.
      - Formula: MAE = (1/n) Σ |Yi - Ŷi|

    - 3. R-squared (R²): It represents the proportion of variance in the dependent variable that is explained by the regression model. Its value lies between 0 and 1.
      - R² = 1 → Perfect fit
      - R² = 0 → No explanatory power


Question 8: What is the purpose of the R-squared metric in regression analysis?
  - The purpose of the R-squared (R²) metric in regression analysis is to measure how well the regression model explains the variability of the dependent variable.

  - R-squared indicates the proportion of the total variation in the dependent variable that is explained by the independent variable(s).
  - Purpose:
    - To evaluate the goodness of fit of a regression model
    - To compare different regression models
    - To understand the explanatory power of predictors


Question 9: Write Python code to fit a simple linear regression model using scikit-learn and print the slope and intercept.

In [1]:
# Import required libraries
import numpy as np
from sklearn.linear_model import LinearRegression

# Sample data (Independent variable X and Dependent variable y)
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([2, 4, 5, 4, 6])

# Create and fit the model
model = LinearRegression()
model.fit(X, y)

# Print slope and intercept
print("Slope (β1):", model.coef_[0])
print("Intercept (β0):", model.intercept_)


Slope (β1): 0.8000000000000002
Intercept (β0): 1.7999999999999998


Question 10: How do you interpret the coefficients in a simple linear regression model?
  - In a Simple Linear Regression model, the coefficients explain the
relationship between the independent variable (X) and the dependent
variable (Y).
    - 1. Intercept (β0):
   It represents the value of the dependent variable (Y) when the
   independent variable (X) is zero.
    - 2. Slope (β1):
   It represents the average change in the dependent variable (Y)
   for a one-unit increase in the independent variable (X).
  - Interpretation:
    - A positive β1 indicates a positive relationship between X and Y
    - A negative β1 indicates a negative relationship between X and Y
    - The magnitude of β1 shows the strength of the effect of X on Y
