#Question 1 : What is Simple Linear Regression (SLR)? Explain its purpose.
 - Simple Linear Regression (SLR) is a basic statistical method used to model the relationship between one independent variable (X) and one dependent variable (Y).

 - It assumes this relationship is linear, meaning it can be represented with a straight-line equation:
  
 - Y = β0 ​+ β1​X
 - Where:

   - β₀ (Intercept) → The value of Y when X = 0

   - β₁ (Slope/Coefficient) → The change in Y for a one-unit increase in X

 - Purpose of Simple Linear Regression:
  - 1.Prediction:-
    - To predict the value of Y based on a given X.
    - Example: Predicting a student's exam score (Y) based on hours studied (X).

 - 2.Understanding Relationships:-
    - To determine if X has a significant impact on Y.
    - Example: Does advertising spend (X) actually increase sales (Y)?  
 - 3.Trend Analysis:-
    - To observe how Y changes as X increases or decreases.  

#Question 2: What are the key assumptions of Simple Linear Regression?
 - Key Assumptions of Simple Linear Regression (SLR):
 - 1.Linearity :-
     - The relationship between the independent variable (X) and dependent     variable (Y) should be a straight line. If the pattern is curved or random, SLR won’t work well.
 - 2.Independence of Errors :-
     - The errors (difference between actual and predicted values) should not influence each other. Each prediction mistake should be independent of the next one.
 - 3.Homoscedasticity :-
     - The errors should have constant variance across all levels of X. In simple words, the spread of residuals should be uniform and not increase or decrease with X.    

  - 4.Normality of Errors :-
     - The residuals (errors) should follow a normal distribution—most errors should be close to zero, and very high/low errors should be rare.   

  - 5.No Multicollinearity :-
     - This isn’t a concern in Simple Linear Regression because there is only one independent variable. (This assumption is more relevant in Multiple Linear Regression.)  

#Question 3: Write the mathematical equation for a simple linear regression model and explain each term.
- Mathematical Equation :-
- Y = β0 + β1X

- Y (Dependent Variable)
    - The outcome or value to predict.
    - Example: House price, salary, marks, etc.
- X (Independent Variable)
    - The input or predictor that influences Y.
    - Example: Size of the house, years of experience, hours studied.  

- β₀ (Beta Zero / Intercept)
    - The value of Y when X = 0.
    - It represents the starting point of the regression line.

- β₁ (Beta One / Slope / Coefficient)
    - It shows how much Y changes when X increases by 1 unit.
    - Example: If β₁ = 2, then for every 1 unit increase in X, Y increases by 2.


#Question 4: Provide a real-world example where simple linear regression can be applied.

 - A real-world example of simple linear regression is predicting a student's exam score based on the number of hours studied.

 - Independent Variable (X): Hours studied
 - Dependent Variable (Y): Exam score

 - As the number of study hours increases, the exam score is also likely to increase. By applying simple linear regression,

 - Suppose the regression equation we found from past data is :-
   - Exam Score (Y) = 30 + 8 × (Hours Studied)

 - if a student studies for 5 hours, we can predict :-
   - Y = 30 + 8 × 5 = 30 + 40 = 70
 - Predicted Exam Score = 70 marks

#Question 5: What is the method of least squares in linear regression?
 - The method of least squares is a technique used in linear regression to find the best-fitting line by minimizing the sum of the squared differences between the actual values (Y) and the predicted values ($\hat{Y}$).

 - It ensures that the line represents the data as accurately as possible by making the total squared errors as small as possible.

 - Regression equation :-
   - Y = β0 ​+ β1​X

 - where β0 and 𝛽1 are calculated using the least squares method.

#Question 6: What is Logistic Regression? How does it differ from Linear Regression?

 - Logistic Regression :-
    - Logistic Regression is a statistical method used to predict a categorical outcome, usually binary (like Yes/No or 0/1), based on one or more independent variables.
 - It uses the logistic (sigmoid) function to convert the linear combination of inputs into a probability between 0 and 1.

 - Equation :-
   - P(Y=1) = 1 /  1+e(β0​ + β1​X )
   - where P(Y=1) is the probability of the event occurring.

 - Difference from Linear Regression :-

 - 1.Output: Linear regression predicts a continuous value (e.g., salary, marks), while logistic regression predicts a categorical value (e.g., 0 or 1, Yes/No).  

 - 2.Prediction Type: Linear regression gives a direct numeric prediction, whereas logistic regression gives a probability, which is then converted into a class label.

 - 3.Equation: Linear regression uses 𝑌 = β0 + β1X , while logistic regression uses the sigmoid function to output probabilities.

#Question 7: Name and briefly describe three common evaluation metrics for regression models.

 - 1.Mean Absolute Error (MAE):-
   - MAE measures the average of the absolute differences between the actual values and the predicted values.
   - It tells us, on average, how much the predictions deviate from the actual values, without considering the direction of the error.
   - Lower MAE indicates a better model.

 - 2.Mean Squared Error (MSE):-
   - MSE calculates the average of the squared differences between actual and predicted values.
   - Squaring the errors gives more weight to larger errors, making it sensitive to outliers.
   - Lower MSE indicates a better fit.

 - 3.Root Mean Squared Error (RMSE):-
   - RMSE is the square root of MSE, bringing the error back to the same units as the target variable.
   - It is widely used because it penalizes large errors more than MAE and is easy to interpret.

#Question 8: What is the purpose of the R-squared metric in regression analysis?

- R-squared is a metric that measures how well the independent variable(s) explain the variation in the dependent variable.

- It represents the proportion of the total variance in the dependent variable (Y) that is explained by the regression model.

 - Value range: 0 to 1
   - 0 → Model explains none of the variation
   - 1 → Model explains all the variation

 - Purpose: To indicate the goodness of fit of the regression model — a higher R-squared means the model explains the data better.  

 - R-squared tells us how much of the change in Y can be explained by X.



In [2]:
#Question 9: Write Python code to fit a simple linear regression model using scikit-learn and print the slope and intercept.
import numpy as np
import pandas as pd

from sklearn.linear_model import LinearRegression

In [5]:
X = np.array([[1], [2], [3], [4], [5]])

In [7]:
Y = np.array([2, 4, 5, 4, 5])

In [8]:
X

array([[1],
       [2],
       [3],
       [4],
       [5]])

In [9]:
Y

array([2, 4, 5, 4, 5])

In [3]:
model = LinearRegression()

In [4]:
model

In [10]:
model.fit(X,Y)

In [15]:
print("Slope (β1):", model.coef_[0])
print("Intercept (β0):", model.intercept_)

Slope (β1): 0.6
Intercept (β0): 2.2


#Question 10: How do you interpret the coefficients in a simple linear regression model?

 - In a simple linear regression model :-
    - Y = β0 + β1X
 - Intercept (β0) :-
    - The value of Y when X = 0. It represents the starting point of the regression line.
 - Slope / Coefficient (β1) :-
    - The change in Y for a one-unit increase in X.
 - Positive slope → Y increases as X increases.
 - Negative slope → Y decreases as X increases.    