# Ordinary Least Square (OLS)

> Linear regression is a statistical modeling technique that estimates the linear relationship between a dependent variable and one or more independent variables by minimizing the squared differences between observed and predicted values.

## 🧠 Theory Behind It

* OLS solves:

  $$
  \hat{\beta} = (X^T X)^{-1} X^T y
  $$
* It assumes:

  * Linear relationship
  * No multicollinearity
  * Homoscedasticity (equal variance of errors)
  * Errors are uncorrelated and normally distributed (for inference)

In [5]:
import numpy as np

In [3]:
class OLS:
    def __init__(self):
        self.coefficients = None

    def fit(self, X, y):
        # Add intercept term (column of ones)
        X = np.c_[np.ones(X.shape[0]), X]
        # OLS closed-form solution: β = (XᵀX)^(-1)Xᵀy
        self.coefficients = np.linalg.inv(X.T @ X) @ X.T @ y

    def predict(self, X):
        X = np.c_[np.ones(X.shape[0]), X]
        return X @ self.coefficients

    def summary(self):
        print("Coefficients:")
        print(self.coefficients)

In [6]:
# Example data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 5, 4, 5])

model = OLS()
model.fit(X, y)
predictions = model.predict(X)

print("Predictions:", predictions)
model.summary()

Predictions: [2.8 3.4 4.  4.6 5.2]
Coefficients:
[2.2 0.6]


In [7]:
from sklearn.linear_model import LinearRegression

In [8]:
# Fit model
model = LinearRegression()
model.fit(X, y)

# Results
print("Intercept:", model.intercept_)
print("Coefficient(s):", model.coef_)
print("R^2 Score:", model.score(X, y))

# Prediction
y_pred = model.predict(X)
print("Predictions:", y_pred)

Intercept: 2.2
Coefficient(s): [0.6]
R^2 Score: 0.6000000000000001
Predictions: [2.8 3.4 4.  4.6 5.2]


In [10]:
import statsmodels.api as sm

In [11]:
# Add constant for intercept
X = sm.add_constant(X)

# Fit model
model = sm.OLS(y, X)
results = model.fit()

# Summary
print(results.summary())

                            OLS Regression Results                            
Dep. Variable:                      y   R-squared:                       0.600
Model:                            OLS   Adj. R-squared:                  0.467
Method:                 Least Squares   F-statistic:                     4.500
Date:                Wed, 06 Aug 2025   Prob (F-statistic):              0.124
Time:                        14:54:03   Log-Likelihood:                -5.2598
No. Observations:                   5   AIC:                             14.52
Df Residuals:                       3   BIC:                             13.74
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          2.2000      0.938      2.345      0.1

  warn("omni_normtest is not valid with less than 8 observations; %i "


## References

- [Linear Regression](https://righteous-guardian-68f.notion.site/Linear-Regression-46ad6f94e3524ed6bf819e5a57aa2f39?source=copy_link)