# Improved Implementation for Stochastic Linear Regression
with regularization.\
with efficiency improvements.\
Recommended to check cost after convergence.

In [1]:
import numpy as np

In [14]:
from algorithms import LinearRegression

In [15]:
help(LinearRegression.fit)

Help on function fit in module algorithms.linear_regression:

fit(self, X, y, epochs=100, alpha=0.01, Lambda=0.0001, error_threshold=0.001, validation_size=0.2, output_limit=10)
    Fit the linear regression model to the given data.

    Parameter
    ---------
    epochs: int, default=1000
        Number of complete iterations through X

    alpha : float, default=0.01
        Constant Learning Rate

    Lambda : float, default=0.0001
        Rate for l2 Regularization

    error_threshold: float, default=0.001
        Threshold for vCost convergence

    validation_size: float, default=0.2
        Percent of data for validation, 0 <= vs < 1

    output_limit : int, default=10
        Number of iterations to show

    Returns
    -------
    W : numpy.ndarray
        The optimized weights.
    b : numpy.longdouble
        The optimized intercept.



## Usage

In [5]:
m = LinearRegression()
X = np.random.rand(1000,2)
y = 5.55*X[:,0] + 11.22*X[:,1] + 50
m.fit(X, y ,epochs= 1000, alpha = 0.2, error_threshold = 0.00001, output_limit=10)

(0/10) > Epoch: 0 cost: 3368.22703666 vCost: 3300.97566839
(1/10) > Epoch: 100 cost: 0.46136860 vCost: 0.44527131
(2/10) > Epoch: 200 cost: 0.01807726 vCost: 0.01742713
(3/10) > Epoch: 300 cost: 0.00087910 vCost: 0.00084725
       * Epoch: 338 vCost: 0.00031620

Epoch 338 > vCost Converged with threshold 1e-05. Or performance degraded.


(array([ 5.526 , 11.1641]), 50.037216373611436)

In [6]:
from sklearn.datasets import make_regression

In [7]:
X, y = make_regression(n_samples=1000,n_features=20, n_informative=19)
m = LinearRegression()
m.fit(X, y ,epochs= 100, alpha = 0.5, error_threshold = 0.01, output_limit=10)

(0/10) > Epoch: 0 cost: 55795.93931730 vCost: 66233.10665206
(1/10) > Epoch: 10 cost: 9.61374721 vCost: 13.63082837
(2/10) > Epoch: 20 cost: 0.01072405 vCost: 0.01418834
       * Epoch: 29 vCost: 0.00443370

Epoch 29 > vCost Converged with threshold 0.01. Or performance degraded.


(array([1.3169e+01, 6.1916e+01, 6.7425e-01, 4.0416e+01, 6.8453e+01,
        5.4340e+01, 9.7061e+01, 5.0039e+01, 3.3161e+01, 9.5036e+01,
        2.7155e-03, 8.1672e+01, 6.0559e+01, 1.5335e+01, 4.8932e+01,
        2.8922e+01, 2.3152e+01, 5.7106e+00, 6.9056e+00, 8.4192e+01]),
 0.0009759274420771402)

In [8]:
from sklearn.datasets import fetch_california_housing

In [9]:
california = fetch_california_housing(as_frame=True)

In [10]:
X,y = california["data"], california["target"]
X,y = X.to_numpy(), y.to_numpy()

In [11]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(X)

In [12]:
m = LinearRegression()
m.fit(X, y ,epochs= 25, alpha = 0.0001, error_threshold = 1/1000, validation_size=1/3 ,output_limit=10)

(0/10) > Epoch: 0 cost: 4.09654386 vCost: 6.61953464
(1/10) > Epoch: 2 cost: 2.52936788 vCost: 4.11320062
(2/10) > Epoch: 4 cost: 2.52735669 vCost: 4.05134147
(3/10) > Epoch: 6 cost: 2.52690437 vCost: 4.04713218
(4/10) > Epoch: 8 cost: 2.52615206 vCost: 4.04620072
(5/10) > Epoch: 10 cost: 2.52537939 vCost: 4.04547694
(6/10) > Epoch: 12 cost: 2.52460583 vCost: 4.04476660
(7/10) > Epoch: 14 cost: 2.52383262 vCost: 4.04405734
(8/10) > Epoch: 16 cost: 2.52305983 vCost: 4.04334834
       * Epoch: 16 vCost: 4.04334834

Epoch 16 > vCost Converged with threshold 0.001. Or performance degraded.


(array([0.0507, 0.8839, 0.4577, 0.2947, 0.6564, 0.8595, 0.0673, 0.1276]),
 1.9743460920631504)