# Improved Implementation for Stochastic Linear Regression
with regularization.\
with efficiency improvements.\
Recommended to check cost after convergence.

In [12]:
import numpy as np

In [13]:
from algorithms import LinearRegression

In [14]:
help(LinearRegression.fit)

Help on function fit in module algorithms.linear_regression:

fit(self, X, y, epochs=100, alpha=0.01, Lambda=0.0001, error_threshold=0.001, validation_size=0.2, output_limit=10)
    Fit the linear regression model to the given data.

    Parameter
    ---------
    epochs: int, default=1000
        Number of complete iterations through X

    alpha : float, default=0.01
        Constant Learning Rate

    Lambda : float, default=0.0001
        Rate for l2 Regularization

    error_threshold: float, default=0.001
        Threshold for vCost convergence

    validation_size: float, default=0.2
        Percent of data for validation, 0 <= vs < 1

    output_limit : int, default=10
        Number of iterations to show

    Returns
    -------
    W : numpy.ndarray
        The optimized weights.
    b : numpy.longdouble
        The optimized intercept.



## Usage

In [15]:
m = LinearRegression()
X = np.random.rand(1000,2)
y = 5.55*X[:,0] + 11.22*X[:,1] + 50
m.fit(X, y ,epochs= 1000, alpha = 0.2, error_threshold = 0.00001, output_limit=10)

(0/10) > Epoch: 0 cost: 3362.63558098 vCost: 3346.84083299
(1/10) > Epoch: 100 cost: 0.45472863 vCost: 0.53433772
(2/10) > Epoch: 200 cost: 0.01875996 vCost: 0.02214043
(3/10) > Epoch: 300 cost: 0.00095989 vCost: 0.00113396
       * Epoch: 349 vCost: 0.00032910

Epoch 349 > vCost Converged with threshold 1e-05. Or performance degraded.


(array([ 5.5329, 11.1661]), np.float64(50.03884796842025))

In [16]:
from sklearn.datasets import make_regression

In [17]:
X, y = make_regression(n_samples=1000,n_features=20, n_informative=19)
m = LinearRegression()
m.fit(X, y ,epochs= 100, alpha = 0.5, error_threshold = 0.01, output_limit=10)

(0/10) > Epoch: 0 cost: 63303.45145403 vCost: 67262.72783378
(1/10) > Epoch: 10 cost: 6.26749439 vCost: 8.42540451
(2/10) > Epoch: 20 cost: 0.00822800 vCost: 0.00970340
       * Epoch: 28 vCost: 0.00480771

Epoch 28 > vCost Converged with threshold 0.01. Or performance degraded.


(array([ 4.5537e+01,  2.6763e+01,  1.8026e+01,  5.4543e+01,  3.0484e+01,
         6.5445e+01,  8.6173e+01,  7.6459e+01,  5.1189e+01, -7.1867e-04,
         5.4358e+01,  7.7115e+01,  4.8077e+01,  4.6832e+00,  9.7940e+01,
         7.2623e+01,  5.5266e+01,  5.4858e+01,  2.4516e+01,  5.9024e+01]),
 np.float64(0.001512764633114047))

In [18]:
from sklearn.datasets import fetch_california_housing

In [19]:
california = fetch_california_housing(as_frame=True)

In [20]:
X,y = california["data"], california["target"]
X,y = X.to_numpy(), y.to_numpy()

In [21]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(X)

In [22]:
m = LinearRegression()
m.fit(X, y ,epochs= 25, alpha = 0.0001, error_threshold = 1/1000, validation_size=1/3 ,output_limit=10)

(0/10) > Epoch: 0 cost: 3.47690903 vCost: 4.69563501
(1/10) > Epoch: 2 cost: 1.94832178 vCost: 2.30065256
(2/10) > Epoch: 4 cost: 1.93392478 vCost: 2.23136357
(3/10) > Epoch: 6 cost: 1.93258689 vCost: 2.22690204
(4/10) > Epoch: 8 cost: 1.93174274 vCost: 2.22623631
(5/10) > Epoch: 10 cost: 1.93092931 vCost: 2.22581156
(6/10) > Epoch: 12 cost: 1.93011845 vCost: 2.22540240
(7/10) > Epoch: 14 cost: 1.92930836 vCost: 2.22499444
(8/10) > Epoch: 16 cost: 1.92849894 vCost: 2.22458678
       * Epoch: 16 vCost: 2.22458678

Epoch 16 > vCost Converged with threshold 0.001. Or performance degraded.


(array([0.2235, 0.06  , 0.1813, 0.5564, 0.0276, 0.0043, 0.6579, 0.8448]),
 np.float64(1.8831042818732568))