#### Linear Regression

It is one of the best statistical models that studies the relationship between a dependent variable (Y) with a given set of independent variables (X). The relationship can be established with the help of fitting a best line.
sklearn.linear_model.LinearRegression is the module used to implement linear regression.

Below is the Implementation Example:

In [8]:
import numpy as np
from sklearn.linear_model import LinearRegression

# Create the independent variable
X = np.array([[1, 1], [2, 2], [3, 3], [4, 4]])

# Set the values of the dependent variable
y = np.dot(X, np.array([1, 2])) + 3

# Create and fit the Linear Regression model
regr = LinearRegression(fit_intercept=True, copy_X=True, n_jobs=2)
regr.fit(X, y)

# Use the predict() method to make predictions using the Linear Regression model
prediction = regr.predict(np.array([[3, 5]]))
print(prediction)


[15.]


In [9]:
### TO get the Coeffificent of Determination we use Score method
regr.score(X,y)



1.0

In [10]:
### estimate the coefficients by using attribute named ‘coef’ as follows:
regr.coef_

array([1.5, 1.5])

In [11]:
regr.intercept_

3.0000000000000018

#### Logistic Regression

Logistic regression, despite its name, is a classification algorithm rather than regression algorithm. Based on a given set of independent variables, it is used to estimate discrete value (0 or 1, yes/no, true/false). It is also called logit or MaxEnt Classifier.
Basically, it measures the relationship between the categorical dependent variable and one or more independent variables by estimating the probability of occurrence of an event using its logistics function.
sklearn.linear_model.LogisticRegression is the module used to implement logistic regression.

In [14]:
### Following Python script provides a simple example of implementing logistic regression on iris dataset of scikit-learn:

from sklearn import datasets
from sklearn import linear_model
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
LRG = linear_model.LogisticRegression(random_state=0,solver='liblinear').fit(X, y)
LRG.score(X, y)

0.96

The output shows that the above Logistic Regression model gave the accuracy of 96 percent.

### Ridge Regression
Ridge regression or Tikhonov regularization is the regularization technique that performs L2 regularization. It modifies the loss function by adding the penalty (shrinkage quantity) equivalent to the square of the magnitude of coefficients. Σ(𝑌𝑖−𝑊0−Σ𝑊𝑖𝑋𝑗𝑖𝑛𝑖=1)2+ 𝑚𝑗=1𝛼Σ𝑊𝑖2𝑛𝑖=1=𝑙𝑜𝑠𝑠_𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛+ 𝛼Σ𝑊𝑖2𝑛𝑖=1
 sklearn.linear_model.Ridge is the module used to solve a regression model where loss function is the linear least squares function and regularization is L2.

In [15]:
#### Following Python script provides a simple example of implementing Ridge Regression.
### We are using 15 samples and 10 features. The value of alpha is 0.5 in our case.
#### There are two methods namely fit() and score() used to fit this model and calculate the score respectively.

from sklearn.linear_model import Ridge
import numpy as np
n_samples, n_features = 15, 10
rng = np.random.RandomState(0)
y = rng.randn(n_samples)
X = rng.randn(n_samples, n_features)
rdg = Ridge(alpha=0.5)
rdg.fit(X, y)
rdg.score(X,y)

0.7629498741931634

In [18]:
rdg.coef_

#### Similarly, we can get the value of intercept with the help of following python script:
rdg.intercept_

0.5274865723969377

### Bayesian Ridge Regression

Bayesian regression allows a natural mechanism to survive insufficient data or poorly distributed data by formulating linear regression using probability distributors rather than point estimates. The output or response ‘y’ is assumed to drawn from a probability distribution rather than estimated as a single value.
Mathematically, to obtain a fully probabilistic model the response y is assumed to be Gaussian distributed around 𝑋𝑤 as follows: 𝑝(𝑦|𝑋,𝑤,𝛼)=𝑁(𝑦|𝑋𝑤,𝛼)
One of the most useful type of Bayesian regression is Bayesian Ridge regression which estimates a probabilistic model of the regression problem. Here the prior for the coefficient 𝑤 is given by spherical Gaussian as follows: 𝑝(𝑤|𝜆)=𝑁(𝑤|0,𝜆−1𝐼𝑝)
This resulting model is called Bayesian Ridge Regression and in scikit-learn sklearn.linear_model.BeyesianRidge module is used for Bayesian Ridge Regression.

In [20]:
from sklearn import linear_model
X = [[0, 0], [1, 1], [2, 2], [3, 3]]
Y = [0, 1, 2, 3]
BayReg = linear_model.BayesianRidge()
BayReg.fit(X, Y)

In [21]:
BayReg.predict([[1,1]])

array([1.00000007])

In [24]:
BayReg.coef_

array([0.49999993, 0.49999993])

In [23]:
BayReg.intercept_

1.9999946720972162e-07

### LASSO(Least Absolute Shrinkage and Selection Operator)

LASSO is the regularisation technique that performs L1 regularisation. It modifies the loss function by adding the penalty (shrinkage quantity) equivalent to the summation of the absolute value of coefficients. Σ(𝑌𝑖−𝑊0−Σ𝑊𝑖𝑋𝑗𝑖𝑛𝑖=1)2+ 𝑚𝑗=1𝛼Σ|𝑊𝑖|𝑛𝑖=1=𝑙𝑜𝑠𝑠_𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛+ 𝛼Σ|𝑊𝑖|𝑛𝑖=1
sklearn.linear_model. Lasso is a linear model, with an added regularisation term, used to estimate sparse coefficients.

In [25]:
from sklearn import linear_model
Lreg = linear_model.Lasso(alpha=0.5)
Lreg.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])

In [26]:
Lreg.coef_

array([0.25, 0.  ])

In [27]:
Lreg.intercept_

0.75

In [28]:
Lreg.n_iter_

2

#### Multi task LASSO

It allows to fit multiple regression problems jointly enforcing the selected features to be same for all the regression problems, also called tasks. Sklearn provides a linear model named MultiTaskLasso, trained with a mixed L1, L2-norm for regularisation, which estimates sparse coefficients for multiple regression problems jointly. In this the response y is a 2D array of shape (n_samples, n_tasks).
The parameters and the attributes for MultiTaskLasso are like that of Lasso. The only difference is in the alpha parameter. In Lasso the alpha parameter is a constant that multiplies L1 norm, whereas in Multi-task Lasso it is a constant that multiplies the L1/L2 terms.
And, opposite to Lasso, MultiTaskLasso doesn’t have precompute attribute


Following Python script uses MultiTaskLasso linear model which further uses coordinate descent as the algorithm to fit the coefficients

In [29]:
from sklearn import linear_model
MTLReg = linear_model.MultiTaskLasso(alpha=0.5)
MTLReg.fit([[0,0], [1, 1], [2, 2]], [[0, 0],[1,1],[2,2]])
MTLReg.predict([[0,1]])

array([[0.53033009, 0.53033009]])

In [30]:
MTLReg.intercept_

array([0.53033009, 0.53033009])

We can get the total number of iterations to get the specified tolerance with the help of following python script:

In [31]:
MTLReg.n_iter_

2

### Elastic-Net

The Elastic-Net is a regularised regression method that linearly combines both penalties i.e. L1 and L2 of the Lasso and Ridge regression methods. It is useful when there are multiple correlated features. The difference between Lass and Elastic-Net lies in the fact that Lasso is likely to pick one of these features at random while elastic-net is likely to pick both at once.
Sklearn provides a linear model named ElasticNet which is trained with both L1, L2-norm for regularisation of the coefficients. The advantage of such combination is that it allows for learning a sparse model where few of the weights are non-zero like Lasso regularisation method, while still maintaining the regularization properties of Ridge regularisation method.
Following is the
objective function to minimise: min𝑤12𝑛𝑠𝑎𝑚𝑝𝑙𝑒𝑠||𝑋𝑤−𝑦||22+𝛼𝜌||𝑤||1+𝛼(1−𝜌)2||𝑤||

Following Python script uses ElasticNet linear model which further uses coordinate descent as the algorithm to fit the coefficients:

In [32]:
from sklearn import linear_model
ENreg = linear_model.ElasticNet(alpha=0.5,random_state=0)
ENreg.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])

In [33]:
ENreg.predict([[0,1]])


array([0.73681643])

In [34]:
ENreg.coef_

array([0.26318357, 0.26313923])

In [35]:
ENreg.intercept_

0.47367720941913904

In [36]:
ENreg.n_iter_

15

In [37]:
### We can change the values of alpha (towards 1) to get better results from the model.
### Let us see same example with alpha = 1.
#### from sklearn import linear_model
ENreg = linear_model.ElasticNet(alpha=1,random_state=0)
ENreg.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])


In [38]:
ENreg.predict([[1,0]])

array([0.90909216])

In [39]:
ENreg.coef_

array([0.09091128, 0.09090784])

In [40]:
#Calculating intercept
ENreg.intercept_

0.818180878658411

In [41]:
#Calculating number of iterations
ENreg.n_iter_

10

#### MultiTaskElasticNet

It is an Elastic-Net model that allows to fit multiple regression problems jointly enforcing the selected features to be same for all the regression problems, also called tasks. Sklearn provides a linear model named MultiTaskElasticNet, trained with a mixed L1, L2-norm and L2 for regularisation, which estimates sparse coefficients for multiple regression problems jointly. In this, the response y is a 2D array of shape (n_samples, n_tasks).

Implementation Example
To show the difference, we are implementing the same example as we did in Multi-task Lasso:

In [42]:
from sklearn import linear_model
MTENReg = linear_model.MultiTaskElasticNet(alpha=0.5)
MTENReg.fit([[0,0], [1, 1], [2, 2]], [[0, 0],[1,1],[2,2]])

In [43]:
MTENReg.predict([[1,0]])

array([[0.69061776, 0.69061776]])

In [44]:
MTENReg.coef_

array([[0.30943437, 0.30938224],
       [0.30943437, 0.30938224]])

In [45]:
MTENReg.intercept_

array([0.38118338, 0.38118338])

In [46]:
MTENReg.n_iter_

15

### POLYNOMIAL REGRESSION

Linear models trained on non-linear functions of data generally maintains the fast performance of linear methods. It also allows them to fit a much wider range of data. That’s the reason in machine learning such linear models, that are trained on nonlinear functions, are used.
One such example is that a simple linear regression can be extended by constructing polynomial features from the coefficients.
Mathematically, suppose we have standard linear regression model then for 2-D data it would look like this: 𝑌= 𝑤0+𝑤1𝑥1+𝑤2𝑥2
Now, we can combine the features in second-order polynomials and our model will look like as follows:

Now, we can combine the features in second-order polynomials and our model will look like as follows:
𝑌= 𝑤0+𝑤1𝑥1+𝑤2𝑥2+𝑤3𝑥1𝑥2+ 𝑤4𝑥12 + 𝑤5𝑥22
The above is still a linear model. Here, we saw that the resulting polynomial regression is in the same class of linear models and can be solved similarly.
To do so, scikit-learn provides a module named PolynomialFeatures. This module transforms an input data matrix into a new data matrix of given degree

In [47]:
from sklearn.preprocessing import PolynomialFeatures
import numpy as np
Y = np.arange(8).reshape(4, 2)
poly = PolynomialFeatures(degree=2)
poly.fit_transform(Y)

array([[ 1.,  0.,  1.,  0.,  0.,  1.],
       [ 1.,  2.,  3.,  4.,  6.,  9.],
       [ 1.,  4.,  5., 16., 20., 25.],
       [ 1.,  6.,  7., 36., 42., 49.]])

#### Streamlining using Pipeline Tools

In [50]:
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
import numpy as np
#Next, create an object of Pipeline tool
Stream_model = Pipeline([('poly', PolynomialFeatures(degree=3)), ('linear', LinearRegression(fit_intercept=False))])
#Provide the size of array and order of polynomial data to fit the model.
x = np.arange(5)
y = 3 - 2 * x + x ** 2 - x ** 3
Stream_model.fit(x[:, np.newaxis], y)
#Calculate the input polynomial coefficients.
Stream_model.named_steps['linear'].coef_

array([ 3., -2.,  1., -1.])

In [51]:
#Calculate the input polynomial coefficients.
Stream_model.named_steps['linear'].coef_

array([ 3., -2.,  1., -1.])