**Non-Linear Regression**

*yh = w0 + w1x1 + w2x2 +....+ wnxn => linear regression* and
*yh = w0 + w1x1 + w2x2 + w3x1x2 => non-linear*

In [2]:
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/bipulshahi/Dataset/main/Advertising.csv',
                 index_col = 0)

df.head()

Unnamed: 0,TV,radio,newspaper,sales
1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9


In [3]:
X = df[['TV','radio']]
X['TVR'] = X.TV * X.radio

Y = df['sales']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X['TVR'] = X.TV * X.radio


In [4]:
X.head()

Unnamed: 0,TV,radio,TVR
1,230.1,37.8,8697.78
2,44.5,39.3,1748.85
3,17.2,45.9,789.48
4,151.5,41.3,6256.95
5,180.8,10.8,1952.64


In [5]:
#Split into train & test
from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest = train_test_split(X,Y,train_size=0.75)


#import alorithm from scikit-learn and train the model
from sklearn.linear_model import LinearRegression
model1 = LinearRegression()

model1.fit(xtrain,ytrain)

print("Trained coefficients=" , model1.coef_ , "intercept=" , model1.intercept_)

#Evaluate model performance
ytrainPred = model1.predict(xtrain)
ytestPred = model1.predict(xtest)

print("Train mean absolute error" , abs(ytrain - ytrainPred).mean())
print("Test mean absolute error" , abs(ytest - ytestPred).mean())

Trained coefficients= [0.02044436 0.03223112 0.00106685] intercept= 6.495246381127702
Train mean absolute error 0.7292025552153282
Test mean absolute error 0.5727590323913396


**Non-Linear Regression of degree 2**

In [6]:
import numpy as np

n1 = np.random.randint(1,9,(5,1))
n1

array([[3],
       [2],
       [4],
       [5],
       [6]])

In [7]:
from sklearn.preprocessing import PolynomialFeatures
pol1 = PolynomialFeatures(degree=2,include_bias=False)

pol1.fit_transform(n1)

array([[ 3.,  9.],
       [ 2.,  4.],
       [ 4., 16.],
       [ 5., 25.],
       [ 6., 36.]])

In [8]:
pol2 = PolynomialFeatures(degree=3,include_bias=False)

pol2.fit_transform(n1)

array([[  3.,   9.,  27.],
       [  2.,   4.,   8.],
       [  4.,  16.,  64.],
       [  5.,  25., 125.],
       [  6.,  36., 216.]])

In [9]:
n2 = np.random.randint(1,9,(5,2))
n2

array([[4, 1],
       [1, 8],
       [1, 8],
       [4, 8],
       [6, 1]])

In [10]:
pol3 = PolynomialFeatures(degree=2,include_bias=False)

pol3.fit_transform(n2)

array([[ 4.,  1., 16.,  4.,  1.],
       [ 1.,  8.,  1.,  8., 64.],
       [ 1.,  8.,  1.,  8., 64.],
       [ 4.,  8., 16., 32., 64.],
       [ 6.,  1., 36.,  6.,  1.]])

In [11]:
pol4 = PolynomialFeatures(degree=3,include_bias=False)

pol4.fit_transform(n2)

array([[  4.,   1.,  16.,   4.,   1.,  64.,  16.,   4.,   1.],
       [  1.,   8.,   1.,   8.,  64.,   1.,   8.,  64., 512.],
       [  1.,   8.,   1.,   8.,  64.,   1.,   8.,  64., 512.],
       [  4.,   8.,  16.,  32.,  64.,  64., 128., 256., 512.],
       [  6.,   1.,  36.,   6.,   1., 216.,  36.,   6.,   1.]])

**Non-Linear Regression on advertising data**

In [12]:
df.head()

Unnamed: 0,TV,radio,newspaper,sales
1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9


In [13]:
X = df[['TV','radio']]
Y = df['sales']

In [14]:
from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest = train_test_split(X,Y,train_size=0.75)

In [15]:
from sklearn.preprocessing import PolynomialFeatures
pol = PolynomialFeatures(degree = 2, include_bias=False)

pol.fit(xtrain)

xtrainPol = pol.transform(xtrain)
xtestPol = pol.transform(xtest)

In [16]:
print(xtrain.shape)
print(xtrainPol.shape)

(150, 2)
(150, 5)


In [17]:
#Train the model using xtrainPol & ytrain

from sklearn.linear_model import LinearRegression
model2 = LinearRegression()
model2.fit(xtrainPol,ytrain)

print(model2.coef_)
#Evaluate model performance

ytrainPred = model2.predict(xtrainPol)
ytestPred = model2.predict(xtestPol)

[ 0.05254062  0.02123818 -0.00011457  0.00107359  0.00028045]


In [18]:
#Evaluate performance
print("Train mean absolute error" , abs(ytrain - ytrainPred).mean())
print("Test mean absolute error" , abs(ytest - ytestPred).mean())

Train mean absolute error 0.4080646244356727
Test mean absolute error 0.4760040379627665
