## Computing the R2 Score of a Linear Regression Model
As mentioned in the preceding sections, R2 score is an important factor in evaluating the performance of a model. Thus, in this exercise, we will be creating a linear regression model and then calculating the R2 score for it.

In [1]:
# import required libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

In [2]:
_headers = ['CIC0', 'SM1', 'GATS1i', 'NdsCH', 'Ndssc','MLOGP', 'response']
df = pd.read_csv('https://raw.githubusercontent.com/'\
                 'PacktWorkshops/The-Data-Science-Workshop'\
                 '/master/Chapter06/Dataset/qsar_fish_toxicity.csv', names=_headers, sep=';')

In [3]:
df.head()

Unnamed: 0,CIC0,SM1,GATS1i,NdsCH,Ndssc,MLOGP,response
0,3.26,0.829,1.676,0,1,1.453,3.77
1,2.189,0.58,0.863,0,0,1.348,3.115
2,2.125,0.638,0.831,0,0,1.348,3.531
3,3.027,0.331,1.472,1,0,1.807,3.51
4,2.094,0.827,0.86,0,0,1.886,5.39


In [7]:
# split the data
features = df.drop('response', axis=1).values
labels = df[['response']].values
X_train, X_eval, y_train, y_eval = train_test_split(features, labels, test_size=0.2, random_state=0)
X_val, X_test, y_val, y_test = train_test_split(X_eval, y_eval, random_state=0)

In [8]:
model = LinearRegression()

In [9]:
model.fit(X_train, y_train)

LinearRegression()

In [10]:
y_pred = model.predict(X_val)

In [11]:
r2 = model.score(X_val, y_val)
print('R^2 score: {}'.format(r2))

R^2 score: 0.5623861754188691


In [12]:
_ys = pd.DataFrame(dict(actuals=y_val.reshape(-1), predicted=y_pred.reshape(-1)))
_ys.head()

Unnamed: 0,actuals,predicted
0,3.742,4.155885
1,6.143,6.398238
2,4.674,5.183181
3,4.865,3.771333
4,4.732,4.593059
