## Task: Fit a linear model with `dly=d_one_se`

In [1]:
import numpy as np
import pickle
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_squared_error
from sklearn.model_selection import train_test_split, KFold

In [2]:
with open('example_data_s1.pickle', 'rb') as fp:
    X,y = pickle.load(fp)

tsamp = 0.05  
nt, nneuron = X.shape
nout = y.shape[1]
ttotal = nt*tsamp


nred = 6000
Xred = X[:nred]
yred = y[:nred]

In [3]:
def create_dly_data(X,y,dly):
    """
    Create delayed data
    """    
    n,p = X.shape
    Xdly = np.zeros((n-dly,(dly+1)*p))
    for i in range(dly+1):
        Xdly[:,i*p:(i+1)*p] = X[dly-i:n-i,:]
    ydly = y[dly:]
    
    return Xdly, ydly

In [4]:
dmax = 15
Xdly, ydly = create_dly_data(Xred,yred,dmax) 

Assign to `d_one_se` the value you found in your Colab notebook:

In [5]:
#grade (write your code in this cell and DO NOT DELETE THIS LINE)
d_one_se = 6

Now that we have selected a model order, we can fit the (reduced) data
to that model.

Use all rows of `Xdly` and `ydly` (but select appropriate columns)
to fit a linear regression model using the
best delay according to the one SE rule.


In [6]:
#grade (write your code in this cell and DO NOT DELETE THIS LINE)
# Fit model on all rows of Xdly, ydly (select appropriate columns!)
# Select the appropriate columns of Xdly based on the best delay according to the one SE rule
X_dly_best = Xdly[:, :X.shape[1]*(d_one_se+1)]

# Fit the linear regression model using the best delay
reg_best = LinearRegression().fit(X_dly_best, ydly)

Then, define a test set using data that was not used to train the model:


In [9]:
#grade (do not modify this cell)
# if d_one_se is the optimal model order, you can use
# for this workspace, we'll use a slightly different test set than the Colab notebook
Xts = X[nred+1:nred+1001+d_one_se]
yts = y[nred+1:nred+1001+d_one_se]
# and then use 
Xts_dly, yts_dly = create_dly_data(Xts,yts,d_one_se)

Use your fitted model to find the R2 score on this test set.

In [12]:
#grade (write your code in this cell and DO NOT DELETE THIS LINE)
# Prepare the test data by selecting the appropriate columns based on d_one_se
Xts_dly_best = Xts_dly[:, :X.shape[1]*(d_one_se+1)]

# Predict the velocities on the test set using the fitted model
yhat = reg_best.predict(Xts_dly_best)

# Compute the R^2 score using the actual and predicted velocities
rsq = r2_score(yts_dly, yhat)