<h1>Single model submission template.

<p>every model to be used must satisfy the APIs requirements of model_template.py, and must pass all the tests like test_model_template.py

In [1]:
"""
This is a template for the APIs of models to be used into the stacking framework.
"""
from time import time, ctime
from sklearn.linear_model import LinearRegression
import pandas as pd

class model_example():
    """base class for the model

    this class is for a model (that can also be
    a combination of bagged models)
    The commonality of the bagged models is that
    they share the feature generation
    """

    def __init__(self, name):
        self.name  = name
        self.model = None
        self.type  = LinearRegression
        self.max_lag = 1 #max lagged values (needed for rolling predictions)
        print("\ninit model {}".format(self.name))

    def _generate_features(self, market_data, news_data, verbose=False):
        """
        given the original market_data and news_data
        generate new features, doesn't change original data.
        NOTE: data cleaning and preprocessing is not here,
        here is only feats engineering

        Args:
            [market_train_df, news_train_df]: pandas.DataFrame
        Returns:
            complete_features: pandas.DataFrame
        """
        start_time = time()
        if verbose: print("Starting features generation for model {}, {}".format(self.name, ctime()))

        complete_features = market_data.copy()
        complete_features['open+close'] = complete_features['open'] + complete_features['close']
        
        complete_features['lag_10_open_max'] = complete_features['open'].rolling(10, min_periods=1).max()
        self.max_lag = 10
        
        complete_features.drop(['time','assetCode','assetName'],axis=1,inplace=True)
        complete_features.fillna(0, inplace=True)

        if verbose: print("Finished features generation for model {}, TIME {}".format(self.name, time()-start_time))
        return complete_features

    def train(self, X, Y, verbose=False):
        """
        basic method to train a model with given data
        model will be inside self.model after training

        Args:
            X: [market_train_df, news_train_df]
            Y: [target]
            verbose: (bool)
        Returns:
            (optional) training_results
        """
        start_time = time()
        if verbose: print("Starting training for model {}, {}".format(self.name, ctime()))

        X_train = self._generate_features(X[0], X[1])
        if verbose: print("X_train shape {}".format(X_train.shape))
        self.model = LinearRegression()
        self.model.fit(X_train, Y)
        del X_train

        if verbose: print("Finished training for model {}, TIME {}".format(self.name, time()-start_time))


    def predict(self, X, verbose=False):
        """
        given a block of X features gives prediction for everyrow

        Args:
            X: [market_train_df, news_train_df]
        Returns:
            y: pandas.Series
        """
        start_time = time()
        if verbose: print("Starting prediction for model {}, {}".format(self.name, ctime()))
        if self.model is None:
            raise "Error: model is not trained!"

        X_test = self._generate_features(X[0], X[1])
        if verbose: print("X_test shape {}".format(X_test.shape))
        y_test = self.model.predict(X_test)

        if verbose: print("Finished prediction for model {}, TIME {}".format(self.name, time()-start_time))
        return y_test


    def predict_rolling(self, historical_df, prediction_length, verbose=False):
        """
        predict features from X, uses historical for (lagged) feature generation
        to be used with rolling prediciton structure from competition

        Args:
            historical_df: [market_train_df, news_train_df]
            prediction_length: generate features on historical_df, predict only on the last rows
        """
        start_time = time()
        if verbose: print("Starting rolled prediction for model {}, {}".format(self.name, ctime()))

        processed_historical_df = self._generate_features(historical_df[0], historical_df[1])
        X_test = processed_historical_df.iloc[-prediction_length:]
        if verbose: print("X_test shape {}".format(X_test.shape))
        y_test = self.model.predict(X_test)

        if verbose: print("Finished rolled prediction for model {}, TIME {}".format(self.name, time()-start_time))
        return y_test

<h1>Get data

In [2]:
from kaggle.competitions import twosigmanews
# You can only call make_env() once, so don't lose it!
env = twosigmanews.make_env()

Loading the data... This could take a minute.
Done!


In [3]:
(market_train_df, news_train_df) = env.get_training_data()

<h1>`Datacleaning and preprocessing procedure

datacleaning will applied to the whole dataset for every model, the only requirements is that at the end of the procedure ***NO NEW FEATURES can be added here***. They must be added inside the feature generation section of the model

In [4]:
import pandas as pd
import numpy as np
def prepare_predictions(market_obs_df):
    market_obs_df['close_to_open'] =  np.abs(market_obs_df['close'] / market_obs_df['open'])
    market_obs_df['assetName_mean_open'] = market_obs_df.groupby('assetName')['open'].transform('mean')
    market_obs_df['assetName_mean_close'] = market_obs_df.groupby('assetName')['close'].transform('mean')

    # if open price is too far from mean open price for this company, replace it. Otherwise replace close price.
    for i, row in market_obs_df.loc[market_obs_df['close_to_open'] >= 2].iterrows():
        if np.abs(row['assetName_mean_open'] - row['open']) > np.abs(row['assetName_mean_close'] - row['close']):
            market_obs_df.iloc[i,5] = row['assetName_mean_open']
        else:
            market_obs_df.iloc[i,4] = row['assetName_mean_close']

    for i, row in market_obs_df.loc[market_obs_df['close_to_open'] <= 0.5].iterrows():
        if np.abs(row['assetName_mean_open'] - row['open']) > np.abs(row['assetName_mean_close'] - row['close']):
            market_obs_df.iloc[i,5] = row['assetName_mean_open']
        else:
            market_obs_df.iloc[i,4] = row['assetName_mean_close']
            
    return market_obs_df.drop(['assetName_mean_open', 'assetName_mean_close','close_to_open'],axis=1)

In [5]:
market_train_df = prepare_predictions(market_train_df)
market_train_df = market_train_df.loc[market_train_df['time'] >= '2010-01-01 22:00:00+0000']

In [6]:
bottom, top = market_train_df.returnsOpenNextMktres10.quantile(0.001), market_train_df.returnsOpenNextMktres10.quantile(0.999)
market_train_df.returnsOpenNextMktres10 = market_train_df.returnsOpenNextMktres10.clip(bottom, top)

<h1>Initialize and train model

In [7]:
model = model_example('example_Linear_Regression')


init model example_Linear_Regression


In [8]:
target = market_train_df.returnsOpenNextMktres10
market_train_df.drop('returnsOpenNextMktres10', axis=1, inplace=True)

In [9]:
model.train([market_train_df, news_train_df], target, verbose=True)

Starting training for model example_Linear_Regression, Thu Dec 20 17:15:58 2018
X_train shape (2946739, 14)
Finished training for model example_Linear_Regression, TIME 3.053800582885742


<h1>Prediction loop

In [10]:
days = env.get_prediction_days()

In [13]:
# skip a prediction (for testing)
# env.predict(predictions_template_df)

In [14]:
from time import time, ctime

total_market_df = pd.DataFrame(columns=['time', 'assetCode', 'assetName', 'volume', 'close', 'open',
       'returnsClosePrevRaw1', 'returnsOpenPrevRaw1',
       'returnsClosePrevMktres1', 'returnsOpenPrevMktres1',
       'returnsClosePrevRaw10', 'returnsOpenPrevRaw10',
       'returnsClosePrevMktres10', 'returnsOpenPrevMktres10','period', 'universe'])

max_lag, days_count = model.max_lag, 0
for (market_obs_df, news_obs_df, predictions_template_df) in days:
    days_count += 1
    if not days_count%50: print(days_count)
        
    market_obs_df['period']   = days_count
    market_obs_df['universe'] = 1
    
    start_time = time()
    total_market_df = pd.concat([total_market_df, market_obs_df])
    #total_news_obs_df.append(news_obs_df)
        
    history_market_df = total_market_df[total_market_df['period'] > days_count - max_lag - 1].drop('period', axis=1)
    predictions = model.predict_rolling([history_market_df, None],
                                        len(predictions_template_df), verbose=True)
    
    predictions_template_df.confidenceValue = predictions.clip(-1, 1)
    env.predict(predictions_template_df)
    print("[{}] loop prediction {}, TIME {}".format(days_count, ctime(), time()-start_time))
print('Done!')

[1] created history dataframes Thu Dec 20 17:16:37 2018, TIME 0.008631229400634766
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:37 2018
X_test shape (1823, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.008206605911254883
[1] loop prediction Thu Dec 20 17:16:37 2018, TIME 0.016048908233642578
[2] created history dataframes Thu Dec 20 17:16:37 2018, TIME 0.008676290512084961
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:37 2018
X_test shape (1818, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01040506362915039
[2] loop prediction Thu Dec 20 17:16:37 2018, TIME 0.021320104598999023
[3] created history dataframes Thu Dec 20 17:16:37 2018, TIME 0.00991058349609375
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:38 2018
X_test shape (1821, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.007759571075439453


Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:38 2018
X_test shape (1813, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.014297723770141602
[24] loop prediction Thu Dec 20 17:16:38 2018, TIME 0.0331270694732666
[25] created history dataframes Thu Dec 20 17:16:39 2018, TIME 0.01477956771850586
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:39 2018
X_test shape (1811, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.013945341110229492
[25] loop prediction Thu Dec 20 17:16:39 2018, TIME 0.031862735748291016
[26] created history dataframes Thu Dec 20 17:16:39 2018, TIME 0.015562057495117188
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:39 2018
X_test shape (1804, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.013099431991577148
[26] loop prediction Thu Dec 20 17:16:39 2018, TIME 0.030048131942749023
[27] cr

X_test shape (1807, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.02248835563659668
[48] loop prediction Thu Dec 20 17:16:40 2018, TIME 0.06013965606689453
[49] created history dataframes Thu Dec 20 17:16:40 2018, TIME 0.03459620475769043
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:40 2018
X_test shape (1807, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.022391319274902344
[49] loop prediction Thu Dec 20 17:16:40 2018, TIME 0.06053519248962402
50
[50] created history dataframes Thu Dec 20 17:16:41 2018, TIME 0.03521394729614258
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:41 2018
X_test shape (1808, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016019821166992188
[50] loop prediction Thu Dec 20 17:16:41 2018, TIME 0.10560727119445801
[51] created history dataframes Thu Dec 20 17:16:41 2018, TIME 0.03453207015991211
Starting roll

[72] created history dataframes Thu Dec 20 17:16:43 2018, TIME 0.04595136642456055
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:43 2018
X_test shape (1834, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.017379045486450195
[72] loop prediction Thu Dec 20 17:16:43 2018, TIME 0.06927180290222168
[73] created history dataframes Thu Dec 20 17:16:43 2018, TIME 0.04538846015930176
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:43 2018
X_test shape (1830, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01786947250366211
[73] loop prediction Thu Dec 20 17:16:43 2018, TIME 0.06702327728271484
[74] created history dataframes Thu Dec 20 17:16:43 2018, TIME 0.04543566703796387
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:43 2018
X_test shape (1833, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.017592191696166992

[95] created history dataframes Thu Dec 20 17:16:46 2018, TIME 0.05521106719970703
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:46 2018
X_test shape (1824, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.018196821212768555
[95] loop prediction Thu Dec 20 17:16:46 2018, TIME 0.07680630683898926
[96] created history dataframes Thu Dec 20 17:16:46 2018, TIME 0.0531764030456543
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:46 2018
X_test shape (1829, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.02824544906616211
[96] loop prediction Thu Dec 20 17:16:46 2018, TIME 0.08497738838195801
[97] created history dataframes Thu Dec 20 17:16:47 2018, TIME 0.05324077606201172
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:47 2018
X_test shape (1826, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015215396881103516


[118] created history dataframes Thu Dec 20 17:16:50 2018, TIME 0.06353092193603516
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:50 2018
X_test shape (1786, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015376806259155273
[118] loop prediction Thu Dec 20 17:16:50 2018, TIME 0.08227109909057617
[119] created history dataframes Thu Dec 20 17:16:50 2018, TIME 0.06416821479797363
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:50 2018
X_test shape (1792, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.017772436141967773
[119] loop prediction Thu Dec 20 17:16:50 2018, TIME 0.08453559875488281
[120] created history dataframes Thu Dec 20 17:16:50 2018, TIME 0.0647284984588623
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:50 2018
X_test shape (1794, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0150630474090

[142] created history dataframes Thu Dec 20 17:16:54 2018, TIME 0.07457923889160156
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:54 2018
X_test shape (1807, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015716075897216797
[142] loop prediction Thu Dec 20 17:16:54 2018, TIME 0.09076523780822754
[143] created history dataframes Thu Dec 20 17:16:54 2018, TIME 0.07321810722351074
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:54 2018
X_test shape (1806, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015001773834228516
[143] loop prediction Thu Dec 20 17:16:54 2018, TIME 0.09311580657958984
[144] created history dataframes Thu Dec 20 17:16:54 2018, TIME 0.23477935791015625
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:54 2018
X_test shape (1800, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015285491943

Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:58 2018
X_test shape (1843, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0151214599609375
[165] loop prediction Thu Dec 20 17:16:58 2018, TIME 0.09447836875915527
[166] created history dataframes Thu Dec 20 17:16:58 2018, TIME 0.08082103729248047
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:58 2018
X_test shape (1835, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01750326156616211
[166] loop prediction Thu Dec 20 17:16:58 2018, TIME 0.0981602668762207
[167] created history dataframes Thu Dec 20 17:16:58 2018, TIME 0.08065295219421387
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:16:58 2018
X_test shape (1832, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.020462751388549805
[167] loop prediction Thu Dec 20 17:16:58 2018, TIME 0.10187053680419922
[168] cr

X_test shape (1832, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016837358474731445
[188] loop prediction Thu Dec 20 17:17:03 2018, TIME 0.09521079063415527
[189] created history dataframes Thu Dec 20 17:17:03 2018, TIME 0.09123539924621582
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:03 2018
X_test shape (1835, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.024194955825805664
[189] loop prediction Thu Dec 20 17:17:03 2018, TIME 0.11005306243896484
[190] created history dataframes Thu Dec 20 17:17:03 2018, TIME 0.10090208053588867
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:03 2018
X_test shape (1832, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01669168472290039
[190] loop prediction Thu Dec 20 17:17:03 2018, TIME 0.09647178649902344
[191] created history dataframes Thu Dec 20 17:17:03 2018, TIME 0.10390949249267578
Starting r

Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:08 2018
X_test shape (1795, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016781330108642578
[212] loop prediction Thu Dec 20 17:17:08 2018, TIME 0.1274409294128418
[213] created history dataframes Thu Dec 20 17:17:08 2018, TIME 0.1071634292602539
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:08 2018
X_test shape (1796, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0162355899810791
[213] loop prediction Thu Dec 20 17:17:08 2018, TIME 0.10670137405395508
[214] created history dataframes Thu Dec 20 17:17:08 2018, TIME 0.1092827320098877
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:08 2018
X_test shape (1794, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015813112258911133
[214] loop prediction Thu Dec 20 17:17:08 2018, TIME 0.10443305969238281
[215] cre

Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:13 2018
X_test shape (1765, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01492166519165039
[235] loop prediction Thu Dec 20 17:17:13 2018, TIME 0.10878467559814453
[236] created history dataframes Thu Dec 20 17:17:13 2018, TIME 0.11616349220275879
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:13 2018
X_test shape (1767, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.017766475677490234
[236] loop prediction Thu Dec 20 17:17:13 2018, TIME 0.11218404769897461
[237] created history dataframes Thu Dec 20 17:17:14 2018, TIME 0.11649966239929199
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:14 2018
X_test shape (1765, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01572132110595703
[237] loop prediction Thu Dec 20 17:17:14 2018, TIME 0.11071252822875977
[238] 

Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:19 2018
X_test shape (1750, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015309572219848633
[258] loop prediction Thu Dec 20 17:17:19 2018, TIME 0.11742258071899414
[259] created history dataframes Thu Dec 20 17:17:19 2018, TIME 0.13912582397460938
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:19 2018
X_test shape (1752, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015178680419921875
[259] loop prediction Thu Dec 20 17:17:19 2018, TIME 0.11743783950805664
[260] created history dataframes Thu Dec 20 17:17:19 2018, TIME 0.13556981086730957
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:20 2018
X_test shape (1757, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.014276504516601562
[260] loop prediction Thu Dec 20 17:17:20 2018, TIME 0.11546015739440918
[261

Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:25 2018
X_test shape (1784, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015153169631958008
[281] loop prediction Thu Dec 20 17:17:25 2018, TIME 0.12563180923461914
[282] created history dataframes Thu Dec 20 17:17:25 2018, TIME 0.1476755142211914
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:26 2018
X_test shape (1785, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.020348310470581055
[282] loop prediction Thu Dec 20 17:17:26 2018, TIME 0.13016104698181152
[283] created history dataframes Thu Dec 20 17:17:26 2018, TIME 0.14607644081115723
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:26 2018
X_test shape (1788, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015484809875488281
[283] loop prediction Thu Dec 20 17:17:26 2018, TIME 0.12558746337890625
[284]

[305] created history dataframes Thu Dec 20 17:17:32 2018, TIME 0.15867090225219727
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:32 2018
X_test shape (1833, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01668548583984375
[305] loop prediction Thu Dec 20 17:17:32 2018, TIME 0.14358043670654297
[306] created history dataframes Thu Dec 20 17:17:33 2018, TIME 0.16000032424926758
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:33 2018
X_test shape (1833, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.014749765396118164
[306] loop prediction Thu Dec 20 17:17:33 2018, TIME 0.13388347625732422
[307] created history dataframes Thu Dec 20 17:17:33 2018, TIME 0.15765094757080078
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:33 2018
X_test shape (1840, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0166301727294

[328] created history dataframes Thu Dec 20 17:17:39 2018, TIME 0.16916108131408691
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:40 2018
X_test shape (1853, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01639580726623535
[328] loop prediction Thu Dec 20 17:17:40 2018, TIME 0.14743852615356445
[329] created history dataframes Thu Dec 20 17:17:40 2018, TIME 0.17127609252929688
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:40 2018
X_test shape (1852, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015142440795898438
[329] loop prediction Thu Dec 20 17:17:40 2018, TIME 0.1397418975830078
[330] created history dataframes Thu Dec 20 17:17:40 2018, TIME 0.16963601112365723
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:40 2018
X_test shape (1851, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01581978797912

[351] created history dataframes Thu Dec 20 17:17:47 2018, TIME 0.1844782829284668
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:47 2018
X_test shape (1848, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015415668487548828
[351] loop prediction Thu Dec 20 17:17:47 2018, TIME 0.14817261695861816
[352] created history dataframes Thu Dec 20 17:17:47 2018, TIME 0.18047523498535156
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:48 2018
X_test shape (1849, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016280651092529297
[352] loop prediction Thu Dec 20 17:17:48 2018, TIME 0.14778804779052734
[353] created history dataframes Thu Dec 20 17:17:48 2018, TIME 0.18015336990356445
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:48 2018
X_test shape (1848, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0154094696044

[374] created history dataframes Thu Dec 20 17:17:55 2018, TIME 0.17917752265930176
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:56 2018
X_test shape (1804, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01714181900024414
[374] loop prediction Thu Dec 20 17:17:56 2018, TIME 0.16121768951416016
[375] created history dataframes Thu Dec 20 17:17:56 2018, TIME 0.19257354736328125
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:56 2018
X_test shape (1812, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01822805404663086
[375] loop prediction Thu Dec 20 17:17:56 2018, TIME 0.1625077724456787
[376] created history dataframes Thu Dec 20 17:17:56 2018, TIME 0.19152402877807617
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:17:56 2018
X_test shape (1811, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.014795064926147

[397] created history dataframes Thu Dec 20 17:18:04 2018, TIME 0.20357608795166016
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:04 2018
X_test shape (1806, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.018920183181762695
[397] loop prediction Thu Dec 20 17:18:04 2018, TIME 0.17104244232177734
[398] created history dataframes Thu Dec 20 17:18:05 2018, TIME 0.20667052268981934
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:05 2018
X_test shape (1800, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015654563903808594
[398] loop prediction Thu Dec 20 17:18:05 2018, TIME 0.16870880126953125
[399] created history dataframes Thu Dec 20 17:18:05 2018, TIME 0.1889972686767578
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:05 2018
X_test shape (1824, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0157375335693

[420] created history dataframes Thu Dec 20 17:18:14 2018, TIME 0.21404743194580078
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:14 2018
X_test shape (1835, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01635122299194336
[420] loop prediction Thu Dec 20 17:18:14 2018, TIME 0.18314385414123535
[421] created history dataframes Thu Dec 20 17:18:14 2018, TIME 0.19540810585021973
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:14 2018
X_test shape (1832, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01732182502746582
[421] loop prediction Thu Dec 20 17:18:14 2018, TIME 0.17709660530090332
[422] created history dataframes Thu Dec 20 17:18:14 2018, TIME 0.19730830192565918
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:14 2018
X_test shape (1836, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01778817176818

[443] created history dataframes Thu Dec 20 17:18:23 2018, TIME 0.2233266830444336
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:23 2018
X_test shape (1835, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015669822692871094
[443] loop prediction Thu Dec 20 17:18:23 2018, TIME 0.18533802032470703
[444] created history dataframes Thu Dec 20 17:18:23 2018, TIME 0.22636032104492188
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:23 2018
X_test shape (1832, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015918254852294922
[444] loop prediction Thu Dec 20 17:18:23 2018, TIME 0.18601250648498535
[445] created history dataframes Thu Dec 20 17:18:24 2018, TIME 0.22053837776184082
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:24 2018
X_test shape (1834, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0157151222229

[466] created history dataframes Thu Dec 20 17:18:33 2018, TIME 0.23040032386779785
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:33 2018
X_test shape (1795, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.021813154220581055
[466] loop prediction Thu Dec 20 17:18:33 2018, TIME 0.24949336051940918
[467] created history dataframes Thu Dec 20 17:18:33 2018, TIME 0.234724760055542
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:34 2018
X_test shape (1796, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015733003616333008
[467] loop prediction Thu Dec 20 17:18:34 2018, TIME 0.2462754249572754
[468] created history dataframes Thu Dec 20 17:18:34 2018, TIME 0.23710298538208008
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:34 2018
X_test shape (1794, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016178369522094

[489] created history dataframes Thu Dec 20 17:18:44 2018, TIME 0.22882556915283203
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:45 2018
X_test shape (1765, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016899824142456055
[489] loop prediction Thu Dec 20 17:18:45 2018, TIME 0.25714802742004395
[490] created history dataframes Thu Dec 20 17:18:45 2018, TIME 0.22368097305297852
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:45 2018
X_test shape (1767, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01408243179321289
[490] loop prediction Thu Dec 20 17:18:45 2018, TIME 0.25408220291137695
[491] created history dataframes Thu Dec 20 17:18:45 2018, TIME 0.22173237800598145
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:46 2018
X_test shape (1765, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0137057304382

[512] created history dataframes Thu Dec 20 17:18:56 2018, TIME 0.24918341636657715
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:56 2018
X_test shape (1750, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.014865875244140625
[512] loop prediction Thu Dec 20 17:18:56 2018, TIME 0.26548337936401367
[513] created history dataframes Thu Dec 20 17:18:57 2018, TIME 0.22940587997436523
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:57 2018
X_test shape (1752, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016319990158081055
[513] loop prediction Thu Dec 20 17:18:57 2018, TIME 0.2622947692871094
[514] created history dataframes Thu Dec 20 17:18:57 2018, TIME 0.22575759887695312
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:18:57 2018
X_test shape (1757, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0213837623596

[535] created history dataframes Thu Dec 20 17:19:08 2018, TIME 0.2368457317352295
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:08 2018
X_test shape (1784, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.015866756439208984
[535] loop prediction Thu Dec 20 17:19:08 2018, TIME 0.27315807342529297
[536] created history dataframes Thu Dec 20 17:19:08 2018, TIME 0.24254703521728516
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:09 2018
X_test shape (1785, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.017181873321533203
[536] loop prediction Thu Dec 20 17:19:09 2018, TIME 0.27736926078796387
[537] created history dataframes Thu Dec 20 17:19:09 2018, TIME 0.26238059997558594
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:09 2018
X_test shape (1788, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0166964530944

[558] created history dataframes Thu Dec 20 17:19:20 2018, TIME 0.25360774993896484
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:21 2018
X_test shape (1829, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016235828399658203
[558] loop prediction Thu Dec 20 17:19:21 2018, TIME 0.2882118225097656
[559] created history dataframes Thu Dec 20 17:19:21 2018, TIME 0.2530062198638916
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:21 2018
X_test shape (1833, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016541719436645508
[559] loop prediction Thu Dec 20 17:19:21 2018, TIME 0.286693811416626
[560] created history dataframes Thu Dec 20 17:19:21 2018, TIME 0.2729785442352295
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:22 2018
X_test shape (1833, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01572203636169433

[581] created history dataframes Thu Dec 20 17:19:33 2018, TIME 0.26470041275024414
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:33 2018
X_test shape (1856, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01614069938659668
[581] loop prediction Thu Dec 20 17:19:33 2018, TIME 0.29860377311706543
[582] created history dataframes Thu Dec 20 17:19:34 2018, TIME 0.2614939212799072
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:34 2018
X_test shape (1853, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01668572425842285
[582] loop prediction Thu Dec 20 17:19:34 2018, TIME 0.2961232662200928
[583] created history dataframes Thu Dec 20 17:19:34 2018, TIME 0.26618289947509766
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:35 2018
X_test shape (1852, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0168430805206298

[604] created history dataframes Thu Dec 20 17:19:47 2018, TIME 0.29100608825683594
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:47 2018
X_test shape (1846, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0187532901763916
[604] loop prediction Thu Dec 20 17:19:47 2018, TIME 0.3185288906097412
[605] created history dataframes Thu Dec 20 17:19:47 2018, TIME 0.27582240104675293
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:48 2018
X_test shape (1848, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.016654491424560547
[605] loop prediction Thu Dec 20 17:19:48 2018, TIME 0.31243252754211426
[606] created history dataframes Thu Dec 20 17:19:48 2018, TIME 0.2765767574310303
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:19:48 2018
X_test shape (1849, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.0150036811828613

[627] created history dataframes Thu Dec 20 17:20:01 2018, TIME 0.2868227958679199
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:20:01 2018
X_test shape (1812, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.013570547103881836
[627] loop prediction Thu Dec 20 17:20:01 2018, TIME 0.31665730476379395
[628] created history dataframes Thu Dec 20 17:20:01 2018, TIME 0.2870657444000244
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:20:02 2018
X_test shape (1804, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.014031648635864258
[628] loop prediction Thu Dec 20 17:20:02 2018, TIME 0.31415677070617676
[629] created history dataframes Thu Dec 20 17:20:02 2018, TIME 0.30220627784729004
Starting rolled prediction for model example_Linear_Regression, Thu Dec 20 17:20:02 2018
X_test shape (1812, 14)
Finished rolled prediction for model example_Linear_Regression, TIME 0.01454019546508

## **`write_submission_file`** function

Writes your predictions to a CSV file (`submission.csv`) in the current working directory.

In [None]:
env.write_submission_file()

In [None]:
# We've got a submission file!
import os
print([filename for filename in os.listdir('.') if '.csv' in filename])

As indicated by the helper message, calling `write_submission_file` on its own does **not** make a submission to the competition.  It merely tells the module to write the `submission.csv` file as part of the Kernel's output.  To make a submission to the competition, you'll have to **Commit** your Kernel and find the generated `submission.csv` file in that Kernel Version's Output tab (note this is _outside_ of the Kernel Editor), then click "Submit to Competition".  When we re-run your Kernel during Stage Two, we will run the Kernel Version (generated when you hit "Commit") linked to your chosen Submission.

## Restart the Kernel to run your code again
In order to combat cheating, you are only allowed to call `make_env` or iterate through `get_prediction_days` once per Kernel run.  However, while you're iterating on your model it's reasonable to try something out, change the model a bit, and try it again.  Unfortunately, if you try to simply re-run the code, or even refresh the browser page, you'll still be running on the same Kernel execution session you had been running before, and the `twosigmanews` module will still throw errors.  To get around this, you need to explicitly restart your Kernel execution session, which you can do by pressing the Restart button in the Kernel Editor's bottom Console tab:
![Restart button](https://i.imgur.com/hudu8jF.png)