# Example Predictor: Linear Rollout Predictor

This example contains basic functionality for training and evaluating a linear predictor that rolls out predictions day-by-day.

First, a training data set is created from historical case and npi data.

Second, a linear model is trained to predict future cases from prior case data along with prior and future npi data.
The model is an off-the-shelf sklearn Lasso model, that uses a positive weight constraint to enforce the assumption that increased npis has a negative correlation with future cases.

Third, a sample evaluation set is created, and the predictor is applied to this evaluation set to produce prediction results in the correct format.

## Training

In [1]:
import pickle
import numpy as np
import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split

### Copy the data locally

In [2]:
# Main source for the training data
DATA_URL = 'https://raw.githubusercontent.com/OxCGRT/covid-policy-tracker/master/data/OxCGRT_latest.csv'
# Local file
DATA_FILE = 'data/OxCGRT_latest.csv'

In [3]:
import os
import urllib.request
if not os.path.exists('data'):
    os.mkdir('data')
urllib.request.urlretrieve(DATA_URL, DATA_FILE)

('data/OxCGRT_latest.csv', <http.client.HTTPMessage at 0x7f1a9f699b00>)

In [4]:
# Load historical data from local file
df = pd.read_csv(DATA_FILE, 
                 parse_dates=['Date'],
                 encoding="ISO-8859-1",
                 dtype={"RegionName": str,
                        "RegionCode": str},
                 error_bad_lines=False)

In [5]:
df.columns

Index(['CountryName', 'CountryCode', 'RegionName', 'RegionCode',
       'Jurisdiction', 'Date', 'C1_School closing', 'C1_Flag',
       'C2_Workplace closing', 'C2_Flag', 'C3_Cancel public events', 'C3_Flag',
       'C4_Restrictions on gatherings', 'C4_Flag', 'C5_Close public transport',
       'C5_Flag', 'C6_Stay at home requirements', 'C6_Flag',
       'C7_Restrictions on internal movement', 'C7_Flag',
       'C8_International travel controls', 'E1_Income support', 'E1_Flag',
       'E2_Debt/contract relief', 'E3_Fiscal measures',
       'E4_International support', 'H1_Public information campaigns',
       'H1_Flag', 'H2_Testing policy', 'H3_Contact tracing',
       'H4_Emergency investment in healthcare', 'H5_Investment in vaccines',
       'H6_Facial Coverings', 'H6_Flag', 'M1_Wildcard', 'ConfirmedCases',
       'ConfirmedDeaths', 'StringencyIndex', 'StringencyIndexForDisplay',
       'StringencyLegacyIndex', 'StringencyLegacyIndexForDisplay',
       'GovernmentResponseIndex', 'Gove

In [6]:
# For testing, restrict training data to that before a hypothetical predictor submission date
HYPOTHETICAL_SUBMISSION_DATE = np.datetime64("2020-07-31")
df = df[df.Date <= HYPOTHETICAL_SUBMISSION_DATE]

In [7]:
# Add RegionID column that combines CountryName and RegionName for easier manipulation of data
df['GeoID'] = df['CountryName'] + '__' + df['RegionName'].astype(str)

In [8]:
# Add new cases column
df['NewCases'] = df.groupby('GeoID').ConfirmedCases.diff().fillna(0)

In [9]:
# Keep only columns of interest
id_cols = ['CountryName',
           'RegionName',
           'GeoID',
           'Date']
cases_col = ['NewCases']
npi_cols = ['C1_School closing',
            'C2_Workplace closing',
            'C3_Cancel public events',
            'C4_Restrictions on gatherings',
            'C5_Close public transport',
            'C6_Stay at home requirements',
            'C7_Restrictions on internal movement',
            'C8_International travel controls',
            'H1_Public information campaigns',
            'H2_Testing policy',
            'H3_Contact tracing',
            'H6_Facial Coverings']
df = df[id_cols + cases_col + npi_cols]

In [10]:
# Fill any missing case values by interpolation and setting NaNs to 0
df.update(df.groupby('GeoID').NewCases.apply(
    lambda group: group.interpolate()).fillna(0))

In [11]:
# Fill any missing NPIs by assuming they are the same as previous day
for npi_col in npi_cols:
    df.update(df.groupby('GeoID')[npi_col].ffill().fillna(0))

In [12]:
df

Unnamed: 0,CountryName,RegionName,GeoID,Date,NewCases,C1_School closing,C2_Workplace closing,C3_Cancel public events,C4_Restrictions on gatherings,C5_Close public transport,C6_Stay at home requirements,C7_Restrictions on internal movement,C8_International travel controls,H1_Public information campaigns,H2_Testing policy,H3_Contact tracing,H6_Facial Coverings
0,Aruba,,Aruba__nan,2020-01-01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Aruba,,Aruba__nan,2020-01-02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Aruba,,Aruba__nan,2020-01-03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Aruba,,Aruba__nan,2020-01-04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Aruba,,Aruba__nan,2020-01-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
87064,Zimbabwe,,Zimbabwe__nan,2020-07-27,78.0,3.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,4.0
87065,Zimbabwe,,Zimbabwe__nan,2020-07-28,192.0,3.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,4.0
87066,Zimbabwe,,Zimbabwe__nan,2020-07-29,113.0,3.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,4.0
87067,Zimbabwe,,Zimbabwe__nan,2020-07-30,62.0,3.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,4.0


In [13]:
df['CountryName'].unique()

array(['Aruba', 'Afghanistan', 'Angola', 'Albania', 'Andorra',
       'United Arab Emirates', 'Argentina', 'Australia', 'Austria',
       'Azerbaijan', 'Burundi', 'Belgium', 'Benin', 'Burkina Faso',
       'Bangladesh', 'Bulgaria', 'Bahrain', 'Bahamas',
       'Bosnia and Herzegovina', 'Belarus', 'Belize', 'Bermuda',
       'Bolivia', 'Brazil', 'Barbados', 'Brunei', 'Bhutan', 'Botswana',
       'Central African Republic', 'Canada', 'Switzerland', 'Chile',
       'China', "Cote d'Ivoire", 'Cameroon',
       'Democratic Republic of Congo', 'Congo', 'Colombia', 'Comoros',
       'Cape Verde', 'Costa Rica', 'Cuba', 'Cyprus', 'Czech Republic',
       'Germany', 'Djibouti', 'Dominica', 'Denmark', 'Dominican Republic',
       'Algeria', 'Ecuador', 'Egypt', 'Eritrea', 'Spain', 'Estonia',
       'Ethiopia', 'Finland', 'Fiji', 'France', 'Faeroe Islands', 'Gabon',
       'United Kingdom', 'Georgia', 'Ghana', 'Guinea', 'Gambia', 'Greece',
       'Greenland', 'Guatemala', 'Guam', 'Guyana', 'Hong Ko

In [14]:
# Set number of past days to use to make predictions
nb_lookback_days = 30

# Create training data across all countries for predicting one day ahead
X_cols = cases_col + npi_cols
y_col = cases_col
X_samples = []
y_samples = []
geo_ids = df.GeoID.unique()
for g in geo_ids:
    gdf = df[df.GeoID == g]
    all_case_data = np.array(gdf[cases_col])
    all_npi_data = np.array(gdf[npi_cols])

    # Create one sample for each day where we have enough data
    # Each sample consists of cases and npis for previous nb_lookback_days
    nb_total_days = len(gdf)
    for d in range(nb_lookback_days, nb_total_days - 1):
        X_cases = all_case_data[d-nb_lookback_days:d]

        # Take negative of npis to support positive
        # weight constraint in Lasso.
        X_npis = -all_npi_data[d - nb_lookback_days:d]

        # Flatten all input data so it fits Lasso input format.
        X_sample = np.concatenate([X_cases.flatten(),
                                   X_npis.flatten()])
        y_sample = all_case_data[d + 1]
        X_samples.append(X_sample)
        y_samples.append(y_sample)

X_samples = np.array(X_samples)
y_samples = np.array(y_samples).flatten()

In [15]:
# Helpful function to compute mae
def mae(pred, true):
    return np.mean(np.abs(pred - true))

In [16]:
# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X_samples,
                                                    y_samples,
                                                    test_size=0.2,
                                                    random_state=301)

In [17]:
# Create and train Lasso model.
# Set positive=True to enforce assumption that cases are positively correlated
# with future cases and npis are negatively correlated.
model = Lasso(alpha=0.1,
              precompute=True,
              max_iter=10000,
              positive=True,
              selection='random')
# Fit model
model.fit(X_train, y_train)

Lasso(alpha=0.1, max_iter=10000, positive=True, precompute=True,
      selection='random')

In [18]:
# Evaluate model
train_preds = model.predict(X_train)
train_preds = np.maximum(train_preds, 0) # Don't predict negative cases
print('Train MAE:', mae(train_preds, y_train))

test_preds = model.predict(X_test)
test_preds = np.maximum(test_preds, 0) # Don't predict negative cases
print('Test MAE:', mae(test_preds, y_test))

Train MAE: 140.71122778128432
Test MAE: 152.4956978332136


In [19]:
# Inspect the learned feature coefficients for the model
# to see what features it's paying attention to.

# Give names to the features
x_col_names = []
for d in range(-nb_lookback_days, 0):
    x_col_names.append('Day ' + str(d) + ' ' + cases_col[0])
for d in range(-nb_lookback_days, 1):
    for col_name in npi_cols:
        x_col_names.append('Day ' + str(d) + ' ' + col_name)

# View non-zero coefficients
for (col, coeff) in zip(x_col_names, list(model.coef_)):
    if coeff != 0.:
        print(col, coeff)
print('Intercept', model.intercept_)

Day -7 NewCases 0.0010962418796115084
Day -6 NewCases 0.43961700664093006
Day -5 NewCases 0.2170266217574839
Day -4 NewCases 0.05905864766692127
Day -3 NewCases 0.06940880574121146
Day -2 NewCases 0.05199045901061941
Day -1 NewCases 0.23836656730788633
Day -26 C6_Stay at home requirements 4.313198963092626
Day -22 C2_Workplace closing 9.718141322985813
Day -17 C2_Workplace closing 5.764077234672192
Intercept 26.556057670476775


In [20]:
# Save model to file
if not os.path.exists('models'):
    os.mkdir('models')
with open('models/model.pkl', 'wb') as model_file:
    pickle.dump(model, model_file)

## Evaluation

Now that the predictor has been trained and saved, this section contains the functionality for evaluating it on sample evaluation data.

In [21]:
# Reload the module to get the latest changes
import predict
from importlib import reload
reload(predict)
from predict import predict_df

In [22]:
list_countries = sorted(list(set(df.CountryName)))
hist_ips_df = pd.read_csv("data/2020-09-30_historical_ip.csv",
                              parse_dates=['Date'],
                              encoding="ISO-8859-1",
                              dtype={"RegionName": str},
                              error_bad_lines=True)
hist_ips_df = hist_ips_df[hist_ips_df.CountryName.isin(list_countries)]
hist_ips_df.to_csv("data/2020-09-30_historical_ip_new.csv" , index = False) 

In [23]:
%%time
preds_df = predict_df("2020-08-01", "2020-08-31", path_to_ips_file="data/2020-09-30_historical_ip_new.csv", verbose=True)


Predicting for Aruba__nan
2020-08-01: 58.830065523865606
2020-08-02: 71.32111674610171
2020-08-03: 78.98680370746847
2020-08-04: 90.23853933752429
2020-08-05: 87.81164418044506
2020-08-06: 98.75744991922349
2020-08-07: 131.03999756627803
2020-08-08: 147.01964816756603
2020-08-09: 158.9490389985464
2020-08-10: 169.93869080059733
2020-08-11: 177.5151244345863
2020-08-12: 193.47966542702378
2020-08-13: 216.81825862225244
2020-08-14: 234.03561900304697
2020-08-15: 248.5555075723213
2020-08-16: 261.9630532440436
2020-08-17: 275.29474851917416
2020-08-18: 293.2859709226252
2020-08-19: 314.0698935586695
2020-08-20: 332.4224578676101
2020-08-21: 349.22562114030654
2020-08-22: 365.4936514381879
2020-08-23: 382.52642218648214
2020-08-24: 407.88101662136444
2020-08-25: 430.07144610540195
2020-08-26: 450.5597091090083
2020-08-27: 465.98735305673017
2020-08-28: 484.63428886238376
2020-08-29: 515.340280541247
2020-08-30: 541.890745465995
2020-08-31: 566.2508880449996

Predicting for Afghanistan__na

2020-08-13: 2961.314971091926
2020-08-14: 3256.6879160647386
2020-08-15: 3377.030789289596
2020-08-16: 3363.626971879992
2020-08-17: 3234.912574775216
2020-08-18: 3208.247977984357
2020-08-19: 3439.4687983543645
2020-08-20: 3639.875457637493
2020-08-21: 3740.53411429221
2020-08-22: 3755.725920966235
2020-08-23: 3749.195628875257
2020-08-24: 3805.5689591826185
2020-08-25: 3970.779127147044
2020-08-26: 4123.735788686436
2020-08-27: 4218.629130535852
2020-08-28: 4269.368866421063
2020-08-29: 4316.151039665351
2020-08-30: 4406.190695562
2020-08-31: 4545.098088675729

Predicting for Burundi__nan
2020-08-01: 56.712772397170525
2020-08-02: 66.54421141424086
2020-08-03: 74.425969756827
2020-08-04: 80.18851945229098
2020-08-05: 84.26188622310703
2020-08-06: 96.32951633019265
2020-08-07: 127.34227556899472
2020-08-08: 142.0799281694358
2020-08-09: 153.00978608272388
2020-08-10: 162.67255680505636
2020-08-11: 172.81461655994954
2020-08-12: 189.40372582052026
2020-08-13: 212.04691757742484
2020-08

2020-08-27: 554.5145030786359
2020-08-28: 577.8112997416273
2020-08-29: 616.4813772884498
2020-08-30: 649.1362880269409
2020-08-31: 678.6932820305442

Predicting for Bermuda__nan
2020-08-01: 51.59740305214285
2020-08-02: 64.06112687877973
2020-08-03: 69.05908722512672
2020-08-04: 74.03902947583147
2020-08-05: 79.3971962594471
2020-08-06: 93.21431677492409
2020-08-07: 122.81531029308502
2020-08-08: 137.87608720093152
2020-08-09: 147.57214971762377
2020-08-10: 156.89458473657476
2020-08-11: 167.77407294895897
2020-08-12: 184.91883675275866
2020-08-13: 207.08773685870185
2020-08-14: 223.32685550984843
2020-08-15: 236.48509602545883
2020-08-16: 249.48719210796642
2020-08-17: 264.22086103009576
2020-08-18: 282.64151599669924
2020-08-19: 302.7669037249425
2020-08-20: 320.33133789199405
2020-08-21: 336.3373354394138
2020-08-22: 352.4785567909282
2020-08-23: 370.0551454467119
2020-08-24: 389.71422473195145
2020-08-25: 410.05934837094605
2020-08-26: 429.3216927826159
2020-08-27: 439.30594719991

2020-08-10: 5118.0480347007
2020-08-11: 4592.645580483714
2020-08-12: 4247.64482409141
2020-08-13: 4965.4400107423135
2020-08-14: 5396.20644699035
2020-08-15: 5550.773234681447
2020-08-16: 5502.2164287062415
2020-08-17: 5265.060510159993
2020-08-18: 5251.475330353934
2020-08-19: 5650.330019641412
2020-08-20: 5949.073170848825
2020-08-21: 6083.955442916283
2020-08-22: 6085.874263016426
2020-08-23: 6040.094251348739
2020-08-24: 6136.616800841156
2020-08-25: 6405.507202231428
2020-08-26: 6632.598577773741
2020-08-27: 6760.432385656742
2020-08-28: 6818.1300576255735
2020-08-29: 6870.996231267976
2020-08-30: 7009.621133282096
2020-08-31: 7224.5671097764425

Predicting for Switzerland__nan
2020-08-01: 4103.965877134566
2020-08-02: 4296.815252510073
2020-08-03: 3457.951424156261
2020-08-04: 1388.5662514176151
2020-08-05: 1102.0444167047294
2020-08-06: 1769.9936424350624
2020-08-07: 3567.164930775646
2020-08-08: 3795.3982619330527
2020-08-09: 3154.9981150033364
2020-08-10: 2205.562398008631
20

2020-08-01: 51.1724909015977
2020-08-02: 63.53833674843654
2020-08-03: 69.33977339709773
2020-08-04: 75.80663471606206
2020-08-05: 79.77613124776458
2020-08-06: 93.29293038040773
2020-08-07: 122.69275648324282
2020-08-08: 137.8122785604716
2020-08-09: 148.08484281082121
2020-08-10: 157.86922869337423
2020-08-11: 168.20696912211844
2020-08-12: 185.11289172053117
2020-08-13: 207.18678864661547
2020-08-14: 223.5312454496805
2020-08-15: 237.01484322647005
2020-08-16: 250.16541023537832
2020-08-17: 264.6635957428703
2020-08-18: 277.1743541993832
2020-08-19: 295.88913198406766
2020-08-20: 312.9193050221722
2020-08-21: 328.47587833388025
2020-08-22: 344.0497770889854
2020-08-23: 350.2432721387193
2020-08-24: 364.19227130352465
2020-08-25: 381.7760152272068
2020-08-26: 398.92572343890345
2020-08-27: 406.87294633867117
2020-08-28: 419.91095330305745
2020-08-29: 431.42791361704064
2020-08-30: 446.37061707359396
2020-08-31: 463.3728823694188

Predicting for Cape Verde__nan
2020-08-01: 85.23293924

2020-08-01: 922.5386762993446
2020-08-02: 1096.5110549147998
2020-08-03: 1131.4244095656372
2020-08-04: 1126.7026966905494
2020-08-05: 988.0679858771205
2020-08-06: 689.4622792164994
2020-08-07: 1054.9388045809608
2020-08-08: 1201.6998831940543
2020-08-09: 1241.2851488781441
2020-08-10: 1233.9582738032104
2020-08-11: 1140.2841679355465
2020-08-12: 1076.883553652595
2020-08-13: 1250.9235228033633
2020-08-14: 1355.6884558625686
2020-08-15: 1395.7496189801293
2020-08-16: 1395.5737911369495
2020-08-17: 1360.2164688379592
2020-08-18: 1376.3079027221481
2020-08-19: 1479.8372806474079
2020-08-20: 1557.8289055135447
2020-08-21: 1598.5190298561454
2020-08-22: 1612.7023439016536
2020-08-23: 1627.3928185356801
2020-08-24: 1668.5662053469011
2020-08-25: 1744.9890535479644
2020-08-26: 1810.434402595495
2020-08-27: 1854.7847650048525
2020-08-28: 1885.9630619234008
2020-08-29: 1920.1660866911932
2020-08-30: 1971.5857557113684
2020-08-31: 2038.2493402162388

Predicting for Ecuador__nan
2020-08-01: 917

2020-08-01: 20444.847647234114
2020-08-02: 21188.289645807003
2020-08-03: 20904.37107044325
2020-08-04: 18295.504375936118
2020-08-05: 13980.270887933559
2020-08-06: 11488.084161343833
2020-08-07: 19606.65920266463
2020-08-08: 21246.31914799412
2020-08-09: 20941.22217777807
2020-08-10: 19286.325397210163
2020-08-11: 17028.48925846105
2020-08-12: 17141.526154252624
2020-08-13: 20840.382887384476
2020-08-14: 22137.001988366927
2020-08-15: 22021.39388835236
2020-08-16: 21106.7918605536
2020-08-17: 20221.351740089867
2020-08-18: 20881.262377659856
2020-08-19: 22829.824584238468
2020-08-20: 23762.113839518726
2020-08-21: 23831.262918656546
2020-08-22: 23476.07128082054
2020-08-23: 23327.750110336954
2020-08-24: 24045.815179009332
2020-08-25: 25248.373116269577
2020-08-26: 25968.07682144894
2020-08-27: 26188.94073993408
2020-08-28: 26216.618979103932
2020-08-29: 26445.919933762507
2020-08-30: 27136.349354701768
2020-08-31: 28019.22218022095

Predicting for Gabon__nan
2020-08-01: 58.495405765

2020-08-01: 83.11068270650891
2020-08-02: 90.9308699787226
2020-08-03: 115.68160373521808
2020-08-04: 94.90537780575593
2020-08-05: 102.83036676171426
2020-08-06: 111.57767149500877
2020-08-07: 152.2978157496935
2020-08-08: 170.6825262202825
2020-08-09: 184.6374868190424
2020-08-10: 184.87613709970142
2020-08-11: 194.6990677543058
2020-08-12: 211.7988138133358
2020-08-13: 239.1270744834391
2020-08-14: 258.3816025329205
2020-08-15: 272.36583461840576
2020-08-16: 281.85901977215633
2020-08-17: 295.82901739975307
2020-08-18: 320.98347288065503
2020-08-19: 345.40201994137203
2020-08-20: 365.5901994148926
2020-08-21: 382.471995581296
2020-08-22: 397.94665154653353
2020-08-23: 426.6855314807125
2020-08-24: 453.0776016902679
2020-08-25: 478.07758619455905
2020-08-26: 500.88316097182087
2020-08-27: 517.6369988443655
2020-08-28: 539.1686555030515
2020-08-29: 570.9235404351439
2020-08-30: 599.1815841978296
2020-08-31: 626.0210210117648

Predicting for Gambia__nan
2020-08-01: 52.12225907488548
20

2020-08-01: 2763.3415655597382
2020-08-02: 3186.4506334629896
2020-08-03: 3228.533110918135
2020-08-04: 3469.622611194389
2020-08-05: 2888.0965033359607
2020-08-06: 1935.098741525779
2020-08-07: 2999.932706498743
2020-08-08: 3376.250370192186
2020-08-09: 3492.1093869156384
2020-08-10: 3536.743856238136
2020-08-11: 3180.203839310072
2020-08-12: 2939.3094357055306
2020-08-13: 3422.0374420095577
2020-08-14: 3694.2159352686185
2020-08-15: 3807.4474723222993
2020-08-16: 3810.2378525336944
2020-08-17: 3655.218169451232
2020-08-18: 3635.054112846474
2020-08-19: 3900.0904513475125
2020-08-20: 4096.380719120812
2020-08-21: 4197.076771037231
2020-08-22: 4227.725142306657
2020-08-23: 4187.302375594317
2020-08-24: 4246.3277327262485
2020-08-25: 4425.462636591659
2020-08-26: 4578.672327376612
2020-08-27: 4688.159542669008
2020-08-28: 4742.953689968712
2020-08-29: 4777.993776367825
2020-08-30: 4870.623949068017
2020-08-31: 5016.861036639263

Predicting for Haiti__nan
2020-08-01: 58.96861155924829
20

2020-08-01: 608.7600028595571
2020-08-02: 851.4751418682629
2020-08-03: 636.6298254459766
2020-08-04: 792.951865813998
2020-08-05: 810.795150192218
2020-08-06: 512.8516948692738
2020-08-07: 760.1153465620863
2020-08-08: 874.7770646621924
2020-08-09: 835.0803003195456
2020-08-10: 903.5041441943106
2020-08-11: 883.6657542359278
2020-08-12: 809.2121389648846
2020-08-13: 926.0975929768308
2020-08-14: 994.8154079152956
2020-08-15: 1008.4571209264384
2020-08-16: 1044.7285793273866
2020-08-17: 1040.9517384659498
2020-08-18: 1039.5569039389045
2020-08-19: 1108.5680915352345
2020-08-20: 1160.1238003352594
2020-08-21: 1189.625344543588
2020-08-22: 1219.186367977663
2020-08-23: 1233.4973366672978
2020-08-24: 1257.897893314958
2020-08-25: 1309.7782446655926
2020-08-26: 1355.2955631894968
2020-08-27: 1382.1964826020755
2020-08-28: 1412.1509764618747
2020-08-29: 1438.5323252248934
2020-08-30: 1472.9354619691537
2020-08-31: 1518.8881391249413

Predicting for Italy__nan
2020-08-01: 29092.646193090593


2020-08-01: 340.36805312435337
2020-08-02: 413.82543905854595
2020-08-03: 424.716493905274
2020-08-04: 401.35337754562363
2020-08-05: 336.73788156170974
2020-08-06: 279.88331908342593
2020-08-07: 427.270019294069
2020-08-08: 488.61139849905214
2020-08-09: 502.9315253728398
2020-08-10: 492.1241457344961
2020-08-11: 462.483992609616
2020-08-12: 466.3953430882476
2020-08-13: 543.9263590386186
2020-08-14: 590.151090961835
2020-08-15: 607.7384967541301
2020-08-16: 608.778213761351
2020-08-17: 605.5344095136306
2020-08-18: 633.0433165027827
2020-08-19: 684.6629656293153
2020-08-20: 722.4569268920984
2020-08-21: 743.8753010901844
2020-08-22: 755.9255133149331
2020-08-23: 779.8466672729277
2020-08-24: 813.1865161275491
2020-08-25: 855.4039520672001
2020-08-26: 890.8923525006203
2020-08-27: 913.0326654226353
2020-08-28: 935.5669874163308
2020-08-29: 964.8110044068762
2020-08-30: 1000.4312951386178
2020-08-31: 1039.6120277634325

Predicting for Kuwait__nan
2020-08-01: 430.2505650067634
2020-08-0

2020-08-01: 536.5484484642369
2020-08-02: 786.8558615265351
2020-08-03: 708.0007516655639
2020-08-04: 650.3964769559132
2020-08-05: 580.3995455550009
2020-08-06: 435.5103138913556
2020-08-07: 678.2523134199369
2020-08-08: 813.8345233995434
2020-08-09: 797.689233687667
2020-08-10: 768.5797272915682
2020-08-11: 722.2683964995852
2020-08-12: 705.5115558741979
2020-08-13: 832.1151894763102
2020-08-14: 912.8548765138626
2020-08-15: 921.5178918459201
2020-08-16: 912.7128042466634
2020-08-17: 900.1176043391757
2020-08-18: 916.3221366359714
2020-08-19: 992.5918029084643
2020-08-20: 1047.7336208594575
2020-08-21: 1067.209744117426
2020-08-22: 1074.3749902229556
2020-08-23: 1073.6791859988086
2020-08-24: 1102.1567982326192
2020-08-25: 1156.0707473399887
2020-08-26: 1199.3291974415283
2020-08-27: 1215.9302154674026
2020-08-28: 1230.5806284591158
2020-08-29: 1247.0048613966846
2020-08-30: 1279.6078156343958
2020-08-31: 1323.3514172352668

Predicting for Latvia__nan
2020-08-01: 411.26304188266295
2

2020-08-22: 2120.999355510814
2020-08-23: 2132.0084230065268
2020-08-24: 2187.420089588213
2020-08-25: 2289.381335810369
2020-08-26: 2370.164963533198
2020-08-27: 2419.4414335236734
2020-08-28: 2451.6226003325955
2020-08-29: 2490.358393964553
2020-08-30: 2555.9563029128103
2020-08-31: 2641.167241282539

Predicting for Mongolia__nan
2020-08-01: 80.7257857060998
2020-08-02: 89.79759928231839
2020-08-03: 116.75910716344231
2020-08-04: 108.98122826660321
2020-08-05: 107.81344634148853
2020-08-06: 112.99193861513092
2020-08-07: 152.6402350173749
2020-08-08: 171.74787365168567
2020-08-09: 188.82896986400613
2020-08-10: 193.3085514200634
2020-08-11: 199.5341685300318
2020-08-12: 214.44510172529934
2020-08-13: 241.2253597520163
2020-08-14: 261.23133449822046
2020-08-15: 277.2973099570263
2020-08-16: 288.24557636207226
2020-08-17: 300.63861844152615
2020-08-18: 324.5965874834425
2020-08-19: 348.7916591243833
2020-08-20: 369.6223859153881
2020-08-21: 387.70133065968605
2020-08-22: 403.7082983420

2020-08-11: 332.35103907797844
2020-08-12: 338.91110695462856
2020-08-13: 387.31764755118854
2020-08-14: 414.38299849393843
2020-08-15: 435.43261493058185
2020-08-16: 448.8694505589667
2020-08-17: 451.9773820672459
2020-08-18: 475.6113985935101
2020-08-19: 510.74376894665284
2020-08-20: 537.8759549787587
2020-08-21: 560.1934587650039
2020-08-22: 577.3627500678908
2020-08-23: 602.8020969334157
2020-08-24: 630.9280298531935
2020-08-25: 662.8237596187474
2020-08-26: 691.4784016434231
2020-08-27: 712.6754767441616
2020-08-28: 736.186236179386
2020-08-29: 764.0715314345587
2020-08-30: 794.4191053890407
2020-08-31: 820.4939270344114

Predicting for Nicaragua__nan
2020-08-01: 55.10683772175757
2020-08-02: 67.5800553479311
2020-08-03: 83.5282335463402
2020-08-04: 106.04891877501605
2020-08-05: 88.30120781882141
2020-08-06: 98.97469435664567
2020-08-07: 130.03412009675955
2020-08-08: 147.09579030630812
2020-08-09: 164.38254627230046
2020-08-10: 178.24256788175606
2020-08-11: 180.00256114650156


2020-08-07: 1869.2691963597397
2020-08-08: 2134.0944875465684
2020-08-09: 2193.136610781282
2020-08-10: 2108.0249334760992
2020-08-11: 1876.540771668331
2020-08-12: 1808.1265607141797
2020-08-13: 2141.2647715970324
2020-08-14: 2326.0239243754413
2020-08-15: 2376.73938192848
2020-08-16: 2329.9261972117556
2020-08-17: 2237.197440980817
2020-08-18: 2274.8260162116057
2020-08-19: 2465.1962183259857
2020-08-20: 2595.924591385702
2020-08-21: 2646.4568296652697
2020-08-22: 2640.08543067512
2020-08-23: 2638.5786904448833
2020-08-24: 2706.8722155346295
2020-08-25: 2837.7176726252073
2020-08-26: 2940.6225483479384
2020-08-27: 2997.5809097327883
2020-08-28: 3026.550518150033
2020-08-29: 3065.439395709519
2020-08-30: 3144.6644726838686
2020-08-31: 3250.875414582929

Predicting for Philippines__nan
2020-08-01: 1270.2340807188546
2020-08-02: 1538.1171770467834
2020-08-03: 1707.9565901187643
2020-08-04: 1839.1025154392717
2020-08-05: 1625.8636849650914
2020-08-06: 1021.0481741178453
2020-08-07: 1499.

2020-08-03: 793.0198558151649
2020-08-04: 784.6704285498201
2020-08-05: 663.8617321718526
2020-08-06: 510.651204019042
2020-08-07: 805.5910913190628
2020-08-08: 903.4747709966575
2020-08-09: 902.3957159999336
2020-08-10: 888.7106721056266
2020-08-11: 823.2357809870575
2020-08-12: 809.1466641018415
2020-08-13: 952.1066288706136
2020-08-14: 1023.21872315151
2020-08-15: 1039.4201702615878
2020-08-16: 1035.8426448836067
2020-08-17: 1017.3544396361652
2020-08-18: 1048.6103833713157
2020-08-19: 1134.0734731919554
2020-08-20: 1189.5104494063917
2020-08-21: 1214.6695552147123
2020-08-22: 1225.7592504900538
2020-08-23: 1246.9757072872696
2020-08-24: 1289.897998494194
2020-08-25: 1353.1244677763011
2020-08-26: 1402.479566676921
2020-08-27: 1435.2914477818256
2020-08-28: 1462.109438742935
2020-08-29: 1496.0221562704025
2020-08-30: 1543.306827454455
2020-08-31: 1598.6941347722557

Predicting for Romania__nan
2020-08-01: 8291.13833250034
2020-08-02: 9471.293907236486
2020-08-03: 9273.202435060484
2

2020-08-01: 53.398425559185405
2020-08-02: 64.1869598934233
2020-08-03: 70.92230184913753
2020-08-04: 77.25350476915685
2020-08-05: 80.3819664085812
2020-08-06: 94.14379996092023
2020-08-07: 124.24028772628904
2020-08-08: 138.98393258641246
2020-08-09: 149.54984725405896
2020-08-10: 159.20630098040164
2020-08-11: 169.2271498260727
2020-08-12: 186.30704056366034
2020-08-13: 208.63933316889316
2020-08-14: 224.9240635226426
2020-08-15: 238.50100344944588
2020-08-16: 251.574230680967
2020-08-17: 265.96825337405727
2020-08-18: 278.58532407545783
2020-08-19: 297.4209958886835
2020-08-20: 314.46800007736056
2020-08-21: 330.0602825310211
2020-08-22: 345.60173431893713
2020-08-23: 351.7748557586466
2020-08-24: 365.7936438629698
2020-08-25: 383.4497371542869
2020-08-26: 400.6322706370233
2020-08-27: 408.6033951313236
2020-08-28: 421.63930265276406
2020-08-29: 433.1697103058848
2020-08-30: 448.16546741388106
2020-08-31: 465.2213504115127

Predicting for El Salvador__nan
2020-08-01: 236.5914341854

2020-08-01: 3459.3038331419475
2020-08-02: 4415.533845489433
2020-08-03: 4032.9974460598714
2020-08-04: 1488.5120485457974
2020-08-05: 1125.932307151601
2020-08-06: 1687.899736347839
2020-08-07: 3332.0976864903114
2020-08-08: 3918.94298596897
2020-08-09: 3442.559593210425
2020-08-10: 2309.1190654562197
2020-08-11: 2111.789985838078
2020-08-12: 2610.9060151866133
2020-08-13: 3463.60853424563
2020-08-14: 3768.5794717572207
2020-08-15: 3453.8261161163528
2020-08-16: 2941.6895724655924
2020-08-17: 2895.1060187590047
2020-08-18: 3257.803081080566
2020-08-19: 3729.29683423567
2020-08-20: 3893.7578893274804
2020-08-21: 3730.712421318536
2020-08-22: 3518.9456034603645
2020-08-23: 3556.9260643566104
2020-08-24: 3815.0889651129337
2020-08-25: 4097.639967714435
2020-08-26: 4205.973745384475
2020-08-27: 4140.565999194241
2020-08-28: 4080.4333208130474
2020-08-29: 4159.39845126696
2020-08-30: 4351.808472653126
2020-08-31: 4541.7499406759125

Predicting for Eswatini__nan
2020-08-01: 65.9109056672546

2020-08-19: 309.0111158216192
2020-08-20: 327.18029108650944
2020-08-21: 343.8729237447051
2020-08-22: 360.68161160692046
2020-08-23: 389.5162489167685
2020-08-24: 414.73875813436723
2020-08-25: 437.76971719029444
2020-08-26: 459.1985918287834
2020-08-27: 471.3373260082028
2020-08-28: 492.0986182143629
2020-08-29: 518.6946237226186
2020-08-30: 544.339891749911
2020-08-31: 568.7966060361948

Predicting for Timor-Leste__nan
2020-08-01: 50.66467415432003
2020-08-02: 62.74143861625789
2020-08-03: 68.25421517014246
2020-08-04: 73.71272784768558
2020-08-05: 79.13088749925792
2020-08-06: 92.79764188454311
2020-08-07: 121.93550958221095
2020-08-08: 136.851095337674
2020-08-09: 146.81133765998902
2020-08-10: 156.37214243835206
2020-08-11: 167.27902079295683
2020-08-12: 184.285919682207
2020-08-13: 206.22025532030648
2020-08-14: 222.4052707499774
2020-08-15: 235.69817969209544
2020-08-16: 248.8161658139483
2020-08-17: 263.549234346583
2020-08-18: 276.1063615847573
2020-08-19: 294.7350347440222
2


Predicting for United States__Alabama
2020-08-01: 1998.256711226707
2020-08-02: 2334.610315227681
2020-08-03: 2223.425936238197
2020-08-04: 1975.320161404528
2020-08-05: 1611.0969596233483
2020-08-06: 1264.9987829994427
2020-08-07: 2089.517128213066
2020-08-08: 2354.055210872266
2020-08-09: 2312.091164813295
2020-08-10: 2164.3910048451353
2020-08-11: 1958.5554898152047
2020-08-12: 1940.9152825521926
2020-08-13: 2332.7836184045664
2020-08-14: 2510.357033430688
2020-08-15: 2509.464283248697
2020-08-16: 2434.991619806461
2020-08-17: 2358.1826821507675
2020-08-18: 2423.4926928945815
2020-08-19: 2640.6361060411173
2020-08-20: 2764.3619871567253
2020-08-21: 2788.779988999435
2020-08-22: 2770.5517632152187
2020-08-23: 2769.213984399232
2020-08-24: 2852.70240804093
2020-08-25: 2995.093942915444
2020-08-26: 3092.135937423661
2020-08-27: 3130.987323312127
2020-08-28: 3151.830388782235
2020-08-29: 3191.4747182462124
2020-08-30: 3278.0402569152566
2020-08-31: 3388.226876769507

Predicting for Uni

2020-08-28: 10554.329180939178
2020-08-29: 10659.544701043264
2020-08-30: 10924.172552471227
2020-08-31: 11274.629088846184

Predicting for United States__Georgia
2020-08-01: 2933.4096920678276
2020-08-02: 3883.1468979615274
2020-08-03: 4482.759253609625
2020-08-04: 2972.494225131225
2020-08-05: 2283.4116002152823
2020-08-06: 1928.7085340001634
2020-08-07: 3232.505938651292
2020-08-08: 3938.692061000267
2020-08-09: 4046.3656577582024
2020-08-10: 3365.4629763506427
2020-08-11: 2953.2045233101126
2020-08-12: 2994.9851859549626
2020-08-13: 3668.6530683646943
2020-08-14: 4097.826349902757
2020-08-15: 4114.0390198134755
2020-08-16: 3800.753065145518
2020-08-17: 3623.5786456805436
2020-08-18: 3755.649422163793
2020-08-19: 4146.475681881896
2020-08-20: 4408.631263723316
2020-08-21: 4429.7492705128325
2020-08-22: 4307.17976540216
2020-08-23: 4270.769353635066
2020-08-24: 4415.352249953682
2020-08-25: 4669.51617480454
2020-08-26: 4848.110957783761
2020-08-27: 4894.751433976834
2020-08-28: 4879.

2020-08-28: 3493.785749854832
2020-08-29: 3532.1803771818322
2020-08-30: 3625.9127258247763
2020-08-31: 3747.991494091254

Predicting for United States__Maine
2020-08-01: 229.7106975259483
2020-08-02: 266.7177885906677
2020-08-03: 261.1105392945847
2020-08-04: 286.795841032794
2020-08-05: 246.26929672860444
2020-08-06: 208.20919248157367
2020-08-07: 307.2950204454104
2020-08-08: 342.9428896939203
2020-08-09: 354.7064391495231
2020-08-10: 366.48368539701914
2020-08-11: 352.18067388590106
2020-08-12: 357.0333939475455
2020-08-13: 410.21331260225696
2020-08-14: 441.1777506989497
2020-08-15: 458.5821019373798
2020-08-16: 470.4045616436903
2020-08-17: 474.19567278361853
2020-08-18: 498.17392914599384
2020-08-19: 536.0393294723452
2020-08-20: 564.7211696969639
2020-08-21: 585.6657828726263
2020-08-22: 602.2329818365274
2020-08-23: 628.0996783529292
2020-08-24: 657.0376267469122
2020-08-25: 690.5644404254101
2020-08-26: 720.0304950383218
2020-08-27: 740.8548669593157
2020-08-28: 764.306691451

2020-08-15: 2106.4409262704066
2020-08-16: 2100.894473457156
2020-08-17: 2101.0394992317515
2020-08-18: 2160.7285141869797
2020-08-19: 2333.8846932753404
2020-08-20: 2399.0718730373246
2020-08-21: 2402.2683084441887
2020-08-22: 2419.495438937851
2020-08-23: 2451.530709926059
2020-08-24: 2527.9541870235516
2020-08-25: 2639.5561972064115
2020-08-26: 2702.9135433344463
2020-08-27: 2731.916693491251
2020-08-28: 2768.9130320412787
2020-08-29: 2820.916294979914
2020-08-30: 2898.8432499712408
2020-08-31: 2987.2990138069717

Predicting for United States__New Hampshire
2020-08-01: 472.6864234619403
2020-08-02: 552.4933164120966
2020-08-03: 520.4083677341193
2020-08-04: 474.91958169965267
2020-08-05: 453.1725745556172
2020-08-06: 355.20094888038483
2020-08-07: 550.2983191940789
2020-08-08: 616.1528378966991
2020-08-09: 610.019233929941
2020-08-10: 594.983901906593
2020-08-11: 576.3014910433945
2020-08-12: 573.777578172609
2020-08-13: 670.7517328771877
2020-08-14: 719.3847046058049
2020-08-15: 72

2020-08-16: 1926.6874058542685
2020-08-17: 1875.100307451605
2020-08-18: 1927.9309837628477
2020-08-19: 2101.2673794601797
2020-08-20: 2203.8088785611635
2020-08-21: 2223.1386658261963
2020-08-22: 2210.999353515038
2020-08-23: 2215.1872003367707
2020-08-24: 2283.7392585317475
2020-08-25: 2399.1102990877516
2020-08-26: 2479.2127769782182
2020-08-27: 2510.972528678937
2020-08-28: 2530.357330487266
2020-08-29: 2565.708060849771
2020-08-30: 2637.2572969934545
2020-08-31: 2727.5498987435394

Predicting for United States__South Dakota
2020-08-01: 978.5648454697598
2020-08-02: 1221.7917364798182
2020-08-03: 1128.0852528839507
2020-08-04: 996.3050321720899
2020-08-05: 834.5489945260781
2020-08-06: 665.0810070057033
2020-08-07: 1083.7169492224357
2020-08-08: 1243.3476081969764
2020-08-09: 1212.3194873148952
2020-08-10: 1139.130244971567
2020-08-11: 1047.840823374147
2020-08-12: 1045.7260921904167
2020-08-13: 1252.0611798132295
2020-08-14: 1354.3772235778333
2020-08-15: 1354.6056830492682
2020-0

Predicting for United States__Wisconsin
2020-08-01: 5743.862820795583
2020-08-02: 6541.788156494561
2020-08-03: 6082.7379896141565
2020-08-04: 4763.7244883448675
2020-08-05: 3818.986696156331
2020-08-06: 3267.5550471323704
2020-08-07: 5662.816096603855
2020-08-08: 6319.07689433573
2020-08-09: 6018.776515423773
2020-08-10: 5329.602397806132
2020-08-11: 4800.28062516768
2020-08-12: 4932.572938826342
2020-08-13: 6065.836289015074
2020-08-14: 6491.361054482844
2020-08-15: 6348.765173665977
2020-08-16: 6005.190991157059
2020-08-17: 5807.601533207719
2020-08-18: 6061.399743023762
2020-08-19: 6670.052459340722
2020-08-20: 6951.687680642078
2020-08-21: 6919.624859770598
2020-08-22: 6789.779791049271
2020-08-23: 6780.496919281056
2020-08-24: 7029.3922877770365
2020-08-25: 7406.306673732358
2020-08-26: 7618.298237180202
2020-08-27: 7658.873530228578
2020-08-28: 7661.295584164399
2020-08-29: 7750.750255217646
2020-08-30: 7978.744315937129
2020-08-31: 8252.284337269713

Predicting for United State

2020-08-01: 2353.112891997571
2020-08-02: 2707.626049267542
2020-08-03: 2894.4939495100425
2020-08-04: 2703.9935997283874
2020-08-05: 2173.428432353351
2020-08-06: 1583.3079290244634
2020-08-07: 2521.7911371624486
2020-08-08: 2865.7218374672348
2020-08-09: 2965.396861746758
2020-08-10: 2838.6353428907396
2020-08-11: 2531.3729888969574
2020-08-12: 2422.439136462006
2020-08-13: 2864.154925615485
2020-08-14: 3108.827327440985
2020-08-15: 3181.091829743121
2020-08-16: 3112.962018744158
2020-08-17: 2984.690506988065
2020-08-18: 3023.440735618533
2020-08-19: 3272.7144147568824
2020-08-20: 3444.950438254299
2020-08-21: 3511.3306076127874
2020-08-22: 3497.988238046597
2020-08-23: 3486.5984095203403
2020-08-24: 3568.962841572118
2020-08-25: 3738.005081797527
2020-08-26: 3871.4001414995473
2020-08-27: 3943.504810935041
2020-08-28: 3975.9601159895938
2020-08-29: 4019.5408140531454
2020-08-30: 4117.382177879573
2020-08-31: 4252.835570792905

Predicting for Zambia__nan
2020-08-01: 86.55940405553977

In [24]:
# Check the predictions
preds_df.head()

Unnamed: 0,CountryName,RegionName,Date,PredictedDailyNewCases
213,Aruba,,2020-08-01,58.830066
214,Aruba,,2020-08-02,71.321117
215,Aruba,,2020-08-03,78.986804
216,Aruba,,2020-08-04,90.238539
217,Aruba,,2020-08-05,87.811644


# Validation
This is how the predictor is going to be called during the competition.  
!!! PLEASE DO NOT CHANGE THE API !!!

In [36]:
!python3 predict.py -s 2020-08-01 -e 2020-08-04 -ip data/2020-09-30_historical_ip_new.csv -o predictions/2020-08-01_2020-08-04.csv

Generating predictions from 2020-08-01 to 2020-08-04...
Saved predictions to predictions/2020-08-01_2020-08-04.csv
Done!


In [37]:
!head predictions/2020-08-01_2020-08-04.csv

CountryName,RegionName,Date,PredictedDailyNewCases
Aruba,,2020-08-01,58.830065523865606
Aruba,,2020-08-02,71.32111674610171
Aruba,,2020-08-03,78.98680370746847
Aruba,,2020-08-04,90.23853933752429
Afghanistan,,2020-08-01,149.7782001433701
Afghanistan,,2020-08-02,287.27501248801843
Afghanistan,,2020-08-03,277.5336348431203
Afghanistan,,2020-08-04,264.79695558556705
Angola,,2020-08-01,198.70220566149834


# Test cases
We can generate a prediction file. Let's validate a few cases...

In [38]:
import os
from predictor_validation import validate_submission

def validate(start_date, end_date, ip_file, output_file):
    # First, delete any potential old file
    try:
        os.remove(output_file)
    except OSError:
        pass
    
    # Then generate the prediction, calling the official API
    !python predict.py -s {start_date} -e {end_date} -ip {ip_file} -o {output_file}
    
    # And validate it
    errors = validate_submission(start_date, end_date, ip_file, output_file)
    if errors:
        for error in errors:
            print(error)
    else:
        print("All good!")

## 4 days, no gap
- All countries and regions
- Official number of cases is known up to start_date
- Intervention Plans are the official ones

In [39]:
validate(start_date="2020-08-01",
         end_date="2020-08-04",
         ip_file="data/2020-09-30_historical_ip.csv",
         output_file="predictions/val_4_days.csv")

  File "predict.py", line 36
    def predict(start_date: str,
                          ^
SyntaxError: invalid syntax


FileNotFoundError: [Errno 2] No such file or directory: 'predictions/val_4_days.csv'

## 1 month in the future
- 2 countries only
- there's a gap between date of last known number of cases and start_date
- For future dates, Intervention Plans contains scenarios for which predictions are requested to answer the question: what will happen if we apply these plans?

In [None]:
%%time
validate(start_date="2021-01-01",
         end_date="2021-01-31",
         ip_file="../../../validation/data/future_ip.csv",
         output_file="predictions/val_1_month_future.csv")

## 180 days, from a future date, all countries and regions
- Prediction start date is 1 week from now. (i.e. assuming submission date is 1 week from now)  
- Prediction end date is 6 months after start date.  
- Prediction is requested for all available countries and regions.  
- Intervention plan scenario: freeze last known intervention plans for each country and region.  

As the number of cases is not known yet between today and start date, but the model relies on them, the model has to predict them in order to use them.  
This test is the most demanding test. It should take less than 1 hour to generate the prediction file.

### Generate the scenario

In [None]:
from datetime import datetime, timedelta

start_date = datetime.now() + timedelta(days=7)
start_date_str = start_date.strftime('%Y-%m-%d')
end_date = start_date + timedelta(days=180)
end_date_str = end_date.strftime('%Y-%m-%d')
print(f"Start date: {start_date_str}")
print(f"End date: {end_date_str}")

In [None]:
from covid_xprize.validation.scenario_generator import get_raw_data, generate_scenario, NPI_COLUMNS
DATA_FILE = 'data/OxCGRT_latest.csv'
latest_df = get_raw_data(DATA_FILE, latest=True)
scenario_df = generate_scenario(start_date_str, end_date_str, latest_df, countries=None, scenario="Freeze")
scenario_file = "predictions/180_days_future_scenario.csv"
scenario_df.to_csv(scenario_file, index=False)
print(f"Saved scenario to {scenario_file}")

### Check it

In [None]:
%%time
validate(start_date=start_date_str,
         end_date=end_date_str,
         ip_file=scenario_file,
         output_file="predictions/val_6_month_future.csv")