# Example Predictor: Linear Rollout Predictor

This example contains basic functionality for training and evaluating a linear predictor that rolls out predictions day-by-day.

First, a training data set is created from historical case and npi data.

Second, a linear model is trained to predict future cases from prior case data along with prior and future npi data.
The model is an off-the-shelf sklearn Lasso model, that uses a positive weight constraint to enforce the assumption that increased npis has a negative correlation with future cases.

Third, a sample evaluation set is created, and the predictor is applied to this evaluation set to produce prediction results in the correct format.

## Training

In [1]:
import pickle
import numpy as np
import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split

### Copy the data locally

In [2]:
# Main source for the training data
DATA_URL = 'https://raw.githubusercontent.com/OxCGRT/covid-policy-tracker/master/data/OxCGRT_latest.csv'
# Local file
DATA_FILE = 'data/OxCGRT_latest.csv'

In [3]:
import os
import urllib.request
if not os.path.exists('data'):
    os.mkdir('data')
urllib.request.urlretrieve(DATA_URL, DATA_FILE)

('data/OxCGRT_latest.csv', <http.client.HTTPMessage at 0x25c2c52f348>)

In [4]:
# Load historical data from local file
df = pd.read_csv(DATA_FILE, 
                 parse_dates=['Date'],
                 encoding="ISO-8859-1",
                 dtype={"RegionName": str,
                        "RegionCode": str},
                 error_bad_lines=False)

In [5]:
df.columns

Index(['CountryName', 'CountryCode', 'RegionName', 'RegionCode',
       'Jurisdiction', 'Date', 'C1_School closing', 'C1_Flag',
       'C2_Workplace closing', 'C2_Flag', 'C3_Cancel public events', 'C3_Flag',
       'C4_Restrictions on gatherings', 'C4_Flag', 'C5_Close public transport',
       'C5_Flag', 'C6_Stay at home requirements', 'C6_Flag',
       'C7_Restrictions on internal movement', 'C7_Flag',
       'C8_International travel controls', 'E1_Income support', 'E1_Flag',
       'E2_Debt/contract relief', 'E3_Fiscal measures',
       'E4_International support', 'H1_Public information campaigns',
       'H1_Flag', 'H2_Testing policy', 'H3_Contact tracing',
       'H4_Emergency investment in healthcare', 'H5_Investment in vaccines',
       'H6_Facial Coverings', 'H6_Flag', 'M1_Wildcard', 'ConfirmedCases',
       'ConfirmedDeaths', 'StringencyIndex', 'StringencyIndexForDisplay',
       'StringencyLegacyIndex', 'StringencyLegacyIndexForDisplay',
       'GovernmentResponseIndex', 'Gove

In [6]:
# For testing, restrict training data to that before a hypothetical predictor submission date
HYPOTHETICAL_SUBMISSION_DATE = np.datetime64("2020-07-31")
df = df[df.Date <= HYPOTHETICAL_SUBMISSION_DATE]

In [7]:
# Add RegionID column that combines CountryName and RegionName for easier manipulation of data
df['GeoID'] = df['CountryName'] + '__' + df['RegionName'].astype(str)

In [8]:
# Add new cases column
df['NewCases'] = df.groupby('GeoID').ConfirmedCases.diff().fillna(0)

In [9]:
# Keep only columns of interest
id_cols = ['CountryName',
           'RegionName',
           'GeoID',
           'Date']
cases_col = ['NewCases']
npi_cols = ['C1_School closing',
            'C2_Workplace closing',
            'C3_Cancel public events',
            'C4_Restrictions on gatherings',
            'C5_Close public transport',
            'C6_Stay at home requirements',
            'C7_Restrictions on internal movement',
            'C8_International travel controls',
            'H1_Public information campaigns',
            'H2_Testing policy',
            'H3_Contact tracing',
            'H6_Facial Coverings']
df = df[id_cols + cases_col + npi_cols]

In [10]:
# Fill any missing case values by interpolation and setting NaNs to 0
df.update(df.groupby('GeoID').NewCases.apply(
    lambda group: group.interpolate()).fillna(0))

In [11]:
# Fill any missing NPIs by assuming they are the same as previous day
for npi_col in npi_cols:
    df.update(df.groupby('GeoID')[npi_col].ffill().fillna(0))

In [12]:
df

Unnamed: 0,CountryName,RegionName,GeoID,Date,NewCases,C1_School closing,C2_Workplace closing,C3_Cancel public events,C4_Restrictions on gatherings,C5_Close public transport,C6_Stay at home requirements,C7_Restrictions on internal movement,C8_International travel controls,H1_Public information campaigns,H2_Testing policy,H3_Contact tracing,H6_Facial Coverings
0,Aruba,,Aruba__nan,2020-01-01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Aruba,,Aruba__nan,2020-01-02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Aruba,,Aruba__nan,2020-01-03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Aruba,,Aruba__nan,2020-01-04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Aruba,,Aruba__nan,2020-01-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
87064,Zimbabwe,,Zimbabwe__nan,2020-07-27,78.0,3.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,4.0
87065,Zimbabwe,,Zimbabwe__nan,2020-07-28,192.0,3.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,4.0
87066,Zimbabwe,,Zimbabwe__nan,2020-07-29,113.0,3.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,4.0
87067,Zimbabwe,,Zimbabwe__nan,2020-07-30,62.0,3.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,4.0


In [13]:
# Set number of past days to use to make predictions
nb_lookback_days = 30

# Create training data across all countries for predicting one day ahead
X_cols = cases_col + npi_cols
y_col = cases_col
X_samples = []
y_samples = []
geo_ids = df.GeoID.unique()
for g in geo_ids:
    gdf = df[df.GeoID == g]
    all_case_data = np.array(gdf[cases_col])
    all_npi_data = np.array(gdf[npi_cols])

    # Create one sample for each day where we have enough data
    # Each sample consists of cases and npis for previous nb_lookback_days
    nb_total_days = len(gdf)
    for d in range(nb_lookback_days, nb_total_days - 1):
        X_cases = all_case_data[d-nb_lookback_days:d]

        # Take negative of npis to support positive
        # weight constraint in Lasso.
        X_npis = -all_npi_data[d - nb_lookback_days:d]

        # Flatten all input data so it fits Lasso input format.
        X_sample = np.concatenate([X_cases.flatten(),
                                   X_npis.flatten()])
        y_sample = all_case_data[d + 1]
        X_samples.append(X_sample)
        y_samples.append(y_sample)

X_samples = np.array(X_samples)
y_samples = np.array(y_samples).flatten()

In [14]:
# Helpful function to compute mae
def mae(pred, true):
    return np.mean(np.abs(pred - true))

In [15]:
# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X_samples,
                                                    y_samples,
                                                    test_size=0.2,
                                                    random_state=301)

In [16]:
# Create and train Lasso model.
# Set positive=True to enforce assumption that cases are positively correlated
# with future cases and npis are negatively correlated.
model = Lasso(alpha=0.1,
              precompute=True,
              max_iter=10000,
              positive=True,
              selection='random')
# Fit model
model.fit(X_train, y_train)

Lasso(alpha=0.1, copy_X=True, fit_intercept=True, max_iter=10000,
      normalize=False, positive=True, precompute=True, random_state=None,
      selection='random', tol=0.0001, warm_start=False)

In [17]:
# Evaluate model
train_preds = model.predict(X_train)
train_preds = np.maximum(train_preds, 0) # Don't predict negative cases
print('Train MAE:', mae(train_preds, y_train))

test_preds = model.predict(X_test)
test_preds = np.maximum(test_preds, 0) # Don't predict negative cases
print('Test MAE:', mae(test_preds, y_test))

Train MAE: 140.71006386533378
Test MAE: 152.49615088019445


In [18]:
# Inspect the learned feature coefficients for the model
# to see what features it's paying attention to.

# Give names to the features
x_col_names = []
for d in range(-nb_lookback_days, 0):
    x_col_names.append('Day ' + str(d) + ' ' + cases_col[0])
for d in range(-nb_lookback_days, 1):
    for col_name in npi_cols:
        x_col_names.append('Day ' + str(d) + ' ' + col_name)

# View non-zero coefficients
for (col, coeff) in zip(x_col_names, list(model.coef_)):
    if coeff != 0.:
        print(col, coeff)
print('Intercept', model.intercept_)

Day -7 NewCases 0.0013225356073199859
Day -6 NewCases 0.4392436772339886
Day -5 NewCases 0.21732674537443955
Day -4 NewCases 0.058986375482331224
Day -3 NewCases 0.06940403657691334
Day -2 NewCases 0.05200614681355212
Day -1 NewCases 0.23829209744812396
Day -26 C6_Stay at home requirements 4.317512757562426
Day -22 C2_Workplace closing 9.712146509910697
Day -17 C2_Workplace closing 5.768601948051288
Intercept 26.55309172529894


In [19]:
# Save model to file
if not os.path.exists('models'):
    os.mkdir('models')
with open('models/model.pkl', 'wb') as model_file:
    pickle.dump(model, model_file)

## Evaluation

Now that the predictor has been trained and saved, this section contains the functionality for evaluating it on sample evaluation data.

In [20]:
# Reload the module to get the latest changes
import predict
from importlib import reload
reload(predict)
from predict import predict_df

In [21]:
list_countries = sorted(list(set(df.CountryName)))
hist_ips_df = pd.read_csv("data/2020-09-30_historical_ip.csv",
                              parse_dates=['Date'],
                              encoding="ISO-8859-1",
                              dtype={"RegionName": str},
                              error_bad_lines=True)
hist_ips_df = hist_ips_df[hist_ips_df.CountryName.isin(list_countries)]
hist_ips_df.to_csv("data/2020-09-30_historical_ip_new.csv" , index = False)

In [26]:
hist_ips_df

Unnamed: 0,CountryName,RegionName,Date,C1_School closing,C2_Workplace closing,C3_Cancel public events,C4_Restrictions on gatherings,C5_Close public transport,C6_Stay at home requirements,C7_Restrictions on internal movement,C8_International travel controls,H1_Public information campaigns,H2_Testing policy,H3_Contact tracing,H6_Facial Coverings
0,Aruba,,2020-01-01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Aruba,,2020-01-02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Aruba,,2020-01-03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Aruba,,2020-01-04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Aruba,,2020-01-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
66577,Zimbabwe,,2020-09-26,2.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,3.0
66578,Zimbabwe,,2020-09-27,2.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,3.0
66579,Zimbabwe,,2020-09-28,2.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,3.0
66580,Zimbabwe,,2020-09-29,2.0,1.0,2.0,3.0,1.0,2.0,2.0,4.0,2.0,1.0,1.0,1.0


In [22]:
%%time
preds_df = predict_df("2020-08-01", "2020-08-31", path_to_ips_file="data/2020-09-30_historical_ip_new.csv", verbose=True)


Predicting for Aruba__nan
2020-08-01: 58.83473289745202
2020-08-02: 71.32067758710875
2020-08-03: 78.9939859879806
2020-08-04: 90.23322539243564
2020-08-05: 87.80838529780314
2020-08-06: 98.77066089725407
2020-08-07: 131.03650358597326
2020-08-08: 147.0197713649059
2020-08-09: 158.95382619356604
2020-08-10: 169.93478041431484
2020-08-11: 177.51681194106453
2020-08-12: 193.48982775242172
2020-08-13: 216.81710958325104
2020-08-14: 234.03727312037546
2020-08-15: 248.55924399377528
2020-08-16: 261.96334323394944
2020-08-17: 275.30005222237355
2020-08-18: 293.2945345100961
2020-08-19: 314.07285333297244
2020-08-20: 332.42697864544107
2020-08-21: 349.23132322228753
2020-08-22: 365.49901678409543
2020-08-23: 382.53537168234544
2020-08-24: 407.8958403512891
2020-08-25: 430.08469160303
2020-08-26: 450.5744309199954
2020-08-27: 465.99933048449
2020-08-28: 484.64663396426465
2020-08-29: 515.3515195450678
2020-08-30: 541.9013788115575
2020-08-31: 566.2617214003626

Predicting for Afghanistan__nan

2020-08-27: 8743.206635241966
2020-08-28: 8787.225765910822
2020-08-29: 8861.26895439204
2020-08-30: 9067.712280823758
2020-08-31: 9351.865934512774

Predicting for Azerbaijan__nan
2020-08-01: 2144.236787305373
2020-08-02: 2777.1520281078583
2020-08-03: 3002.262683339484
2020-08-04: 2959.430852905886
2020-08-05: 2638.6802092959388
2020-08-06: 1675.7642101748484
2020-08-07: 2515.0987286211994
2020-08-08: 2970.002490129109
2020-08-09: 3126.709823288507
2020-08-10: 3100.940425838303
2020-08-11: 2833.819246054756
2020-08-12: 2565.566329057056
2020-08-13: 2961.468576480843
2020-08-14: 3256.779166804466
2020-08-15: 3377.198304112346
2020-08-16: 3363.7407751140877
2020-08-17: 3234.9811696606043
2020-08-18: 3208.7714504092783
2020-08-19: 3439.72034652292
2020-08-20: 3640.0675332798073
2020-08-21: 3740.764546553231
2020-08-22: 3755.9271216376133
2020-08-23: 3749.458428292196
2020-08-24: 3806.0206997843406
2020-08-25: 3971.103707167391
2020-08-26: 4124.019528299527
2020-08-27: 4218.92721294256
2

2020-08-17: 1438.4924786314682
2020-08-18: 1489.7176507220838
2020-08-19: 1620.9860370834058
2020-08-20: 1695.7481036317586
2020-08-21: 1723.7734658158392
2020-08-22: 1719.9724334977068
2020-08-23: 1721.0567646306777
2020-08-24: 1778.4302381014718
2020-08-25: 1867.5219353120087
2020-08-26: 1930.6896367229178
2020-08-27: 1957.3689752215173
2020-08-28: 1975.1822754478321
2020-08-29: 2003.3938723331366
2020-08-30: 2061.184939697067
2020-08-31: 2132.1701773369787

Predicting for Belarus__nan
2020-08-01: 1205.1332661282
2020-08-02: 1465.4719059187296
2020-08-03: 1541.728922327432
2020-08-04: 1617.2502682620502
2020-08-05: 1378.0996117436416
2020-08-06: 920.5872903369026
2020-08-07: 1392.7252977279004
2020-08-08: 1601.813932705601
2020-08-09: 1680.584876045697
2020-08-10: 1697.3098460380802
2020-08-11: 1543.377954621575
2020-08-12: 1426.6999415818568
2020-08-13: 1648.9173921812935
2020-08-14: 1795.6894444037894
2020-08-15: 1863.5538777761994
2020-08-16: 1869.8958575519787
2020-08-17: 1805.28

2020-08-19: 422.173898839664
2020-08-20: 441.7202234637805
2020-08-21: 444.2677354280012
2020-08-22: 454.42556409980136
2020-08-23: 475.36783046648736
2020-08-24: 505.06694214711735
2020-08-25: 535.4309218274257
2020-08-26: 555.4541881811323
2020-08-27: 568.4536316027409
2020-08-28: 585.4683006644835
2020-08-29: 609.0461728833246
2020-08-30: 637.3045234378483
2020-08-31: 664.9401916420646

Predicting for Central African Republic__nan
2020-08-01: 52.42609841793618
2020-08-02: 67.10560337355324
2020-08-03: 71.1523125061638
2020-08-04: 74.75767633943352
2020-08-05: 79.93321208138464
2020-08-06: 93.89434451129392
2020-08-07: 124.19734050796148
2020-08-08: 140.11291334641965
2020-08-09: 149.33610323517598
2020-08-10: 158.00205848198016
2020-08-11: 168.75103564682962
2020-08-12: 186.0668789377895
2020-08-13: 208.68535838934423
2020-08-14: 225.268226652305
2020-08-15: 238.1875463033743
2020-08-16: 250.87444604116064
2020-08-17: 265.55069529815415
2020-08-18: 284.1182336252879
2020-08-19: 304.

2020-08-19: 335.9235636467507
2020-08-20: 358.24451686745147
2020-08-21: 380.9847483137527
2020-08-22: 395.9621649945
2020-08-23: 424.1716658289229
2020-08-24: 451.096610951849
2020-08-25: 477.2443577894826
2020-08-26: 512.2084021551109
2020-08-27: 538.7055936260713
2020-08-28: 562.9804636325367
2020-08-29: 592.3741605177181
2020-08-30: 622.0888276425796
2020-08-31: 653.0654618016067

Predicting for Colombia__nan
2020-08-01: 6075.915389043067
2020-08-02: 7490.590810740697
2020-08-03: 7654.592709142291
2020-08-04: 7514.792896184089
2020-08-05: 6607.633416094385
2020-08-06: 4320.072237557359
2020-08-07: 6693.532274457165
2020-08-08: 7733.999835949522
2020-08-09: 7936.619032408448
2020-08-10: 7810.465756158912
2020-08-11: 7107.359142784353
2020-08-12: 6518.5094581531075
2020-08-13: 7610.444953433124
2020-08-14: 8287.973341778847
2020-08-15: 8486.8268798293
2020-08-16: 8402.550439995857
2020-08-17: 8067.267396828777
2020-08-18: 8026.0209933707065
2020-08-19: 8630.728073671902
2020-08-20: 9

2020-08-19: 311.69585043948155
2020-08-20: 329.98772956675
2020-08-21: 346.8801413734324
2020-08-22: 363.87440353056144
2020-08-23: 392.71224851021157
2020-08-24: 417.816187891103
2020-08-25: 440.8604929373747
2020-08-26: 462.3952279613875
2020-08-27: 474.67094423910135
2020-08-28: 495.5504452054971
2020-08-29: 522.163249585814
2020-08-30: 547.7860027194814
2020-08-31: 572.2842933930839

Predicting for Dominica__nan
2020-08-01: 51.7982034139513
2020-08-02: 65.11596312576133
2020-08-03: 69.17953212024327
2020-08-04: 75.22186997725507
2020-08-05: 81.96232112175089
2020-08-06: 94.01787979454224
2020-08-07: 123.53995293001385
2020-08-08: 138.82739188106066
2020-08-09: 148.3555680804266
2020-08-10: 158.305475944244
2020-08-11: 169.5630812358499
2020-08-12: 186.0463589661917
2020-08-13: 208.1179520327616
2020-08-14: 224.42758466695938
2020-08-15: 237.63774151498572
2020-08-16: 250.96757929832808
2020-08-17: 265.8051094296616
2020-08-18: 278.19610307530616
2020-08-19: 296.8859123333905
2020-0

2020-08-09: 501.52352140553216
2020-08-10: 499.25117309380676
2020-08-11: 479.68378671959385
2020-08-12: 471.8952559308829
2020-08-13: 545.8764056962186
2020-08-14: 594.5624822724396
2020-08-15: 611.228557640089
2020-08-16: 617.1664412683384
2020-08-17: 616.9005084390666
2020-08-18: 628.0369923736305
2020-08-19: 675.1383210457996
2020-08-20: 712.3780364157146
2020-08-21: 733.1340736609708
2020-08-22: 746.5151155782725
2020-08-23: 748.7452299166591
2020-08-24: 768.737390484572
2020-08-25: 804.5672410867804
2020-08-26: 836.0194493995712
2020-08-27: 850.3359593120435
2020-08-28: 865.4388321483277
2020-08-29: 879.4207488638787
2020-08-30: 902.958011041262
2020-08-31: 933.7864613401149

Predicting for Ethiopia__nan
2020-08-01: 475.6887437659903
2020-08-02: 540.0934329515015
2020-08-03: 531.6423398239799
2020-08-04: 540.9204574592596
2020-08-05: 463.576496448733
2020-08-06: 361.97549293237296
2020-08-07: 556.255331373598
2020-08-08: 619.5279235403174
2020-08-09: 631.4847397314572
2020-08-10:

2020-08-05: 269.636531617032
2020-08-06: 378.13087305484424
2020-08-07: 696.9403854544154
2020-08-08: 743.4421378680134
2020-08-09: 661.1872803444247
2020-08-10: 546.6661874250432
2020-08-11: 509.1537955977242
2020-08-12: 608.080201620378
2020-08-13: 767.187190114447
2020-08-14: 803.8582708978785
2020-08-15: 764.5674906620468
2020-08-16: 715.4260102404102
2020-08-17: 718.4737020120959
2020-08-18: 799.8304320299361
2020-08-19: 891.6345377827735
2020-08-20: 922.833652675074
2020-08-21: 912.9793047752893
2020-08-22: 902.4494483458576
2020-08-23: 935.6751439139535
2020-08-24: 999.8922859812609
2020-08-25: 1062.8229493186773
2020-08-26: 1094.5272093164795
2020-08-27: 1100.8785272248283
2020-08-28: 1114.7790177011157
2020-08-29: 1152.8705107340288
2020-08-30: 1206.9086819732483
2020-08-31: 1257.7229433634357

Predicting for United Kingdom__Wales
2020-08-01: 630.774158091883
2020-08-02: 596.608962283898
2020-08-03: 345.89839044257843
2020-08-04: 209.5741403293544
2020-08-05: 197.2156888253303

2020-08-21: 1019.0425895456835
2020-08-22: 1017.9400275368612
2020-08-23: 1035.8438121851796
2020-08-24: 1082.8628192063766
2020-08-25: 1143.8300006642735
2020-08-26: 1189.037858202787
2020-08-27: 1215.3863207634463
2020-08-28: 1234.4524246422961
2020-08-29: 1265.1810031882055
2020-08-30: 1311.9165515953014
2020-08-31: 1364.1953131241808

Predicting for Guam__nan
2020-08-01: 124.08217731375495
2020-08-02: 124.67729063607545
2020-08-03: 123.81755889278651
2020-08-04: 137.92364893537743
2020-08-05: 106.0748065319905
2020-08-06: 126.03260698930266
2020-08-07: 184.69201056066638
2020-08-08: 198.56830023427813
2020-08-09: 207.12085006089603
2020-08-10: 214.40221373061354
2020-08-11: 211.3718834060278
2020-08-12: 233.9130225170253
2020-08-13: 268.9444978801939
2020-08-14: 286.7150329768947
2020-08-15: 299.5146013406675
2020-08-16: 309.8007924609375
2020-08-17: 319.7946622042739
2020-08-18: 342.1579503579573
2020-08-19: 368.7547692579906
2020-08-20: 388.18958358952796
2020-08-21: 404.22664502

2020-08-20: 53232.36026538169
2020-08-21: 54189.22066431374
2020-08-22: 54014.94777798287
2020-08-23: 53427.48203059653
2020-08-24: 54238.43407747146
2020-08-25: 56607.7668827723
2020-08-26: 58496.06868957886
2020-08-27: 59476.67917378722
2020-08-28: 59817.87578607526
2020-08-29: 60138.962609205686
2020-08-30: 61283.00979022297
2020-08-31: 63106.014214320625

Predicting for Ireland__nan
2020-08-01: 364.30579840336156
2020-08-02: 433.94350470958847
2020-08-03: 410.9464241095023
2020-08-04: 415.7732867911983
2020-08-05: 362.8530778261977
2020-08-06: 292.46870824124716
2020-08-07: 446.65557062338803
2020-08-08: 502.42035351973755
2020-08-09: 506.7605769786408
2020-08-10: 507.8331788773692
2020-08-11: 482.74482546757
2020-08-12: 482.93589325816987
2020-08-13: 561.7588289163457
2020-08-14: 604.5151767353036
2020-08-15: 619.5496152447364
2020-08-16: 625.862199847865
2020-08-17: 624.7883065847335
2020-08-18: 645.6072693608749
2020-08-19: 695.7521872229887
2020-08-20: 731.2339500396706
2020-08

2020-08-24: 2728.491544498261
2020-08-25: 2880.4522255538996
2020-08-26: 2987.5707557313112
2020-08-27: 3026.916562493362
2020-08-28: 3032.1463322662366
2020-08-29: 3059.0543512661575
2020-08-30: 3147.2578918044687
2020-08-31: 3262.505682541667

Predicting for Kazakhstan__nan
2020-08-01: 860.7930734118924
2020-08-02: 995.1131917470727
2020-08-03: 981.4679863659401
2020-08-04: 929.1230637224027
2020-08-05: 847.6186296281884
2020-08-06: 616.0745249576156
2020-08-07: 958.2966917404985
2020-08-08: 1076.231663087146
2020-08-09: 1084.0626656337631
2020-08-10: 1061.4324114002925
2020-08-11: 998.6284071694525
2020-08-12: 962.547809846896
2020-08-13: 1125.218306967468
2020-08-14: 1210.3679377839367
2020-08-15: 1231.5871000463694
2020-08-16: 1226.6546592576715
2020-08-17: 1206.6304257115466
2020-08-18: 1233.2876723486297
2020-08-19: 1329.4173720972708
2020-08-20: 1394.2580170715305
2020-08-21: 1423.73837417787
2020-08-22: 1435.8893441279047
2020-08-23: 1457.1919669399822
2020-08-24: 1501.3450082

2020-08-06: 96.90999076099338
2020-08-07: 130.3347468345887
2020-08-08: 145.54975350942766
2020-08-09: 154.0426896866046
2020-08-10: 161.31462118383286
2020-08-11: 171.91119305042793
2020-08-12: 190.30016814114308
2020-08-13: 214.2473710620668
2020-08-14: 230.64757284475235
2020-08-15: 243.0333116537052
2020-08-16: 255.09267878932684
2020-08-17: 269.82177482483655
2020-08-18: 294.84999696275077
2020-08-19: 317.25476666174137
2020-08-20: 335.5559729704495
2020-08-21: 351.7709375052418
2020-08-22: 368.1323240508764
2020-08-23: 397.10320456133974
2020-08-24: 422.94465869263627
2020-08-25: 446.55274086920997
2020-08-26: 468.09033541729116
2020-08-27: 488.65501938063204
2020-08-28: 511.3425259058751
2020-08-29: 539.0685743210056
2020-08-30: 566.0727370866858
2020-08-31: 591.8217537162841

Predicting for Libya__nan
2020-08-01: 501.50524414700243
2020-08-02: 653.2451998431362
2020-08-03: 645.2734818915861
2020-08-04: 494.85901131469535
2020-08-05: 722.900207653533
2020-08-06: 442.316009939677

2020-08-05: 1193.9100744449001
2020-08-06: 882.8689338567516
2020-08-07: 1402.5407135225219
2020-08-08: 1560.988938798846
2020-08-09: 1574.66491403091
2020-08-10: 1538.1968310425573
2020-08-11: 1408.3271836626086
2020-08-12: 1361.8061695007118
2020-08-13: 1604.5284411379198
2020-08-14: 1722.0400845993495
2020-08-15: 1750.0670940200675
2020-08-16: 1732.7344003745739
2020-08-17: 1685.3319102579135
2020-08-18: 1719.9244630758471
2020-08-19: 1858.2437474400426
2020-08-20: 1946.7191222545493
2020-08-21: 1983.299561735676
2020-08-22: 1990.3800224229217
2020-08-23: 2004.655015605499
2020-08-24: 2061.374881506535
2020-08-25: 2158.3120110349196
2020-08-26: 2232.764626499528
2020-08-27: 2269.4145272924084
2020-08-28: 2298.3541943960436
2020-08-29: 2336.6478989721763
2020-08-30: 2400.2131202411274
2020-08-31: 2480.356672568414

Predicting for Madagascar__nan
2020-08-01: 57.62516706602822
2020-08-02: 71.09396367037202
2020-08-03: 89.16959534967489
2020-08-04: 120.17291017148511
2020-08-05: 105.994

2020-08-27: 405.8922511803262
2020-08-28: 418.93542718574986
2020-08-29: 430.51672353898664
2020-08-30: 445.4817340286868
2020-08-31: 462.44490135862486

Predicting for Malawi__nan
2020-08-01: 53.168547873503066
2020-08-02: 64.8788584089395
2020-08-03: 69.33728753932056
2020-08-04: 74.2569271899664
2020-08-05: 79.60874437475806
2020-08-06: 93.69502083715423
2020-08-07: 123.83333817298825
2020-08-08: 138.59219646485548
2020-08-09: 148.01281564631006
2020-08-10: 157.27775542545103
2020-08-11: 168.19610478264002
2020-08-12: 185.5483437834772
2020-08-13: 207.91396846211393
2020-08-14: 224.02017259315863
2020-08-15: 237.04063873821121
2020-08-16: 250.01161951365117
2020-08-17: 264.7959234790738
2020-08-18: 283.34414611631996
2020-08-19: 303.54840130977146
2020-08-20: 321.05256195675497
2020-08-21: 336.99320409410063
2020-08-22: 353.1260723702679
2020-08-23: 370.7481454352082
2020-08-24: 390.4828737153867
2020-08-25: 416.63377275509356
2020-08-26: 437.2459555542653
2020-08-27: 452.1403824872

2020-08-01: 53.28586777151234
2020-08-02: 66.6604889216609
2020-08-03: 74.03806921241909
2020-08-04: 79.87234379089087
2020-08-05: 82.21228611622782
2020-08-06: 95.06694741208037
2020-08-07: 125.40140024431318
2020-08-08: 141.35346193735262
2020-08-09: 152.28866809814843
2020-08-10: 161.66858730682395
2020-08-11: 171.19808537370983
2020-08-12: 187.89851592124006
2020-08-13: 210.47774845860374
2020-08-14: 227.36487264687366
2020-08-15: 241.14721226401696
2020-08-16: 254.067404091643
2020-08-17: 268.19447902012394
2020-08-18: 280.67206977772287
2020-08-19: 299.6996552905545
2020-08-20: 317.0679766581273
2020-08-21: 332.7796045000549
2020-08-22: 348.2469512379231
2020-08-23: 354.29948230677485
2020-08-24: 368.29568409326293
2020-08-25: 386.09475491374076
2020-08-26: 403.4651011148581
2020-08-27: 411.5090342287922
2020-08-28: 424.5202075970808
2020-08-29: 453.32102311958465
2020-08-30: 472.46754800236783
2020-08-31: 491.5170240269976

Predicting for Oman__nan
2020-08-01: 318.8969270756217


2020-08-30: 2070.04533728604
2020-08-31: 2131.8457881995428

Predicting for Portugal__nan
2020-08-01: 5245.07550013035
2020-08-02: 6504.6898437221
2020-08-03: 6421.9287157981025
2020-08-04: 6175.204465364295
2020-08-05: 4728.64895930697
2020-08-06: 3474.240475514838
2020-08-07: 5649.3714662438
2020-08-08: 6529.72910865982
2020-08-09: 6591.937285198139
2020-08-10: 6306.658150173905
2020-08-11: 5522.983887087538
2020-08-12: 5297.460862294913
2020-08-13: 6331.913336390916
2020-08-14: 6898.556580085439
2020-08-15: 6991.992828377731
2020-08-16: 6806.681110404468
2020-08-17: 6474.114540289422
2020-08-18: 6555.625081835126
2020-08-19: 7127.628156602298
2020-08-20: 7504.727447264262
2020-08-21: 7611.892662702024
2020-08-22: 7547.999063493065
2020-08-23: 7479.3632974487555
2020-08-24: 7649.041221236722
2020-08-25: 8021.101757867187
2020-08-26: 8299.737528518195
2020-08-27: 8426.896005957207
2020-08-28: 8464.679310561522
2020-08-29: 8528.22396875585
2020-08-30: 8725.890035988317
2020-08-31: 9010

2020-08-19: 598.9231224661389
2020-08-20: 629.9957221842934
2020-08-21: 647.558906355822
2020-08-22: 660.366750948257
2020-08-23: 675.7666235095454
2020-08-24: 702.6947679614746
2020-08-25: 738.0105933557143
2020-08-26: 767.1745821447299
2020-08-27: 784.9596614332353
2020-08-28: 803.7496633154877
2020-08-29: 825.8927982774392
2020-08-30: 854.6246196979549
2020-08-31: 886.8619253645586

Predicting for Sudan__nan
2020-08-01: 95.78597191242108
2020-08-02: 199.90857116962422
2020-08-03: 333.66844386626605
2020-08-04: 215.01657565826704
2020-08-05: 274.5092519328296
2020-08-06: 183.439770666126
2020-08-07: 228.78105393663895
2020-08-08: 306.91101037988125
2020-08-09: 358.1801145854846
2020-08-10: 333.2245534122936
2020-08-11: 344.22422788035834
2020-08-12: 323.6452674047933
2020-08-13: 357.3808348984985
2020-08-14: 409.16110768992996
2020-08-15: 439.67441099894205
2020-08-16: 442.2626449950226
2020-08-17: 450.37615317249595
2020-08-18: 461.69183138638147
2020-08-19: 492.83387990447443
2020-

2020-08-01: 4879.807859123461
2020-08-02: 5950.673492487485
2020-08-03: 6027.038238168422
2020-08-04: 5765.025634486872
2020-08-05: 4640.37337921774
2020-08-06: 3292.6782094730306
2020-08-07: 5268.910443395446
2020-08-08: 6069.652734433956
2020-08-09: 6181.381516333652
2020-08-10: 5947.908565451672
2020-08-11: 5282.996795360411
2020-08-12: 5003.433005096119
2020-08-13: 5932.9100293103465
2020-08-14: 6458.555017927562
2020-08-15: 6572.92224715959
2020-08-16: 6428.6239640818885
2020-08-17: 6138.371231347586
2020-08-18: 6184.738162981416
2020-08-19: 6699.557222702489
2020-08-20: 7052.96011960426
2020-08-21: 7169.614478607529
2020-08-22: 7127.946416036369
2020-08-23: 7071.085291259593
2020-08-24: 7216.177441250721
2020-08-25: 7554.779952284956
2020-08-26: 7817.870665620192
2020-08-27: 7943.222444995469
2020-08-28: 7988.328362114916
2020-08-29: 8050.329708802553
2020-08-30: 8228.911602117063
2020-08-31: 8491.312837096342

Predicting for South Sudan__nan
2020-08-01: 52.825680974699374
2020-0

2020-08-14: 305.88981570754925
2020-08-15: 320.9980239349668
2020-08-16: 332.2114375107888
2020-08-17: 343.02395210369366
2020-08-18: 367.2244134583468
2020-08-19: 394.9784199979542
2020-08-20: 417.55499683374944
2020-08-21: 435.79875868392696
2020-08-22: 451.9692120778845
2020-08-23: 479.7110177277516
2020-08-24: 506.4364223934811
2020-08-25: 533.5753194971597
2020-08-26: 558.2296009356604
2020-08-27: 576.2472249836082
2020-08-28: 598.4387741195264
2020-08-29: 625.9907478776619
2020-08-30: 654.0886877236566
2020-08-31: 682.1340231648778

Predicting for Chad__nan
2020-08-01: 56.366536606046395
2020-08-02: 68.20962235573671
2020-08-03: 74.5470028565795
2020-08-04: 80.93079383285254
2020-08-05: 85.85226834614242
2020-08-06: 96.79484161690384
2020-08-07: 127.79575851985317
2020-08-08: 143.12406955683093
2020-08-09: 153.62536761292603
2020-08-10: 163.60395112607327
2020-08-11: 173.96866234248972
2020-08-12: 190.1407261028745
2020-08-13: 212.8080979055399
2020-08-14: 229.41462991936606
2020

2020-08-02: 4831.6802393161
2020-08-03: 5209.073299080485
2020-08-04: 5557.65443601043
2020-08-05: 4861.082368216803
2020-08-06: 2998.4080160290196
2020-08-07: 4470.594239907382
2020-08-08: 5196.619806030212
2020-08-09: 5518.589459264537
2020-08-10: 5627.6007689678645
2020-08-11: 5097.23691778288
2020-08-12: 4542.552539006387
2020-08-13: 5211.320913180293
2020-08-14: 5702.278699428715
2020-08-15: 5950.342952396639
2020-08-16: 5981.730025133888
2020-08-17: 5722.270309395157
2020-08-18: 5609.250133905842
2020-08-19: 5985.352567729031
2020-08-20: 6323.386489884687
2020-08-21: 6516.779510168178
2020-08-22: 6557.606868429083
2020-08-23: 6484.55233313216
2020-08-24: 6540.1481180093615
2020-08-25: 6802.3529610561245
2020-08-26: 7056.069751396358
2020-08-27: 7223.9806708968
2020-08-28: 7300.977040834843
2020-08-29: 7350.892400383025
2020-08-30: 7474.7182550083335
2020-08-31: 7692.453989707956

Predicting for Taiwan__nan
2020-08-01: 50.66886569838578
2020-08-02: 62.74285598097143
2020-08-03: 68

2020-08-01: 1742.9683401865364
2020-08-02: 1938.5864355246836
2020-08-03: 1796.5636155243674
2020-08-04: 1517.96140905014
2020-08-05: 1191.6754215898845
2020-08-06: 1032.756634934276
2020-08-07: 1756.960986184491
2020-08-08: 1939.5528992877948
2020-08-09: 1867.77852940088
2020-08-10: 1707.5870685500254
2020-08-11: 1542.8479855542364
2020-08-12: 1588.2034050320926
2020-08-13: 1932.6662669579514
2020-08-14: 2061.786773600069
2020-08-15: 2037.8010012296277
2020-08-16: 1959.1226868906756
2020-08-17: 1905.6914450129516
2020-08-18: 1995.1528901779611
2020-08-19: 2186.2411239658136
2020-08-20: 2280.037373286172
2020-08-21: 2287.919576750318
2020-08-22: 2267.0124141765004
2020-08-23: 2285.802887403069
2020-08-24: 2376.0262585862624
2020-08-25: 2501.9539442157948
2020-08-26: 2579.8897274690726
2020-08-27: 2607.105529055535
2020-08-28: 2626.6165765502456
2020-08-29: 2673.3522673598554
2020-08-30: 2759.0120938836
2020-08-31: 2857.1839972011976

Predicting for United States__Arizona
2020-08-01: 34

2020-08-26: 4848.40915304848
2020-08-27: 4895.059516332788
2020-08-28: 4880.016861100956
2020-08-29: 4921.565312798
2020-08-30: 5063.15074885181
2020-08-31: 5251.316051200994

Predicting for United States__Hawaii
2020-08-01: 147.303529953964
2020-08-02: 187.98832472891712
2020-08-03: 207.61854517749376
2020-08-04: 198.5026339120221
2020-08-05: 180.299849988765
2020-08-06: 161.61701052688315
2020-08-07: 230.138207064943
2020-08-08: 266.0251079187042
2020-08-09: 282.46494145843025
2020-08-10: 283.96828144918305
2020-08-11: 279.64620613420084
2020-08-12: 288.6133973757801
2020-08-13: 329.4714747388705
2020-08-14: 358.8892193950584
2020-08-15: 375.98679298147033
2020-08-16: 384.6983430949689
2020-08-17: 392.16755013829834
2020-08-18: 415.90356494034376
2020-08-19: 447.91302928655773
2020-08-20: 474.4986608614442
2020-08-21: 494.02861230634585
2020-08-22: 509.1591130872684
2020-08-23: 535.6764824630108
2020-08-24: 563.0981740106226
2020-08-25: 593.0828468230163
2020-08-26: 620.3512448054067

2020-08-06: 1483.0189206146015
2020-08-07: 2399.6677991005577
2020-08-08: 2766.6732320432707
2020-08-09: 2889.9388088512746
2020-08-10: 2701.004935727842
2020-08-11: 2345.6604173232017
2020-08-12: 2289.3126755760377
2020-08-13: 2733.3827485478746
2020-08-14: 2989.6698637224117
2020-08-15: 3062.532804689069
2020-08-16: 2960.6696258495726
2020-08-17: 2815.5886244501626
2020-08-18: 2872.9504476780203
2020-08-19: 3126.980912523538
2020-08-20: 3303.4146445642878
2020-08-21: 3364.297835765386
2020-08-22: 3332.819388120344
2020-08-23: 3314.031947190172
2020-08-24: 3402.7622588384793
2020-08-25: 3574.3363832627306
2020-08-26: 3707.7398504606517
2020-08-27: 3769.3190656645384
2020-08-28: 3790.2433203017135
2020-08-29: 3828.8007512152617
2020-08-30: 3927.4567326321685
2020-08-31: 4062.52779890146

Predicting for United States__Maryland
2020-08-01: 2249.776511153064
2020-08-02: 2494.1037955749234
2020-08-03: 2601.29002756154
2020-08-04: 2272.811573168994
2020-08-05: 1764.4852807947173
2020-08-06:

2020-08-25: 4909.409795106107
2020-08-26: 5076.7897327936835
2020-08-27: 5161.342696610936
2020-08-28: 5193.796271775156
2020-08-29: 5239.8913671140535
2020-08-30: 5368.377286284341
2020-08-31: 5544.443207780286

Predicting for United States__North Dakota
2020-08-01: 1170.8571794808772
2020-08-02: 1396.3826150977816
2020-08-03: 1406.0925244185614
2020-08-04: 1198.8546598892663
2020-08-05: 890.5827664600955
2020-08-06: 760.5944293045561
2020-08-07: 1262.1368420081267
2020-08-08: 1443.9900636375005
2020-08-09: 1445.726931710697
2020-08-10: 1324.729018815272
2020-08-11: 1174.2615151388409
2020-08-12: 1194.4553953913955
2020-08-13: 1442.7958985983996
2020-08-14: 1566.3620568088527
2020-08-15: 1575.9554656994837
2020-08-16: 1517.2488121753513
2020-08-17: 1465.1198397423796
2020-08-18: 1520.241420723084
2020-08-19: 1663.1197665292934
2020-08-20: 1749.641351650425
2020-08-21: 1770.058813853272
2020-08-22: 1755.488685550038
2020-08-23: 1756.5158262787822
2020-08-24: 1817.7176479478421
2020-08-

2020-08-09: 3647.40335730749
2020-08-10: 3617.7859955334047
2020-08-11: 3265.843201987897
2020-08-12: 2979.3677992048833
2020-08-13: 3450.0722937028777
2020-08-14: 3778.3315775700366
2020-08-15: 3922.7596676349294
2020-08-16: 3900.787531715707
2020-08-17: 3736.7239814707864
2020-08-18: 3701.871675251731
2020-08-19: 3969.743288581708
2020-08-20: 4195.27613885424
2020-08-21: 4309.9515275049125
2020-08-22: 4320.42691304778
2020-08-23: 4280.673331171841
2020-08-24: 4335.697482334576
2020-08-25: 4520.862331059387
2020-08-26: 4690.046268521307
2020-08-27: 4790.092935947018
2020-08-28: 4834.942126620455
2020-08-29: 4868.0071917592295
2020-08-30: 4959.499667801123
2020-08-31: 5110.208748556116

Predicting for United States__Oregon
2020-08-01: 1116.7258124758787
2020-08-02: 1384.0466638150397
2020-08-03: 1491.091532448489
2020-08-04: 1469.9379613209576
2020-08-05: 1153.2462692395475
2020-08-06: 831.2843101597376
2020-08-07: 1290.0120646044566
2020-08-08: 1501.5114244896176
2020-08-09: 1577.5158

2020-08-01: 1918.6015055243088
2020-08-02: 2488.0418336945045
2020-08-03: 2429.351878362959
2020-08-04: 2529.6738649946155
2020-08-05: 2492.490692005312
2020-08-06: 1512.7867687267508
2020-08-07: 2243.098688704909
2020-08-08: 2609.4181754541696
2020-08-09: 2661.2777798463785
2020-08-10: 2721.4910712764026
2020-08-11: 2577.9232858612368
2020-08-12: 2300.389185840199
2020-08-13: 2633.126873042986
2020-08-14: 2864.709596222011
2020-08-15: 2945.8172364980774
2020-08-16: 2979.22665347439
2020-08-17: 2903.8085248485127
2020-08-18: 2857.0817253977693
2020-08-19: 3045.242882266749
2020-08-20: 3204.174659898096
2020-08-21: 3287.333767158657
2020-08-22: 3324.110020359441
2020-08-23: 3316.09001291968
2020-08-24: 3351.5061444335342
2020-08-25: 3484.113040327904
2020-08-26: 3607.2980756073575
2020-08-27: 3685.945676712786
2020-08-28: 3736.906436110513
2020-08-29: 3773.7344528323406
2020-08-30: 3842.2499808384823
2020-08-31: 3953.732974768483

Predicting for United States__Virgin Islands
2020-08-01:

2020-08-12: 516.2235918738098
2020-08-13: 555.3558183398089
2020-08-14: 583.164827405023
2020-08-15: 618.4395710114702
2020-08-16: 679.7069326441557
2020-08-17: 692.4459851898172
2020-08-18: 688.1092702330129
2020-08-19: 717.199001860919
2020-08-20: 748.3362308437552
2020-08-21: 786.5652991150529
2020-08-22: 828.7839391311969
2020-08-23: 868.8677865854276
2020-08-24: 889.5389533172598
2020-08-25: 921.2731715262038
2020-08-26: 957.2060201924364
2020-08-27: 997.2262680158427
2020-08-28: 1039.3594279589024
2020-08-29: 1078.001306748843
2020-08-30: 1110.3270456079763
2020-08-31: 1147.1000517895318

Predicting for United States Virgin Islands__nan
2020-08-01: 59.77169366343009
2020-08-02: 71.82302052047893
2020-08-03: 79.3330859865514
2020-08-04: 84.2525822660838
2020-08-05: 89.10399209240165
2020-08-06: 99.03309169838866
2020-08-07: 131.2920664417735
2020-08-08: 147.12704748920896
2020-08-09: 157.93717518225836
2020-08-10: 167.38637526662788
2020-08-11: 177.49740586526502
2020-08-12: 193.4

In [23]:
# Check the predictions
preds_df.head()

Unnamed: 0,CountryName,RegionName,Date,PredictedDailyNewCases
213,Aruba,,2020-08-01,58.834154
214,Aruba,,2020-08-02,71.325152
215,Aruba,,2020-08-03,78.992903
216,Aruba,,2020-08-04,90.241488
217,Aruba,,2020-08-05,87.814744


# Validation
This is how the predictor is going to be called during the competition.  
!!! PLEASE DO NOT CHANGE THE API !!!

In [23]:
!python predict.py -s 2020-08-01 -e 2020-08-04 -ip data/2020-09-30_historical_ip_new.csv -o predictions/2020-08-01_2020-08-04.csv

Generating predictions from 2020-08-01 to 2020-08-04...
Saved predictions to predictions/2020-08-01_2020-08-04.csv
Done!


In [25]:
!head predictions/2020-08-01_2020-08-04.csv

'head' 不是内部或外部命令，也不是可运行的程序
或批处理文件。


# Test cases
We can generate a prediction file. Let's validate a few cases...

In [29]:
import os
from predictor_validation import validate_submission

def validate(start_date, end_date, ip_file, output_file):
    # First, delete any potential old file
    try:
        os.remove(output_file)
    except OSError:
        pass
    
    # Then generate the prediction, calling the official API
    !python predict.py -s {start_date} -e {end_date} -ip {ip_file} -o {output_file}
    
    # And validate it
    errors = validate_submission(start_date, end_date, ip_file, output_file)
    if errors:
        for error in errors:
            print(error)
    else:
        print("All good!")

## 4 days, no gap
- All countries and regions
- Official number of cases is known up to start_date
- Intervention Plans are the official ones

In [32]:
validate(start_date="2020-08-01",
         end_date="2020-08-04",
         ip_file="data/2020-09-30_historical_ip_new.csv",
         output_file="predictions/val_4_days.csv")

Generating predictions from 2020-08-01 to 2020-08-04...
Saved predictions to predictions/val_4_days.csv
Done!
All good!


## 1 month in the future
- 2 countries only
- there's a gap between date of last known number of cases and start_date
- For future dates, Intervention Plans contains scenarios for which predictions are requested to answer the question: what will happen if we apply these plans?

In [34]:
%%time
validate(start_date="2021-01-01",
         end_date="2021-01-31",
         ip_file="validation/data/future_ip.csv",
         output_file="predictions/val_1_month_future.csv")

Generating predictions from 2021-01-01 to 2021-01-31...
Saved predictions to predictions/val_1_month_future.csv
Done!
All good!
Wall time: 3.1 s


## 180 days, from a future date, all countries and regions
- Prediction start date is 1 week from now. (i.e. assuming submission date is 1 week from now)  
- Prediction end date is 6 months after start date.  
- Prediction is requested for all available countries and regions.  
- Intervention plan scenario: freeze last known intervention plans for each country and region.  

As the number of cases is not known yet between today and start date, but the model relies on them, the model has to predict them in order to use them.  
This test is the most demanding test. It should take less than 1 hour to generate the prediction file.

### Generate the scenario

In [35]:
from datetime import datetime, timedelta

start_date = datetime.now() + timedelta(days=7)
start_date_str = start_date.strftime('%Y-%m-%d')
end_date = start_date + timedelta(days=180)
end_date_str = end_date.strftime('%Y-%m-%d')
print(f"Start date: {start_date_str}")
print(f"End date: {end_date_str}")

Start date: 2020-12-01
End date: 2021-05-30


In [37]:
from validation.scenario_generator import get_raw_data, generate_scenario, NPI_COLUMNS
DATA_FILE = 'data/OxCGRT_latest.csv'
latest_df = get_raw_data(DATA_FILE, latest=True)
scenario_df = generate_scenario(start_date_str, end_date_str, latest_df, countries=None, scenario="Freeze")
scenario_file = "predictions/180_days_future_scenario.csv"
scenario_df.to_csv(scenario_file, index=False)
print(f"Saved scenario to {scenario_file}")

Saved scenario to predictions/180_days_future_scenario.csv


### Check it

In [None]:
%%time
validate(start_date=start_date_str,
         end_date=end_date_str,
         ip_file=scenario_file,
         output_file="predictions/val_6_month_future.csv")