# Example Predictor: Linear Rollout Predictor

This example contains basic functionality for training and evaluating a linear predictor that rolls out predictions day-by-day.

First, a training data set is created from historical case and npi data.

Second, a linear model is trained to predict future cases from prior case data along with prior and future npi data.
The model is an off-the-shelf sklearn Lasso model, that uses a positive weight constraint to enforce the assumption that increased npis has a negative correlation with future cases.

Third, a sample evaluation set is created, and the predictor is applied to this evaluation set to produce prediction results in the correct format.

## Training

In [1]:
import pickle
import numpy as np
import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split

### Copy the data locally

In [2]:
# Main source for the training data
DATA_URL = 'https://raw.githubusercontent.com/OxCGRT/covid-policy-tracker/master/data/OxCGRT_latest.csv'
# Local file
DATA_FILE = 'data/OxCGRT_latest.csv'

In [3]:
import os
import urllib.request
if not os.path.exists('data'):
    os.mkdir('data')
urllib.request.urlretrieve(DATA_URL, DATA_FILE)

('data/OxCGRT_latest.csv', <http.client.HTTPMessage at 0x20268472fa0>)

In [4]:
# Load historical data from local file
df = pd.read_csv(DATA_FILE, 
                 parse_dates=['Date'],
                 encoding="ISO-8859-1",
                 dtype={"RegionName": str,
                        "RegionCode": str},
                 error_bad_lines=False)

In [5]:
df.columns

Index(['CountryName', 'CountryCode', 'RegionName', 'RegionCode',
       'Jurisdiction', 'Date', 'C1_School closing', 'C1_Flag',
       'C2_Workplace closing', 'C2_Flag', 'C3_Cancel public events', 'C3_Flag',
       'C4_Restrictions on gatherings', 'C4_Flag', 'C5_Close public transport',
       'C5_Flag', 'C6_Stay at home requirements', 'C6_Flag',
       'C7_Restrictions on internal movement', 'C7_Flag',
       'C8_International travel controls', 'E1_Income support', 'E1_Flag',
       'E2_Debt/contract relief', 'E3_Fiscal measures',
       'E4_International support', 'H1_Public information campaigns',
       'H1_Flag', 'H2_Testing policy', 'H3_Contact tracing',
       'H4_Emergency investment in healthcare', 'H5_Investment in vaccines',
       'H6_Facial Coverings', 'H6_Flag', 'H7_Vaccination policy', 'H7_Flag',
       'M1_Wildcard', 'ConfirmedCases', 'ConfirmedDeaths', 'StringencyIndex',
       'StringencyIndexForDisplay', 'StringencyLegacyIndex',
       'StringencyLegacyIndexForDispla

In [6]:
# For testing, restrict training data to that before a hypothetical predictor submission date
HYPOTHETICAL_SUBMISSION_DATE = np.datetime64("2020-11-30")
df = df[df.Date <= HYPOTHETICAL_SUBMISSION_DATE]

In [7]:
# Add RegionID column that combines CountryName and RegionName for easier manipulation of data
df['GeoID'] = df['CountryName'] + '__' + df['RegionName'].astype(str)

In [8]:
# Add new cases column
df['NewCases'] = df.groupby('GeoID').ConfirmedCases.diff().fillna(0)

In [9]:
# Keep only columns of interest
id_cols = ['CountryName',
           'RegionName',
           'GeoID',
           'Date']
cases_col = ['NewCases']
npi_cols = ['C1_School closing',
            'C2_Workplace closing',
            'C3_Cancel public events',
            'C4_Restrictions on gatherings',
            'C5_Close public transport',
            'C6_Stay at home requirements',
            'C7_Restrictions on internal movement',
            'C8_International travel controls',
            'H1_Public information campaigns',
            'H2_Testing policy',
            'H3_Contact tracing',
            'H6_Facial Coverings']
df = df[id_cols + cases_col + npi_cols]

In [10]:
# Fill any missing case values by interpolation and setting NaNs to 0
df.update(df.groupby('GeoID').NewCases.apply(
    lambda group: group.interpolate()).fillna(0))

In [11]:
# Fill any missing NPIs by assuming they are the same as previous day
for npi_col in npi_cols:
    df.update(df.groupby('GeoID')[npi_col].ffill().fillna(0))

In [12]:
df

Unnamed: 0,CountryName,RegionName,GeoID,Date,NewCases,C1_School closing,C2_Workplace closing,C3_Cancel public events,C4_Restrictions on gatherings,C5_Close public transport,C6_Stay at home requirements,C7_Restrictions on internal movement,C8_International travel controls,H1_Public information campaigns,H2_Testing policy,H3_Contact tracing,H6_Facial Coverings
0,Aruba,,Aruba__nan,2020-01-01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Aruba,,Aruba__nan,2020-01-02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Aruba,,Aruba__nan,2020-01-03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Aruba,,Aruba__nan,2020-01-04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Aruba,,Aruba__nan,2020-01-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
98817,Zimbabwe,,Zimbabwe__nan,2020-11-26,115.0,2.0,1.0,2.0,3.0,1.0,2.0,2.0,2.0,2.0,1.0,1.0,3.0
98818,Zimbabwe,,Zimbabwe__nan,2020-11-27,91.0,2.0,1.0,2.0,3.0,1.0,2.0,2.0,2.0,2.0,1.0,1.0,3.0
98819,Zimbabwe,,Zimbabwe__nan,2020-11-28,108.0,2.0,1.0,2.0,3.0,1.0,2.0,2.0,2.0,2.0,1.0,1.0,3.0
98820,Zimbabwe,,Zimbabwe__nan,2020-11-29,0.0,2.0,1.0,2.0,3.0,1.0,2.0,2.0,2.0,2.0,1.0,1.0,3.0


In [13]:
# Set number of past days to use to make predictions
nb_lookback_days = 30

# Create training data across all countries for predicting one day ahead
X_cols = cases_col + npi_cols
y_col = cases_col
X_samples = []
y_samples = []
geo_ids = df.GeoID.unique()
for g in geo_ids:
    gdf = df[df.GeoID == g]
    all_case_data = np.array(gdf[cases_col])
    all_npi_data = np.array(gdf[npi_cols])

    # Create one sample for each day where we have enough data
    # Each sample consists of cases and npis for previous nb_lookback_days
    nb_total_days = len(gdf)
    for d in range(nb_lookback_days, nb_total_days - 1):
        X_cases = all_case_data[d-nb_lookback_days:d]

        # Take negative of npis to support positive
        # weight constraint in Lasso.
        X_npis = -all_npi_data[d - nb_lookback_days:d]

        # Flatten all input data so it fits Lasso input format.
        X_sample = np.concatenate([X_cases.flatten(),
                                   X_npis.flatten()])
        y_sample = all_case_data[d]
        X_samples.append(X_sample)
        y_samples.append(y_sample)

X_samples = np.array(X_samples)
y_samples = np.array(y_samples).flatten()

In [14]:
# Helpful function to compute mae
def mae(pred, true):
    return np.mean(np.abs(pred - true))

In [15]:
# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X_samples,
                                                    y_samples,
                                                    test_size=0.2,
                                                    random_state=301)

In [16]:
# Create and train Lasso model.
# Set positive=True to enforce assumption that cases are positively correlated
# with future cases and npis are negatively correlated.
model = Lasso(alpha=0.1,
              precompute=True,
              max_iter=10000,
              positive=True,
              selection='random')
# Fit model
model.fit(X_train, y_train)

Lasso(alpha=0.1, max_iter=10000, positive=True, precompute=True,
      selection='random')

In [17]:
# Evaluate model
train_preds = model.predict(X_train)
train_preds = np.maximum(train_preds, 0) # Don't predict negative cases
print('Train MAE:', mae(train_preds, y_train))

test_preds = model.predict(X_test)
test_preds = np.maximum(test_preds, 0) # Don't predict negative cases
print('Test MAE:', mae(test_preds, y_test))

Train MAE: 243.10365933046535
Test MAE: 239.27522830880685


In [18]:
# Inspect the learned feature coefficients for the model
# to see what features it's paying attention to.

# Give names to the features
x_col_names = []
for d in range(-nb_lookback_days, 0):
    x_col_names.append('Day ' + str(d) + ' ' + cases_col[0])
for d in range(-nb_lookback_days, 1):
    for col_name in npi_cols:
        x_col_names.append('Day ' + str(d) + ' ' + col_name)

# View non-zero coefficients
for (col, coeff) in zip(x_col_names, list(model.coef_)):
    if coeff != 0.:
        print(col, coeff)
print('Intercept', model.intercept_)

Day -14 NewCases 0.01063511120413984
Day -7 NewCases 0.3901833563851552
Day -6 NewCases 0.13934255461612743
Day -4 NewCases 0.12938927512555035
Day -3 NewCases 0.049278982260770494
Day -2 NewCases 0.07721298168596523
Day -1 NewCases 0.23360962065089283
Day -27 C6_Stay at home requirements 10.081487437570491
Day -18 C6_Stay at home requirements 13.859149414673897
Day -10 C6_Stay at home requirements 5.464966906524101
Intercept 41.675232903469805


In [19]:
# Save model to file
if not os.path.exists('models'):
    os.mkdir('models')
with open('models/model.pkl', 'wb') as model_file:
    pickle.dump(model, model_file)

## Evaluation

Now that the predictor has been trained and saved, this section contains the functionality for evaluating it on sample evaluation data.

In [20]:
# Reload the module to get the latest changes
import predict
from importlib import reload
reload(predict)
from predict import predict_df

In [21]:
%%time
preds_df = predict_df("2020-08-01", "2020-08-31", path_to_ips_file="../../../validation/data/2020-09-30_historical_ip.csv", verbose=True)


Predicting for Aruba__nan
2020-08-01: 110.721549344085
2020-08-02: 132.85696554341925
2020-08-03: 146.01904406625135
2020-08-04: 163.24031979966742
2020-08-05: 181.43132439511896
2020-08-06: 180.02027812762324
2020-08-07: 199.12830368230138
2020-08-08: 252.81951918007303
2020-08-09: 279.5077050098621
2020-08-10: 298.20358136686855
2020-08-11: 313.70944547675066
2020-08-12: 333.9813118403191
2020-08-13: 346.11359395449927
2020-08-14: 368.63332678789396
2020-08-15: 403.68194682213084
2020-08-16: 430.0831156889156
2020-08-17: 451.23185456554194
2020-08-18: 471.9099025627045
2020-08-19: 480.14399095447106
2020-08-20: 495.97914841569474
2020-08-21: 517.9433903826953
2020-08-22: 545.3036200405745
2020-08-23: 568.7689485036241
2020-08-24: 590.8265656062034
2020-08-25: 611.3619824372208
2020-08-26: 628.1937646381347
2020-08-27: 647.202767061426
2020-08-28: 659.3496068988902
2020-08-29: 681.4594541200113
2020-08-30: 703.1871419561869
2020-08-31: 724.7211427531294

Predicting for Afghanistan__n

2020-08-16: 2422.3112015648176
2020-08-17: 2408.4080135947115
2020-08-18: 2461.6842200495657
2020-08-19: 2429.2630726086672
2020-08-20: 2276.678155516797
2020-08-21: 2304.4621318180925
2020-08-22: 2471.8732079693864
2020-08-23: 2549.351959356418
2020-08-24: 2563.2383608290884
2020-08-25: 2601.333162716952
2020-08-26: 2602.54560002639
2020-08-27: 2556.497135682995
2020-08-28: 2563.8961157749754
2020-08-29: 2647.0259103968424
2020-08-30: 2698.3795155804637
2020-08-31: 2721.7799557000394

Predicting for Azerbaijan__nan
2020-08-01: 3185.769085093367
2020-08-02: 3588.899516198826
2020-08-03: 3027.3519190987067
2020-08-04: 3523.815374522946
2020-08-05: 3384.1000884396094
2020-08-06: 1821.5485328918448
2020-08-07: 1843.1766909520234
2020-08-08: 3082.9375224666246
2020-08-09: 3360.14247791073
2020-08-10: 3155.3457993155203
2020-08-11: 3384.855573276566
2020-08-12: 3322.4014491864614
2020-08-13: 2701.2806392258544
2020-08-14: 2717.360476261753
2020-08-15: 3256.0182162712053
2020-08-16: 3428.317

2020-08-02: 135.97231024969113
2020-08-03: 150.478713922087
2020-08-04: 163.4350266088478
2020-08-05: 178.26569370900071
2020-08-06: 179.75912375417295
2020-08-07: 200.2975329063928
2020-08-08: 255.45402758172654
2020-08-09: 281.6917512227832
2020-08-10: 300.7933424607827
2020-08-11: 308.9447194124899
2020-08-12: 326.66317442134164
2020-08-13: 339.04237927970473
2020-08-14: 361.8750002665186
2020-08-15: 396.50048465855394
2020-08-16: 422.36948327620456
2020-08-17: 442.5564056758106
2020-08-18: 459.7175506151978
2020-08-19: 452.1180391331872
2020-08-20: 464.03734351135506
2020-08-21: 483.64482216502415
2020-08-22: 508.6931502822019
2020-08-23: 528.848299661458
2020-08-24: 548.4702857684773
2020-08-25: 564.1055118347124
2020-08-26: 572.3914345494195
2020-08-27: 586.6236844824919
2020-08-28: 585.1182808242447
2020-08-29: 601.2470275225327
2020-08-30: 617.5455827220599
2020-08-31: 634.4152612897261

Predicting for Bosnia and Herzegovina__nan
2020-08-01: 678.4613086439142
2020-08-02: 747.92

2020-08-22: 1065.8499905894205
2020-08-23: 1110.0001970437338
2020-08-24: 1150.025259256789
2020-08-25: 1187.612547026443
2020-08-26: 1229.8656350849694
2020-08-27: 1247.8651659019388
2020-08-28: 1256.8485277517414
2020-08-29: 1284.2016251911302
2020-08-30: 1321.0757388239294
2020-08-31: 1356.007777084476

Predicting for Brazil__Amapa
2020-08-01: 157.28648304646535
2020-08-02: 281.63338635204855
2020-08-03: 324.92301964415606
2020-08-04: 305.9826422294909
2020-08-05: 542.8539452225042
2020-08-06: 402.5134908035451
2020-08-07: 315.4696035812825
2020-08-08: 372.21868723874115
2020-08-09: 457.03785268199874
2020-08-10: 476.1008137269047
2020-08-11: 495.72539061827433
2020-08-12: 593.6380042547906
2020-08-13: 558.2036606719265
2020-08-14: 532.2478527044564
2020-08-15: 566.4468050231177
2020-08-16: 620.4283091844645
2020-08-17: 640.4485900061849
2020-08-18: 668.7196234652154
2020-08-19: 705.8813291788484
2020-08-20: 705.7815826601939
2020-08-21: 706.3233480867412
2020-08-22: 733.40078304720

2020-08-26: 1389.3002380950625
2020-08-27: 1408.290127486399
2020-08-28: 1431.8439196006248
2020-08-29: 1466.6719475273262
2020-08-30: 1510.8870743013597
2020-08-31: 1554.5092869437005

Predicting for Brazil__Para
2020-08-01: 370.5374565159466
2020-08-02: 445.30703782600045
2020-08-03: 688.6340624811442
2020-08-04: 839.5598060253935
2020-08-05: 994.7090339374656
2020-08-06: 937.3677824767525
2020-08-07: 578.375173647121
2020-08-08: 672.253459005228
2020-08-09: 746.7940229400803
2020-08-10: 867.0191577626947
2020-08-11: 934.0490489834845
2020-08-12: 1028.4103900305495
2020-08-13: 999.5256451060958
2020-08-14: 880.0197936923419
2020-08-15: 914.1519038147853
2020-08-16: 970.3161723312573
2020-08-17: 1035.2833977189462
2020-08-18: 1081.9238077137543
2020-08-19: 1111.744897441125
2020-08-20: 1104.2487808248125
2020-08-21: 1069.8135801475448
2020-08-22: 1090.83697020349
2020-08-23: 1128.3383291827101
2020-08-24: 1169.1821689327335
2020-08-25: 1201.2662690840543
2020-08-26: 1228.0780364011234

2020-08-16: 523.592380403886
2020-08-17: 554.465084581175
2020-08-18: 569.9407800590802
2020-08-19: 607.7602146287534
2020-08-20: 640.000453037522
2020-08-21: 650.0267466438579
2020-08-22: 673.454699618837
2020-08-23: 703.4352928800879
2020-08-24: 731.5415958445401
2020-08-25: 754.1885923001996
2020-08-26: 785.9466892831186
2020-08-27: 814.8496672059097
2020-08-28: 835.7773302387836
2020-08-29: 860.9782160561012
2020-08-30: 889.9722108048306
2020-08-31: 917.912988360934

Predicting for Brazil__Rio Grande do Sul
2020-08-01: 1375.5855828849067
2020-08-02: 1902.6433402855828
2020-08-03: 2824.821372108322
2020-08-04: 3228.2541756643705
2020-08-05: 4765.709071620988
2020-08-06: 4280.418416642526
2020-08-07: 2184.6722493920506
2020-08-08: 2395.75130845497
2020-08-09: 2792.405949444198
2020-08-10: 3175.8252271127762
2020-08-11: 3418.1229309264254
2020-08-12: 4116.372395902803
2020-08-13: 3883.410550684089
2020-08-14: 3091.1380941696048
2020-08-15: 3106.087607070265
2020-08-16: 3341.0720865721

2020-08-22: 547.9166534128826
2020-08-23: 572.7003162959198
2020-08-24: 595.7228648124972
2020-08-25: 617.5395902380687
2020-08-26: 635.2338552715937
2020-08-27: 656.9387588741802
2020-08-28: 674.584648819695
2020-08-29: 727.0228130198047
2020-08-30: 757.152370189697
2020-08-31: 784.1646332082607

Predicting for Botswana__nan
2020-08-01: 100.48644042100679
2020-08-02: 175.79646996551614
2020-08-03: 300.06595694519154
2020-08-04: 189.1103739240796
2020-08-05: 189.49836918370784
2020-08-06: 201.6013354496675
2020-08-07: 224.3607067608886
2020-08-08: 265.9767070246905
2020-08-09: 324.8032961386059
2020-08-10: 381.4299139658867
2020-08-11: 357.00100680806406
2020-08-12: 365.7879151590044
2020-08-13: 384.25014948336366
2020-08-14: 410.0438214619175
2020-08-15: 440.27076919741035
2020-08-16: 483.0149972447894
2020-08-17: 519.0066712142213
2020-08-18: 526.0546718605873
2020-08-19: 542.5028186534607
2020-08-20: 565.1202679705195
2020-08-21: 591.1963971535736
2020-08-22: 618.9496087713244
2020-

2020-08-01: 100.48644042100679
2020-08-02: 145.00176539535198
2020-08-03: 201.0367923267982
2020-08-04: 163.59845738420114
2020-08-05: 177.29932684559282
2020-08-06: 183.20575544018706
2020-08-07: 205.05088475171192
2020-08-08: 251.8522316653234
2020-08-09: 291.7132152010352
2020-08-10: 324.73261368355946
2020-08-11: 326.3522935939013
2020-08-12: 343.4689617471189
2020-08-13: 359.72593279148805
2020-08-14: 384.2425439408969
2020-08-15: 417.1622929403564
2020-08-16: 450.3890290489535
2020-08-17: 477.70982639328656
2020-08-18: 494.0710926908043
2020-08-19: 514.9892415046751
2020-08-20: 536.6069711361125
2020-08-21: 561.9989390396597
2020-08-22: 590.7200878378337
2020-08-23: 620.3574929982549
2020-08-24: 646.8382202381046
2020-08-25: 669.3294701764255
2020-08-26: 693.1612290914665
2020-08-27: 717.7508340531801
2020-08-28: 744.0403299067877
2020-08-29: 771.8513483048072
2020-08-30: 800.280769277452
2020-08-31: 827.3035649166873

Predicting for Colombia__nan
2020-08-01: 6500.400863077701
20

2020-08-24: 17357.530016908007
2020-08-25: 18186.85743055064
2020-08-26: 18254.483944919542
2020-08-27: 17493.05006406574
2020-08-28: 17445.74876824239
2020-08-29: 18009.940613826402
2020-08-30: 18174.665594712853
2020-08-31: 18281.353717908845

Predicting for Djibouti__nan
2020-08-01: 105.04948995742058
2020-08-02: 128.28388351385846
2020-08-03: 142.82081756271768
2020-08-04: 156.632840448907
2020-08-05: 171.95271550597806
2020-08-06: 176.38696093454197
2020-08-07: 195.8477070857231
2020-08-08: 247.50448373443237
2020-08-09: 274.3560206748888
2020-08-10: 293.76806384339096
2020-08-11: 302.1507807173424
2020-08-12: 320.14660374616426
2020-08-13: 333.7642900691262
2020-08-14: 356.05107218202465
2020-08-15: 389.29477981843615
2020-08-16: 415.21023771943703
2020-08-17: 435.5883635519909
2020-08-18: 452.79705919307867
2020-08-19: 445.3329221101566
2020-08-20: 457.7414085955879
2020-08-21: 477.08372186777365
2020-08-22: 501.5508072458456
2020-08-23: 521.6426192979325
2020-08-24: 541.3396894


Predicting for Ethiopia__nan
2020-08-01: 435.78497221632324
2020-08-02: 518.3046853895964
2020-08-03: 486.25460132004673
2020-08-04: 462.9322797780575
2020-08-05: 515.1902919027826
2020-08-06: 353.20349861856636
2020-08-07: 373.7179245338077
2020-08-08: 548.0839879637458
2020-08-09: 616.4552081662805
2020-08-10: 610.1977167845941
2020-08-11: 616.1389232813042
2020-08-12: 642.5367606269814
2020-08-13: 592.4197176994805
2020-08-14: 614.5342746547504
2020-08-15: 700.0967265530953
2020-08-16: 749.4213332041536
2020-08-17: 760.2011805808874
2020-08-18: 779.3542401315215
2020-08-19: 788.1756336518963
2020-08-20: 780.4324711688222
2020-08-21: 802.4127558201317
2020-08-22: 851.9750627360638
2020-08-23: 887.4851438934147
2020-08-24: 906.4971820152049
2020-08-25: 927.7323953767991
2020-08-26: 944.9676061170816
2020-08-27: 955.6736044133525
2020-08-28: 968.6479971918267
2020-08-29: 1001.2989436022644
2020-08-30: 1029.7151437722055
2020-08-31: 1051.3909192714093

Predicting for Finland__nan
2020-

2020-08-02: 1746.1687778700061
2020-08-03: 1752.3563925377075
2020-08-04: 958.0375875264209
2020-08-05: 748.6965378708885
2020-08-06: 689.9307060054953
2020-08-07: 809.3037265739513
2020-08-08: 1291.566850880072
2020-08-09: 1546.8214367888474
2020-08-10: 1536.1771238327801
2020-08-11: 1225.7967885669045
2020-08-12: 1131.6304376481737
2020-08-13: 1111.8419327099064
2020-08-14: 1197.0529374410573
2020-08-15: 1408.6098239094479
2020-08-16: 1553.8951114212957
2020-08-17: 1558.472165000216
2020-08-18: 1449.5357274959638
2020-08-19: 1403.3882004562145
2020-08-20: 1406.747690245315
2020-08-21: 1463.1895942764293
2020-08-22: 1568.1843735350276
2020-08-23: 1651.3051093150038
2020-08-24: 1668.539159556096
2020-08-25: 1639.2237837524317
2020-08-26: 1627.383112361524
2020-08-27: 1642.9228213873466
2020-08-28: 1673.9017361510541
2020-08-29: 1732.7613865376843
2020-08-30: 1784.5161561194273
2020-08-31: 1807.376816080214

Predicting for Georgia__nan
2020-08-01: 2730.20837823628
2020-08-02: 2483.02382

2020-08-13: 386.8642437886558
2020-08-14: 409.31354770710027
2020-08-15: 452.41272825967224
2020-08-16: 482.95498062309076
2020-08-17: 503.77998665562893
2020-08-18: 524.4487989635262
2020-08-19: 530.9943538114089
2020-08-20: 543.218546775463
2020-08-21: 565.2595841877538
2020-08-22: 596.1918158943424
2020-08-23: 621.7612386795034
2020-08-24: 643.9770585866341
2020-08-25: 664.6568222931187
2020-08-26: 680.8957684844561
2020-08-27: 698.5639259128563
2020-08-28: 710.9062870128986
2020-08-29: 734.720286057094
2020-08-30: 757.6054626867693
2020-08-31: 779.4559896009101

Predicting for Honduras__nan
2020-08-01: 334.7336454878768
2020-08-02: 401.1120360435497
2020-08-03: 372.3998526348328
2020-08-04: 411.78274015860444
2020-08-05: 445.9819607445138
2020-08-06: 311.131742731569
2020-08-07: 328.0530191085909
2020-08-08: 466.46623404675677
2020-08-09: 519.584012865484
2020-08-10: 519.9995627688827
2020-08-11: 558.9966682642067
2020-08-12: 583.995584684722
2020-08-13: 545.5040152466013
2020-08-1

2020-08-01: 107.65510618565665
2020-08-02: 129.531559195512
2020-08-03: 142.9142035031697
2020-08-04: 152.99225775197905
2020-08-05: 170.4119696621941
2020-08-06: 175.997046343129
2020-08-07: 195.95039207915647
2020-08-08: 248.23762512833076
2020-08-09: 274.8377527332687
2020-08-10: 293.44223833426264
2020-08-11: 300.4303732723596
2020-08-12: 319.150731542062
2020-08-13: 333.30725177044604
2020-08-14: 355.8826948918928
2020-08-15: 389.3293744966435
2020-08-16: 415.20977253388776
2020-08-17: 435.15762823351116
2020-08-18: 451.82755820532645
2020-08-19: 444.60898860378717
2020-08-20: 457.27020866947555
2020-08-21: 476.75445397556734
2020-08-22: 501.2976170984979
2020-08-23: 521.3860804102111
2020-08-24: 540.8763920643271
2020-08-25: 556.2839158930128
2020-08-26: 564.700502004614
2020-08-27: 579.221790800852
2020-08-28: 577.6445304185404
2020-08-29: 593.5368490789901
2020-08-30: 609.7615277113795
2020-08-31: 626.5439018988282

Predicting for Israel__nan
2020-08-01: 1657.4620811616387
2020

2020-08-15: 391.9104843558424
2020-08-16: 418.347475582078
2020-08-17: 439.395886278626
2020-08-18: 456.2884957014534
2020-08-19: 450.1416513960864
2020-08-20: 465.97686355157396
2020-08-21: 500.716895609634
2020-08-22: 529.131160960712
2020-08-23: 551.9707279186749
2020-08-24: 574.0292686840744
2020-08-25: 592.6444587330093
2020-08-26: 603.505769967574
2020-08-27: 622.7631718205641
2020-08-28: 629.4756285176352
2020-08-29: 650.0974956836013
2020-08-30: 680.1355334917905
2020-08-31: 702.9920840594498

Predicting for South Korea__nan
2020-08-01: 798.4366965328907
2020-08-02: 827.1351827010185
2020-08-03: 855.2527382559763
2020-08-04: 972.1863808694313
2020-08-05: 939.2630288346709
2020-08-06: 551.4580922458633
2020-08-07: 581.7641071063413
2020-08-08: 888.7942776363311
2020-08-09: 951.2864248084877
2020-08-10: 969.8952791677525
2020-08-11: 1035.7978214944046
2020-08-12: 1027.8396681485056
2020-08-13: 882.2277261080775
2020-08-14: 907.8595701107331
2020-08-15: 1047.7365366701729
2020-08-

2020-08-02: 2452.3305899521088
2020-08-03: 1804.234956239306
2020-08-04: 1867.5328782764177
2020-08-05: 2444.248326599468
2020-08-06: 1257.2833014838518
2020-08-07: 1243.3981504211697
2020-08-08: 2073.6945312550206
2020-08-09: 2297.709792850484
2020-08-10: 2005.950728296334
2020-08-11: 2083.1377575233037
2020-08-12: 2277.863855107489
2020-08-13: 1842.5122033011185
2020-08-14: 1832.1763199978686
2020-08-15: 2193.991767836832
2020-08-16: 2331.2133934955245
2020-08-17: 2229.8556630398075
2020-08-18: 2291.1892639950975
2020-08-19: 2344.99943635259
2020-08-20: 2191.1352592550375
2020-08-21: 2195.489534263057
2020-08-22: 2364.339515645257
2020-08-23: 2445.301468036107
2020-08-24: 2423.4539605196396
2020-08-25: 2479.5950772625843
2020-08-26: 2518.487310221324
2020-08-27: 2477.2481999511288
2020-08-28: 2475.5111314977876
2020-08-29: 2562.113614244755
2020-08-30: 2615.215765661625
2020-08-31: 2627.1066375537293

Predicting for Luxembourg__nan
2020-08-01: 175.09368116010785
2020-08-02: 377.92385

2020-08-13: 368.16131420810916
2020-08-14: 388.8902470951752
2020-08-15: 430.5548834922149
2020-08-16: 459.15735605280344
2020-08-17: 474.78465088329614
2020-08-18: 493.405443714194
2020-08-19: 487.9892978294134
2020-08-20: 496.72566973569235
2020-08-21: 515.3635828787171
2020-08-22: 543.5324605027151
2020-08-23: 564.9657352399694
2020-08-24: 582.9643647898358
2020-08-25: 599.2584255830244
2020-08-26: 608.5078788667358
2020-08-27: 621.6320336403065
2020-08-28: 619.7987311950318
2020-08-29: 637.3685921015756
2020-08-30: 654.3501235134419
2020-08-31: 670.7409953295851

Predicting for Myanmar__nan
2020-08-01: 1011.4988100123745
2020-08-02: 1110.8704047077567
2020-08-03: 1069.122872798692
2020-08-04: 1122.2449492158275
2020-08-05: 1127.095864978819
2020-08-06: 660.9077419943301
2020-08-07: 691.1221433658854
2020-08-08: 1077.558668221335
2020-08-09: 1178.384317031285
2020-08-10: 1164.5155500573937
2020-08-11: 1213.2571021870165
2020-08-12: 1216.3045173113235
2020-08-13: 1042.340435491928
20

2020-08-09: 301.6504178951243
2020-08-10: 321.7120913374076
2020-08-11: 323.3042396280682
2020-08-12: 336.5905722984762
2020-08-13: 350.12221921432734
2020-08-14: 373.83562994696905
2020-08-15: 410.859347457704
2020-08-16: 439.4722235535892
2020-08-17: 460.1549937910669
2020-08-18: 474.5415778786976
2020-08-19: 465.093844549295
2020-08-20: 477.36160269272807
2020-08-21: 497.5232372058334
2020-08-22: 523.655404309579
2020-08-23: 545.0884554140264
2020-08-24: 564.9821202726094
2020-08-25: 579.4947144730218
2020-08-26: 587.0228404672822
2020-08-27: 601.3955495747709
2020-08-28: 600.2208915308798
2020-08-29: 616.881616384243
2020-08-30: 633.804509408304
2020-08-31: 650.8504351678505

Predicting for Nigeria__nan
2020-08-01: 546.678167661156
2020-08-02: 542.7355223537372
2020-08-03: 456.90116894885045
2020-08-04: 707.2657919436668
2020-08-05: 766.3805455346213
2020-08-06: 434.0488900626171
2020-08-07: 439.67149442173513
2020-08-08: 661.4839419913462
2020-08-09: 689.3923313468298
2020-08-10: 

2020-08-27: 2539.315655817634
2020-08-28: 2580.187715848261
2020-08-29: 2676.616740568791
2020-08-30: 2737.0119181712575
2020-08-31: 2763.1365482687706

Predicting for Peru__nan
2020-08-01: 1005.2596365465422
2020-08-02: 916.31624115446
2020-08-03: 2138.8414618565625
2020-08-04: 1171.8722072390015
2020-08-05: 743.4288525598008
2020-08-06: 615.0851715087497
2020-08-07: 776.1458742620619
2020-08-08: 1056.9582050629542
2020-08-09: 1189.3960542182836
2020-08-10: 1618.4763047884192
2020-08-11: 1296.0667986961344
2020-08-12: 1099.379881961252
2020-08-13: 1039.1834435508158
2020-08-14: 1151.5368764935781
2020-08-15: 1260.4404834771508
2020-08-16: 1376.6592203934263
2020-08-17: 1545.4619967330268
2020-08-18: 1450.2840264378115
2020-08-19: 1371.212972864648
2020-08-20: 1359.5511429684484
2020-08-21: 1428.5987557142362
2020-08-22: 1489.2900090026726
2020-08-23: 1568.270330484478
2020-08-24: 1650.4657349156462
2020-08-25: 1636.1068185813465
2020-08-26: 1616.2748714940535
2020-08-27: 1629.23377853

2020-08-08: 528.7813566443521
2020-08-09: 571.6155062411099
2020-08-10: 559.8662189665743
2020-08-11: 574.1474565526906
2020-08-12: 591.5400318359966
2020-08-13: 559.3711559686335
2020-08-14: 588.8341323704276
2020-08-15: 675.3840718821461
2020-08-16: 713.921109365286
2020-08-17: 723.7328380236263
2020-08-18: 745.05958652805
2020-08-19: 766.3517115197365
2020-08-20: 768.812971361054
2020-08-21: 797.2329666486046
2020-08-22: 848.8846891514818
2020-08-23: 882.8809299366749
2020-08-24: 903.2050732854932
2020-08-25: 928.2406506783358
2020-08-26: 952.8527322042048
2020-08-27: 970.5140626625699
2020-08-28: 999.0033796649896
2020-08-29: 1037.2858694469028
2020-08-30: 1068.9902753204676
2020-08-31: 1094.564758628011

Predicting for Romania__nan
2020-08-01: 4369.040083294583
2020-08-02: 4135.544824053837
2020-08-03: 3571.6197963974582
2020-08-04: 4791.20811089467
2020-08-05: 4680.552615112145
2020-08-06: 2350.1112918961485
2020-08-07: 2386.625651948736
2020-08-08: 4038.4043311347727
2020-08-09:

2020-08-30: 607.5842694490069
2020-08-31: 624.4073451698675

Predicting for El Salvador__nan
2020-08-01: 400.94119977125604
2020-08-02: 300.5511023819023
2020-08-03: 428.67528371387306
2020-08-04: 381.0363475572545
2020-08-05: 394.7070980786408
2020-08-06: 282.1275393741443
2020-08-07: 326.98189846757936
2020-08-08: 472.6741161585975
2020-08-09: 478.13147906784354
2020-08-10: 526.8219478427083
2020-08-11: 532.429507540526
2020-08-12: 546.4041998975267
2020-08-13: 512.71248629693
2020-08-14: 550.299715004159
2020-08-15: 619.764407912854
2020-08-16: 646.8884990629564
2020-08-17: 677.2236348849931
2020-08-18: 698.3196804693489
2020-08-19: 716.8182127089542
2020-08-20: 718.6672459496236
2020-08-21: 750.3144875422279
2020-08-22: 793.9245080219296
2020-08-23: 823.9088203072419
2020-08-24: 851.3731633115196
2020-08-25: 877.2167429249721
2020-08-26: 900.11901171946
2020-08-27: 917.4706795526035
2020-08-28: 946.9443882508039
2020-08-29: 981.5747462689596
2020-08-30: 1011.5736635350568
2020-08-3

2020-08-20: 5006.704544141137
2020-08-21: 4993.415227824522
2020-08-22: 5261.787992546124
2020-08-23: 5208.86495993625
2020-08-24: 5527.689406083486
2020-08-25: 5977.711195802778
2020-08-26: 5833.517270387623
2020-08-27: 5546.298399912544
2020-08-28: 5552.9625949819
2020-08-29: 5687.2090324915625
2020-08-30: 5707.429505958508
2020-08-31: 5879.462561827858

Predicting for Eswatini__nan
2020-08-01: 136.36045328715997
2020-08-02: 176.16816907465923
2020-08-03: 186.83085343092452
2020-08-04: 214.64669620059567
2020-08-05: 236.38273232100082
2020-08-06: 204.50567561366648
2020-08-07: 220.81806842682846
2020-08-08: 285.3455597947976
2020-08-09: 320.20482372143897
2020-08-10: 338.0781899024618
2020-08-11: 364.4367527159534
2020-08-12: 386.3677738689912
2020-08-13: 387.540201646147
2020-08-14: 408.47524630127003
2020-08-15: 449.43279515729887
2020-08-16: 480.02812939273315
2020-08-17: 502.28146388001176
2020-08-18: 528.2059520679213
2020-08-19: 551.7392364030321
2020-08-20: 567.3294632208006
2

2020-08-28: 588.6736123765386
2020-08-29: 605.4372625461328
2020-08-30: 622.2824578748955
2020-08-31: 640.0082163424613

Predicting for Tunisia__nan
2020-08-01: 1057.5294740622007
2020-08-02: 904.967479406
2020-08-03: 658.9922379416436
2020-08-04: 1303.817451181334
2020-08-05: 1161.9679499739627
2020-08-06: 632.3609549124425
2020-08-07: 648.5265302332555
2020-08-08: 1079.2150832930608
2020-08-09: 1039.408126864012
2020-08-10: 979.2161036860441
2020-08-11: 1226.2950691915096
2020-08-12: 1197.8836604480612
2020-08-13: 983.9266532194462
2020-08-14: 1002.4588877168532
2020-08-15: 1194.5848661924615
2020-08-16: 1201.1367211549918
2020-08-17: 1199.0581393143016
2020-08-18: 1310.2490655580227
2020-08-19: 1291.119893150916
2020-08-20: 1209.449263167963
2020-08-21: 1228.2778441225003
2020-08-22: 1320.2723281135193
2020-08-23: 1338.5606186672353
2020-08-24: 1354.3391082014753
2020-08-25: 1409.7538729104426
2020-08-26: 1417.5756898358645
2020-08-27: 1395.3071827774907
2020-08-28: 1395.68080445093

2020-08-27: 3120.995168047344
2020-08-28: 3136.64343662423
2020-08-29: 3242.3802703790197
2020-08-30: 3298.149563893044
2020-08-31: 3325.9397660276263

Predicting for United States__Arkansas
2020-08-01: 1742.4706009850133
2020-08-02: 1576.8903746920796
2020-08-03: 1442.325464924299
2020-08-04: 1826.0375864392824
2020-08-05: 1866.1079223786257
2020-08-06: 975.9485593867673
2020-08-07: 1021.4314531403272
2020-08-08: 1670.2481647157983
2020-08-09: 1690.7629569512187
2020-08-10: 1632.6688621820242
2020-08-11: 1816.735373824134
2020-08-12: 1833.5628274528767
2020-08-13: 1485.9855511355086
2020-08-14: 1515.795757099182
2020-08-15: 1795.1151368104754
2020-08-16: 1845.8661904226203
2020-08-17: 1837.33528802156
2020-08-18: 1935.1280137894291
2020-08-19: 1940.6570210793798
2020-08-20: 1814.7137843304047
2020-08-21: 1840.4707002022164
2020-08-22: 1972.6468968850731
2020-08-23: 2018.8540118420024
2020-08-24: 2034.5079621240357
2020-08-25: 2092.4639295484994
2020-08-26: 2111.3779472826236
2020-08-2

2020-08-31: 893.3605370798775

Predicting for United States__Iowa
2020-08-01: 1084.1918646352128
2020-08-02: 1325.653063375262
2020-08-03: 1053.2720459753266
2020-08-04: 1242.1128918666898
2020-08-05: 1685.9813804557798
2020-08-06: 837.858222279838
2020-08-07: 786.2266439194448
2020-08-08: 1215.260368713618
2020-08-09: 1381.7274839264594
2020-08-10: 1259.6023609130045
2020-08-11: 1387.1790963229837
2020-08-12: 1543.0715866457626
2020-08-13: 1239.931878047909
2020-08-14: 1211.2744929474115
2020-08-15: 1407.491088782737
2020-08-16: 1506.8522889662397
2020-08-17: 1471.8081311553137
2020-08-18: 1550.7639691953393
2020-08-19: 1606.2355256060287
2020-08-20: 1505.1268174640115
2020-08-21: 1500.7570793858335
2020-08-22: 1599.847496974557
2020-08-23: 1660.509801960952
2020-08-24: 1665.063894708479
2020-08-25: 1715.022964759328
2020-08-26: 1752.0693995451634
2020-08-27: 1729.3710145274235
2020-08-28: 1731.6963529309382
2020-08-29: 1787.979800411942
2020-08-30: 1830.343277665135
2020-08-31: 1850.

2020-08-23: 872.9374650169266
2020-08-24: 895.8233267442629
2020-08-25: 922.154396913822
2020-08-26: 939.1531515610279
2020-08-27: 947.5360355001517
2020-08-28: 959.3569311207107
2020-08-29: 990.4766713116618
2020-08-30: 1017.9247502322798
2020-08-31: 1041.2177611207157

Predicting for United States__Michigan
2020-08-01: 2971.914391999333
2020-08-02: 2457.5676365554627
2020-08-03: 4733.330658658621
2020-08-04: 4364.032361831235
2020-08-05: 3859.9055104248673
2020-08-06: 1957.4165367274975
2020-08-07: 2157.237163821299
2020-08-08: 3063.6690374204472
2020-08-09: 3197.1019613074977
2020-08-10: 3979.5028402845855
2020-08-11: 3999.583015861498
2020-08-12: 3718.7994358459705
2020-08-13: 2946.7113757673983
2020-08-14: 3051.1590685336214
2020-08-15: 3408.5794187872084
2020-08-16: 3581.3777796732984
2020-08-17: 3886.870608514001
2020-08-18: 3967.4889919163697
2020-08-19: 3828.309524585554
2020-08-20: 3532.4981939598993
2020-08-21: 3588.830377727817
2020-08-22: 3755.899815359098
2020-08-23: 3878

2020-08-31: 4490.234935393901

Predicting for United States__New Mexico
2020-08-01: 1272.9210869765277
2020-08-02: 1415.3750941745752
2020-08-03: 1309.3661684831807
2020-08-04: 1320.04782186478
2020-08-05: 1478.680238648587
2020-08-06: 814.444921040033
2020-08-07: 836.2673422958281
2020-08-08: 1315.3635037231097
2020-08-09: 1453.8411475563107
2020-08-10: 1399.0454527660054
2020-08-11: 1441.613902438938
2020-08-12: 1491.2450182999019
2020-08-13: 1246.067433362967
2020-08-14: 1262.9047404512517
2020-08-15: 1475.6297109818488
2020-08-16: 1568.8753155495253
2020-08-17: 1559.6137021803468
2020-08-18: 1600.9501359743288
2020-08-19: 1615.0404743664378
2020-08-20: 1532.7501244942423
2020-08-21: 1551.896289758282
2020-08-22: 1657.1480483154742
2020-08-23: 1717.5472294250962
2020-08-24: 1731.643314722292
2020-08-25: 1765.80879713565
2020-08-26: 1786.0325426087122
2020-08-27: 1769.8567296711324
2020-08-28: 1783.3811054319067
2020-08-29: 1842.454813902321
2020-08-30: 1885.6414831922852
2020-08-31:

2020-08-02: 649.0950355711063
2020-08-03: 473.22741838913385
2020-08-04: 555.7431837818744
2020-08-05: 736.3127134529133
2020-08-06: 430.19998234658283
2020-08-07: 436.4906978530381
2020-08-08: 666.0999715353231
2020-08-09: 732.1848880590306
2020-08-10: 665.5813530210413
2020-08-11: 719.4543515243914
2020-08-12: 793.6082420080808
2020-08-13: 692.1818550898597
2020-08-14: 702.6992169727961
2020-08-15: 812.8524735376889
2020-08-16: 861.2969068364391
2020-08-17: 848.1628908336204
2020-08-18: 887.8549759354942
2020-08-19: 915.6150843634407
2020-08-20: 889.4209012318486
2020-08-21: 905.2213517262048
2020-08-22: 965.5659366153536
2020-08-23: 1000.9589543919426
2020-08-24: 1010.9736590156577
2020-08-25: 1040.9924201032659
2020-08-26: 1066.3006334476868
2020-08-27: 1070.506164922647
2020-08-28: 1080.8218368132186
2020-08-29: 1118.3363656861807
2020-08-30: 1147.0989329501817
2020-08-31: 1165.7180306885614

Predicting for United States__Tennessee
2020-08-01: 5975.14499110981
2020-08-02: 8872.577

2020-08-25: 3109.838244431633
2020-08-26: 3110.1158297521633
2020-08-27: 3050.171680587778
2020-08-28: 3082.9635264223125
2020-08-29: 3191.426299254079
2020-08-30: 3241.5733581907407
2020-08-31: 3271.868037508511

Predicting for United States__West Virginia
2020-08-01: 1080.8971871692318
2020-08-02: 1095.7242633521378
2020-08-03: 1094.2877026407891
2020-08-04: 1175.5949350585179
2020-08-05: 1113.5722591954836
2020-08-06: 660.2122682237382
2020-08-07: 700.8550770710224
2020-08-08: 1112.7056397478932
2020-08-09: 1182.5104323823703
2020-08-10: 1185.9101618536092
2020-08-11: 1236.7229017617349
2020-08-12: 1216.3722754267753
2020-08-13: 1041.3761912116988
2020-08-14: 1075.1144355572224
2020-08-15: 1256.9752847945924
2020-08-16: 1318.6760603948346
2020-08-17: 1334.543560315262
2020-08-18: 1374.1970349535459
2020-08-19: 1364.4133954049455
2020-08-20: 1305.554038711722
2020-08-21: 1333.9929674540228
2020-08-22: 1424.6771716967057
2020-08-23: 1470.9193169311689
2020-08-24: 1494.262404039875
202

2020-08-04: 209.28770530674117
2020-08-05: 202.7109735535185
2020-08-06: 195.0468731835658
2020-08-07: 216.94904791968804
2020-08-08: 284.0962337694572
2020-08-09: 311.03313289242465
2020-08-10: 332.6666954078679
2020-08-11: 344.3914620309513
2020-08-12: 354.3528155903365
2020-08-13: 361.9003310617842
2020-08-14: 385.71190909882614
2020-08-15: 425.3691133566746
2020-08-16: 451.88607616250647
2020-08-17: 473.5503793355641
2020-08-18: 491.92068486847575
2020-08-19: 481.4730776213809
2020-08-20: 491.13082000051764
2020-08-21: 511.33841872463654
2020-08-22: 538.5495245355944
2020-08-23: 559.2424187931683
2020-08-24: 579.6658379412672
2020-08-25: 595.8269727417796
2020-08-26: 603.0950645616655
2020-08-27: 616.3890314699066
2020-08-28: 615.2613943486921
2020-08-29: 632.39615609875
2020-08-30: 649.1093445393745
2020-08-31: 666.4524744422345

Predicting for Zimbabwe__nan
2020-08-01: 159.63976858767103
2020-08-02: 194.36665462609244
2020-08-03: 226.05818673158612
2020-08-04: 272.56751539968207


In [22]:
# Check the predictions
preds_df.head()

Unnamed: 0,CountryName,RegionName,Date,PredictedDailyNewCases
213,Aruba,,2020-08-01,110.721549
214,Aruba,,2020-08-02,132.856966
215,Aruba,,2020-08-03,146.019044
216,Aruba,,2020-08-04,163.24032
217,Aruba,,2020-08-05,181.431324


# Validation
This is how the predictor is going to be called during the competition.  
!!! PLEASE DO NOT CHANGE THE API !!!

In [23]:
!python predict.py -s 2020-08-01 -e 2020-08-04 -ip ../../../validation/data/2020-09-30_historical_ip.csv -o predictions/2020-08-01_2020-08-04.csv

Generating predictions from 2020-08-01 to 2020-08-04...
Saved predictions to predictions/2020-08-01_2020-08-04.csv
Done!


In [24]:
!head predictions/2020-08-01_2020-08-04.csv

CountryName,RegionName,Date,PredictedDailyNewCases
Aruba,,2020-08-01,110.721549344085
Aruba,,2020-08-02,132.85696554341925
Aruba,,2020-08-03,146.01904406625135
Aruba,,2020-08-04,163.24031979966742
Afghanistan,,2020-08-01,229.17688977298798
Afghanistan,,2020-08-02,326.44440936029133
Afghanistan,,2020-08-03,303.8069782507652
Afghanistan,,2020-08-04,330.51054499811056
Angola,,2020-08-01,156.9399774033422


# Test cases
We can generate a prediction file. Let's validate a few cases...

In [25]:
import os
from covid_xprize.validation.predictor_validation import validate_submission

def validate(start_date, end_date, ip_file, output_file):
    # First, delete any potential old file
    try:
        os.remove(output_file)
    except OSError:
        pass
    
    # Then generate the prediction, calling the official API
    !python predict.py -s {start_date} -e {end_date} -ip {ip_file} -o {output_file}
    
    # And validate it
    errors = validate_submission(start_date, end_date, ip_file, output_file)
    if errors:
        for error in errors:
            print(error)
    else:
        print("All good!")

## 4 days, no gap
- All countries and regions
- Official number of cases is known up to start_date
- Intervention Plans are the official ones

In [26]:
validate(start_date="2020-08-01",
         end_date="2020-08-04",
         ip_file="../../../validation/data/2020-09-30_historical_ip.csv",
         output_file="predictions/val_4_days.csv")

Generating predictions from 2020-08-01 to 2020-08-04...
Saved predictions to predictions/val_4_days.csv
Done!
All good!


## 1 month in the future
- 2 countries only
- there's a gap between date of last known number of cases and start_date
- For future dates, Intervention Plans contains scenarios for which predictions are requested to answer the question: what will happen if we apply these plans?

In [27]:
%%time
validate(start_date="2021-01-01",
         end_date="2021-01-31",
         ip_file="../../../validation/data/future_ip.csv",
         output_file="predictions/val_1_month_future.csv")

Generating predictions from 2021-01-01 to 2021-01-31...
Saved predictions to predictions/val_1_month_future.csv
Done!
All good!
Wall time: 1.92 s


## 180 days, from a future date, all countries and regions
- Prediction start date is 1 week from now. (i.e. assuming submission date is 1 week from now)  
- Prediction end date is 6 months after start date.  
- Prediction is requested for all available countries and regions.  
- Intervention plan scenario: freeze last known intervention plans for each country and region.  

As the number of cases is not known yet between today and start date, but the model relies on them, the model has to predict them in order to use them.  
This test is the most demanding test. It should take less than 1 hour to generate the prediction file.

### Generate the scenario

In [28]:
from datetime import datetime, timedelta

start_date = datetime.now() + timedelta(days=7)
start_date_str = start_date.strftime('%Y-%m-%d')
end_date = start_date + timedelta(days=180)
end_date_str = end_date.strftime('%Y-%m-%d')
print(f"Start date: {start_date_str}")
print(f"End date: {end_date_str}")

Start date: 2020-12-24
End date: 2021-06-22


In [29]:
from covid_xprize.validation.scenario_generator import get_raw_data, generate_scenario, NPI_COLUMNS
DATA_FILE = 'data/OxCGRT_latest.csv'
latest_df = get_raw_data(DATA_FILE, latest=True)
scenario_df = generate_scenario(start_date_str, end_date_str, latest_df, countries=None, scenario="Freeze")
scenario_file = "predictions/180_days_future_scenario.csv"
scenario_df.to_csv(scenario_file, index=False)
print(f"Saved scenario to {scenario_file}")

Saved scenario to predictions/180_days_future_scenario.csv


### Check it

In [30]:
%%time
validate(start_date=start_date_str,
         end_date=end_date_str,
         ip_file=scenario_file,
         output_file="predictions/val_6_month_future.csv")

Generating predictions from 2020-12-24 to 2021-06-22...
Saved predictions to predictions/val_6_month_future.csv
Done!
All good!
Wall time: 4min 38s
