# Gaussian Process (Erf or Linear Prior)

This is the model that produced our final submission. This is a relatively simple model, but it generated the best results for us. We have also worked on other models, such as SIR based models, Logistic Multiparameter models, and residual fitting models, which one can find in our github repository. 

This model is simply run by the one command given at the end of this file. We decided to include some of the functions as well in this code demo because some of the function contain important parameters for the model. The model predicts all dates until the end of June. If one wants to change the prediction dates, he/she can do that in the file erf_model_small_changes, which is in the Gaussian_Processes folder. If one wants to change the training dates, he/she can simply change the boundary date of the em.fit_erf method used below inside the predict_county function. 

In [1]:
import numpy as np
from matplotlib import pyplot as plt

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C
from sklearn.kernel_ridge import KernelRidge as KR

import pandas as pd
import git

In [2]:
import warnings
warnings.simplefilter('ignore')

In [3]:
repo = git.Repo("./", search_parent_directories=True)
homedir = repo.working_dir
datadir = f"{homedir}/Gaussian_Processes"
import sys
sys.path.insert(0, datadir)
import utils2 as utils

## Process the Datasetes

In [4]:
# Function to return data ever since first min_cases cases. 
# We used this function to check if a county has always
# had 0 deaths, in which case we predict zeroes for all deciles.
def select_region(df, fips, min_deaths=-1):
    d = df.loc[df['fips'] == fips]
    deaths = np.where(d['deaths'].values > min_deaths)[0]
    if len(deaths) == 0:
        return []
    start = deaths[0]
    d = d[start:]
    return d

In [5]:
# Import the processed dataset using utils helper function 
df = utils.get_processed_df()

## Gaussian Process with Erf prior

The implementation of the Gaussian Process has been done with the help of pymc3 and the tutorial links provided in the class. 

In [6]:
# Import the Erf model functions
import erf_model_small_changes as em

import pymc3 as pm
import arviz as az

Initialize the Erf/Linear prior as a mean function class.

In [7]:
import theano.tensor as tt

class Erf(pm.gp.mean.Mean):
    def __init__(self, fit_func, params):
        self.fit_func = fit_func # Either Erf or Linear
        self.params = params 

    def __call__(self, X):
        X = X.reshape(1, -1)[0]
        return em.run_model(self.fit_func, self.params, X)

Cretae function to make predictions for 1 county. 

In [8]:
def predict_county(county_fips):
    
    # If the number of deaths has always been 0, 
    # simply return zeroes for all percentiles
    d = select_region(df, county_fips)
    if len(d) == 0:
        return np.zeros((91, 9))
    if np.max(d['deaths'].values) == 0:
        return np.zeros((91, 9))

    # Following is the Gaussian Process model with the Erf/Linear Prior
    with pm.Model() as gp_covid_model:
        # Lengthscale
        ρ = pm.HalfCauchy('ρ', 6)
        η = pm.HalfCauchy('η', 6)

        # Fit the Erf model
        fit_func, popt, pcov, X_train, y_train, X_pred = em.fit_erf(df, county_fips, boundary_date='2020-05-24')

        # Set the Erf prior
        M = Erf(fit_func, popt)
        # Intialize the Covariance Function
        K = (η**2) * pm.gp.cov.ExpQuad(1, ρ)
        
        # Noise of the data 
        σ = pm.HalfNormal('σ', 40)

        # Compute the Marginal Likelihood of the data
        covid_deaths_gp = pm.gp.Marginal(mean_func=M, cov_func=K)
        covid_deaths_gp.marginal_likelihood('covid_deaths', X=X_train.reshape(-1, 1), 
                               y=y_train, noise=σ)

    # Train the model using the Markov Chain Monte Carlo (MCMC) method
    if len(popt) == 2:
        with gp_covid_model:
            gp_trace = pm.sample(100, tune=100, cores=2, random_seed=10)
    else:
        with gp_covid_model:
            gp_trace = pm.sample(800, tune=1500, cores=2, random_seed=35)

    # Make posterior predictions
    with gp_covid_model:
        covid_pred_noise = covid_deaths_gp.conditional("covid_pred_noise", X_pred.reshape(-1,1), pred_noise=True)
        gp_covid_samples = pm.sample_posterior_predictive(gp_trace, vars=[covid_pred_noise], samples=500, random_seed=42)

    # Compute daily death count from cumulative predictions
    all_covid_samples = np.diff(gp_covid_samples['covid_pred_noise'])
    all_covid_samples = np.array(all_covid_samples)

    # Compute the deciles for the predictions
    all_deciles = np.transpose(np.array([np.percentile(all_covid_samples, per, axis=0) for per in np.arange(10, 100, 10)]))
    all_deciles[all_deciles < 0] = 0
    
    return all_deciles

Function to make the final predictions for all counties. 

In [9]:
def predict_all_counties(out_file):
    out_dates = utils.all_output_dates()
    out_fips, all_row_starts = utils.all_output_fips('sample_submission.csv')
    num_dates, num_fips = len(out_dates), len(out_fips)
    out = np.zeros((num_dates * num_fips, 9))
    # Go through each county one by one, perform our fit, and record predictions
    for fi, fips in enumerate(out_fips):
        print('Processing FIPS', fips)
        preds = predict_county(float(fips))
        # Indices are disjointed because we're recording a single FIPS on many different dates
        out[np.arange(fi, out.shape[0], num_fips)] = preds
    # Add in the header line
    out_lines = [','.join(['id'] + ['%d' % x for x in np.arange(10, 91, 10)]) + '\n']
    # Add in all other lines one at a time
    for row_head, row in zip(all_row_starts, out):
        out_lines.append(','.join([row_head] + ['%.2f' % val for val in row]) + '\n')
    with open(out_file, 'w') as f:
        f.writelines(out_lines)
    f.close()

## Run the Model

The model has been finalized. To produce the submission file, one simply needs to run the following command.

In [10]:
predict_all_counties('final_predictions.csv')

Processing FIPS 10001


Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 4 divergences: 100%|██████████| 4600/4600 [00:30<00:00, 148.56draws/s]
There were 3 divergences after tuning. Increase `target_accept` or reparameterize.
There was 1 divergence after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 25% for some parameters.
100%|██████████| 500/500 [00:05<00:00, 92.92it/s] 


Processing FIPS 10003


Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 0 divergences: 100%|██████████| 4600/4600 [00:25<00:00, 178.54draws/s]
100%|██████████| 500/500 [00:05<00:00, 93.08it/s] 


Processing FIPS 10005


Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 5 divergences: 100%|██████████| 4600/4600 [00:48<00:00, 95.50draws/s] 
There were 5 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6428342269228152, but should be close to 0.8. Try to increase the number of tuning steps.
The estimated number of effective samples is smaller than 200 for some parameters.
100%|██████████| 500/500 [00:07<00:00, 71.34it/s]


Processing FIPS 1001


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 0 divergences: 100%|██████████| 400/400 [00:02<00:00, 146.64draws/s]
The acceptance probability does not match the target. It is 0.9177289136562271, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.9004037481219446, but should be close to 0.8. Try to increase the number of tuning steps.
The number of effective samples is smaller than 25% for some parameters.
100%|██████████| 500/500 [00:04<00:00, 101.20it/s]


Processing FIPS 1003


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 2 divergences: 100%|██████████| 400/400 [00:03<00:00, 122.18draws/s]
There were 2 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.8907692910359618, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.9151547997990539, but should be close to 0.8. Try to increase the number of tuning steps.
100%|██████████| 500/500 [00:04<00:00, 109.34it/s]


Processing FIPS 1005


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 165 divergences: 100%|██████████| 400/400 [00:02<00:00, 142.35draws/s]
There were 64 divergences after tuning. Increase `target_accept` or reparameterize.
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0.49349111079823066, but should be close to 0.8. Try to increase the number of tuning steps.
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The number of effective samples is smaller than 10% for some parameters.
100%|██████████| 500/500 [00:04<00:00, 107.32it/s]


Processing FIPS 1007


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 165 divergences: 100%|██████████| 400/400 [00:02<00:00, 141.44draws/s]
There were 64 divergences after tuning. Increase `target_accept` or reparameterize.
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0.49349111079823066, but should be close to 0.8. Try to increase the number of tuning steps.
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The number of effective samples is smaller than 10% for some parameters.
100%|██████████| 500/500 [00:04<00:00, 101.34it/s]


Processing FIPS 1009


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 165 divergences: 100%|██████████| 400/400 [00:02<00:00, 172.00draws/s]
There were 64 divergences after tuning. Increase `target_accept` or reparameterize.
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0.49349111079823066, but should be close to 0.8. Try to increase the number of tuning steps.
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The number of effective samples is smaller than 10% for some parameters.
100%|██████████| 500/500 [00:04<00:00, 101.09it/s]


Processing FIPS 1011


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 165 divergences: 100%|██████████| 400/400 [00:03<00:00, 121.38draws/s]
There were 64 divergences after tuning. Increase `target_accept` or reparameterize.
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0.49349111079823066, but should be close to 0.8. Try to increase the number of tuning steps.
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The number of effective samples is smaller than 10% for some parameters.
100%|██████████| 500/500 [00:04<00:00, 115.14it/s]


Processing FIPS 1013


Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 0 divergences: 100%|██████████| 4600/4600 [00:39<00:00, 117.56draws/s]
The number of effective samples is smaller than 25% for some parameters.
100%|██████████| 500/500 [00:04<00:00, 110.08it/s]


Processing FIPS 1015


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 200 divergences: 100%|██████████| 400/400 [00:02<00:00, 164.13draws/s]
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0.5896913218667597, but should be close to 0.8. Try to increase the number of tuning steps.
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0.1513000928280537, but should be close to 0.8. Try to increase the number of tuning steps.
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The number of effective samples is smaller than 10% for some parameters.
100%|██████████| 500/500 [00:05<00:00, 90.26it/s] 


Processing FIPS 1017


Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 0 divergences: 100%|██████████| 4600/4600 [00:28<00:00, 159.84draws/s]
100%|██████████| 500/500 [00:04<00:00, 106.16it/s]


Processing FIPS 1019


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 1 divergences: 100%|██████████| 400/400 [00:01<00:00, 205.01draws/s]
The acceptance probability does not match the target. It is 0.9480888519286992, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.9066906348036741, but should be close to 0.8. Try to increase the number of tuning steps.
100%|██████████| 500/500 [00:05<00:00, 90.09it/s] 


Processing FIPS 1021


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 165 divergences: 100%|██████████| 400/400 [00:02<00:00, 143.73draws/s]
There were 64 divergences after tuning. Increase `target_accept` or reparameterize.
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0.49349111079823066, but should be close to 0.8. Try to increase the number of tuning steps.
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The number of effective samples is smaller than 10% for some parameters.
100%|██████████| 500/500 [00:05<00:00, 95.10it/s] 


Processing FIPS 1023


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 1 divergences: 100%|██████████| 400/400 [00:02<00:00, 152.75draws/s]
The acceptance probability does not match the target. It is 0.9579414655034245, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.9268493922796245, but should be close to 0.8. Try to increase the number of tuning steps.
The rhat statistic is larger than 1.05 for some parameters. This indicates slight problems during sampling.
The number of effective samples is smaller than 25% for some parameters.
100%|██████████| 500/500 [00:05<00:00, 97.18it/s] 


Processing FIPS 1025


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 200 divergences: 100%|██████████| 400/400 [00:05<00:00, 71.85draws/s] 
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0.6105094887389774, but should be close to 0.8. Try to increase the number of tuning steps.
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0.5838064932235352, but should be close to 0.8. Try to increase the number of tuning steps.
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The number of effective samples is smaller than 10% for some parameters.
100%|██████████| 500/500 [00:04<00:00, 112.26it/s]


Processing FIPS 1027


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 101 divergences:  75%|███████▌  | 301/400 [00:03<00:01, 90.79draws/s]
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The number of effective samples is smaller than 10% for some parameters.
100%|██████████| 500/500 [00:04<00:00, 110.37it/s]


Processing FIPS 1029


Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, η, ρ]
Sampling 2 chains, 0 divergences:  11%|█▏        | 45/400 [00:00<00:01, 181.63draws/s]


ValueError: Not enough samples to build a trace.

Runtime: This model takes about 8 hours to run.