In [1]:
import pandas as pd
import numpy as np
import datetime
from scorepi import *
from epiweeks import Week
import matplotlib.pyplot as plt
import matplotlib as mpl
from collections import defaultdict
import seaborn as sns
import pickle
from pathlib import Path
import requests
import os

import warnings
warnings.filterwarnings('ignore')

In [2]:
!python --version

Python 3.10.15


In [3]:
pd.__version__

'1.4.4'

Misc questions/notes for Clara:

The 'incidence' attribute for your classes is not used anywhere, comment indicates incidence=F would indicate cumulative data but there is no implementation for this. Should it be implemented or deleted?

The 'format_forecasts' method for the Flusight_2324 class is not used. Delete?

Better docstrings for classes and methods?

See the error noted in the WIS section.

# Load Data

At least for now, I'm using Clara's functions for pulling predictions and surveillance data. We will probably want to set up the Flusight repo as a submodule and pull new data on a schedule. 


Here's how it's currently implemented:
1. For each selected model and date, we save each forecast submission as-is in parquet format.
2. For each selected model and date, we read the corresponding parquets and concatenate into a single data frame.
3. We instantiate the Flusight_2324 class with a given start and end week, and call the format_forecasts_all method on the concatenated predictions dataframe, which filters for quantile predictions, removes dates after the end week, enforces datatypes, and returns the formatted dataframe to be used in scoring. (NOTE: start week is not used, what's up with that?)

Automating this workflow will probably involve:
1. Calculate scorings for all historical data once and save.
2. Set up the Flusight repo as a submodule and pull new data on a schedule.
3. It might be possible to keep the implementation with parquets if github can run it (pulling from flusight repo and writing and reading parquets within a python script). This would eliminate (2.) and could avoid problems when making epistorm-evaluations a submodule of epistorm-dashboard, since it also has the flusight repo as a submodule.
4. Every time new data is pulled, calculate scoring for only the new data and append to existing scoring files. Do we need to re-calculate scores for previous weeks in case surveillance data changes retrospectively?
5. We might generate separate files with transformed data for charts in the dashboard, if so either:
    1. Do that here, add epistorm-evaluations as a submodule of epistorm-dashboard, and pull the transformed scores on a schedule.
    2. Add epistorm-evaluations as a submodule of epistorm-dashboard, pull the scoring files on a schedule, and trigger a script within epistorm-dashboard to create the transformed scoring files.


In [4]:
# functions to download flusight model predictions and surveillance data

def pull_flusight_predictions(model,date):
    """pull_scenario_modeling_hub_predictions. Load predictions of the model saved by the scenario modeling
    hub.

    Parameters
    ----------
    model : str
        Model name on the
    dates : list or string
        List of potential dates in the iso format, e.g., 'yyyy-mm-dd', for the submission.
    """
    predictions = None
    
    url = f"https://raw.githubusercontent.com/cdcepi/Flusight-forecast-hub/main/model-output/{model}/{date}-{model}"
    for ext in [".csv",".gz",".zip",".csv.zip",".csv.gz"]:
        try:
            predictions = pd.read_csv(url+ext,dtype={'location':str},parse_dates=['target_end_date'])
        except:
            pass
    if predictions is None:
        print(f"Data for model {model} and date {dates} unavailable")
    return predictions


def pull_surveillance_data():
    mapping = {'death':'Deaths', 'case':'Cases', 'hospitalization': 'Hospitalizations'}
    
    url = f"https://raw.githubusercontent.com/cdcepi/FluSight-forecast-hub/main/target-data/target-hospital-admissions.csv"
    return pd.read_csv(url, dtype={'location':str})


In [5]:
# download and save surveillance data
surv = pull_surveillance_data() 
surv.to_parquet(f"./dat/target-hospital-admissions.pq", index=False)

Clara was hard-coding dates to pull each time she manually ran this notebook. Since we want scores for all historical data, I'm using all dates from the surveillance file. I don't think this should cause a bug or miss any data but it would be good to have a second opinion.

Once we've got all the historical scores, if we're only scoring new predictions every week, should we only use the most recent surveillance date?

In [6]:
#selecting all target dates that exist in the surveillance file
dates = pd.unique(surv.date)

#selecting just models used in the dashboard for now
#will need to expand eventually whether we keep the parquet implementation or pull files from the flusight repo as a submodule
models = ['CEPH-Rtrend_fluH', 'FluSight-baseline', 'FluSight-ensemble', 'MIGHTE-Nsemble', 'MOBS-GLEAM_FLUH', 'NU_UCSD-GLEAM_AI_FLUH']

In [9]:


for model in models:
    for date in dates:
        try:
            predictions = pull_flusight_predictions(model,date)

            predictions.to_parquet(f'./dat/{model}_{date}.pq', index=False)
        except Exception as e:
            print(e)

Data for model CEPH-Rtrend_fluH and date ['2024-04-27' '2024-04-20' '2024-04-13' '2024-04-06' '2024-03-30'
 '2024-03-23' '2024-03-16' '2024-03-09' '2024-03-02' '2024-02-24'
 '2024-02-17' '2024-02-10' '2024-02-03' '2024-01-27' '2024-01-20'
 '2024-01-13' '2024-01-06' '2023-12-30' '2023-12-23' '2023-12-16'
 '2023-12-09' '2023-12-02' '2023-11-25' '2023-11-18' '2023-11-11'
 '2023-11-04' '2023-10-28' '2023-10-21' '2023-10-14' '2023-10-07'
 '2023-09-30' '2023-09-23' '2023-09-16' '2023-09-09' '2023-09-02'
 '2023-08-26' '2023-08-19' '2023-08-12' '2023-08-05' '2023-07-29'
 '2023-07-22' '2023-07-15' '2023-07-08' '2023-07-01' '2023-06-24'
 '2023-06-17' '2023-06-10' '2023-06-03' '2023-05-27' '2023-05-20'
 '2023-05-13' '2023-05-06' '2023-04-29' '2023-04-22' '2023-04-15'
 '2023-04-08' '2023-04-01' '2023-03-25' '2023-03-18' '2023-03-11'
 '2023-03-04' '2023-02-25' '2023-02-18' '2023-02-11' '2023-02-04'
 '2023-01-28' '2023-01-21' '2023-01-14' '2023-01-07' '2022-12-31'
 '2022-12-24' '2022-12-17' '2022-12

# Classes and Methods

It would be nice to have proper docstrings for everything here. Should rename Flusight_2324 to something else.

In [7]:
# functions for calculating scores of the forecasts against the surveillance data

class Flusight_2324:
    
    def __init__(self, df, obsdf, target, incidence = True, max_date = False, start_week = False, end_week = False):
        self.df = df # input dataframe with all scenarios, locations, and quantiles
        self.obsdf = obsdf # input of surveillance data of interest
        self.target = target # target metric of interest (case, death, hospitalization)
        self.inc = incidence # True if incident measures, False if cumulative
        self.max_date = max_date # maximum date you want to analyze, cut off date
        self.start_week = start_week # beginning of observations of interest
        self.end_week = end_week # end of observations of interest
        self.locations = pd.DataFrame()
        self.scenario_ensemble = pd.DataFrame()
        
        
    def get_locations(self):
        # get df with US state names, populations, and abbreviations and corresponding numerical code 
        locations = pd.read_csv('./dat/locations.csv',dtype={'location':str})
        self.locations = locations
        
        return locations
        
        
    def get_observations(self, target_location):
        # get and format surveillance data of interest
        #observations = self.obsdf.copy()
        
        if self.target == 'hosp':
            target_obs = 'hospitalization'
        else:
            target_obs = self.target
            
        # read in observations dataframe
        observations = self.obsdf.copy().drop(columns= ['Unnamed: 0', 'weekly_rate'])
        observations['date'] = pd.to_datetime(observations['date'])

        #filter start - end week
        if self.start_week:
            observations = observations[(observations['date'] >= pd.to_datetime(self.start_week.startdate())) ]
            
        if self.end_week:
            observations = observations[(observations['date'] <= pd.to_datetime(self.end_week.enddate()))]
                                
        #filter location
        observations = observations[observations['location'] == target_location]

        #aggregate to weekly
        observations = observations.groupby(['location', pd.Grouper(key='date', freq='W-SAT')]).sum().reset_index()

        if self.max_date:
            observations = observations[observations['date'] <= max_date].copy()
            
        #transform to Observation object
        observations = Observations(observations)

        return observations
    
    
    
    def format_forecasts(self, model, date, target_location):
        # get forecast into standard format to use for scoring
        # read predictions from saved file 
        pred = pd.read_parquet(f"./dat/{model}_{date}.pq")
        pred['Model'] = model
        pred = pred[pred.output_type == 'quantile'] # only keep quantile predictions
        pred['target_end_date'] = pd.to_datetime(pred['target_end_date'])  #make sure dates are in datetime format
        if self.end_week:
            pred = pred[(pred['target_end_date'] <= pd.to_datetime(self.end_week.enddate()))] # filter dates
        
        
        pred['output_type_id'] = pred["output_type_id"].astype(float)  # make sure quantile levels are floats
        predictions = pred[pred['location'] == target_location].copy() # filter for location of interest

        return predictions
    
    
    def format_forecasts_all(self, dfformat):
        # get forecasts into standard format to use for scoring
        # dfformat input is the dataframe you want to format
        pred = dfformat.copy()
        pred = pred[pred.output_type == 'quantile'] # only keep quantile predictions
        pred['target_end_date'] = pd.to_datetime(pred['target_end_date']) #make sure dates are in datetime format
        if self.end_week:
            pred = pred[(pred['target_end_date'] <= pd.to_datetime(self.end_week.enddate()))] # filter dates
        
        pred['output_type_id'] = pred["output_type_id"].astype(float) # make sure quantile levels are floats
        
        return pred
        
    

class Scoring(Flusight_2324):
    # calculate score values for probabilistic epidemic forecasts 
    # find WIS, MAE, and coverage over whole projection window as well as timestamped for every week.
    # uses scorepi package to calculate the scores 
    # https://github.com/gstonge/scorepi/tree/main 
    # score dataframe must have 'Model' column to differentiate and calculate scores for different models
    
    def __init__(self, df, obsdf, target, incidence = True, max_date = False, start_week = False, 
                 end_week = False):
        super().__init__(df, obsdf, target, incidence, max_date, start_week, end_week)
        
    def get_all_average_scores(self, models, date):
        # calculate scores averaged over the full projection period
        
        pred1 = self.df.copy() # dataframe that will be scored
        loclist = list(pred1.location.unique()) 
        
        
        allscore = {}
        for model in models:
            allscore[model] = {}
            for target_location in loclist:
                if target_location == '72':
                    continue
                #print(target_location)
                
                observations = self.get_observations(target_location) # get surveillance data for target location 

                # filter by model and location
                pred = pred1[(pred1.Model==model) & (pred1['location']==target_location) ] 
                # make into Predictions object
                pred = Predictions(pred, t_col = 'target_end_date', quantile_col = 'output_type_id')
                observations = Observations(observations[observations.date<=pred.target_end_date.max()])
                #calculate scores
                d,_ = score_utils.all_scores_from_df(observations, pred, mismatched_allowed=False) 

                # save in dictionary
                allscore[model][target_location] = d
            
        
        return allscore
    
    def organize_average_scores(self, want_scores, models, date):
        # save average scores of interest into a dataframe
        # want_scores is list of scores you want to save in the dataframe
        # wis is 'wis_mean', and all coverages are '10_cov', '20_cov', ... '95_cov' etc.
        
        average_scores = pd.DataFrame()
        
        allscore = self.get_all_average_scores(models, date) #calculate all average scores
        
        for model in allscore.keys():
            scoresmod = allscore[model]
            for loc in scoresmod.keys():
                    
                scoresloc = scoresmod[loc]

                scoredict = {'Model': model ,'location': loc}
                for score in want_scores: # only save scores input into want_scores
                    scoredict[score] = scoresloc[score]

                average_scores = pd.concat([average_scores, pd.DataFrame(scoredict, index=[0])])

        average_scores = average_scores.reset_index() 
        average_scores = average_scores.drop(columns=['index'])
        
        return average_scores
    
    def get_all_timestamped_scores(self, models, date):
        # calculate score at each week of projection period
        pred = self.df.copy() # dataframe used for scoring
        loclist = list(pred.location.unique())
        
        allscore = {}
        
        for model in models:
            allscore[model] = {}
            for target_location in loclist:
                    
                observations = self.get_observations(target_location) # get surveillance data for target location
                
                try:
                    predss = pred[pred['location'] == target_location] #filter by location
                    # format forecasts into Predictions scorepi objec
                    predss = Predictions(predss, t_col = 'target_end_date', quantile_col = 'output_type_id')
                    
                    if len(predss)==0:
                        continue
                    
                    allscore[model][target_location] = {}
                    # loop over all time points in the predictions
                    for t in predss.target_end_date.unique():
                        prednew = predss[predss.target_end_date == t]
                        obsnew = observations[observations.date == t]

                        obsnew = Observations(obsnew)
                        prednew = Predictions(prednew, t_col = 'target_end_date', quantile_col = 'output_type_id')

                        # calculate scores
                        d = score_utils.all_timestamped_scores_from_df(obsnew, prednew)

                        allscore[model][target_location][t] = d
                except Exception as e:
                    print(e)
        
        return allscore
    
    
    def organize_timestamped_scores(self, want_scores, models, date):
        # save timestamped scores of interest into a dataframe
        # want_scores is list of scores you want to save in the dataframe
        # wis is 'wis'
        
        time_scores = pd.DataFrame()
        
        # calculate all scores evaluated for each time point
        allscore = self.get_all_timestamped_scores(models=models, date=date)
        
        for model in allscore.keys():
            scoremod = allscore[model]
        
            for loc in scoremod.keys():
                    
                scoresloc = scoremod[loc]

                for t in scoresloc.keys():
                    tdf = scoresloc[t]

                    scoredict = {'Model':model ,'location':loc, 'target_end_date':t}
                    for score in want_scores:
                        scoredict[score] = tdf[score]

                    # save scores in want_scores into a dataframe
                    time_scores = pd.concat([time_scores, pd.DataFrame(scoredict, index=[0])])

        
        time_scores = time_scores.reset_index() 
        time_scores = time_scores.drop(columns=['index'])
        
        return time_scores
    
    
    def get_rescaled_wis(self):
        # calculate WIS rescaled by standard deviation 
        # need to have WIS scores for multiple models
        # calculate standard deviation across models and divide WIS scores by this value
        
        time_scores = self.organize_timestamped_scores(['wis'])
        
        time_scores = time_scores[~time_scores['location'].isin(['60','66','69', '72', '78'])]
        
        wisstdev = pd.DataFrame()
        for loc in time_scores.location.unique():
            for date in time_scores.target_end_date.unique():
                # get all scores at each location and week
                df = time_scores[(time_scores.location == loc) & (time_scores.target_end_date == date)]
                
                stdev = df['wis'].std() # find standard deviation across models
                
                df['wis_scaled'] = df['wis'] / stdev #calculate rescaled wis by dividing by standard deviation
                
                wisstdev = pd.concat([wisstdev, df])
                
        wisstdev = wisstdev.reset_index()
        wisstdev = wisstdev.drop(columns=['index'])
        
        return wisstdev
    
    
    def get_rescaled_wis_obs(self, models, date):
        # calculate WIS rescaled by observations
        # divide WIS scores by corresponding value in surveillance data
        
        time_scores = self.organize_timestamped_scores(['wis'], models, date) # calculate WIS at each time point
        
        
        wisnorm = pd.DataFrame()
        for loc in time_scores.location.unique():
            for wk in time_scores.target_end_date.unique():
                # get forecast for each location and week
                df = time_scores[(time_scores.location == loc) & (time_scores.target_end_date == wk)]
                observations = self.get_observations(loc) # get observations for target location
                obs = observations[observations.date == wk].copy() # filter for week of interest
                
                if list(obs['value'])[0] == 0:
                    continue
                    
                df['wis_scaled'] = list(df['wis'])[0] / list(obs['value'])[0] # calculate rescaled WIS
                
                wisnorm = pd.concat([wisnorm, df])
                
        wisnorm = wisnorm.reset_index()
        wisnorm = wisnorm.drop(columns=['index'])
        
        return wisnorm
    
    
    def get_wis_ratios(self, numerator_model, denominator_model, timestamped=False):
        # need all models of interest in score df so we can calculate wis of each
        # input is numerator model and denominator model to calculate wis ratio
        # input whether you want ratio taken at each time point or over average of whole time window
        
        if timestamped == True: # ratio at each week
            scores = self.organize_timestamped_scores(['wis'])
        else: # one ratio for full projection period
            scores = self.organize_average_scores(['wis_mean'])
            scores = scores.rename(columns={'wis_mean':'wis'})
            
        num = scores[scores.Model == numerator_model] # get scores from numerator model
        num = num.rename(columns={"wis": "wis_num"})
        
        denom = scores[scores.Model == denominator_model] # get scores from denominator model
        denom = denom.rename(columns={"wis": "wis_denom"})
        
        # merge numerator and denominator dataframes
        if timestamped == True:
            scoresmerge = pd.merge(num, denom, how='inner', on = ['location', 'target_end_date'])
        else:
            scoresmerge = pd.merge(num, denom, how='inner', on = ['location'])
            
        scoresmerge['wis_ratio'] = scoresmerge['wis_num'] / scoresmerge['wis_denom'] # calculate WIS ratio
        
        return scoresmerge

# Calculate Scores

## Instantiate Flusight_2324 Class and Format Data for Scoring

I'm keeping the dates and models specified earlier.

In [8]:
# put all forecasts into one dataframe
predictionsall = pd.DataFrame()
for model in models:
    for date in dates:
        try:
            predictions = pd.read_parquet(f'./dat/{model}_{date}.pq')
            predictions['Model'] = model
            predictionsall = pd.concat([predictionsall, predictions])
        except Exception as e:
            print(e)

[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-10-07.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-09-30.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-09-23.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-09-16.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-09-09.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-09-02.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-08-26.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-08-19.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-08-12.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-08-05.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-07-29.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-07-22.pq'
[Errno 2] No such file or directory: './dat/CEPH-Rtrend_fluH_2023-07-15.pq'
[Errno 2] No

In [9]:
# format forecasts in order to calculate scores
# input start and end weeks for the period of interest
test = Flusight_2324(df=pd.DataFrame(), obsdf=surv, target='hosp', incidence = True, max_date = False, 
                            start_week = Week(2023,40), end_week = Week(2024, 17))
predsall = test.format_forecasts_all( dfformat = predictionsall)

## WIS

At first I was working with a conda environment with fully updated packages. The following chunk threw an error in organize_timestamped_scores which traced down to something in Pandas that was making a deep copy. I changed my conda requirements to enforce python=3.10.* and pandas=1.4.* which eliminated the error. If we want to fix this to run on current versions of the required packages I can recreate the error.

In [10]:
# calculate wis for all forecasts

dfwis = pd.DataFrame()
#dic = {}
for horizon in [0, 1, 2,3]:
    for model in models:
        for date in dates: 
            start_week = Week.fromdate(pd.to_datetime(date)) # week of submission date
            end_week = start_week + 3 # target end date of last horizon
            
            # filter by horizon, model and submission date
            pred = predsall[(predsall.horizon==horizon) & (predsall.Model == model) & \
                            (predsall.reference_date == date)]
            if len(pred)==0:
                continue
            
            # calculate wis for each week
            test = Scoring(df=pred, obsdf=surv, target='hosp', incidence = True, max_date = False, 
                            start_week = start_week, end_week = end_week)

            out = test.organize_timestamped_scores(want_scores = ['wis'], models = [model], date=date)
            
            out['horizon'] = horizon
            out['reference_date'] = date
            
            dfwis = pd.concat([dfwis, out])

In [12]:
dfwis.to_csv('./WIS.csv', index=False)

In [13]:
dfwis

Unnamed: 0,Model,location,target_end_date,wis,horizon,reference_date
0,CEPH-Rtrend_fluH,01,2024-04-27,1.912889,0,2024-04-27
1,CEPH-Rtrend_fluH,02,2024-04-27,3.405661,0,2024-04-27
2,CEPH-Rtrend_fluH,04,2024-04-27,23.574783,0,2024-04-27
3,CEPH-Rtrend_fluH,05,2024-04-27,3.179565,0,2024-04-27
4,CEPH-Rtrend_fluH,06,2024-04-27,6.374348,0,2024-04-27
...,...,...,...,...,...,...
47,NU_UCSD-GLEAM_AI_FLUH,53,2023-12-23,79.365652,3,2023-12-02
48,NU_UCSD-GLEAM_AI_FLUH,54,2023-12-23,53.268863,3,2023-12-02
49,NU_UCSD-GLEAM_AI_FLUH,55,2023-12-23,113.721773,3,2023-12-02
50,NU_UCSD-GLEAM_AI_FLUH,56,2023-12-23,37.325938,3,2023-12-02


## WIS Ratio

In [14]:
# compute wis ratio, comparing the Flusight models' forecast scores to the Flusight baseline model
# divide flusight models by flusight baseline WIS scores at each location, week, horizon, location
baseline = dfwis[dfwis.Model == 'FluSight-baseline'] 
baseline = baseline.rename(columns={'wis':'wis_baseline', 'Model':'baseline'})
dfwis_test = dfwis[dfwis.Model != 'FluSight-baseline']

dfwis_ratio = pd.merge(dfwis_test, baseline, how='inner', on = ['location', 'target_end_date',
                                                                'horizon', 'reference_date'])

# calculate wis ratio
dfwis_ratio['wis_ratio'] = dfwis_ratio['wis']/dfwis_ratio['wis_baseline']

In [16]:
dfwis_ratio.to_csv('./WIS_ratio.csv', index=False)

In [15]:
dfwis_ratio

Unnamed: 0,Model,location,target_end_date,wis,horizon,reference_date,baseline,wis_baseline,wis_ratio
0,CEPH-Rtrend_fluH,01,2024-04-27,1.912889,0,2024-04-27,FluSight-baseline,3.186017,0.600401
1,FluSight-ensemble,01,2024-04-27,2.005467,0,2024-04-27,FluSight-baseline,3.186017,0.629459
2,MIGHTE-Nsemble,01,2024-04-27,2.774675,0,2024-04-27,FluSight-baseline,3.186017,0.870891
3,MOBS-GLEAM_FLUH,01,2024-04-27,9.271692,0,2024-04-27,FluSight-baseline,3.186017,2.910120
4,NU_UCSD-GLEAM_AI_FLUH,01,2024-04-27,6.820537,0,2024-04-27,FluSight-baseline,3.186017,2.140772
...,...,...,...,...,...,...,...,...,...
27397,MIGHTE-Nsemble,72,2023-11-04,29.624689,3,2023-10-14,FluSight-baseline,24.647738,1.201923
27398,CEPH-Rtrend_fluH,US,2023-11-04,437.658750,3,2023-10-14,FluSight-baseline,551.338550,0.793811
27399,FluSight-ensemble,US,2023-11-04,275.942167,3,2023-10-14,FluSight-baseline,551.338550,0.500495
27400,MIGHTE-Nsemble,US,2023-11-04,230.569058,3,2023-10-14,FluSight-baseline,551.338550,0.418199


## Coverage

In [17]:
dfcoverage = pd.DataFrame()

for date in dates:
    for model in models:
         
        start_week = Week.fromdate(pd.to_datetime(date)) # week of submission date
        end_week = start_week + 3 # target end date of last horizon

        # filter by model and submission date, only look at horizon 0-3
        pred = predsall[(predsall.Model == model) & \
                        (predsall.reference_date == date) & (predsall.horizon >=0)]
        if len(pred)==0:
            continue

        # calculate wis for each week
        test = Scoring(df=pred, obsdf=surv, target='hosp', incidence = True, max_date = None, 
                        start_week = start_week, end_week = end_week)

        out = test.organize_average_scores(want_scores=['10_cov', '20_cov', '30_cov', '40_cov', '50_cov',
            '60_cov', '70_cov', '80_cov', '90_cov', '95_cov', '98_cov'], models = [model], date=date)

        out['horizon'] = horizon
        out['reference_date'] = date

        dfcoverage = pd.concat([dfcoverage, out])

In [19]:
dfcoverage.to_csv('./coverage.csv', index=False)

In [18]:
dfcoverage

Unnamed: 0,Model,location,10_cov,20_cov,30_cov,40_cov,50_cov,60_cov,70_cov,80_cov,90_cov,95_cov,98_cov,horizon,reference_date
0,CEPH-Rtrend_fluH,01,0.0,1.00,1.00,1.00,1.0,1.0,1.00,1.00,1.00,1.00,1.00,3,2024-04-27
1,CEPH-Rtrend_fluH,02,0.0,0.00,0.00,0.00,0.0,0.0,0.00,1.00,1.00,1.00,1.00,3,2024-04-27
2,CEPH-Rtrend_fluH,04,0.0,0.00,0.00,0.00,0.0,0.0,0.00,0.00,0.00,1.00,1.00,3,2024-04-27
3,CEPH-Rtrend_fluH,05,0.0,0.00,0.00,1.00,1.0,1.0,1.00,1.00,1.00,1.00,1.00,3,2024-04-27
4,CEPH-Rtrend_fluH,06,1.0,1.00,1.00,1.00,1.0,1.0,1.00,1.00,1.00,1.00,1.00,3,2024-04-27
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
47,MOBS-GLEAM_FLUH,53,0.0,0.00,0.00,0.00,0.0,0.0,0.00,0.00,0.25,0.25,0.50,3,2023-10-14
48,MOBS-GLEAM_FLUH,54,0.0,0.25,0.25,0.25,0.5,0.5,0.50,1.00,1.00,1.00,1.00,3,2023-10-14
49,MOBS-GLEAM_FLUH,55,0.0,0.00,0.00,0.00,0.0,0.0,0.25,0.25,0.25,0.25,0.75,3,2023-10-14
50,MOBS-GLEAM_FLUH,56,0.0,0.00,0.00,0.25,0.5,0.5,0.50,0.50,0.75,0.75,0.75,3,2023-10-14
