# Hindcast Summary Demo

This example shows how to access Salient's rigorous data validation system and highlight areas of particular interest.


In [None]:
import os
import sys

import pandas as pd

# Prevent wrapping on tables for readability
pd.set_option("display.width", None)
pd.set_option("display.max_columns", None)
pd.set_option("display.expand_frame_repr", False)

try:
    import salientsdk as sk
except ModuleNotFoundError as e:
    if os.path.exists("../salientsdk"):
        sys.path.append(os.path.abspath(".."))
        import salientsdk as sk
    else:
        raise ModuleNotFoundError("Install salient SDK with: pip install salientsdk")

sk.set_file_destination("hindcast_summary_example", force=False)
sk.login("username", "password", verbose=False)

<requests.sessions.Session at 0x7f83101078d0>

## Validate

The `hindcast_summary` function is a convenience interface to the `hindast_summary` [API endpoint](https://api.salientpredictions.com/v2/documentation/api/#/Validation/hindcast_summary) which delivers accuracy statistics. This example vectorizes multiple requests, downloads them in parallel, and concatenates all the results into a single file.


In [None]:
score_file = sk.hindcast_summary(
    loc=sk.Location(region=["united states", "germany"]),
    variable=["temp", "precip"],  # Vectorize multi-variable
    metric="crps_skill_score",
    reference="clim",
    season=["DJF", "JJA"],  # Vectorize meteorological winter and summer
    version="-default",  # this is vectorizable
    force=False,
    verbose=False,
)

print(score_file)
print(pd.read_csv(score_file))

hindcast_summary_example/hindcast_summary_f326bf0e2223e1324dff1a5af22b4065.csv
            Lead      Reference Model  Reference CRPS  Salient CRPS  Salient CRPS Skill Score (%)         region season variable
0         Week 1  30 year Climatology            2.58          0.93                          63.1  united states    DJF     temp
1         Week 2  30 year Climatology            2.57          1.86                          27.8  united states    DJF     temp
2         Week 3  30 year Climatology            2.57          2.39                           7.6  united states    DJF     temp
3         Week 4  30 year Climatology            2.57          2.48                           3.5  united states    DJF     temp
4         Week 5  30 year Climatology            2.57          2.50                           2.8  united states    DJF     temp
..           ...                  ...             ...           ...                           ...            ...    ...      ...
91       Month 3  

# Simplify Score File

The native score file returned by `hindcast_summary` is "data-long", which means that each `Lead` time gets a row of its own. To make the data more compact, we can focus only on the 5th "skill score" column and transpose all of its rows to columns. Use the sdk function `transpose_hindcast_summary()` to transform the hindcast csv.

The min_score argument converts any score below the threshold value into `NaN`. By default, this is set to zero. In this example, setting it to `5.0` creates a "bright spot" analysis which highlights the examples where the Salient forecast significantly outperforms the reference model.


In [None]:
scores = sk.transpose_hindcast_summary(score_file, min_score=5.0)

save_file = os.path.join(sk.get_file_destination(), "hindcast_summary_transposed.csv")
scores.to_csv(save_file)

print(save_file)
print(scores)

hindcast_summary_example/hindcast_summary_transposed.csv
Lead                           Week 1  Week 2  Week 3  Week 4  Week 5  Month 1  Month 2  Month 3  Months 1-3  Months 4-6  Months 7-9  Months 10-12   mean
region        season variable                                                                                                                            
germany       DJF    precip      47.0    12.0     NaN     NaN     NaN     13.0      NaN      NaN         NaN         NaN         NaN           NaN    NaN
                     temp        70.2    29.1     7.9     5.1     NaN     30.3      7.7      NaN        11.2         8.8         6.5           7.1   5.99
              JJA    precip      32.0     NaN     NaN     NaN     NaN      7.1      NaN      NaN         NaN         NaN         NaN           NaN    NaN
                     temp        70.3    27.7     9.6     7.1     8.8     34.3     19.8     19.1        21.5        22.2        23.6          23.2  13.09
united states DJF  