# Evaluating Sequence Risk with `emergenet.emergenet`

## Installation

In [14]:
%%capture
!pip install emergenet --upgrade

import pandas as pd
from emergenet.emergenet import Enet, predict_irat_emergence

## Evaluation of Risk at a Particular Time

We demonstrate the usage of `emergenet.emergenet` on an IRAT-analyzed sequence, A/mink/Spain/3691-8_22VIR10586-10/2022. IRAT analyzed the risk in April 2023, so we will do the same.

In [15]:
# Load IRAT sequence - A/mink/Spain/3691-8_22VIR10586-10/2022
irat_df = pd.read_csv('data/emergenet/irat.csv')
row = irat_df.iloc[0]
print(row)

Influenza Virus                              A/mink/Spain/3691-8_22VIR10586-10/2022
Virus Type                                                                     H5N1
Date of Risk Assessment                                                  2023-04-01
Risk Score Category                                                        Moderate
Emergence Score                                                                 5.1
Impact Score                                                                    6.2
Mean Low Acceptable Emergence                                                  3.96
Mean High Acceptable Emergence                                                 6.27
Mean Low Acceptable Impact                                                     4.95
Mean High Acceptable Impact                                                    7.43
HA Sequence                       MENIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQ...
NA Sequence                       MNPNQRIITTGSICMVIGIVSLMLQIGNIISIWVSHSIQTGN

The `Enet` model requires the analysis date in format YYYY-MM-DD, the HA sequence, and the NA sequence. It will train multiple Emergenet models, so it will take several minutes.

**As of 2024-04-01, Emergenet only supports sequences from 2010-01-01 to 2024-01-01.**

Optionally, you can provide a `save_data` directory, which saves trained Emergenet models, the data used to train those models, and the risk results.

In [None]:
analysis_date = row['Date of Risk Assessment']
ha_seq = row['HA Sequence']
na_seq = row['NA Sequence']
SAVE_DIR = 'data/emergenet/example_results/'

# Initialize the Enet
enet = Enet(analysis_date, ha_seq, na_seq, random_state=42)

# Estimate the Enet risk scores
ha_risk, na_risk = enet.risk(sample_size=10000)

# Map the Enet risk scores to the IRAT risk scale
irat, irat_low, irat_high = predict_irat_emergence(ha_risk, na_risk)

## Evaluation of Risk at Present Time

**As of 2024-04-01, "present_time" = 2024-01-01.**

What if we want to evaluate the risk of A/mink/Spain/3691-8_22VIR10586-10/2022 at present time? Instead of providing an analysis date, set it to `'PRESENT'`. This uses pre-trained Enet models trained as of **"present_time"**, so will only take a couple minutes.

In [None]:
# Initialize the Enet
enet_present = Enet('PRESENT', ha_seq, na_seq, random_state=42)

# Estimate the Enet risk scores at present
ha_risk_present, na_risk_present = enet_present.risk(sample_size=10000)

# Map the Enet risk scores to the IRAT risk scale
irat_present, irat_low_present, irat_high_present = predict_irat_emergence(ha_risk_present, enet_present)