# Data Collection

### NOTE: data is updated as of 1/1/2024

For each of the following four tasks:
- Download viruses with complete protein HA and NA segments as FASTA
- Metadata for FASTA files: `Isolate name|Type|Gene name|Collection date|Protein Accession no.|Isolate ID|Lineage|Clade`
    
1. Collect all human viruses from **1/1/2010 - 1/1/2024** from [GISAID](https://gisaid.org/)
2. Collect data on strains previously analyzed by IRAT and replicate the [IRAT table](https://www.cdc.gov/flu/pandemic-resources/monitoring/irat-virus-summaries.htm#H1N2variant)
3. Collect variant sequences (animal sequences that have emerged in humans)
4. Collect all animal sequences from **1/1/2020 - 1/1/2024**

In [3]:
import glob, os, json
import pandas as pd
from datetime import date
from collections import Counter
import warnings
warnings.filterwarnings('ignore')
from emergenet.utils import parse_fasta, filter_by_date_range, save_model
from emergenet.emergenet import Enet

# Directory paths
HUMAN_DIR = 'data/human/'
ENET_DATA_DIR = 'emergenet/data/'
ENET_MODEL_DIR = 'emergenet/models/'
IRAT_DIR = 'data/irat/'
VARIANT_DIR = 'data/variant/'
ANIMAL_DIR = 'data/animal/'

# Length to truncate sequences
HA_TRUNC = 560
NA_TRUNC = 460

## 1) Human Sequences
Collect all human viruses from **1/1/2010 - 1/1/2024** from [GISAID](https://gisaid.org/)

In [60]:
# Compile all human sequences
fasta_files = glob.glob(os.path.join(HUMAN_DIR, '*.fasta'))

human = pd.DataFrame()
for file in fasta_files:
    df = parse_fasta(file)
    human = pd.concat([human, df], ignore_index=True)
    
# Sort by date and save
human = human.sort_values(by=['date']).reset_index(drop=True)
human.to_csv(ENET_DATA_DIR + 'human.csv', index=False)
human.to_csv('data/human.csv', index=False)
human.head()

Unnamed: 0,name,subtype,segment,date,accession,sequence,HA,NA
0,A/Finland/836N/2010,H1N1,HA,2010-01-01,EPI545126,MKAILVVLLYTFATANADTLCIGYHANNSTDTVDTVLEKNVTVTHS...,H1,N1
1,A/Finland/776N/2010,H1N1,,2010-01-01,EPI545107,MNPNQKIITIGSVCMTIGMANLILQIGNIISIWISHSIQLGNQNQI...,H1,N1
2,A/Finland/776N/2010,H1N1,HA,2010-01-01,EPI545105,MKAILVVLLYIFATANADTLCIGYHANNSTDTVDTVLEKNVTVTHS...,H1,N1
3,A/Finland/799N/2010,H1N1,,2010-01-01,EPI545114,MNPNQKIITIGSVCMTIGMANLILQIGNIISIWISHSIQLGNQNQI...,H1,N1
4,A/Finland/799N/2010,H1N1,HA,2010-01-01,EPI545112,MKAILVVLLYTFATANADTLCIGYHANNSTDTVDTVLEKNVTVTHS...,H1,N1


### Prepare human sequences for current time

This is for evaluating animal strains at current time, preloaded into Emergenet

1. Filter within one year of current time (1/1/2023 - 1/1/2024)
2. Filter by subtype for both HA and NA
3. Save to `emergenet/data/current/<subtype>.csv`
4. Train Enet models for each subtype and save to `emergenet/models/<subtype>.csv`

In [5]:
# Filter human sequences for current time
human = pd.read_csv(ENET_DATA_DIR + 'human.csv', na_filter=False)

os.makedirs(ENET_DATA_DIR + 'current/', exist_ok=True)
end = date(2024, 1, 1)
start = date(end.year - 1, end.month, end.day)
current = filter_by_date_range(human, 'date', str(start), str(end))

# Initialize Enet
enet = Enet(str(end), ' '*HA_TRUNC, ' '*NA_TRUNC, random_state=42)

current_models = {'HA':[], 'NA':[]}
for segment in ['HA', 'NA']:
    TRUNC = NA_TRUNC
    if segment == 'HA':
        TRUNC = HA_TRUNC
    filtered = current[current['segment'] == segment]
    filtered = filtered[filtered['sequence'].str.len() >= TRUNC]
    subtypes = Counter(filtered[segment])
    for subtype in subtypes:
        if subtypes[subtype] < 15:
            continue
        current_models[segment].append(subtype)
        # Use entire population for constructing Enet
        df = filtered[filtered[segment] == subtype]
        print(subtype)
        print('Total:', len(df))
        enet_model = enet.train(segment, df, 10000, include_target=False, n_jobs=8)
        save_model(enet_model, ENET_MODEL_DIR + subtype + '.joblib')
        # Save only unique sequences
        df = df.drop_duplicates(subset=['sequence'])
        df.to_csv(ENET_DATA_DIR + 'current/' + subtype + '.csv', index=False)
        print('Unique:', len(df))
        print()

# Save dict of available models
with open(ENET_DATA_DIR + 'current_subtypes.json', 'w') as file:
    json.dump(current_models, file)

H3
Total: 6911
Unique: 2196

H1
Total: 11106
Unique: 2925

N2
Total: 6912
Unique: 1681

N1
Total: 11111
Unique: 2136



## 2) IRAT Sequences
Collect data on strains previously analyzed by IRAT and replicate the [IRAT table](https://www.cdc.gov/flu/pandemic-resources/monitoring/irat-virus-summaries.htm#H1N2variant)
- Protein HA and NA for each sequence are downloaded from GISAID or NCBI
- Had difficulty finding `A/duck/New York/1996`, only NA available 
    
### Mean Low - Mean High

From [IRAT](https://www.cdc.gov/flu/pandemic-resources/national-strategy/risk-assessment.htm): "Since the IRAT is qualitative in nature, its scores involve some degree of subjectivity. Accordingly, subject matter experts provide a range of “acceptable” scores for each risk element by identifying a lower and upper bound they would consider acceptable from other experts scoring the same element. The mean of the lowest acceptable bound and the mean of the highest acceptable bound from each risk element are used in the weighted “emergence” or “public health impact” calculations to create the “mean-high” and “mean-low” acceptable score ranges."

In [10]:
def add_irat_entry(df, subtype, virus_name, 
                   assessment_date, risk_category,
                   emergence_risk, impact_risk,
                   emergence_low, emergence_high,
                   impact_low, impact_high):
    DIR = IRAT_DIR + virus_name.replace('/',':') + '.fasta'
    seq_df = parse_fasta(DIR)
    try:
        ha_seq = seq_df[seq_df['segment'] == 'HA']['sequence'].values[0]
        ha_acc = seq_df[seq_df['segment'] == 'HA']['accession'].values[0]
    except:
        ha_seq = '-1'
        ha_acc = '-1'
    try:
        na_seq = seq_df[seq_df['segment'] == 'NA']['sequence'].values[0]
        na_acc = seq_df[seq_df['segment'] == 'NA']['accession'].values[0]
    except:
        na_seq = '-1'
        na_acc = '-1'
    entry_df = pd.DataFrame({'Influenza Virus':[virus_name],
                             'Virus Type':[subtype],
                             'Date of Risk Assessment':[assessment_date],
                             'Risk Score Category':[risk_category],
                             'Emergence Score':[emergence_risk],
                             'Impact Score':[impact_risk],
                             'Mean Low Acceptable Emergence':[emergence_low],
                             'Mean High Acceptable Emergence':[emergence_high],
                             'Mean Low Acceptable Impact':[impact_low],
                             'Mean High Acceptable Impact':[impact_high],
                             'HA Accession':[ha_acc],
                             'NA Accession':[na_acc],
                             'HA Sequence':[ha_seq],
                             'NA Sequence':[na_seq],
                             'HA Length':[len(ha_seq)],
                             'NA Length':[len(na_seq)]})
    return df.append(entry_df, ignore_index=True)


# Compile all IRAT sequences
df = pd.DataFrame()
df = add_irat_entry(df,'H1N1','A/swine/Shandong/1207/2016',date(2020,7,1),'Moderate',7.5,6.9,6.33,8.65,5.42,8.09)
df = add_irat_entry(df,'H1N1','A/duck/New York/1996',date(2011,11,1),'Low',2.3,2.4,-1,-1,-1,-1)
df = add_irat_entry(df,'H1N2','A/California/62/2018',date(2019,7,1),'Moderate',5.8,5.7,4.22,7.16,3.8,7.09)
df = add_irat_entry(df,'H3N2','A/Ohio/13/2017',date(2019,7,1),'Moderate',6.6,5.8,5.01,7.59,4.09,7.26)
df = add_irat_entry(df,'H3N2','A/Indiana/08/2011',date(2012,12,1),'Moderate',6.0,4.5,-1,-1,-1,-1)
df = add_irat_entry(df,'H3N2','A/canine/Illinois/12191/2015',date(2016,6,1),'Low',3.7,3.7,2.81,4.9,2.69,4.9)
df = add_irat_entry(df,'H5N1','A/American wigeon/South Carolina/AH0195145/2021',date(2022,3,1),'Moderate',4.4,5.1,3.28,5.51,3.84,6.19)
df = add_irat_entry(df,'H5N1','A/American green-winged teal/Washington/1957050/2014',date(2015,3,1),'Low-Moderate',3.6,4.1,2.4,4.6,3,5.6)
df = add_irat_entry(df,'H5N1','A/Vietnam/1203/2004',date(2011,11,1),'Moderate',5.2,6.6,-1,-1,-1,-1)
df = add_irat_entry(df,'H5N2','A/Northern pintail/Washington/40964/2014',date(2015,3,1),'Low-Moderate',3.8,4.1,2.6,5,3,5.7)
df = add_irat_entry(df,'H5N6','A/Sichuan/06681/2021',date(2021,10,1),'Moderate',5.3,6.3,3.88,6.45,5.04,7.47)
df = add_irat_entry(df,'H5N6','A/Yunnan/14564/2015',date(2016,4,1),'Moderate',5.0,6.6,4.07,6.18,5.57,7.93)
df = add_irat_entry(df,'H5N8','A/Astrakhan/3212/2020',date(2021,3,1),'Moderate',4.6,5.2,3.64,5.82,4.07,6.37)
df = add_irat_entry(df,'H5N8','A/gyrfalcon/Washington/41088/2014',date(2015,3,1),'Low-Moderate',4.2,4.6,2.9,5.3,3.4,5.9)
df = add_irat_entry(df,'H7N7','A/Netherlands/219/2003',date(2012,6,1),'Moderate',4.6,5.8,3.22,4.39,5.99,7.22)
df = add_irat_entry(df,'H7N8','A/turkey/Indiana/1573-2/2016',date(2017,7,1),'Low',3.4,3.9,2.4,4.26,2.91,4.63)
df = add_irat_entry(df,'H7N9','A/chicken/Tennessee/17-007431-3/2017',date(2017,10,1),'Low',3.1,3.5,2.2,3.94,2.53,4.32)
df = add_irat_entry(df,'H7N9','A/chicken/Tennessee/17-007147-2/2017',date(2017,10,1),'Low',2.8,3.5,2.01,3.71,2.67,4.39)
df = add_irat_entry(df,'H7N9','A/Hong Kong/125/2017',date(2017,5,1),'Moderate-High',6.5,7.5,5.65,7.51,6.74,8.5)
df = add_irat_entry(df,'H7N9','A/Shanghai/02/2013',date(2016,4,1),'Moderate-High',6.4,7.2,5.52,7.43,6.41,8.32)
df = add_irat_entry(df,'H9N2','A/Bangladesh/0994/2011',date(2014,2,1),'Moderate',5.6,5.4,4.49,6.74,4.41,6.65)
df = add_irat_entry(df,'H9N2','A/Anhui-Lujiang/39/2018',date(2019,7,1),'Moderate',6.2,5.9,4.76,7.57,4.3,7.3)
df = add_irat_entry(df,'H10N8','A/Jiangxi-Donghu/346/2013',date(2014,2,1),'Moderate',4.3,6.0,3.37,5.96,5.21,7.24)
df = add_irat_entry(df,'H5N1','A/mink/Spain/3691-8_22VIR10586-10/2022',date(2023,4,1),'Moderate',5.1,6.2,3.96,6.27,4.95,7.43)
df.sort_values(by=['Date of Risk Assessment'], inplace=True, ascending=False)
df.to_csv('data/irat.csv', index=False)
df.reset_index(drop=True)

Unnamed: 0,Influenza Virus,Virus Type,Date of Risk Assessment,Risk Score Category,Emergence Score,Impact Score,Mean Low Acceptable Emergence,Mean High Acceptable Emergence,Mean Low Acceptable Impact,Mean High Acceptable Impact,HA Accession,NA Accession,HA Sequence,NA Sequence,HA Length,NA Length
0,A/mink/Spain/3691-8_22VIR10586-10/2022,H5N1,2023-04-01,Moderate,5.1,6.2,3.96,6.27,4.95,7.43,EPI2220597,EPI2220596,MENIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQ...,MNPNQRIITTGSICMVIGIVSLMLQIGNIISIWVSHSIQTGNQYQP...,567,469
1,A/American wigeon/South Carolina/AH0195145/2021,H5N1,2022-03-01,Moderate,4.4,5.1,3.28,5.51,3.84,6.19,EPI1985910,EPI1985912,MENIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQ...,MNPNQKITTIGSICMVIGIVSLMLQIGNIISIWVSHSIQTGNQYQP...,567,469
2,A/Sichuan/06681/2021,H5N6,2021-10-01,Moderate,5.3,6.3,3.88,6.45,5.04,7.47,EPI1883261,EPI1883263,MENIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQ...,MNPNQKITCISATGVTLSIVSLLIGITNLGLNIGLHYKVSDSTTIN...,567,459
3,A/Astrakhan/3212/2020,H5N8,2021-03-01,Moderate,4.6,5.2,3.64,5.82,4.07,6.37,EPI1846961,EPI1846963,MENIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQ...,MNPNQKIATIGSISLGLVVFNVLLHALNIILMVLALGKSENNGICK...,567,470
4,A/swine/Shandong/1207/2016,H1N1,2020-07-01,Moderate,7.5,6.9,6.33,8.65,5.42,8.09,EPI1751427,EPI1751500,MEARLFVLFCAFTTLKADTICVGYHANNSTDTVDTILEKNVTVTHS...,MNPNQKIITIGSICMTIGIASLILQIGNIISIWISHSIQIENQNQS...,566,469
5,A/Ohio/13/2017,H3N2,2019-07-01,Moderate,6.6,5.8,5.01,7.59,4.09,7.26,EPI1056653,EPI1056652,MKTIIALSHILCLVFAQKLPGNDNNMATLCLGHHAVPNGTIVKTIT...,MNPNQKIITIGSVSLIIATICFLMQIAILVTTITLHFKQHNCDSSP...,566,469
6,A/California/62/2018,H1N2,2019-07-01,Moderate,5.8,5.7,4.22,7.16,3.8,7.09,EPI1311361,EPI1311360,MKVKLMVLLCTFTATYADTICVGYHANNSTDTVDTVLEKNVTVTHS...,MNPNQKIITIGSISLTLAAMCFLMQTAILVTNVTLHFNQCECHYPP...,565,469
7,A/Anhui-Lujiang/39/2018,H9N2,2019-07-01,Moderate,6.2,5.9,4.76,7.57,4.3,7.3,EPI1315830,EPI1315828,METVSLITILLVATASNADKICIGYQSTNSTETVDTLTENNVPVTH...,MNPNQKITAIGSVSLIIAIICLLMQIAILTTTMTLHFGQKECSNPS...,560,466
8,A/chicken/Tennessee/17-007431-3/2017,H7N9,2017-10-01,Low,3.1,3.5,2.2,3.94,2.53,4.32,ARB51617.1,ARB51619.1,MNTQILALIACMLIGAKGDKICLGHHAVANGTKVNTLTERGIEVVN...,MNPNQKILCTSATAIVIGTIAVLIGIANLGLNIGLHLKPNCNCSNS...,560,470
9,A/chicken/Tennessee/17-007147-2/2017,H7N9,2017-10-01,Low,2.8,3.5,2.01,3.71,2.67,4.39,ARB51605.1,ARB51607.1,MNTQILALIACMLIGAKGDKICLGHHAVANGTKVNTLTERGIEVVN...,MNPNQKILCTSATAIVIGTIAVLIGIANLGLNIGLHLKPNCNCSNS...,569,470


## 3) Variant Sequences
Collect variant sequences (animal sequences that have emerged in humans)


In [10]:
def add_variant_entry(df, filepath):
    seq_df = parse_fasta(filepath)
    ha_acc = seq_df[seq_df['segment'] == 'HA']['accession'].values[0]
    ha_seq = seq_df[seq_df['segment'] == 'HA']['sequence'].values[0]
    na_acc = seq_df[seq_df['segment'] == 'NA']['accession'].values[0]
    na_seq = seq_df[seq_df['segment'] == 'NA']['sequence'].values[0]
    entry_df = pd.DataFrame({'name':[seq_df['name'].values[0]],
                             'subtype':[seq_df['subtype'].values[0]],
                             'date':[seq_df['date'].values[0]],
                             'ha_accession':[ha_acc],
                             'ha_sequence':[ha_seq],
                             'na_accession':[na_acc],
                             'na_sequence':[na_seq]})
    return df.append(entry_df, ignore_index=True)


# Compile all variants
fasta_files = glob.glob(os.path.join(VARIANT_DIR, '*.fasta'))
variant = pd.DataFrame()
for file in fasta_files:
    variant = add_variant_entry(variant, file)
variant = variant.sort_values(by=['date']).reset_index(drop=True)
variant.to_csv('data/variant.csv', index=False)
variant

Unnamed: 0,name,subtype,date,ha_accession,ha_sequence,na_accession,na_sequence
0,A/Minnesota/19/2011,H1N2,2011-11-04,EPI347610,MKVKLLTLFCTFTATYADTICIGYHANNSTDTVDTVLEKNVTVTHS...,EPI347609,MNPNQKIITIGSVSLIIATICFLMQIAILVTTVTLHFKQHDYNSPP...
1,A/Ohio/09/2015,H1N1,2015-04-21,EPI590132,MKAILIVLLYTFTTANADKICIGYHANNSTDTVDTVLEKNVTVTHS...,EPI590131,MNANQRIITIGTVCLIIGIISLLLQIGNMVSLWISHSIQTGGKNHT...
2,A/Hunan/42443/2015,H1N1,2015-07-02,EPI691395,MEARLFVLFCAFTTLKADTICVGYHANNSTDTVDTILEKNVTVTHS...,EPI691397,MNPNQKIITIGSICMAIGIASLILQIGNIISIWISHSIQTENQNQS...
3,A/Iowa/39/2015,H1N1,2015-08-07,EPI638970,MKAILVVLLYTFTTANADTLCIGYHANNSTDTVDTVLEKNVTVTHS...,EPI638969,MNTNQRIITIGTVCMIVGMISLLLQIGSIVSLWISHSIQTGWENHT...
4,A/Parana/720/2015,H1N2,2015-11-27,EPI768456,MKIKLLILLCTFTATYADTICTGYHANNSTDTVDTVLEKNVTVTHS...,EPI768455,MNPNQKIITIGSISLTIALMCFLMQVVILVTTVTLHFKQYECNPAP...
5,A/Minnesota/45/2016,H1N2,2016-03-29,EPI760602,MKAILLVLLHTFAAANADTICIGYHANNSTDTVDTVLEKNVTVTHS...,EPI760601,MNPNQKIITIGSASLIIATICFLMQVAILVTTVTLHFKQHDCNSSP...
6,A/Wisconsin/71/2016,H1N2,2016-06-09,EPI816890,MKVKLLILLCTFTAAYADTICIGYHANNSTDTVDTVLEKNVTVTHS...,EPI816889,MNPNQKIITIGSVSLIIATICFLMQIAILVTTVTLHFKQHNCDSSP...
7,A/Ohio/24/2017,H1N2,2017-07-25,EPI1056725,MKAILLVLLHTFAAANADTICIGYHANNSTDTVDTVLEKNVTVTHS...,EPI1056724,MNPNQKIITIGSASLIIATICFLMQVAILVTTVTLHFKQHDCNPSP...
8,A/Michigan/383/2018,H1N2,2018-07-31,EPI1271066,MKVKLMVLLCTFTATYADTICVGYHANNSTDTVDTVLEKNVTVTHS...,EPI1271065,MNPNQKIITIGSISLTLAAMCFLMQTAILVTNVTLHFNQCECHYPP...
9,A/Alberta/01/2020_(H1N2)v,H1N2,2020-10-01,EPI1815179,MKAILLVLLHTFAATSADTICVGYHANNSTDTVDTVLEKNVTVTHS...,EPI1815178,MNPNQKIITIGSVSLIIATICFLMQIAILVTTVTLHFKQHDCNSSS...


## 4) Animal Sequences
Collect all animal sequences from **1/1/2020 - 1/1/2024**

In [6]:
animal = parse_fasta(ANIMAL_DIR + 'animal.fasta').drop(columns=['HA', 'NA'])
animal_ha = animal[animal['segment'] == 'HA'].rename(columns={'accession':'ha_accession', 'sequence':'ha_sequence'}).drop(columns=['segment'])
animal_ha = animal_ha[animal_ha['ha_sequence'].str.len() >= HA_TRUNC]
animal_na = animal[animal['segment'] == 'NA'].rename(columns={'accession':'na_accession', 'sequence':'na_sequence'}).drop(columns=['segment'])
animal_na = animal_na[animal_na['na_sequence'].str.len() >= NA_TRUNC]
animal = pd.merge(animal_ha, animal_na, on=['name', 'subtype', 'date'], how='inner')
animal = animal.drop_duplicates(subset=['ha_sequence']).sort_values(by='date').reset_index(drop=True)
animal.to_csv('data/animal.csv', index=False)
animal

Unnamed: 0,name,subtype,date,ha_accession,ha_sequence,na_accession,na_sequence
0,A/chicken/Pakistan/062BYP/2020,H9N2,2020-01-01,EPI1905094,MEAKSLMITLLVVTTSSADKICIGHQSTNSTETVDTLTESNIPVTQ...,EPI1905097,MNPNQKIIALGSASLTIATVCLLIQIAILATTMTLHFNRNEYTNSS...
1,A/chicken/Pakistan/041CP/2020,H9N2,2020-01-01,EPI1905093,MEAISLMIILLVVTTSNADKICIGHQSTNSTETVDTLTESNIPVTQ...,EPI1905099,MNPNQKIIALGSASLTIATVCLLIQIAILATTMTLHFNRNEYTNSS...
2,A/swine/Denmark/UCPH-S39742/2020,H3N2,2020-01-01,EPI2904581,MKTIIALSCILCLVFAQKIPGNDNSTATLCLGHHAVPNGTIVKTIT...,EPI2904579,MNPNQKIITIGSVSLTISTICFFMQIAILITTVTLHFKQYEFNSPP...
3,A/laying_hen/Poland/002/2020,H5N8,2020-01-01,EPI1800003,MENIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQ...,EPI1800005,MNPNQKIVTVGSISLGLVVFNVLLHAVSIILMVLALGKSENNGICK...
4,A/swine/Illinois/A02479030/2020(H1N2),H1N2,2020-01-02,EPI1785473,MKAILLVLLHTFAAANADTICIGYHANNSTDTVDTVLEKNVTVTHS...,EPI1785474,MNPNQKIITIGSVSLTIATMCFLMQIAILVTTVTLHFKQYECNYPP...
...,...,...,...,...,...,...,...
6349,A/barnacle_goose/Denmark/24-00026-1.02/2023,H5N1,2023-12-28,EPI2957711,MENIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQ...,EPI2957710,MNPNQRIITTGSICMIIGIVSLMLQIGNIISIWVSHSIQTGNQYQP...
6350,A/chicken/Austria/24000087-002/2024,H5N1,2023-12-29,EPI2923389,MENIVLLLATVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQ...,EPI2923392,MNPNQRIITTGSICMVIGIVSLMLQIGNIISIWVSHSIQTGNQYQP...
6351,A/swine/Iowa/ISU-A02862194/2023,H3N2,2023-12-29,EPI2971632,MKTIIAFSCTLCLILARKIPGGDNNMATLCLGHHAVPNGTLVKTIT...,EPI2971631,MNPNQKIITIGSVSLIIATICFLMQIAILVTTVTLHFKQHDCNSSS...
6352,A/chicken/Czech_Republic/11-org/2024,H5N1,2023-12-31,EPI2882288,MENIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQ...,EPI2882290,MNPNQRIITTGSICMIIGIVSLMLQIGNIISIWVSHSIQTGNQYQP...
