In [1]:
%cd ..

/Users/nattkorat/Workspace/outbreak_event_extraction


In [2]:
import json
from utils import prompt, response_parser, evaluation, semantic_similarity
from llms import gemini2_5

In [12]:
with open('speedpp_templates.json', 'r') as f:
    templates = json.load(f)

templates[0]

{'text': 'My COVID19 antibodies test came back positive . Crazy . Ive had no symptoms . Please get tested if possible . The more data we have on this the better .',
 'events': {'Infect': [{'trigger': 'positive',
    'arguments': {'disease': 'COVID19', 'infected': 'My'}}]}}

In [4]:
with open('speedpp_test.json', 'r') as f:
    test = json.load(f)
test[0]

{'text': 'I know it gets harder as the weeks go on , but PLEASE #( Stay At Home ) this weekend ( except for essential purposes ie food / medicine and exercise ). You are making a difference and saving lives . Lets stick with it for a bit longer to get this virus properly under control . Thank you',
 'events': {'Control': [{'trigger': 'saving',
    'arguments': {'disease': 'virus'}},
   {'trigger': 'control',
    'arguments': {'disease': 'this virus',
     'effectiveness': 'saving lives',
     'means': 'Stay At Home',
     'subject': 'You'}}],
  'Prevent': [{'trigger': 'Stay',
    'arguments': {'disease': 'virus', 'means': 'Stay'}}]}}

In [5]:
prompt_test = prompt.event_extraction_prompt_few_shot(
    article=test[1]['text'],
    samples=templates,
)
print(prompt_test)

You are an AI assistant specializing in public health intelligence. Carefully read the news article provided below. Your mission is to identify all events related to the epidemic and extract key information for each event.
Rule:
- Identify a trigger word which of the following event types are mentioned: Infect, Spread, Symptom, Prevent, Control, Cure, and Death.
- Trigger word is one word that MOST LIKELY manifests the event's occurrence.
- For each event you find, extract the specific details corresponding to its "Argument Roles".
- Requirement: If a specific detail (an argument role) is not mentioned in the text for a given event, you MUST not fill in.
- If the article mentions multiple distinct events of the same type (e.g., two different control measures), list each one as a separate entry.
- Present the final output as a single JSON object, with event types as keys.
- Keep the orginal langauage

EVENT DEFINITIONS & ARGUMENT ROLES:
1. Infect: An event describing one or more individ

In [6]:
llm_response = gemini2_5.chat_with_gemini2_5(prompt_test)

llm_response = response_parser.json_string_response_parser(llm_response)

llm_response

{'Infect': [{'trigger': 'infected',
   'arguments': {'disease': 'COVID - 19',
    'infected': 'the Dem governors',
    'place': 'nursing homes'}}],
 'Death': [{'trigger': 'death',
   'arguments': {'trend': 'drive up', 'disease': 'COVID - 19'}}],
 'Control': [{'trigger': 'shut',
   'arguments': {'means': 'shut everything down'}}]}

In [7]:
test[1]['events']

{'Death': [{'trigger': 'death',
   'arguments': {'disease': 'COVID - 19',
    'place': 'nursing homes',
    'trend': 'drive up'}}],
 'Infect': [{'trigger': 'infected',
   'arguments': {'disease': 'COVID - 19',
    'infected': 'infected',
    'place': 'nursing homes'}}]}

In [8]:
event_classification = evaluation.evaluate_event_types(test[1]['events'], llm_response)
event_classification

{'precision': 0.6666666644444444,
 'recall': 0.999999995,
 'f1': 0.799999992,
 'tp': 2,
 'fp': 1,
 'fn': 0}

In [11]:
trigger_identifiaction = evaluation.evaluate_event_triggers(
    test[1]['events'],
    llm_response,
    semantic_fn=semantic_similarity.calculate_bleu,
    threshold=0.5
)
trigger_identifiaction

{'Infect': {'precision': 0.9999999900000002,
  'recall': 0.9999999900000002,
  'f1': 0.9999999850000002,
  'tp': 1,
  'fp': 0,
  'fn': 0},
 'Death': {'precision': 0.9999999900000002,
  'recall': 0.9999999900000002,
  'f1': 0.9999999850000002,
  'tp': 1,
  'fp': 0,
  'fn': 0},
 'Control': {'precision': 0.0,
  'recall': 0.0,
  'f1': 0.0,
  'tp': 0,
  'fp': 1,
  'fn': 0}}

In [13]:
arguments_identification = evaluation.evaluate_event_arguments(
    test[1]['events'],
    llm_response,
    semantic_fn=semantic_similarity.calculate_bleu,
    threshold=0.5
)
arguments_identification

{'Death': [{'disease': {'precision': 0.9999999900000002,
    'recall': 0.9999999900000002,
    'f1': 0.9999999850000002,
    'tp': 1,
    'fp': 0,
    'fn': 0},
   'place': {'precision': 0.0,
    'recall': 0.0,
    'f1': 0.0,
    'tp': 0,
    'fp': 0,
    'fn': 1},
   'trend': {'precision': 0.9999999900000002,
    'recall': 0.9999999900000002,
    'f1': 0.9999999850000002,
    'tp': 1,
    'fp': 0,
    'fn': 0}}],
 'Infect': [{'infected': {'precision': 0.0,
    'recall': 0.0,
    'f1': 0.0,
    'tp': 0,
    'fp': 1,
    'fn': 1},
   'disease': {'precision': 0.9999999900000002,
    'recall': 0.9999999900000002,
    'f1': 0.9999999850000002,
    'tp': 1,
    'fp': 0,
    'fn': 0},
   'place': {'precision': 0.9999999900000002,
    'recall': 0.9999999900000002,
    'f1': 0.9999999850000002,
    'tp': 1,
    'fp': 0,
    'fn': 0}}]}

## Notice: 

This method is harsh penanty to the argument extraction if the trigger word is false identify. Need to be considered!

This evaluation is adapted from TextEE (mentioned in SPEED++), but using semnatic matching for confusion matrix.