# LLM-human collaborative annotation 
This notebook illustrates the integration of Large Language Models (LLMs) into the MEGAnno framework. In this framework, LLMs serve as annotators, and human verification is used to validate the annotation results. Initially, we demonstrate this integration with OpenAI's GPT models and completion APIs.

Users can register agents by specifying model configurations and prompt configurations, select a subset, and run the job. MEGAnno takes care of the following tasks:

* Interfacing with OpenAI and handling errors.
* Executing LLM models and persisting the results.
* Providing flexible search capabilities to support human verification and downstream applications.


# 1. Setup
## 1.1 Authentication and MEGAnno project connection

In [None]:
from meganno_client import Authentication
auth = Authentication(project="eacl_demo", token=<megagon_distributed_token>)

In [None]:
from meganno_client import Service

# or use own auth/token
demo = Service(project="eacl_demo", auth=auth)

## 1.2 Review labeling schema

In [None]:
### review schema
import pprint
schema = demo.get_schemas().value(active=True)
pprint.pprint(schema)

# 2. LLM Annotation
## 2.1 Config model and prompt template

In [None]:
model_config = {
    "model": "gpt-3.5-turbo",
    "temperature": 0,
    "n": 1,
    "logprobs": True,
    "messages": [{"role": "system", "content": "You are a helpful assistant."}],
}

In [None]:
label_name = "sentiment"

In [None]:
from meganno_client.prompt import PromptTemplate

prompt_template = PromptTemplate(
    label_schema=schema[0]["schemas"]["label_schema"], label_names=[label_name]
)
prompt_template.preview(
    records=["[sample input]", "Megagon Labs is located in Mountain View."]
)

In [None]:
prompt_template.get_template()

## 2.2 Register an agent with service

In [None]:
from meganno_client.controller import Controller

controller = Controller(demo, auth)

In [None]:
agent_uuid = controller.create_agent(
    model_config, prompt_template, provider_api="openai:chat"
)

## 2.3 Run an LLM annotation job on subsets
**!Make sure OPENAI_API_KEY is set as an env var.**

In [None]:
# selecting subset to run the job with
subset = demo.search(keyword="delay", limit=4)
subset.show({"view": "table"})

In [None]:
job_uuid = controller.run_job(agent_uuid, subset, label_name, label_meta_names=["conf"])

## 2.4 List agents & jobs

In [None]:
# list agents
controller.list_my_agents()
# job_list = controller.list_jobs('agent_uuid', [agent_uuid])

In [None]:
# filter over agent properties and get jobs
ret = controller.list_agents(provider_filter="openai", show_job_list=True)
job_list = [val for sublist in ret for val in sublist["job_list"]]
job_list

# Verification

## 3.1 Verify annotations from a job

In [None]:
args = {
    "job_id": job_uuid,
    "label_metadata_condition": {
        "label_name": "sentiment",
        "name": "conf",
        "operator": "<",
        "value": 0.99,
    },
    "verification_condition": {
        "label_name": label_name,
        "search_mode": "ALL",  # "ALL"|"UNVERIFIED"|"VERIFIED"
    },
}
verf_subset = demo.search_by_job(**args)
verf_subset.show({"mode": "verifying", "label_meta_names": ["conf"]})

## 3.2 Retrieve Verification Annotations
The current version supports only programmatic retrieval of previous verifications, 

In [None]:
# further filter by type of verification(CONFIRMS|CORRECTS)
# CONFIMS:  where the verification confirms the original label
# CORRECTS: where the verification is different from the original label
verf_subset.get_verification_annotations(
    label_name="sentiment",
    label_level="record",
    annotator=job_uuid,
    verified_status="CORRECTS",  # CONFIRMS|CORRECTS
)