# Abstraction Review

In this notebook, we will demonstrate a simple example of how LatteReview could be used for abstracting concepts from input data.

## Setting up the notebook

High-level configs

In [1]:
%reload_ext autoreload
%autoreload 2

from dotenv import load_dotenv

# Load environment variables from .env file. Adjust the path to the .env file as needed.
load_dotenv(dotenv_path='../.env')

# Enable asyncio in Jupyter
import asyncio
import nest_asyncio

nest_asyncio.apply()

#  Add the package to the path (required if you are running this notebook from the examples folder)
import sys
sys.path.append('../../')


Import required packages

In [3]:
import json
import pandas as pd
from pydantic import BaseModel
from tqdm.auto import tqdm

from lattereview.providers import OpenAIProvider
from lattereview.providers import LiteLLMProvider
from lattereview.agents import AbstractionReviewer
from lattereview.workflows import ReviewWorkflow

## Data

To demonstrate how the `AbstractionReviewer` can be used to extract specific information from input items, we will create a dummy dataset of imaginary stories. The `AbstractionReviewer` will be tasked with extracting the location of each story and identifying the main characters introduced in it. To generate these stories, we will prompt the GPT-4-O model to provide the stories along with the ground truth location and characters. Notice how we utilize the base functionalities of LatteLLM and a Pydantic base model to specify the expected output format from GPT-4-O, clearly defining the structure of the desired output.

In [4]:
class BuildStoryOutput(BaseModel):
    story: str
    location: str
    characters: list[str]

async def build_story():
    prompt = """
    Write a one-paragraph story with whatever realistic or imaginary theme you like,  
    then create a list of all characters you named in your story.
    Return your story, the main location that your story happens in, and a Python list of your characters as your output.
    """
    provider = OpenAIProvider(model="gpt-4o", response_format_class=BuildStoryOutput)
    return await provider.get_json_response(prompt, temperature=0.9)

def run_build_story():
    response =  asyncio.run(build_story())[0]
    return response

data = {
    "story": [],
    "location": [],
    "characters": [],
}
for i in tqdm(range(5)):
    out = json.loads(run_build_story())
    data["characters"].append(out["characters"])
    data["location"].append(out["location"])
    data["story"].append(out["story"])


data = pd.DataFrame(data)
data.to_csv("data.csv", index=False)
data

  0%|          | 0/5 [00:00<?, ?it/s]

Unnamed: 0,story,location,characters
0,"In the quaint seaside village of Brightwater, ...",Brightwater,"[Emily Langley, Mrs. Henderson, Eliza Marlowe]"
1,In the heart of the ancient forest of Eldergle...,ancient forest of Elderglen,"[Leona, Alaric, Finn, Malgor]"
2,"In the heart of the bustling city of Aurelian,...",Aurelian,"[Ava, Milo]"
3,In the heart of the ancient forest of Eldergro...,Eldergrove Forest,"[Astra, Eldwin, Damon]"
4,"In the bustling city of Verenthia, amidst towe...",The hidden garden behind the old library in Ve...,"[Elara, Cedric, Professor Hoots]"


## Abstraction with a single agent

Here, we will use the `AbstractionReviewer` without defining a workflow. To do so, we simply need to specify the keys we expect the `AbstractionReviewer` to output, which in this case are the location and characters keys. Additionally, we need to define the meaning of each key in the `key_descriptions` argument.

In [11]:
Albert = AbstractionReviewer(
    provider=LiteLLMProvider(model="gpt-4o-mini"),
    name="Albert",
    max_concurrent_requests=1, 
    backstory="an expert reviewer!",
    input_description = "stories",
    model_args={"max_tokens": 200, "temperature": 0.1},
    abstraction_keys = {
        "location": str, 
        "characters": list[str]
    },
    key_descriptions = {
        "location": "The main location that the story happens in.", 
        "characters": "The name of the characters mentioned in the story."
    }
)


# Dummy input
input_list = data.story.str.lower().tolist()
print("====== Inputs ======\n\n", '\n'.join(input_list))

# Dummy review
results, total_cost = asyncio.run(Albert.review_items(input_list))
print("\n====== Outputs ======")
for result in results:
    print(result)

# Dummy costs
print("\n====== Costs ======\n")
for i, item in enumerate(Albert.memory):
    print(f"Cost for item {i}: {item['cost']}")

print(f"\nTotal cost: {total_cost}")


 in the quaint seaside village of brightwater, emily langley discovered a mysterious key buried beneath the roots of an old oak tree. as she examined it under the golden afternoon sun, mrs. henderson, the village's wise and curious librarian, happened to pass by and noticed the glint of the metal. curious herself, mrs. henderson invited emily to the library's archives, where they began unraveling tales of ancient shipwrecks and lost treasures tied to the enigmatic captain, eliza marlowe, who once docked in the village long ago. the key, it seemed, was more than just a trinket; it was a link to the past adventures of captain marlowe, and the two women soon found themselves caught up in a quest that would take them beyond the peaceful shores of brightwater.
in the heart of the ancient forest of elderglen, a young alchemist named leona was deep in her studies, surrounded by scrolls detailing the secrets of the forest's mystical flora. her mentor, the wise and eccentric mage alaric, had t

Reviewing 5 items - 2025-01-04 18:21:22: 100%|██████████| 5/5 [00:05<00:00,  1.14s/it]


{'location': 'brightwater', 'characters': ['Emily Langley', 'Mrs. Henderson', 'Captain Eliza Marlowe']}
{'location': 'the ancient forest of elderglen', 'characters': ['Leona', 'Alaric', 'Finn', 'Malgor']}
{'location': 'the bustling city of Aurelian', 'characters': ['Ava', 'Milo']}
{'location': 'the ancient forest of eldergrove', 'characters': ['astra', 'eldwin', 'damon']}
{'location': 'the bustling city of verenthia', 'characters': ['elara', 'cedric', 'professor hoots']}


Cost for item 0: 7.274999999999999e-05
Cost for item 1: 7.635e-05
Cost for item 2: 7.335e-05
Cost for item 3: 7.214999999999999e-05
Cost for item 4: 6.675e-05

Total cost: 6.675e-05





## Abstraction with a workflow

Obviously, the same functionality could also be achieved by defining a workflow. In this case, we will define a workflow with a single item and a single agent to demonstrate how AbstractionReviewers are similar to ScoringReviewers when incorporated into workflows. Naturally, in more complex reviews, an AbstractionReviewer could be combined with other AbstractionReviewers or even ScoringReviewers to accomplish more sophisticated review goals.

In [6]:
workflow = ReviewWorkflow(
    workflow_schema=[
        {
            "round": 'A',
            "reviewers": [Albert],
            "text_inputs": ["story"]
        }
    ]
)

# Reload the data if needed.
updated_data = asyncio.run(workflow(data))

print("\n====== Costs ======\n")
print("Total cost: ", workflow.get_total_cost())
print("Detailed costs: ", workflow.reviewer_costs)

updated_data



Processing 5 eligible rows


['round: A', 'reviewer_name: Albert'] -                     2025-01-04 18:19:22: 100%|██████████| 5/5 [00:06<00:00,  1.38s/it]

The following columns are present in the dataframe at the end of Albert's reivew in round A: ['story', 'location', 'characters', 'round-A_Albert_output', 'round-A_Albert_location', 'round-A_Albert_characters']


Total cost:  6.675e-05
Detailed costs:  {('A', 'Albert'): 6.675e-05}





Unnamed: 0,story,location,characters,round-A_Albert_output,round-A_Albert_location,round-A_Albert_characters
0,"In the quaint seaside village of Brightwater, ...",Brightwater,"[Emily Langley, Mrs. Henderson, Eliza Marlowe]","{'location': 'Brightwater', 'characters': ['Em...",Brightwater,"[Emily Langley, Mrs. Henderson, Eliza Marlowe]"
1,In the heart of the ancient forest of Eldergle...,ancient forest of Elderglen,"[Leona, Alaric, Finn, Malgor]",{'location': 'the ancient forest of Elderglen'...,the ancient forest of Elderglen,"[Leona, Alaric, Finn, Malgor]"
2,"In the heart of the bustling city of Aurelian,...",Aurelian,"[Ava, Milo]","{'location': 'the bustling city of Aurelian', ...",the bustling city of Aurelian,"[Ava, Milo]"
3,In the heart of the ancient forest of Eldergro...,Eldergrove Forest,"[Astra, Eldwin, Damon]",{'location': 'the ancient forest of Eldergrove...,the ancient forest of Eldergrove,"[Astra, Eldwin, Damon]"
4,"In the bustling city of Verenthia, amidst towe...",The hidden garden behind the old library in Ve...,"[Elara, Cedric, Professor Hoots]","{'location': 'Verenthia', 'characters': ['Elar...",Verenthia,"[Elara, Cedric, Professor Hoots]"


In [10]:
Albert.formatted_prompt

"**Review the input item below and extract the specified keys as instructed:** --- **Input Item:** <<${item}$>> **Keys to Extract and Their Expected Formats:** <<{'location': <class 'str'>, 'characters': list[str]}>> --- **Instructions:** Follow the detailed guidelines below for extracting the specified keys: <<{'location': 'The main location that the story happens in.', 'characters': 'The name of the characters mentioned in the story.'}>> --- ${additional_context}$"