# Scoring Review

In this notebook, we will demonstrate how LatteReview could be used for scoring input articles in a multiagent framework.

## Setting up the notebook

High-level configs

In [1]:
%reload_ext autoreload
%autoreload 2

from dotenv import load_dotenv

# Load environment variables from .env file. Adjust the path to the .env file as needed.
load_dotenv(dotenv_path='../.env')

# Enable asyncio in Jupyter
import asyncio
import nest_asyncio

nest_asyncio.apply()

#  Add the package to the path (required if you are running this notebook from the examples folder)
import sys
sys.path.append('../../')


Import required packages

In [5]:
import pandas as pd

from lattereview.providers import OpenAIProvider
from lattereview.providers import GoogleProvider
from lattereview.providers import LiteLLMProvider
from lattereview.agents import ScoringReviewer
from lattereview.workflows import ReviewWorkflow

Let's load a dummy dataset of 20 random articles about AI.

In [6]:
data = pd.read_csv('data.csv')
data

Unnamed: 0,Title,Abstract,Authors,Year
0,Fusing an agent-based model of mosquito popula...,The mosquito Aedes aegypti is the vector of a ...,"Cavany, S.M.",2022
1,PDRL: Multi-Agent based Reinforcement Learning...,Reinforcement learning has been increasingly a...,"Shaik, T.",2023
2,Learning-accelerated Discovery of Immune-Tumou...,We present an integrated framework for enablin...,"Ozik, J.",2019
3,Investigating spatiotemporal dynamics and sync...,"In this paper we present AceMod, an agent-base...","Cliff, O.M.",2018
4,Modeling the Spread of COVID-19 in University ...,Mathematical and simulation models are often u...,"Herrmann, J.W.",2024
5,Multi-Agent Reinforcement Learning with Action...,Unmanned Aerial Vehicles (UAVs) are increasing...,"Rizvi, D.",2023
6,A new (N) stochastic quenched disorder model f...,Human beings live in a networked world in whic...,"Ferreira, A.A.",2019
7,Frustration induced phases in migrating cell c...,Collective motion of cells is common in many p...,"Copenhagen, K.",2017
8,Universal masking is urgent in the COVID-19 pa...,We present two models for the COVID-19 pandemi...,"Kai, D.",2020
9,Calculus of consent via MARL: Legitimating the...,"Public policies that supply public goods, espe...","Hu, Y.",2021


## A multiagent review workflow for doing title/abstract analysis

Here, we will define three reviewer agents. Agents 1 and 2 are scoring reviewers, each designed with a different background and prompted to employ different types of reasoning. They will work in parallel, reviewing every input title and abstract to determine whether the article discusses large language model-based AI agents applied to medical imaging data. Agent 3, on the other hand, is another scoring reviewer prompted to act as a more senior reviewer. Its role is to resolve conflicts by reviewing articles where Agents 1 and 2 have provided differing scores, deciding whether to include or exclude those articles. In other words, Agent 3 is responsible for resolving disagreements between Agents 1 and 2.

It is also important to note that different models are used for these agents. Agents 1 and 2, which need to review many articles, are based on smaller and more cost-efficient models: Gemini 1.1 Flash and GPT-4.0 Mini, respectively. Agent 3, tasked with the more complex job of conflict resolution, uses GPT-4.0, a more expensive model. However, since Agent 3 only reviews a small subset of articles where discrepancies occur, its usage is limited. If Agents 1 and 2 agree on all articles, Agent 3 may not be called upon to perform any tasks at all.

In [7]:
Agent1 = ScoringReviewer(
    provider=GoogleProvider(model="gemini-2.5-flash-preview-04-17"),
    name="Agent1",
    max_concurrent_requests=20, 
    backstory="a radiologist with many years of background in statistcis and data science, who is famous among your colleagues for your systematic thinking, organizaton of thoughts, and being conservative",
    model_args={"max_tokens": 200, "temperature": 0.1, "reasoning_effort": "low"},
    reasoning = "brief",
    scoring_task="Look for articles that discuss large language models-based AI agents applied to medical imaging data",
    scoring_set=[1, 2],
    scoring_rules='Score 1 if the paper meets the criteria, and 2 if the paper does not meet the criteria.',
)

Agent2 = ScoringReviewer(
    provider=LiteLLMProvider(model="o4-mini"),
    name="Agent2",
    max_concurrent_requests=20, 
    backstory="an expert in data science with a background in developing ML models for healthcare, who is famous among your colleagues for your creativity and out of the box thinking",
    # model_args={"max_tokens": 200, "temperature": 0.9},
    reasoning = "brief",
    scoring_task="Look for articles that discuss large language models-based AI agents applied to medical imaging data",
    scoring_set=[1, 2],
    scoring_rules='Score 1 if the paper meets the criteria, and 2 if the paper does not meet the criteria.',
)

Agent3 = ScoringReviewer(
    provider=GoogleProvider(model="gemini-2.5-pro-preview-05-06"),
    name="Agent3",
    max_concurrent_requests=20, 
    model_args={"max_tokens": 200, "temperature": 0.1},
    backstory="a senior radiologist with a PhD in computer science and years of experience as the director of a DL lab focused on developing ML models for radiology and healthcare",
    reasoning = "cot",
    scoring_task="""Agent1 and Agent2 have Looked for articles that discuss large language models-based AI agents applied to medical imaging data. 
                       They scored an article 1 if they thought it does not meet this criteria, 2 if they thought it meets the criteria, 0 if they were uncertain of scoring.
                       You will receive an article they have had different opinions about, as well as each of their scores and their reasoning for that score. Read their reviews and determine who you agree with. 
                    """,
    scoring_set=[1, 2],
    scoring_rules="""Score 1 if you agree with Agent1, and score 2 if you agree with Agent2.""",
)


Setting up the review workflow:

In [8]:
title_abs_review = ReviewWorkflow(
    workflow_schema=[
        {
            "round": 'A',
            "reviewers": [Agent1, Agent2],
            "text_inputs": ["Title", "Abstract"]
        },
        {
            "round": 'B',
            "reviewers": [Agent3],
            "text_inputs": ["Title", "Abstract", "round-A_Agent1_output", "round-A_Agent2_output"],
            "filter": lambda row: row["round-A_Agent1_output"]["score"] != row["round-A_Agent2_output"]["score"]
        }
    ]
)

Applying the review workflow to a number of sample articles:

In [9]:
# Reload the data if needed.
data = pd.read_csv('data.csv')
updated_data = asyncio.run(title_abs_review(data))

print("\n====== Costs ======\n")
print("Total cost: ", title_abs_review.get_total_cost())
print("Detailed costs: ", title_abs_review.reviewer_costs)

updated_data



Processing 20 eligible rows


['round: A', 'reviewer_name: Agent1'] -                     2025-05-14 12:51:46: 100%|██████████| 20/20 [00:05<00:00,  3.51it/s]


The following columns are present in the dataframe at the end of Agent1's reivew in round A: ['Title', 'Abstract', 'Authors', 'Year', 'round-A_Agent1_output', 'round-A_Agent1_reasoning', 'round-A_Agent1_score', 'round-A_Agent1_certainty']


['round: A', 'reviewer_name: Agent2'] -                     2025-05-14 12:51:52: 100%|██████████| 20/20 [00:06<00:00,  2.98it/s]

The following columns are present in the dataframe at the end of Agent2's reivew in round A: ['Title', 'Abstract', 'Authors', 'Year', 'round-A_Agent1_output', 'round-A_Agent1_reasoning', 'round-A_Agent1_score', 'round-A_Agent1_certainty', 'round-A_Agent2_output', 'round-A_Agent2_reasoning', 'round-A_Agent2_score', 'round-A_Agent2_certainty']


Skipping review round B - no eligible rows


Total cost:  0.0049851
Detailed costs:  {('A', 'Agent1'): 0.00318, ('A', 'Agent2'): 0.0018051}





Unnamed: 0,Title,Abstract,Authors,Year,round-A_Agent1_output,round-A_Agent1_reasoning,round-A_Agent1_score,round-A_Agent1_certainty,round-A_Agent2_output,round-A_Agent2_reasoning,round-A_Agent2_score,round-A_Agent2_certainty
0,Fusing an agent-based model of mosquito popula...,The mosquito Aedes aegypti is the vector of a ...,"Cavany, S.M.",2022,{'reasoning': 'The paper discusses an agent-ba...,The paper discusses an agent-based model of mo...,2,100,{'reasoning': 'The paper focuses on mechanisti...,The paper focuses on mechanistic and statistic...,2,100
1,PDRL: Multi-Agent based Reinforcement Learning...,Reinforcement learning has been increasingly a...,"Shaik, T.",2023,{'reasoning': 'The paper discusses reinforceme...,The paper discusses reinforcement learning age...,2,100,{'reasoning': 'The paper focuses on deep reinf...,The paper focuses on deep reinforcement learni...,2,100
2,Learning-accelerated Discovery of Immune-Tumou...,We present an integrated framework for enablin...,"Ozik, J.",2019,{'reasoning': 'The article discusses agent-bas...,The article discusses agent-based simulation a...,2,100,{'reasoning': 'The paper describes agent-based...,The paper describes agent-based simulations fo...,2,95
3,Investigating spatiotemporal dynamics and sync...,"In this paper we present AceMod, an agent-base...","Cliff, O.M.",2018,{'reasoning': 'The paper discusses agent-based...,The paper discusses agent-based modeling for s...,2,100,{'reasoning': 'The paper presents an agent-bas...,The paper presents an agent-based model for in...,2,100
4,Modeling the Spread of COVID-19 in University ...,Mathematical and simulation models are often u...,"Herrmann, J.W.",2024,{'reasoning': 'The paper discusses mathematica...,The paper discusses mathematical and simulatio...,2,100,{'reasoning': 'The paper focuses on epidemiolo...,The paper focuses on epidemiological modeling ...,2,95
5,Multi-Agent Reinforcement Learning with Action...,Unmanned Aerial Vehicles (UAVs) are increasing...,"Rizvi, D.",2023,{'reasoning': 'The paper discusses multi-agent...,The paper discusses multi-agent reinforcement ...,2,100,{'reasoning': 'This paper addresses multi-agen...,This paper addresses multi-agent RL for UAV co...,2,95
6,A new (N) stochastic quenched disorder model f...,Human beings live in a networked world in whic...,"Ferreira, A.A.",2019,{'reasoning': 'The paper discusses a network m...,The paper discusses a network model for opinio...,2,100,{'reasoning': 'The paper focuses on a stochast...,The paper focuses on a stochastic network mode...,2,90
7,Frustration induced phases in migrating cell c...,Collective motion of cells is common in many p...,"Copenhagen, K.",2017,{'reasoning': 'The abstract discusses cell clu...,The abstract discusses cell cluster migration ...,2,100,{'reasoning': 'The article focuses on modeling...,The article focuses on modeling cell cluster m...,2,95
8,Universal masking is urgent in the COVID-19 pa...,We present two models for the COVID-19 pandemi...,"Kai, D.",2020,{'reasoning': 'The paper discusses SEIR and ag...,The paper discusses SEIR and agent-based model...,2,100,{'reasoning': 'This paper focuses on epidemiol...,This paper focuses on epidemiological models o...,2,95
9,Calculus of consent via MARL: Legitimating the...,"Public policies that supply public goods, espe...","Hu, Y.",2021,{'reasoning': 'The abstract discusses Multi-Ag...,The abstract discusses Multi-Agent Reinforceme...,2,100,{'reasoning': 'The paper focuses on multi-agen...,The paper focuses on multi-agent reinforcement...,2,95


In [11]:
for i, row in updated_data.iterrows():
    print(
        f"""
        ====== item {i} ======
        Title: {row.Title}
        Abstract: {row.Abstract}
        Agent1's score: {row["round-A_Agent1_score"]}
        Agent1's reasoning: {row["round-A_Agent1_reasoning"]}
        Agent2's certinty: {row["round-A_Agent1_certainty"]}
        Agent2's score: {row["round-A_Agent2_score"]}
        Agent2's reasoning: {row["round-A_Agent2_reasoning"]}
        Agent2's certainty: {row["round-A_Agent2_certainty"]}
        Agent3's score: {None if "round-B_Agent3_score" not in row else row["round-B_Agent3_score"]}
        Agent3's reasoning: {None if "round-B_Agent3_reasoning" not in row else row["round-B_Agent3_reasoning"]}
        Agent3's certainty: {None if "round-B_Agent3_certainty" not in row else row["round-B_Brad_certainty"]}
        """
    )


        Title: Fusing an agent-based model of mosquito population dynamics with a statistical reconstruction of spatio-temporal abundance patterns
        Abstract: The mosquito Aedes aegypti is the vector of a number of medically-important viruses, including dengue virus, yellow virus, chikungunya virus, and Zika virus, and as such vector control is a key approach to managing the diseases they cause. Understanding the impact of vector control on these diseases is aided by first understanding its impact on Ae. aegypti population dynamics. A number of detail-rich models have been developed to couple the dynamics of the immature and adult stages of Ae. aegypti. The numerous assumptions of these models enable them to realistically characterize impacts of mosquito control, but they also constrain the ability of such models to reproduce empirical patterns that do not conform to the models’ behavior. In contrast, statistical models afford sufficient flexibility to extract nuanced signals fr