# TitleAbstract Reviewer

In this notebook, we will demostrate the use of `TitleAbstractReviewer` agents.

## Setting up the notebook

High-level configs

In [1]:
%reload_ext autoreload
%autoreload 2

from dotenv import load_dotenv

# Load environment variables from .env file. Adjust the path to the .env file as needed.
load_dotenv(dotenv_path='../.env')

# Enable asyncio in Jupyter
import asyncio
import nest_asyncio

nest_asyncio.apply()

#  Add the package to the path (required if you are running this notebook from the examples folder)
import sys
sys.path.append('../../')


Import required packages

In [2]:
import pandas as pd 

from lattereview.providers import LiteLLMProvider, OpenAIProvider
from lattereview.agents import TitleAbstractReviewer
from lattereview.workflows import ReviewWorkflow

* 'fields' has been removed


## Data

First, let's load a dummy dataset of some articles that we once pooled for a prior systematic review published [here](https://pubmed.ncbi.nlm.nih.gov/36292201/).

In [3]:
data = pd.read_csv('data.csv')
data

Unnamed: 0,title,abstract,DOI,study_type,clinical_application,organ,modality,cardiovascular,task,deep_learning,external_validation
0,(18)F-FDG PET/CT Uptake Classification in Lymp...,Background Fluorine 18 ((18)F)-fluorodeoxygluc...,10.1148/radiol.2019191114,cross-sectional,diagnosis,lung,pet,0,classification,1,0
1,(18)F-FDG-PET/CT Whole-Body Imaging Lung Tumor...,Under the background of (18)F-FDG-PET/CT multi...,10.1155/2021/8865237,cross-sectional,diagnosis,lung,pet,0,classification,1,0
2,3-D Convolutional Neural Networks for Automati...,Deep two-dimensional (2-D) convolutional neura...,10.1109/jbhi.2018.2879449,cross-sectional,diagnosis,lung,ct,0,detection,1,0
3,3D CNN with Visual Insights for Early Detectio...,The 3D convolutional neural network is able to...,10.1155/2021/6695518,cross-sectional,diagnosis,lung,ct,0,classification,1,0
4,3D deep learning based classification of pulmo...,Classifying ground-glass lung nodules (GGNs) i...,10.1016/j.compmedimag.2020.101814,cross-sectional,diagnosis,lung,ct,0,"segmentation, classification",1,0
...,...,...,...,...,...,...,...,...,...,...,...
973,Vulture-Based AdaBoost-Feedforward Neural Fram...,"In today's scenario, many scientists and medic...",10.1007/s12539-022-00505-3,cross-sectional,diagnosis,lung,xr,0,"segmentation, classification",1,0
974,Wavelet decomposition facilitates training on ...,The adoption of low-dose computed tomography (...,10.1007/s00418-020-01961-y,cross-sectional,diagnosis,lung,ct,0,classification,1,0
975,Weakly unsupervised conditional generative adv...,Because of the rapid spread and wide range of ...,10.1016/j.media.2021.102159,cross-sectional,diagnosis,lung,ct,0,classification,1,0
976,Weakly-supervised lesion analysis with a CNN-b...,Objective.Lesions of COVID-19 can be clearly v...,10.1088/1361-6560/ac4316,cross-sectional,diagnosis,lung,ct,0,classification,1,0


Let's then write some inclusion and exclusion criteria to screen these papers.

In [4]:
inclusion_criteria = "The study must involve CT scans. If multiple modalities are involved, CT scans should be among them."
exclusion_criteria = "The study must not include PET scans as one of its modalities."

## Review

Then, we will define three reviewers (two juniors and one senior) that will go through the dataset and look for those items that meet all the inclusion criteria and does not violate any of the exclusion criteria. Each agent return either of the following scores:

- 1 means absolutely to exclude.
- 2 means better to exclude.
- 3 Not sure if to include or exclude.
- 4 means better to include.
- 5 means absolutely to include.

In [5]:
Agent1 = TitleAbstractReviewer(
    provider=LiteLLMProvider(model="gemini/gemini-1.5-flash"),
    name="Agent1",
    backstory="a PhD researcher in biology and computer science",
    inclusion_criteria = inclusion_criteria,
    exclusion_criteria = exclusion_criteria,        
    max_concurrent_requests=30, 
    model_args={"max_tokens": 200, "temperature": 0.1},
)

Agent2 = TitleAbstractReviewer(
    provider=LiteLLMProvider(model="gpt-4o-mini"), 
    name="Agent2",
    backstory="a PhD researcher in biology and computer science",
    inclusion_criteria = inclusion_criteria,
    exclusion_criteria = exclusion_criteria,        
    max_concurrent_requests=30, 
    model_args={"max_tokens": 200, "temperature": 0.1},
)

Agent3 = TitleAbstractReviewer(
    provider=LiteLLMProvider(model="o3-mini"),
    name="Agent3",
    backstory="a senior MD-PhD researcher with years of experience in conducting systematic reviews in radiology and deep learning",
    inclusion_criteria = inclusion_criteria,
    exclusion_criteria = exclusion_criteria,        
    max_concurrent_requests=30, 
    model_args={'reasoning_effort': 'medium'},
    additional_context="""
    Two PhD reviewers have already reviewed this article and disagree on how to evaluate it. You can read their evaluation above.
    """
)

Finally, let's create a review workflow and put our agents to work!

In [14]:
def filter_func(row):
    score1 = int(row["round-A_Agent1_output"]["evaluation"])
    score2 = int(row["round-A_Agent2_output"]["evaluation"])
    if score1 != score2:
        if score1 >= 4 and score2 >= 4:
            return False
        if score1 >= 3 or score2 >= 3:
            return True
    elif score1 == score2 == 3:
        return True
    return False

async def review(df, sample_size=None):
    
    title_abs_review = ReviewWorkflow(
        workflow_schema=[
                {
                    "round": 'A',
                    "reviewers": [Agent1, Agent2],
                    "text_inputs": ["title", "abstract"]
                },
                {
                    "round": 'B',
                    "reviewers": [Agent3],
                    "text_inputs": ["title", "abstract", "round-A_Agent1_output", "round-A_Agent2_output"],
                    "filter": filter_func
                }
            ]
        )

    if sample_size:
        df = df.sample(sample_size)

    updated_df = await title_abs_review(df)
    total_cost = title_abs_review.get_total_cost()
    print(f"\n====== Finished reviewing ======\n")
    print(f"\nTotal cost: {total_cost}")
    print("-" * 100)
    return updated_df


updated_df = asyncio.run(review(data, sample_size=10))



Processing 10 eligible rows


['round: A', 'reviewer_name: Agent1'] -                     2025-03-16 13:59:23: 100%|██████████| 10/10 [00:00<00:00, 10.99it/s]


The following columns are present in the dataframe at the end of Agent1's reivew in round A: ['title', 'abstract', 'DOI', 'study_type', 'clinical_application', 'organ', 'modality', 'cardiovascular', 'task', 'deep_learning', 'external_validation', 'round-A_Agent1_output', 'round-A_Agent1_reasoning', 'round-A_Agent1_evaluation']


['round: A', 'reviewer_name: Agent2'] -                     2025-03-16 13:59:24: 100%|██████████| 10/10 [00:01<00:00,  5.37it/s]


The following columns are present in the dataframe at the end of Agent2's reivew in round A: ['title', 'abstract', 'DOI', 'study_type', 'clinical_application', 'organ', 'modality', 'cardiovascular', 'task', 'deep_learning', 'external_validation', 'round-A_Agent1_output', 'round-A_Agent1_reasoning', 'round-A_Agent1_evaluation', 'round-A_Agent2_output', 'round-A_Agent2_reasoning', 'round-A_Agent2_evaluation']


Processing 1 eligible rows


['round: B', 'reviewer_name: Agent3'] -                     2025-03-16 13:59:25: 100%|██████████| 1/1 [00:05<00:00,  5.44s/it]

The following columns are present in the dataframe at the end of Agent3's reivew in round B: ['title', 'abstract', 'DOI', 'study_type', 'clinical_application', 'organ', 'modality', 'cardiovascular', 'task', 'deep_learning', 'external_validation', 'round-A_Agent1_output', 'round-A_Agent1_reasoning', 'round-A_Agent1_evaluation', 'round-A_Agent2_output', 'round-A_Agent2_reasoning', 'round-A_Agent2_evaluation', 'round-B_Agent3_output', 'round-B_Agent3_reasoning', 'round-B_Agent3_evaluation']



Total cost: 0.00293695
----------------------------------------------------------------------------------------------------





In [15]:
updated_df

Unnamed: 0,title,abstract,DOI,study_type,clinical_application,organ,modality,cardiovascular,task,deep_learning,external_validation,round-A_Agent1_output,round-A_Agent1_reasoning,round-A_Agent1_evaluation,round-A_Agent2_output,round-A_Agent2_reasoning,round-A_Agent2_evaluation,round-B_Agent3_output,round-B_Agent3_reasoning,round-B_Agent3_evaluation
351,CovXNet: A multi-dilation convolutional neural...,"With the recent outbreak of COVID-19, fast dia...",10.1016/j.compbiomed.2020.103869,cross-sectional,diagnosis,lung,xr,0,classification,1,0,"{'evaluation': 1, 'reasoning': 'The study uses...","The study uses chest X-ray images, not CT scan...",1,{'reasoning': 'The study focuses solely on che...,The study focuses solely on chest X-ray images...,1,,,
224,Automatic segmentation of organs at risk and t...,BACKGROUND AND OBJECTIVE: Accurately and relia...,10.1016/j.cmpb.2021.106419,cross-sectional,treatment,lung,ct,0,segmentation,1,1,"{'evaluation': 5, 'reasoning': 'The abstract e...",The abstract explicitly states that the study ...,5,{'reasoning': 'The study focuses on CT images ...,The study focuses on CT images for lung cancer...,5,,,
18,A cascade and heterogeneous neural network for...,Screening of pulmonary nodules in computed tom...,10.1016/j.compmedimag.2021.101889,cross-sectional,diagnosis,lung,ct,0,classification,1,1,"{'evaluation': 5, 'reasoning': 'The study focu...",The study focuses on CT pulmonary nodule detec...,5,{'reasoning': 'The study involves CT scans exc...,The study involves CT scans exclusively for pu...,5,,,
260,Classification of malignant and benign lung no...,Lung cancer presents the highest cause of deat...,10.1007/s11517-018-1841-0,cross-sectional,diagnosis,lung,ct,0,classification,0,0,"{'evaluation': 5, 'reasoning': 'The abstract e...",The abstract explicitly states that the study ...,5,{'reasoning': 'The study involves CT scans and...,The study involves CT scans and does not menti...,5,,,
598,Implementation of a Deep Learning-Based Comput...,OBJECTIVE: To describe the experience of imple...,10.3348/kjr.2020.0536,cross-sectional,diagnosis,lung,xr,0,classification,1,1,"{'evaluation': 2, 'reasoning': 'Although the s...","Although the study involves chest X-rays, a mo...",2,{'reasoning': 'The study involves chest X-rays...,The study involves chest X-rays and evaluates ...,1,,,
899,Strategies to develop radiomics and machine le...,Predictive models based on radiomics and machi...,10.1016/j.ejmp.2021.08.015,cross-sectional,diagnosis,lung,ct,0,classification,0,0,"{'evaluation': 5, 'reasoning': 'The study uses...",The study uses CT scans and does not involve P...,5,{'reasoning': 'The study involves CT scans exc...,The study involves CT scans exclusively for th...,5,,,
544,Exploiting Global Structure Information to Imp...,"In this paper, we propose a method to enhance ...",10.3390/s21093249,cross-sectional,diagnosis,lung,xr,0,segmentation,1,0,"{'evaluation': 2, 'reasoning': 'The abstract d...",The abstract does not specify the use of CT sc...,2,{'reasoning': 'The abstract does not mention C...,The abstract does not mention CT scans or any ...,3,{'reasoning': 'The study does not explicitly m...,The study does not explicitly mention the use ...,2.0
805,Pulmonary Nodule Detection Model Based on SVM ...,In order to improve the detection accuracy of ...,10.1155/2016/8052436,cross-sectional,diagnosis,lung,ct,0,classification,0,0,"{'evaluation': 4, 'reasoning': 'The abstract m...",The abstract mentions the use of CT scans for ...,4,{'reasoning': 'The study involves CT scans for...,The study involves CT scans for pulmonary nodu...,5,,,
567,Highly accurate model for prediction of lung n...,Computed tomography (CT) examinations are comm...,10.1038/s41598-018-27569-w,cross-sectional,diagnosis,lung,ct,0,classification,1,0,"{'evaluation': 5, 'reasoning': 'The study focu...",The study focuses on CT scan analysis for lung...,5,{'reasoning': 'The study involves CT scans exc...,The study involves CT scans exclusively for pr...,5,,,
227,Auxiliary Diagnosis for COVID-19 with Deep Tra...,To assist physicians identify COVID-19 and its...,10.1007/s10278-021-00431-8,cross-sectional,diagnosis,lung,ct,0,classification,1,0,"{'evaluation': 4, 'reasoning': 'The study uses...",The study uses CT scans and does not mention P...,4,{'reasoning': 'The study involves CT scans exc...,The study involves CT scans exclusively for th...,5,,,


Now let's print the responses of all three agents for the cases where agent 3 needed to intervene.

In [16]:
# Filter the dataframe for cases where the two junior models (Agent1 and Agent2) disagreed

if "round-B_Agent3_evaluation" in updated_df.columns:   
    disagreed_cases = updated_df[
        updated_df["round-B_Agent3_evaluation"].notna()
    ]

    # Print the responses of all agents for these cases
    for index, row in disagreed_cases.iterrows():
        print(f"Title: {row['title']}")
        print(f"Abstract: {row['abstract']}")
        print(f"Agent1's response: {row['round-A_Agent1_output']}")
        print(f"Agent2's response: {row['round-A_Agent2_output']}")
        print(f"Agent3's response: {row['round-B_Agent3_output']}")
        print("-" * 100)

else:
    print("No cases where Agent3 intervened.")

Title: Exploiting Global Structure Information to Improve Medical Image Segmentation
Abstract: In this paper, we propose a method to enhance the performance of segmentation models for medical images. The method is based on convolutional neural networks that learn the global structure information, which corresponds to anatomical structures in medical images. Specifically, the proposed method is designed to learn the global boundary structures via an autoencoder and constrain a segmentation network through a loss function. In this manner, the segmentation model performs the prediction in the learned anatomical feature space. Unlike previous studies that considered anatomical priors by using a pre-trained autoencoder to train segmentation networks, we propose a single-stage approach in which the segmentation network and autoencoder are jointly learned. To verify the effectiveness of the proposed method, the segmentation performance is evaluated in terms of both the overlap and distance me