# TitleAbstract Reviewer

In this notebook, we will demostrate the use of `TitleAbstractReviewer` agents.

## Setting up the notebook

High-level configs

In [1]:
%reload_ext autoreload
%autoreload 2

from dotenv import load_dotenv

# Load environment variables from .env file. Adjust the path to the .env file as needed.
load_dotenv(dotenv_path='../.env')

# Enable asyncio in Jupyter
import asyncio
import nest_asyncio

nest_asyncio.apply()

#  Add the package to the path (required if you are running this notebook from the examples folder)
import sys
sys.path.append('../../')


Import required packages

In [2]:
import pandas as pd 

from lattereview.providers import LiteLLMProvider, OpenAIProvider
from lattereview.agents import TitleAbstractReviewer
from lattereview.workflows import ReviewWorkflow

* 'fields' has been removed


## Data

First, let's load a dummy dataset of some articles that we once pooled for a prior systematic review published [here](https://pubmed.ncbi.nlm.nih.gov/36292201/).

In [3]:
data = pd.read_csv('data.csv')
data

Unnamed: 0,title,abstract,DOI,study_type,clinical_application,organ,modality,cardiovascular,task,deep_learning,external_validation
0,(18)F-FDG PET/CT Uptake Classification in Lymp...,Background Fluorine 18 ((18)F)-fluorodeoxygluc...,10.1148/radiol.2019191114,cross-sectional,diagnosis,lung,pet,0,classification,1,0
1,(18)F-FDG-PET/CT Whole-Body Imaging Lung Tumor...,Under the background of (18)F-FDG-PET/CT multi...,10.1155/2021/8865237,cross-sectional,diagnosis,lung,pet,0,classification,1,0
2,3-D Convolutional Neural Networks for Automati...,Deep two-dimensional (2-D) convolutional neura...,10.1109/jbhi.2018.2879449,cross-sectional,diagnosis,lung,ct,0,detection,1,0
3,3D CNN with Visual Insights for Early Detectio...,The 3D convolutional neural network is able to...,10.1155/2021/6695518,cross-sectional,diagnosis,lung,ct,0,classification,1,0
4,3D deep learning based classification of pulmo...,Classifying ground-glass lung nodules (GGNs) i...,10.1016/j.compmedimag.2020.101814,cross-sectional,diagnosis,lung,ct,0,"segmentation, classification",1,0
...,...,...,...,...,...,...,...,...,...,...,...
973,Vulture-Based AdaBoost-Feedforward Neural Fram...,"In today's scenario, many scientists and medic...",10.1007/s12539-022-00505-3,cross-sectional,diagnosis,lung,xr,0,"segmentation, classification",1,0
974,Wavelet decomposition facilitates training on ...,The adoption of low-dose computed tomography (...,10.1007/s00418-020-01961-y,cross-sectional,diagnosis,lung,ct,0,classification,1,0
975,Weakly unsupervised conditional generative adv...,Because of the rapid spread and wide range of ...,10.1016/j.media.2021.102159,cross-sectional,diagnosis,lung,ct,0,classification,1,0
976,Weakly-supervised lesion analysis with a CNN-b...,Objective.Lesions of COVID-19 can be clearly v...,10.1088/1361-6560/ac4316,cross-sectional,diagnosis,lung,ct,0,classification,1,0


Let's then write some inclusion and exclusion criteria to screen these papers.

In [4]:
inclusion_criteria = "The study must involve CT scans. If multiple modalities are involved, CT scans should be among them."
exclusion_criteria = "The study must not include PET scans as one of its modalities."

## Review

Then, we will define three reviewers (two juniors and one senior) that will go through the dataset and look for those items that meet all the inclusion criteria and does not violate any of the exclusion criteria. Each agent return either of the following scores:

- 1 means absolutely to exclude.
- 2 means better to exclude.
- 3 Not sure if to include or exclude.
- 4 means better to include.
- 5 means absolutely to include.

In [6]:
Agent1 = TitleAbstractReviewer(
    provider=LiteLLMProvider(model="gemini/gemini-1.5-flash"),
    name="Agent1",
    backstory="a PhD researcher in biology and computer science",
    inclusion_criteria = inclusion_criteria,
    exclusion_criteria = exclusion_criteria,        
    max_concurrent_requests=30, 
    model_args={"max_tokens": 200, "temperature": 0.1},
)

Agent2 = TitleAbstractReviewer(
    provider=LiteLLMProvider(model="gpt-4o-mini"), 
    name="Agent2",
    backstory="a PhD researcher in biology and computer science",
    inclusion_criteria = inclusion_criteria,
    exclusion_criteria = exclusion_criteria,        
    max_concurrent_requests=30, 
    model_args={"max_tokens": 200, "temperature": 0.1},
)

Agent3 = TitleAbstractReviewer(
    provider=LiteLLMProvider(model="o3-mini"),
    name="Agent3",
    backstory="a senior MD-PhD researcher with years of experience in conducting systematic reviews in radiology and deep learning",
    inclusion_criteria = inclusion_criteria,
    exclusion_criteria = exclusion_criteria,        
    max_concurrent_requests=30, 
    model_args={'reasoning_effort': 'medium'},
    additional_context="""
    Two PhD reviewers have already reviewed this article and disagree on how to evaluate it. You can read their evaluation above.
    """
)

Finally, let's create a review workflow and put our agents to work!

In [7]:
def filter_func(row):
    score1 = int(row["round-A_Agent1_output"]["evaluation"])
    score2 = int(row["round-A_Agent2_output"]["evaluation"])
    if score1 != score2:
        if score1 >= 4 and score2 >= 4:
            return False
        if score1 >= 3 or score2 >= 3:
            return True
    elif score1 == score2 == 3:
        return True
    return False

async def review(df, sample_size=None):
    
    title_abs_review = ReviewWorkflow(
        workflow_schema=[
                {
                    "round": 'A',
                    "reviewers": [Agent1, Agent2],
                    "text_inputs": ["title", "abstract"]
                },
                {
                    "round": 'B',
                    "reviewers": [Agent3],
                    "text_inputs": ["title", "abstract", "round-A_Agent1_output", "round-A_Agent2_output"],
                    "filter": filter_func
                }
            ]
        )

    if sample_size:
        df = df.sample(sample_size)

    updated_df = await title_abs_review(df)
    total_cost = title_abs_review.get_total_cost()
    print(f"\n====== Finished reviewing ======\n")
    print(f"\nTotal cost: {total_cost}")
    print("-" * 100)
    return updated_df


updated_df = asyncio.run(review(data, sample_size=10))



Processing 10 eligible rows


['round: A', 'reviewer_name: Agent1'] -                     2025-02-01 11:17:32: 100%|██████████| 10/10 [00:00<00:00, 10.21it/s]


The following columns are present in the dataframe at the end of Agent1's reivew in round A: ['title', 'abstract', 'DOI', 'study_type', 'clinical_application', 'organ', 'modality', 'cardiovascular', 'task', 'deep_learning', 'external_validation', 'round-A_Agent1_output', 'round-A_Agent1_reasoning', 'round-A_Agent1_evaluation']


['round: A', 'reviewer_name: Agent2'] -                     2025-02-01 11:17:33: 100%|██████████| 10/10 [00:01<00:00,  8.50it/s]


The following columns are present in the dataframe at the end of Agent2's reivew in round A: ['title', 'abstract', 'DOI', 'study_type', 'clinical_application', 'organ', 'modality', 'cardiovascular', 'task', 'deep_learning', 'external_validation', 'round-A_Agent1_output', 'round-A_Agent1_reasoning', 'round-A_Agent1_evaluation', 'round-A_Agent2_output', 'round-A_Agent2_reasoning', 'round-A_Agent2_evaluation']


Processing 1 eligible rows


['round: B', 'reviewer_name: Agent3'] -                     2025-02-01 11:17:34: 100%|██████████| 1/1 [00:04<00:00,  4.33s/it]

The following columns are present in the dataframe at the end of Agent3's reivew in round B: ['title', 'abstract', 'DOI', 'study_type', 'clinical_application', 'organ', 'modality', 'cardiovascular', 'task', 'deep_learning', 'external_validation', 'round-A_Agent1_output', 'round-A_Agent1_reasoning', 'round-A_Agent1_evaluation', 'round-A_Agent2_output', 'round-A_Agent2_reasoning', 'round-A_Agent2_evaluation', 'round-B_Agent3_output', 'round-B_Agent3_reasoning', 'round-B_Agent3_evaluation']



Total cost: 0.0026872000000000003
----------------------------------------------------------------------------------------------------





In [14]:
updated_df

Unnamed: 0,title,abstract,DOI,study_type,clinical_application,organ,modality,cardiovascular,task,deep_learning,external_validation,round-A_Agent1_output,round-A_Agent1_reasoning,round-A_Agent1_evaluation,round-A_Agent2_output,round-A_Agent2_reasoning,round-A_Agent2_evaluation,round-B_Agent3_output,round-B_Agent3_reasoning,round-B_Agent3_evaluation
910,The deep learning model combining CT image and...,PURPOSE: This study aimed to investigate the d...,10.1007/s00259-020-04986-6,cross-sectional,treatment,lung,ct,0,classification,1,0,"{'evaluation': 4, 'reasoning': 'The study uses...",The study uses CT scans and clinicopathologica...,4,{'reasoning': 'The study involves CT scans and...,The study involves CT scans and does not menti...,5,,,
260,Classification of malignant and benign lung no...,Lung cancer presents the highest cause of deat...,10.1007/s11517-018-1841-0,cross-sectional,diagnosis,lung,ct,0,classification,0,0,"{'evaluation': 5, 'reasoning': 'The abstract e...",The abstract explicitly states that the study ...,5,{'reasoning': 'The study involves CT scans as ...,The study involves CT scans as it tests the pr...,5,,,
813,Quantifying the incremental value of deep lear...,We present a case study for implementing a mac...,10.1371/journal.pone.0231468,cross-sectional,diagnosis,lung,ct,0,detection,1,1,"{'evaluation': 2, 'reasoning': 'Although the a...",Although the abstract mentions using image dat...,2,{'reasoning': 'The study involves the use of C...,The study involves the use of CT scans for lun...,5,{'reasoning': 'Although the abstract does not ...,Although the abstract does not explicitly ment...,5.0
952,Use of a Dual Artificial Intelligence Platform...,OBJECTIVE: To investigate the performance of D...,10.1097/rct.0000000000001118,cross-sectional,diagnosis,lung,ct,0,classification,1,0,"{'evaluation': 4, 'reasoning': 'The study uses...",The study uses CT scans and does not involve P...,4,{'reasoning': 'The study involves CT scans exc...,The study involves CT scans exclusively and do...,5,,,
696,Mix-and-Interpolate: A Training Strategy to De...,"Till March 31st, 2021, the coronavirus disease...",10.1109/jbhi.2021.3119325,cross-sectional,diagnosis,lung,ct,0,classification,1,0,"{'evaluation': 5, 'reasoning': 'The study focu...",The study focuses on CT scans for COVID-19 dia...,5,{'reasoning': 'The study involves CT scans for...,The study involves CT scans for COVID-19 diagn...,5,,,
315,Convolutional Neural Networks Promising in Lun...,"AIM: To develop an algorithm, based on convolu...",10.1155/2018/1382309,cross-sectional,diagnosis,lung,pet,0,"detection, classification",1,0,"{'evaluation': 1, 'reasoning': 'The study incl...","The study includes PET scans, violating the ex...",1,{'reasoning': 'The study involves both FDG-PET...,"The study involves both FDG-PET and CT scans, ...",1,,,
714,Multicenter analysis and a rapid screening mod...,Early determination of coronavirus disease 201...,10.1097/md.0000000000026279,cross-sectional,diagnosis,lung,xr,0,classification,0,0,"{'evaluation': 4, 'reasoning': 'The abstract m...",The abstract mentions the use of chest CT scan...,4,{'reasoning': 'The study involves imaging chan...,The study involves imaging changes on chest X-...,5,,,
283,Comparison of Deep Learning Approaches for Mul...,The increased availability of labeled X-ray im...,10.1038/s41598-019-42294-8,cross-sectional,diagnosis,lung,xr,0,classification,1,0,"{'evaluation': 1, 'reasoning': 'The study uses...","The study uses chest X-ray images, not CT scan...",1,{'reasoning': 'The study focuses exclusively o...,The study focuses exclusively on chest X-ray c...,1,,,
561,Fused feature signatures to probe tumour radio...,Radiogenomics relationships (RRs) aims to iden...,10.1038/s41598-022-06085-y,cross-sectional,diagnosis,lung,ct,0,classification,0,0,"{'evaluation': 4, 'reasoning': 'The abstract m...",The abstract mentions the use of CT scans and ...,4,{'reasoning': 'The study involves CT scans as ...,The study involves CT scans as one of the moda...,5,,,
256,Classification of early stage non-small cell l...,"Radiomics, which involves the extraction of la...",10.1007/s12194-017-0433-2,cross-sectional,diagnosis,lung,ct,0,classification,1,0,"{'evaluation': 5, 'reasoning': 'The study uses...",The study uses CT scans and does not involve P...,5,{'reasoning': 'The study involves the use of c...,The study involves the use of computed tomogra...,5,,,


Now let's print the responses of all three agents for the cases where agent 3 needed to intervene.

In [13]:
# Filter the dataframe for cases where the two junior models (Agent1 and Agent2) disagreed
disagreed_cases = updated_df[
    updated_df["round-B_Agent3_evaluation"].notna()
]

# Print the responses of all agents for these cases
for index, row in disagreed_cases.iterrows():
    print(f"Title: {row['title']}")
    print(f"Abstract: {row['abstract']}")
    print(f"Agent1's response: {row['round-A_Agent1_output']}")
    print(f"Agent2's response: {row['round-A_Agent2_output']}")
    print(f"Agent3's response: {row['round-B_Agent3_output']}")
    print("-" * 100)


Title: Quantifying the incremental value of deep learning: Application to lung nodule detection
Abstract: We present a case study for implementing a machine learning algorithm with an incremental value framework in the domain of lung cancer research. Machine learning methods have often been shown to be competitive with prediction models in some domains; however, implementation of these methods is in early development. Often these methods are only directly compared to existing methods; here we present a framework for assessing the value of a machine learning model by assessing the incremental value. We developed a machine learning model to identify and classify lung nodules and assessed the incremental value added to existing risk prediction models. Multiple external datasets were used for validation. We found that our image model, trained on a dataset from The Cancer Imaging Archive (TCIA), improves upon existing models that are restricted to patient characteristics, but it was inconcl