Skip to content

BioKEA/roastchip

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RoastChip: Automated Paper Reviewer

An automated system for reviewing research papers using DeepSeek R1 with context from eminent researchers in the field.

Features

  • Download papers from ArXiv or direct URLs
  • Advanced PDF text extraction using multiple methods (PyPDF2, PyMuPDF, pdfplumber)
  • Extract sections and structure from scientific papers
  • Find eminent researchers in the paper's field using Semantic Scholar
  • Retrieve context papers from these researchers
  • Integrate with PubMed to find additional relevant papers
  • Generate critical reviews using Google AI (Gemini), focusing on errors, inaccuracies, missing context, and citations

Installation

  1. Clone this repository:

    git clone https://github.com/yourusername/roastchip.git
    cd roastchip
    
  2. Create a virtual environment and install dependencies:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    pip install -e .
    
  3. Create a .env file with your API keys:

    cp .env.example .env
    

    Then edit the .env file to add your API keys and other configuration:

    # Google Gemini API key (required if not using OpenRouter)
    GOOGLE_AI_API_KEY=your_gemini_api_key_here
    GOOGLE_AI_API_BASE=https://generativelanguage.googleapis.com/v1beta
    GOOGLE_AI_MODEL=gemini-1.5-pro
    
    # OpenRouter API key (required if using OpenRouter)
    OPENROUTER_API_KEY=your_openrouter_api_key_here
    OPENROUTER_API_BASE=https://openrouter.ai/api/v1
    OPENROUTER_MODEL=google/gemini-1.5-pro  # Can also use google/gemini-2.5-pro, openai/gpt-4o, anthropic/claude-3-sonnet, etc.
    
    # Email for PubMed API access (required for PubMed integration)
    PUBMED_EMAIL=your_email@example.com
    
    # Semantic Scholar API key (optional, but recommended for higher rate limits)
    # Get your API key from https://www.semanticscholar.org/product/api
    SEMANTIC_SCHOLAR_API_KEY=your_semantic_scholar_api_key_here
    
    # Configuration
    MAX_PAPERS_PER_RESEARCHER=5
    MAX_RESEARCHERS=3
    DOWNLOAD_DIR=downloads
    REVIEWS_DIR=reviews
    
  4. Getting API keys:

    Google Gemini API key:

    • Go to Google AI Studio
    • Sign in with your Google account
    • Click on "Get API key" in the top right corner
    • Create a new API key or use an existing one
    • Copy the API key and paste it in your .env file as GOOGLE_AI_API_KEY

    OpenRouter API key:

    • Go to OpenRouter
    • Sign in or create an account
    • Navigate to the API Keys section
    • Create a new API key
    • Copy the API key and paste it in your .env file as OPENROUTER_API_KEY

    Semantic Scholar API key:

    • Go to Semantic Scholar API
    • Sign up for an API key
    • Copy the API key and paste it in your .env file as SEMANTIC_SCHOLAR_API_KEY

    Note: The system is configured to use Gemini 1.5 Pro by default, but you can specify other models like Gemini 2.5 Pro, GPT-4o, or Claude 3 Sonnet when using OpenRouter.

Usage

Review a paper from ArXiv

python -m paper_reviewer.main review 2101.12345 --output review.json

Review a paper from a URL

python -m paper_reviewer.main review-url https://example.com/paper.pdf --field cs.AI --output review.json

Run the test example (Gene Set Summarization paper)

python -m paper_reviewer.main test --output review.json

Process all PDFs in the downloads directory

python -m paper_reviewer.main process-all --field cs.AI

Command-line options

usage: main.py [-h] {review,review-url,test,process-all} ...

Automatically review research papers using Google AI with context from eminent researchers.

positional arguments:
  {review,review-url,test,process-all}  Command to execute
    review                  Review a paper from ArXiv
    review-url              Review a paper from a URL
    test                    Run a test review on a predefined paper
    process-all             Process all PDFs in the downloads directory

options:
  -h, --help           show this help message and exit

Review command options

usage: main.py review [-h] [--use-rules] [--use-scholar] [--max-context-papers MAX_CONTEXT_PAPERS] [--max-researchers MAX_RESEARCHERS] [--no-pubmed] [--output OUTPUT] paper_id

positional arguments:
  paper_id              ArXiv paper ID (e.g., 2101.12345)

options:
  -h, --help            show this help message and exit
  --use-rules           Use reviewer rules from AIreviewer_rules.txt
  --use-scholar         Include Semantic Scholar results in the prompt
  --max-context-papers MAX_CONTEXT_PAPERS
                        Maximum number of context papers per researcher
  --max-researchers MAX_RESEARCHERS
                        Maximum number of researchers to consider
  --no-pubmed           Disable PubMed integration
  --output OUTPUT       Output file path (JSON format)

Review URL command options

usage: main.py review-url [-h] [--use-rules] [--use-scholar] [--paper-id PAPER_ID] [--field FIELD] [--max-context-papers MAX_CONTEXT_PAPERS] [--max-researchers MAX_RESEARCHERS] [--no-pubmed] [--output OUTPUT] pdf_url

positional arguments:
  pdf_url               URL to the PDF file

options:
  -h, --help            show this help message and exit
  --use-rules           Use reviewer rules from AIreviewer_rules.txt
  --use-scholar         Include Semantic Scholar results in the prompt
  --paper-id PAPER_ID   Optional paper ID (will be generated if not provided)
  --field FIELD         Research field of the paper (e.g., cs.AI, physics.optics)
  --max-context-papers MAX_CONTEXT_PAPERS
                        Maximum number of context papers per researcher
  --max-researchers MAX_RESEARCHERS
                        Maximum number of researchers to consider
  --no-pubmed           Disable PubMed integration
  --output OUTPUT       Output file path (JSON format)

Test command options

usage: main.py test [-h] [--use-rules] [--use-scholar] [--output OUTPUT]

options:
  -h, --help       show this help message and exit
  --use-rules      Use reviewer rules from AIreviewer_rules.txt
  --use-scholar    Include Semantic Scholar results in the prompt
  --output OUTPUT  Output file path (JSON format)

Process-all command options

usage: main.py process-all [-h] [--use-rules] [--use-scholar] [--field FIELD] [--max-context-papers MAX_CONTEXT_PAPERS] [--max-researchers MAX_RESEARCHERS] [--no-pubmed]

options:
  -h, --help            show this help message and exit
  --use-rules           Use reviewer rules from AIreviewer_rules.txt
  --use-scholar         Include Semantic Scholar results in the prompt
  --field FIELD         Default research field for papers without ArXiv IDs
  --max-context-papers MAX_CONTEXT_PAPERS
                        Maximum number of context papers per researcher
  --max-researchers MAX_RESEARCHERS
                        Maximum number of researchers to consider
  --no-pubmed           Disable PubMed integration

Examples

Review an ArXiv paper

python -m paper_reviewer.main review 2101.12345

This will:

  1. Download the paper with ID 2101.12345 from ArXiv
  2. Extract text from the PDF using the best available method
  3. Find top researchers in the paper's field using Semantic Scholar
  4. Retrieve context papers from these researchers
  5. Find related papers in PubMed
  6. Generate a critical review using Google AI (Gemini)
  7. Print the review to the console

Run the test example

python -m paper_reviewer.main test

This will run a review on the paper "Gene Set Summarization using Large Language Models" (ArXiv ID: 2305.13338).

Using Reviewer Rules and Semantic Scholar

python -m paper_reviewer.main test --use-rules --use-scholar

When using the --use-rules option:

  1. The system will include the reviewer rules from AIreviewer_rules.txt in the prompt to the AI model
  2. The output filenames will indicate whether rules were used with _rulesyes_ or _rulesno_ in the filename
  3. The review structure and content will follow the guidelines specified in the rules file
  4. The full prompt will be saved to a file with the same prefix as the review file but with an additional _prompt suffix
  5. The prompt structure will also be saved as a JSON file with the same prefix as the review file but with an additional _prompt.json suffix, following the LinkML schema defined in src/paper_reviewer/models/prompt_schema.linkml.yaml

When using the --use-scholar option:

  1. The system will include Semantic Scholar results in the prompt to the AI model
  2. The output filenames will indicate whether Semantic Scholar was used with _scholaryes_ or _scholarno_ in the filename
  3. The review will include insights from related papers by eminent researchers in the field

You can use both options together to get the benefits of both structured reviews and context from related papers.

Processing All PDFs in the Downloads Directory

python -m paper_reviewer.main process-all --use-rules --use-scholar

This will:

  1. Find all PDF files in the downloads directory
  2. Process each PDF file to extract text
  3. Generate reviews for each PDF using Google AI (Gemini)
  4. Save the reviews and prompts to the reviews directory

The system will automatically detect ArXiv papers based on their filenames and use appropriate field settings for them. For non-ArXiv papers, it will use the field specified with the --field option (default: cs.AI).

Tests

The project includes test scripts in the src/tests directory that use the already extracted text from the raw_text directory, avoiding the need to extract text from PDFs.

Running Tests

Process all raw text files

# Process all raw text files with reviewer rules
python -m tests.run_tests raw-text --use-rules

# Process all raw text files with Semantic Scholar context
python -m tests.run_tests raw-text --use-scholar

# Process all raw text files with both rules and Semantic Scholar
python -m tests.run_tests raw-text --use-rules --use-scholar

Process a specific raw text file

# Process a specific file with reviewer rules
python -m tests.run_tests raw-text --file raw_text/ISMEJ-D-23-00112.txt --use-rules

# Process a specific file with a custom output directory
python -m tests.run_tests raw-text --file raw_text/ISMEJ-D-23-00112.txt --output-dir custom_reviews

Generate a review for a specific file

# Generate a review with reviewer rules
python -m tests.test_single_file raw_text/ISMEJ-D-23-00112.txt --use-rules

# Generate a review with Semantic Scholar context
python -m tests.test_single_file raw_text/ISMEJ-D-23-00112.txt --use-scholar

# Generate a review with a specific model
python -m tests.test_single_file raw_text/ISMEJ-D-23-00112.txt --use-rules --model "openai/gpt-4o"

Test with multiple LLMs using the new prompt schema

# Run pytest tests (mocked API calls)
python -m pytest src/tests/test_all_llms_with_schema.py

# Run pytest tests with verbose output
python -m pytest -v src/tests/test_all_llms_with_schema.py

# Run actual tests with real API calls for all models
python -m src.tests.test_all_llms_with_schema

# Run actual tests with Gemini and reviewer rules
python -m src.tests.test_all_llms_with_schema --model gemini --use-rules

# Run actual tests with Claude
python -m src.tests.test_all_llms_with_schema --model claude

# Run actual tests with GPT-4o and a specific file
python -m src.tests.test_all_llms_with_schema --model gpt4o --file raw_text/ISMEJ-D-23-00112.txt

The tests with real API calls will generate the following files in the test_reviews directory:

  • Text review files
  • JSON review files
  • Text prompt files
  • JSON prompt files (following the LinkML schema)

See the tests README for more information.

Evaluation

The project includes an evaluation module in the src/evaluation directory that can be used to compare reviews generated with and without rules.

Running Evaluation

Compare specific reviews

python -m evaluation.compare_specific_reviews --rules-yes <path_to_rules_yes_review> --rules-no <path_to_rules_no_review>

Compare all reviews for a specific timestamp

python -m evaluation.compare_all_reviews --reviews-dir reviews --output-dir evaluation_results --timestamp <timestamp>

Manual comparison

python -m evaluation.manual_compare --rules-yes <path_to_rules_yes_review> --rules-no <path_to_rules_no_review>

Evaluation Metrics

The evaluation module compares reviews based on several metrics:

  • Basic Metrics: Sentence count, word count, average sentence length
  • Structure Metrics: Section count, bullet point count
  • Complexity Metrics: Lexical diversity, content word ratio
  • Unique Content Analysis: Identifies unique sentences in each review

Evaluation Output

The evaluation results are saved in the following formats:

  • JSON Files: Detailed comparison results for each paper
  • Summary JSON: Overall results across all papers
  • HTML Report: User-friendly visualization of the evaluation results

Prompt Structure

The system uses a structured prompt format defined by a LinkML schema. The prompt is saved in both text and JSON formats:

  • Text Format: Human-readable prompt with clear section delimiters
  • JSON Format: Structured representation of the prompt following the LinkML schema

Example of the prompt JSON structure:

{
  "introduction": {
    "section_type": "INTRODUCTION",
    "section_title": "REVIEWER ROLE",
    "section_content": "You are a critical academic reviewer with expertise in analyzing research papers.",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "reviewer_rules": {
    "section_type": "REVIEWER_RULES",
    "section_title": "REVIEWER RULES",
    "section_content": "1. Ethics & Integrity\n...\n\n2. Review Structure\n...\n\n3. Content‑Level Expectations\n...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "task_description": {
    "section_type": "TASK_DESCRIPTION",
    "section_title": "REVIEW TASK",
    "section_content": "Your task is to provide a thorough, critical review of the following paper...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "paper_content": {
    "section_type": "PAPER_CONTENT",
    "section_title": "PAPER TO REVIEW",
    "section_content": "...[paper content]...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "context_papers": {
    "section_type": "CONTEXT_PAPERS",
    "section_title": "CONTEXT PAPERS",
    "section_content": "To help with your review, here are relevant papers from eminent researchers...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "review_instructions": {
    "section_type": "REVIEW_INSTRUCTIONS",
    "section_title": "REVIEW INSTRUCTIONS",
    "section_content": "Please provide a comprehensive review that:\n1. Identifies any factual errors...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "metadata": {
    "model_name": "openai/gpt-4o",
    "use_rules": true,
    "context_papers_count": 5,
    "semantic_scholar_papers_count": 3,
    "pubmed_papers_count": 2
  }
}

## Pipeline Runner

The project includes a unified pipeline runner (`src.pipeline_runner`) that provides a consistent interface to run different stages of the paper review pipeline. This tool enforces consistent output directories and simplifies the execution of the entire workflow.

### Pipeline Stages

The pipeline runner supports the following stages:

1. **PDF to Text Extraction**: Extract text from PDF files
2. **Text to Reviews Generation**: Generate reviews from raw text with various parameter combinations
3. **Reviews to Evaluation**: Compare and evaluate generated reviews
4. **Evaluation to Visualization**: Create visual reports from evaluation data

### Directory Structure

The pipeline runner enforces a consistent directory structure:

- `pdf/`: PDF files
- `raw_text/`: Extracted text from PDFs
- `reviews/`: Generated reviews
- `evaluation/`: Evaluation results
- `evaluation_viz/`: Visualization of evaluation results

### Usage

#### Extract Text from PDFs

```bash
python -m src.pipeline_runner extract [options]

Options:

  • --pdf-dir: Directory containing PDF files (default: "pdf")
  • --output-dir: Directory to save extracted text (default: "raw_text")
  • --limit: Maximum number of PDFs to process

Generate Reviews

python -m src.pipeline_runner review [options]

Options:

  • --raw-text-dir: Directory containing raw text files (default: "raw_text")
  • --reviews-dir: Directory to save reviews (default: "reviews")
  • --use-rules: Use reviewer rules
  • --use-scholar: Include Semantic Scholar results
  • --use-pubmed: Include PubMed results
  • --timestamp: Timestamp to use in filenames
  • --delay: Delay in seconds between API calls (default: 30)
  • --model: LLM model name to use (e.g., 'gemini-1.5-pro', 'google/gemini-2.5-pro', 'openai/gpt-4o', 'anthropic/claude-3-sonnet'). Use 'all' to run all three main models.
  • --use-openrouter: Use OpenRouter API instead of Google AI

Generate All Review Combinations

python -m src.pipeline_runner all-reviews [options]

This command generates reviews with all combinations of parameters (with/without rules, with/without scholar, with/without pubmed) using a consistent timestamp for easy comparison.

Options:

  • --raw-text-dir: Directory containing raw text files (default: "raw_text")
  • --reviews-dir: Directory to save reviews (default: "reviews")
  • --timestamp: Timestamp to use in filenames
  • --delay: Delay in seconds between API calls (default: 60)
  • --model: LLM model name to use. Use 'all' to run all three main models (Gemini 2.5 Pro, GPT-4o, and Claude 3 Sonnet)

Evaluate Reviews

python -m src.pipeline_runner evaluate [options]

Options:

  • --reviews-dir: Directory containing review files (default: "reviews")
  • --output-dir: Directory to save evaluation results (default: "evaluation")
  • --timestamp: Timestamp to filter reviews (required)

Visualize Evaluation Results

python -m src.pipeline_runner visualize [options]

Options:

  • --summary-file: Path to the evaluation summary JSON file (required)
  • --output-dir: Directory to save visualization results

Run Full Pipeline

python -m src.pipeline_runner full [options]

This command runs the entire pipeline from PDF extraction to visualization in one go.

Options:

  • --pdf-dir: Directory containing PDF files (default: "pdf")
  • --raw-text-dir: Directory to save extracted text (default: "raw_text")
  • --reviews-dir: Directory to save reviews (default: "reviews")
  • --evaluation-dir: Directory to save evaluation results (default: "evaluation")
  • --evaluation-viz-dir: Directory to save visualization results (default: "evaluation_viz")
  • --limit: Maximum number of PDFs to process
  • --delay: Delay in seconds between API calls (default: 60)
  • --model: LLM model name to use. Use 'all' to run all three main models (Gemini 2.5 Pro, GPT-4o, and Claude 3 Sonnet)

Examples

Extract Text from PDFs

# Extract text from all PDFs in the pdf/ directory
python -m src.pipeline_runner extract --pdf-dir pdf --output-dir raw_text

# Extract text from only the first 3 PDFs
python -m src.pipeline_runner extract --pdf-dir pdf --output-dir raw_text --limit 3

# Extract text from a specific PDF
cp path/to/your/paper.pdf pdf/
python -m src.pipeline_runner extract --pdf-dir pdf --output-dir raw_text

Generate Reviews with Different Parameters

# Generate reviews with rules and scholar for all papers in raw_text/
python -m src.pipeline_runner review --use-rules --use-scholar --use-pubmed

# Generate reviews with a specific model (Gemini 2.5 Pro)
python -m src.pipeline_runner review --use-rules --model "google/gemini-2.5-pro" --use-openrouter

# Generate reviews with OpenAI GPT-4o and include context from Semantic Scholar
python -m src.pipeline_runner review --use-rules --use-scholar --model "openai/gpt-4o" --use-openrouter

# Generate reviews with Claude 3 Sonnet with a longer delay between API calls
python -m src.pipeline_runner review --model "anthropic/claude-3-sonnet" --use-openrouter --delay 60

# Generate reviews with all three main models (Gemini 2.5 Pro, GPT-4o, and Claude 3 Sonnet)
python -m src.pipeline_runner review --use-rules --model all --use-openrouter --delay 60

# Generate reviews with a specific timestamp (useful for grouping related reviews)
python -m src.pipeline_runner review --use-rules --model "openai/gpt-4o" --use-openrouter --timestamp "20250504_120000"

This will generate the following files for each paper in the raw_text/ directory:

  1. A text review file (e.g., ISMEJ-D-23-00112_review_20250503_204228_rulesyes_scholarno_pubmedno_openai_gpt_4o.txt)
  2. A JSON review file (e.g., ISMEJ-D-23-00112_review_20250503_204228_rulesyes_scholarno_pubmedno_openai_gpt_4o.json)
  3. A text prompt file (e.g., ISMEJ-D-23-00112_review_20250503_204228_rulesyes_scholarno_pubmedno_openai_gpt_4o_prompt.txt)
  4. A JSON prompt file (e.g., ISMEJ-D-23-00112_review_20250503_204228_rulesyes_scholarno_pubmedno_openai_gpt_4o_prompt.json)

The JSON prompt file follows the LinkML schema defined in src/paper_reviewer/models/prompt_schema.linkml.yaml and contains structured sections for the prompt. Here's an example of the JSON structure:

{
  "introduction": {
    "section_type": "INTRODUCTION",
    "section_title": "REVIEWER ROLE",
    "section_content": "You are a critical academic reviewer with expertise in analyzing research papers.",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "reviewer_rules": {
    "section_type": "REVIEWER_RULES",
    "section_title": "REVIEWER RULES",
    "section_content": "1. Ethics & Integrity\n...\n\n2. Review Structure\n...\n\n3. Content‑Level Expectations\n...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "task_description": {
    "section_type": "TASK_DESCRIPTION",
    "section_title": "REVIEW TASK",
    "section_content": "Your task is to provide a thorough, critical review of the following paper...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "paper_content": {
    "section_type": "PAPER_CONTENT",
    "section_title": "PAPER TO REVIEW",
    "section_content": "...[paper content]...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "context_papers": {
    "section_type": "CONTEXT_PAPERS",
    "section_title": "CONTEXT PAPERS",
    "section_content": "To help with your review, here are relevant papers from eminent researchers...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "review_instructions": {
    "section_type": "REVIEW_INSTRUCTIONS",
    "section_title": "REVIEW INSTRUCTIONS",
    "section_content": "Please provide a comprehensive review that:\n1. Identifies any factual errors...",
    "section_delimiter_start": "===",
    "section_delimiter_end": "==="
  },
  "metadata": {
    "model_name": "openai/gpt-4o",
    "use_rules": true,
    "context_papers_count": 5,
    "semantic_scholar_papers_count": 3,
    "pubmed_papers_count": 2
  }
}

#### Generate All Review Combinations

```bash
# Generate all combinations with default settings
python -m src.pipeline_runner all-reviews --delay 60

# Generate all combinations with a specific model
python -m src.pipeline_runner all-reviews --model "openai/gpt-4o" --delay 60

# Generate all combinations with all three main models (Gemini 2.5 Pro, GPT-4o, and Claude 3 Sonnet)
python -m src.pipeline_runner all-reviews --model all --delay 60

# Generate all combinations with a specific timestamp
python -m src.pipeline_runner all-reviews --timestamp "20250504_120000" --delay 60

This will generate reviews with all combinations of parameters:

  1. Without rules, without scholar, without pubmed
  2. With rules, without scholar, without pubmed
  3. Without rules, with scholar, with pubmed
  4. With rules, with scholar, with pubmed

All reviews will have the same timestamp for easy comparison. For each combination, the following files will be generated:

  • Text review files
  • JSON review files
  • Text prompt files
  • JSON prompt files (following the LinkML schema)

Evaluate Reviews

# Evaluate reviews with a specific timestamp
python -m src.pipeline_runner evaluate --timestamp 20250504_120000

# Evaluate reviews with custom directories
python -m src.pipeline_runner evaluate --reviews-dir custom_reviews --output-dir custom_evaluation --timestamp 20250504_120000

This will generate evaluation files in the evaluation/ directory:

  • Individual evaluation JSON files for each paper
  • A summary JSON file with overall metrics
  • Comparison data between different review configurations

Visualize Evaluation Results

# Visualize evaluation results from a summary file
python -m src.pipeline_runner visualize --summary-file evaluation/evaluation_summary_20250504_120000.json

# Visualize with a custom output directory
python -m src.pipeline_runner visualize --summary-file evaluation/evaluation_summary_20250504_120000.json --output-dir custom_viz

This will generate HTML visualization files in the evaluation_viz/ directory with:

  • Bar plots comparing metrics across different models and configurations
  • Detailed comparisons of review content
  • Visualizations of unique and common content between reviews

Run Full Pipeline

# Run the full pipeline with default settings
python -m src.pipeline_runner full --limit 5

# Run the full pipeline with a specific model
python -m src.pipeline_runner full --model "openai/gpt-4o" --limit 3

# Run the full pipeline with all three main models (Gemini 2.5 Pro, GPT-4o, and Claude 3 Sonnet)
python -m src.pipeline_runner full --model all --limit 3 --delay 60

# Run the full pipeline with custom directories
python -m src.pipeline_runner full --pdf-dir custom_pdf --raw-text-dir custom_text --reviews-dir custom_reviews

This will:

  1. Extract text from PDFs (limited by the --limit parameter if provided)
  2. Generate all review combinations
  3. Evaluate the reviews
  4. Create visualization reports

All outputs will use consistent timestamps throughout the pipeline for easy tracking.

Running the Pipeline with Multiple Models

The project includes a script to run the full pipeline with multiple models in sequence. This is useful for comparing the performance of different LLM models on the same set of papers.

# Run the pipeline with all three models (Claude, GPT-4o, and Gemini)
python run_pipeline_with_models.py

# Run the pipeline with a specific set of models
python run_pipeline_with_models.py --models "anthropic/claude-3.7-sonnet:thinking" "openai/gpt-4o"

# Run the pipeline with custom directories
python run_pipeline_with_models.py --raw-text-dir custom_text --reviews-dir custom_reviews

This script will:

  1. Run the pipeline for each model with all parameter combinations (with/without rules, with/without scholar & pubmed)
  2. Use a consistent timestamp across all models for easy comparison
  3. Generate evaluation and visualization reports comparing the performance of different models
  4. Use a shorter delay (1 second) between OpenRouter API calls to speed up processing

Note: Before running this script, make sure you have installed the package in development mode with pip install -e . to ensure all dependencies are available.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published