# üß™ SLR Abstract Screening Experiment
#### Experiment Information
- **ID**: 005
- **Date**: 08/13
#### üéØ Goal
- Test all configurations on a smaller portion of the dataset to identify problems and patterns
#### ‚öôÔ∏è Configuration
- **LLM** : GPT-4o
- **Data**: LB
- **Examples** : 1
- **Output**: Yes/Maybe/No
#### üìù Notes
- 


## üîß Setup and Configuration

In [1]:
# Essential imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import json
from pathlib import Path
import os
from dotenv import load_dotenv
from openai import OpenAI

# Configure pandas display options
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
pd.set_option('display.width', 1000)
pd.set_option('display.float_format', lambda x: '%.3f' % x)

# Set plotting style
sns.set_theme()  # This is the correct way to set seaborn style
plt.rcParams['figure.figsize'] = (12, 8)

In [2]:
# Data Import 

# Define the data paths for both datasets
DATA_PATH_1 = "../data/SSOT_manual_LB_20250808_120908.csv" # ‚¨ÖÔ∏è Change this path if needed
DATA_PATH_2 = "../data/SSOT_manual_BM_20250813_132621.csv" # ‚¨ÖÔ∏è Change this path if needed

# Load the first dataset (df1)
try:
    df_LB = pd.read_csv(DATA_PATH_1)
    print(f"‚úì First dataset loaded successfully")
    print(f"‚úì Shape of dataset 1: {df_LB.shape}")
except FileNotFoundError:
    print("‚ùå Error: The file LB dataset was not found in the data directory")
except Exception as e:
    print(f"‚ùå Error loading the first dataset: {str(e)}")

# Load the second dataset (df2)
try:
    df_BM = pd.read_csv(DATA_PATH_2)
    print(f"\n‚úì Second dataset loaded successfully")
    print(f"‚úì Shape of dataset 2: {df_BM.shape}")
except FileNotFoundError:
    print("‚ùå Error: The file df_BM was not found in the data directory")
except Exception as e:
    print(f"‚ùå Error loading the second dataset: {str(e)}")

# Display basic information about both datasets
print("\nFirst few rows of dataset 1:\n")
display(df_LB.head())

print("\nFirst few rows of dataset 2:\n")
display(df_BM.head())

‚úì First dataset loaded successfully
‚úì Shape of dataset 1: (3944, 15)

‚úì Second dataset loaded successfully
‚úì Shape of dataset 2: (917, 13)

First few rows of dataset 1:



Unnamed: 0,ID,abstract,acmid,author,doi,outlet,title_full,url,year,qualtrics_id,wos_id,ebsco_id,stage_1,stage_2,stage_3
0,Bindu2018503,Online social networks have become immensely p...,,"Bindu, P V and Mishra, R and Thilagam, P S",10.1007/s10844-017-0494-z,Journal of Intelligent Information Systems,{Discovering spammer communities in TWITTER},https://www.scopus.com/inward/record.uri?eid=2...,2018,12,,,True,False,False
1,Moraga2018470,This article explores the ways Latinos‚Äîas audi...,,"Moraga, J E",10.1177/0193723518797030,Journal of Sport and Social Issues,"{On ESPN Deportes: Latinos, Sport MEDIA, and t...",https://www.scopus.com/inward/record.uri?eid=2...,2018,22,,,True,False,False
2,Lanosga20181676,This study of American investigative reporting...,,"Lanosga, G and Martin, J",10.1177/1464884916683555,JOURNALISm,"{JOURNALISts, sources, and policy outcomes: In...",https://www.scopus.com/inward/record.uri?eid=2...,2018,47,,,True,False,True
3,Warner2018720,"In this study, we test the indirect and condit...",,"Warner, B R and Jennings, F J and Bramlett, J ...",10.1080/15205436.2018.1472283,Mass Communication and Society,{A MultiMEDIA Analysis of Persuasion in the 20...,https://www.scopus.com/inward/record.uri?eid=2...,2018,50,,,True,False,False
4,Burrows20181117,Professional communicators produce a diverse r...,,"Burrows, E",10.1177/0163443718764807,"MEDIA, Culture and Society",{Indigenous MEDIA producers' perspectives on o...,https://www.scopus.com/inward/record.uri?eid=2...,2018,56,,,True,False,False



First few rows of dataset 2:



Unnamed: 0,(internal) id,(source) id,abstract,title_full,journal,authors,tags,consensus,labeled_at...9,code,stage_1,stage_2,stage_3
0,33937314,175,There is a worry that serious forms of politic...,Is Context the Key? The (Non-)Differential Eff...,Polit. Commun.,,,o,,-1,True,False,False
1,33937315,113,The electoral model of democracy holds the ide...,POLITICAL NEWS IN ONLINE AND PRINT NEWSPAPERS ...,Digit. Journal.,,,o,,-1,True,False,False
2,33937316,122,Machine learning is a field at the intersectio...,Machine Learning for Sociology,Annu. Rev. Sociol.,,,o,,-1,True,False,False
3,33937317,467,Research on digital glocalization has found th...,Improving Health in Low-Income Communities Wit...,J. Commun.,,,o,,-1,True,False,False
4,33937318,10,Political scientists often wish to classify do...,Using Word Order in Political Text Classificat...,Polit. Anal.,,,o,,-1,True,False,False


## üß´ Define Experiment Parameters

In [11]:
from datetime import datetime

# Experiment Metadata
EXPERIMENT_ID = "005"  # ‚¨ÖÔ∏è Change this for each new experiment
EXPERIMENT_DATE = "2025-08-13"  # ‚¨ÖÔ∏è Update the date
EXPERIMENT_CATEGORY = "Testing"  # ‚¨ÖÔ∏è Category of experiment
EXPERIMENT_GOAL = "Test Set Up"  # ‚¨ÖÔ∏è What are you testing?

# Model Configuration
MODEL_NAME = "gpt-4o"
TEMPERATURE = 0.0
MAX_TOKENS = 4000

# Print experiment info
print("üß™ EXPERIMENT SETUP")
print("=" * 50)
print(f"ID: {EXPERIMENT_ID}")
print(f"Date: {EXPERIMENT_DATE}")
print(f"Category: {EXPERIMENT_CATEGORY}")
print(f"üéØGoal: {EXPERIMENT_GOAL}")
print(f"Model: {MODEL_NAME} (temp={TEMPERATURE})")
print("=" * 50)
print("‚úÖ Experiment configuration loaded")

üß™ EXPERIMENT SETUP
ID: 005
Date: 2025-08-13
Category: Testing
üéØGoal: Test Set Up
Model: gpt-4o (temp=0.0)
‚úÖ Experiment configuration loaded


## üì£ Set up Basic API Call

In [4]:
import os
import json
from openai import OpenAI
from dotenv import load_dotenv
from datetime import datetime

# Load environment variables
load_dotenv()

# Get the API key from environment variables
api_key = os.getenv("OPENAI_API_KEY")

# Validate API key
if not api_key:
    print("‚ö†Ô∏è  Error: OPENAI_API_KEY not found.")
    print("Please make sure you have a .env file with OPENAI_API_KEY='sk-...'")
else:
    print("‚úÖ OpenAI API Key loaded successfully.")
    client = OpenAI(api_key=api_key)
    print("‚úÖ OpenAI client initialized.")

# Enhanced analysis function for abstract screening
def screen_abstract_llm(abstract_text, system_prompt, user_prompt_template, 
                       model="gpt-4o", temperature=0.0):
    """
    Screen an abstract using LLM with system and user prompts.
    
    Args:
        abstract_text (str): The abstract to analyze
        system_prompt (str): The system prompt defining the role
        user_prompt_template (str): Template with {abstract} placeholder
        model (str): The OpenAI model to use
        temperature (float): Temperature setting for response randomness
    
    Returns:
        dict: Result with decision, reasoning, and metadata
    """
    if 'client' not in globals():
        return {"error": "OpenAI client is not initialized. Please check your API key."}

    try:
        # Insert abstract into user prompt template
        user_prompt = user_prompt_template.format(abstract=abstract_text)
        
        response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_prompt}
            ],
            temperature=temperature,
            max_tokens=4000
        )
        
        if response and response.choices:
            result = {
                "decision": "INCLUDE" if "INCLUDE" in response.choices[0].message.content.upper() else "EXCLUDE",
                "reasoning": response.choices[0].message.content,
                "model": model,
                "temperature": temperature,
                "timestamp": datetime.now().isoformat(),
                "error": None
            }
            return result
        else:
            return {"error": "API Error: Empty or invalid response."}
            
    except Exception as e:
        return {"error": f"API Error: {e}"}

print("‚úÖ Enhanced screening function defined.")

‚úÖ OpenAI API Key loaded successfully.
‚úÖ OpenAI client initialized.
‚úÖ Enhanced screening function defined.


## üèõÔ∏è Set Up System Prompt 

In [5]:
# System prompt configuration
# System prompt configuration
SYSTEM_PROMPT_ID = "SYS_001"  # ‚¨ÖÔ∏è Change this ID for different system prompts
SYSTEM_PROMPT_DESCRIPTION = "Generic expert literature review screener for systematic reviews"

# Define the system prompt that sets the LLM's role
SYSTEM_PROMPT = """You are an expert in scientific literature review and systematic review methodology.

Your task is to screen research abstracts and decide whether they should be INCLUDED or EXCLUDED from a systematic literature review based on provided criteria.

INSTRUCTIONS:
1. Carefully read the provided inclusion/exclusion criteria
2. Review any example abstracts to understand the decision-making pattern
3. Apply the criteria systematically to the given abstract and title
4. Provide your decision in the exact format requested
5. Base your reasoning strictly on the provided criteria

Be consistent, objective, and systematic in your evaluation. Do not make up additional criteria beyond what is provided. Focus only on what is explicitly stated in the instructions."""

print(f"‚úÖ System prompt defined")
print(f"üìã ID: {SYSTEM_PROMPT_ID}")
print(f"üìè Length: {len(SYSTEM_PROMPT)} characters")
print(f"üìÑ Description: {SYSTEM_PROMPT_DESCRIPTION}")

‚úÖ System prompt defined
üìã ID: SYS_001
üìè Length: 759 characters
üìÑ Description: Generic expert literature review screener for systematic reviews


## üë©üèª‚Äç‚öïÔ∏è Create User Prompt


In [6]:
# User prompt configuration
USER_PROMPT_ID = "USR_004"  # ‚¨ÖÔ∏è Change this ID for different user prompts
USER_PROMPT_DESCRIPTION = "Basic screening with criteria and example from CSV files and maybe option"

# File paths for modular components
CRITERIA_FILE = "../prompts/Criteria_LB_01.csv"  # ‚¨ÖÔ∏è Change criteria file here
EXAMPLES_FILE = "../prompts/exmpl_single_LB_01.csv"  # ‚¨ÖÔ∏è Change examples file here (or set to None)

# Output configuration
OUTPUT_FORMAT = "Yes/Maybe/No"  # ‚¨ÖÔ∏è Options: "Binary", "Yes/Maybe/No", "Likert"
DECISION_OPTIONS = ["INCLUDE", "EXCLUDE", "MAYBE"] # ‚¨ÖÔ∏è Change according to the output format

# Additional metadata for results tracking
DOMAIN = "political_communication" # ‚¨ÖÔ∏è Change this to the domain of the study
TOPIC = "media_diversity"  # ‚¨ÖÔ∏è Change this to the topic of the study
DATASET_SOURCE = "LB"  # ‚¨ÖÔ∏è Which dataset (BM/LB)

# Define the user prompt template with placeholders
USER_PROMPT_TEMPLATE = """## SCREENING TASK:
You are conducting the first screening stage for a systematic literature review on {topic} in {domain}. Your task is to evaluate whether each abstract should be included, excluded, or requires further review based on the full text.

## INCLUSION/EXCLUSION CRITERIA:
{criteria_text}

{examples_section}

## DECISION GUIDELINES:

**INCLUDE**: Choose this when the abstract clearly meets the inclusion criteria and does NOT meet any exclusion criteria. The abstract provides sufficient information to confidently determine relevance.

**EXCLUDE**: Choose this when the abstract clearly violates one or more exclusion criteria OR clearly fails to meet the inclusion criteria. The abstract provides sufficient information to confidently determine irrelevance.

**MAYBE**: Choose this when the abstract is potentially relevant but lacks sufficient detail to make a confident decision. Use this option when:
- The abstract mentions relevant concepts but lacks specific details about methodology, context, or scope
- Key information needed to apply the criteria is missing or ambiguous
- The study appears to be in the right domain but the connection to your research question is unclear
- You would need to see the full text to properly evaluate against the criteria

## ABSTRACT TO SCREEN:
**Title:** {title}
**Abstract:** {abstract}

## YOUR DECISION:
Provide your decision as one of: **INCLUDE**, **MAYBE**, or **EXCLUDE**

**Decision:** [Choose exactly one: INCLUDE, MAYBE, or EXCLUDE]

**Reasoning:** [Explain your decision. For MAYBE decisions, specifically describe what additional information from the full text would be needed to make a final determination.]"""

print(f"‚úÖ User prompt configuration and template loaded")
print(f"üìã ID: {USER_PROMPT_ID}")
print(f"üìÑ Description: {USER_PROMPT_DESCRIPTION}")
print(f"üìÅ Criteria: {CRITERIA_FILE}")
print(f"üìÅ Examples: {EXAMPLES_FILE}")
print(f"üéØ Output: {OUTPUT_FORMAT}")
print(f"üî¨ Topic: {TOPIC} | Domain: {DOMAIN} | Source: {DATASET_SOURCE}")
print(f"üìè Template length: {len(USER_PROMPT_TEMPLATE)} characters")

‚úÖ User prompt configuration and template loaded
üìã ID: USR_004
üìÑ Description: Basic screening with criteria and example from CSV files and maybe option
üìÅ Criteria: ../prompts/Criteria_LB_01.csv
üìÅ Examples: ../prompts/exmpl_single_LB_01.csv
üéØ Output: Yes/Maybe/No
üî¨ Topic: media_diversity | Domain: political_communication | Source: LB
üìè Template length: 1679 characters


## ‚úÖ Valdiation Check

In [7]:
def validate_experiment_setup(df, dataset_source="LB"):
    """
    Validate that all required variables and data are available for the experiment.
    
    Args:
        df: DataFrame to be used in experiment
        dataset_source: Dataset identifier
    
    Returns:
        bool: True if all validations pass, False otherwise
    """
    
    print("üîç VALIDATION CHECK")
    print("=" * 50)
    
    validation_passed = True
    
    # Check required global variables
    required_vars = {
        'EXPERIMENT_ID': globals().get('EXPERIMENT_ID'),
        'SYSTEM_PROMPT_ID': globals().get('SYSTEM_PROMPT_ID'), 
        'USER_PROMPT_ID': globals().get('USER_PROMPT_ID'),
        'SYSTEM_PROMPT': globals().get('SYSTEM_PROMPT'),
        'USER_PROMPT_TEMPLATE': globals().get('USER_PROMPT_TEMPLATE'),
        'CRITERIA_FILE': globals().get('CRITERIA_FILE'),
        'DECISION_OPTIONS': globals().get('DECISION_OPTIONS'),
        'MODEL_NAME': globals().get('MODEL_NAME'),
        'TEMPERATURE': globals().get('TEMPERATURE'),
        'TOPIC': globals().get('TOPIC'),
        'DOMAIN': globals().get('DOMAIN')
    }
    
    # Optional variables that can be None
    optional_vars = {
        'EXAMPLES_FILE': globals().get('EXAMPLES_FILE')
    }
    
    print("üìã Checking required variables:")
    for var_name, var_value in required_vars.items():
        if var_value is None:
            print(f"   ‚ùå {var_name}: NOT DEFINED")
            validation_passed = False
        else:
            print(f"   ‚úÖ {var_name}: {str(var_value)[:50]}{'...' if len(str(var_value)) > 50 else ''}")
    
    print("üìã Checking optional variables:")
    for var_name, var_value in optional_vars.items():
        if var_value is None:
            print(f"   ‚úÖ {var_name}: None (optional - will run without examples)")
        else:
            print(f"   ‚úÖ {var_name}: {str(var_value)[:50]}{'...' if len(str(var_value)) > 50 else ''}")
    
    # Check DataFrame structure
    print(f"\nüìä Checking DataFrame structure:")
    required_columns = ['abstract', 'title_full', 'stage_2', 'stage_3']
    
    if df is None:
        print(f"   ‚ùå DataFrame is None")
        validation_passed = False
    else:
        print(f"   ‚úÖ DataFrame shape: {df.shape}")
        
        for col in required_columns:
            if col in df.columns:
                print(f"   ‚úÖ Column '{col}': Present")
            else:
                print(f"   ‚ùå Column '{col}': MISSING")
                validation_passed = False
    
    # Check data availability
    if df is not None and all(col in df.columns for col in required_columns):
        print(f"\nüìà Checking data availability:")
        stage2_true = len(df[df['stage_2'] == True])
        stage2_false = len(df[df['stage_2'] == False])
        stage3_true = len(df[df['stage_3'] == True])
        stage3_false = len(df[df['stage_3'] == False])
        
        print(f"   üìä Stage 2 True: {stage2_true}")
        print(f"   üìä Stage 2 False: {stage2_false}")
        print(f"   üìä Stage 3 True: {stage3_true}")
        print(f"   üìä Stage 3 False: {stage3_false}")
        
        if stage3_true < 10:
            print(f"   ‚ö†Ô∏è  Warning: Only {stage3_true} stage_3=True examples available")
        if stage3_false < 10:
            print(f"   ‚ö†Ô∏è  Warning: Only {stage3_false} stage_3=False examples available")
    
    # Check file paths
    print(f"\nüìÅ Checking file paths:")
    import os
    
    # CRITERIA_FILE is required
    if CRITERIA_FILE and os.path.exists(CRITERIA_FILE):
        print(f"   ‚úÖ Criteria file: {CRITERIA_FILE}")
    elif CRITERIA_FILE:
        print(f"   ‚ùå Criteria file: {CRITERIA_FILE} (NOT FOUND)")
        validation_passed = False
    else:
        print(f"   ‚ùå Criteria file: NOT SPECIFIED")
        validation_passed = False
    
    # EXAMPLES_FILE is optional
    if EXAMPLES_FILE is None:
        print(f"   ‚úÖ Examples file: None (will run without examples)")
    elif os.path.exists(EXAMPLES_FILE):
        print(f"   ‚úÖ Examples file: {EXAMPLES_FILE}")
    else:
        print(f"   ‚ùå Examples file: {EXAMPLES_FILE} (NOT FOUND)")
        validation_passed = False
    
    # Check API function
    print(f"\nü§ñ Checking API function:")
    if 'screen_abstract_llm' in globals():
        print(f"   ‚úÖ screen_abstract_llm function: Available")
    else:
        print(f"   ‚ùå screen_abstract_llm function: NOT DEFINED")
        validation_passed = False
    
    # Final result
    print("\n" + "=" * 50)
    if validation_passed:
        print("‚úÖ ALL VALIDATIONS PASSED - Ready to run experiment!")
    else:
        print("‚ùå VALIDATION FAILED - Please fix the issues above before running")
    
    return validation_passed

# Run validation
validation_result = validate_experiment_setup(df_LB, "LB")

üîç VALIDATION CHECK
üìã Checking required variables:
   ‚úÖ EXPERIMENT_ID: 004
   ‚úÖ SYSTEM_PROMPT_ID: SYS_001
   ‚úÖ USER_PROMPT_ID: USR_004
   ‚úÖ SYSTEM_PROMPT: You are an expert in scientific literature review ...
   ‚úÖ USER_PROMPT_TEMPLATE: ## SCREENING TASK:
You are conducting the first sc...
   ‚úÖ CRITERIA_FILE: ../prompts/Criteria_LB_01.csv
   ‚úÖ DECISION_OPTIONS: ['INCLUDE', 'EXCLUDE', 'MAYBE']
   ‚úÖ MODEL_NAME: gpt-4o
   ‚úÖ TEMPERATURE: 0.0
   ‚úÖ TOPIC: media_diversity
   ‚úÖ DOMAIN: political_communication
üìã Checking optional variables:
   ‚úÖ EXAMPLES_FILE: ../prompts/exmpl_single_LB_01.csv

üìä Checking DataFrame structure:
   ‚úÖ DataFrame shape: (3944, 15)
   ‚úÖ Column 'abstract': Present
   ‚úÖ Column 'title_full': Present
   ‚úÖ Column 'stage_2': Present
   ‚úÖ Column 'stage_3': Present

üìà Checking data availability:
   üìä Stage 2 True: 277
   üìä Stage 2 False: 3667
   üìä Stage 3 True: 207
   üìä Stage 3 False: 3737

üìÅ Checking file paths:
 

## üî¨ Set Up Function

In [8]:
import pandas as pd
import numpy as np
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix 
from datetime import datetime
import os
import time

def run_classification_experiment(
    df, 
    n_total_examples=50,  # ‚¨ÖÔ∏è Total number of examples to test
    n_stage3_true=5,     # ‚¨ÖÔ∏è Number of stage_3=True examples
    n_stage3_false=45,    # ‚¨ÖÔ∏è Number of stage_3=False examples
    dataset_source="LB",  # ‚¨ÖÔ∏è Dataset identifier (LB/BM)
    batch_size=20,        # ‚¨ÖÔ∏è Batch size for processing (max 20 to avoid timeouts)
    save_results=True,    # ‚¨ÖÔ∏è Whether to save results to CSV
    verbose=True          # ‚¨ÖÔ∏è Print progress updates
):
    """
    Run LLM classification experiment on abstracts with batch processing.
    
    Args:
        df: DataFrame with abstracts (must have 'abstract', 'title_full', 'stage_2', 'stage_3')
        n_total_examples: Total number of examples to test
        n_stage3_true: Number of stage_3=True examples to include
        n_stage3_false: Number of stage_3=False examples to include
        dataset_source: Dataset identifier for results filename
        batch_size: Number of examples to process in each batch (max 20)
        save_results: Whether to save results to CSV
        verbose: Whether to print progress
    
    Returns:
        dict: Results including metrics and DataFrame
    """
    
    # Validate batch size
    if batch_size > 20:
        print("‚ö†Ô∏è  Warning: Batch size > 20 may cause timeouts. Setting to 20.")
        batch_size = 20
    
    if verbose:
        print(f"üß™ Starting Classification Experiment with Batch Processing")
        print(f"üìä Dataset: {dataset_source}")
        print(f"üéØ Total examples: {n_total_examples}")
        print(f"‚úÖ Stage 3 True: {n_stage3_true}")
        print(f"‚ùå Stage 3 False: {n_stage3_false}")
        print(f"üì¶ Batch size: {batch_size}")
        print("=" * 50)
    
    # Sample examples
    stage3_true_samples = df[df['stage_3'] == True].sample(n=n_stage3_true, random_state=42)
    stage3_false_samples = df[df['stage_3'] == False].sample(n=n_stage3_false, random_state=42)
    
    # Combine samples
    test_samples = pd.concat([stage3_true_samples, stage3_false_samples]).reset_index(drop=True)
    
    if verbose:
        print(f"üìù Sampled {len(test_samples)} examples")
    
    # Load criteria and examples text
    def load_criteria_text(criteria_file):
        try:
            criteria_df = pd.read_csv(criteria_file)
            criteria_text = ""
            
            # Add inclusion criteria
            inclusion_criteria = criteria_df[criteria_df['type'] == 'inclusion']
            if len(inclusion_criteria) > 0:
                criteria_text += "**INCLUSION CRITERIA:**\n"
                for _, row in inclusion_criteria.iterrows():
                    criteria_text += f"- **{row['criterion_id']}**: {row['description']}\n"
                    if pd.notna(row['examples']) and row['examples'].strip():
                        criteria_text += f"  *Examples: {row['examples']}*\n"
            
            # Add exclusion criteria
            exclusion_criteria = criteria_df[criteria_df['type'] == 'exclusion']
            if len(exclusion_criteria) > 0:
                criteria_text += "\n**EXCLUSION CRITERIA:**\n"
                for _, row in exclusion_criteria.iterrows():
                    criteria_text += f"- **{row['criterion_id']}**: {row['description']}\n"
                    if pd.notna(row['examples']) and row['examples'].strip():
                        criteria_text += f"  *Examples: {row['examples']}*\n"
            
            return criteria_text
        except Exception as e:
            return f"Error loading criteria: {e}"
    
    def load_examples_text(examples_file):
        if not examples_file:
            return ""
        try:
            examples_df = pd.read_csv(examples_file)
            examples_text = "\n## EXAMPLE DECISIONS:\n"
            
            for _, row in examples_df.iterrows():
                decision_label = "INCLUDE" if row['decision'].upper() == 'INCLUDE' else "EXCLUDE"
                examples_text += f"\n**{decision_label} Example:**\n"
                examples_text += f"*Title:* {row['title']}\n"
                examples_text += f"*Abstract:* {row['abstract_text'][:200]}{'...' if len(row['abstract_text']) > 200 else ''}\n"
                examples_text += f"‚Üí **{decision_label}** ({row['reasoning']})\n"
            
            return examples_text
        except Exception as e:
            return f"\n## EXAMPLES:\nError loading examples: {e}\n"
    
    # Load prompt components
    criteria_text = load_criteria_text(CRITERIA_FILE)
    examples_section = load_examples_text(EXAMPLES_FILE) if EXAMPLES_FILE else ""
    
    # Initialize results list
    results_list = []
    
    # Calculate number of batches
    total_examples = len(test_samples)
    num_batches = (total_examples + batch_size - 1) // batch_size  # Ceiling division
    
    if verbose:
        print(f"üì¶ Processing {total_examples} examples in {num_batches} batch(es)")
        print(f"‚è±Ô∏è  Estimated time: ~{num_batches * 2} minutes (2 min per batch)")
    
    # Process examples in batches
    for batch_idx in range(num_batches):
        start_idx = batch_idx * batch_size
        end_idx = min(start_idx + batch_size, total_examples)
        batch_samples = test_samples.iloc[start_idx:end_idx]
        
        if verbose:
            print(f"\nüîÑ Processing Batch {batch_idx + 1}/{num_batches} (examples {start_idx + 1}-{end_idx})")
        
        batch_start_time = time.time()
        
        for idx, row in batch_samples.iterrows():
            sample_number = start_idx + (idx - batch_samples.index[0]) + 1
            
            try:
                # Create complete prompt
                complete_prompt = USER_PROMPT_TEMPLATE.format(
                    topic=TOPIC,
                    domain=DOMAIN,
                    criteria_text=criteria_text,
                    examples_section=examples_section,
                    title=row['title_full'],
                    abstract=row['abstract'],
                    decision_include=DECISION_OPTIONS[0],
                    decision_exclude=DECISION_OPTIONS[1]
                )
                
                # Call LLM
                llm_result = screen_abstract_llm(
                    abstract_text=complete_prompt,
                    system_prompt=SYSTEM_PROMPT,
                    user_prompt_template="{abstract}",  # Just pass through since we formatted above
                    model=MODEL_NAME,
                    temperature=TEMPERATURE
                )
                
                # Parse LLM decision
                llm_decision = llm_result.get('decision', 'UNKNOWN')
                llm_reasoning = llm_result.get('reasoning', 'No reasoning provided')
                
                # Convert to binary for evaluation - MAYBE counts as INCLUDE (1) for stage_2 comparison
                llm_binary = 1 if llm_decision in ['INCLUDE', 'MAYBE'] else 0
                stage2_binary = 1 if row['stage_2'] else 0
                stage3_binary = 1 if row['stage_3'] else 0
                
                # Store result
                result_row = {
                    'example_id': sample_number,
                    'title': row['title_full'],
                    'abstract': row['abstract'],
                    'stage_2_true': row['stage_2'],
                    'stage_3_true': row['stage_3'],
                    'stage_2_binary': stage2_binary,
                    'stage_3_binary': stage3_binary,
                    'llm_decision': llm_decision,
                    'llm_binary': llm_binary,
                    'llm_reasoning': llm_reasoning,
                    'experiment_id': EXPERIMENT_ID,
                    'dataset_source': dataset_source,
                    'system_prompt_id': SYSTEM_PROMPT_ID,
                    'user_prompt_id': USER_PROMPT_ID,
                    'model': MODEL_NAME,
                    'temperature': TEMPERATURE,
                    'timestamp': datetime.now().isoformat()
                }
                
                results_list.append(result_row)
                
            except Exception as e:
                if verbose:
                    print(f"‚ùå Error processing example {sample_number}: {e}")
                
                # Store error result
                result_row = {
                    'example_id': sample_number,
                    'title': row['title_full'],
                    'abstract': row['abstract'],
                    'stage_2_true': row['stage_2'],
                    'stage_3_true': row['stage_3'],
                    'stage_2_binary': 1 if row['stage_2'] else 0,
                    'stage_3_binary': 1 if row['stage_3'] else 0,
                    'llm_decision': 'ERROR',
                    'llm_binary': 0,
                    'llm_reasoning': f'Processing error: {e}',
                    'experiment_id': EXPERIMENT_ID,
                    'dataset_source': dataset_source,
                    'system_prompt_id': SYSTEM_PROMPT_ID,
                    'user_prompt_id': USER_PROMPT_ID,
                    'model': MODEL_NAME,
                    'temperature': TEMPERATURE,
                    'timestamp': datetime.now().isoformat()
                }
                
                results_list.append(result_row)
        
        # Batch completion info
        batch_time = time.time() - batch_start_time
        if verbose:
            print(f"‚úÖ Batch {batch_idx + 1} completed in {batch_time:.1f}s")
            if batch_idx < num_batches - 1:  # Not the last batch
                print(f"‚è≥ Brief pause before next batch...")
                time.sleep(2)  # Small delay between batches
    
    # Create results DataFrame
    results_df = pd.DataFrame(results_list)
    
    # Calculate detailed metrics for stage_2 (MAYBE counts as positive)
    valid_results_stage2 = results_df[results_df['llm_decision'] != 'ERROR']
    if len(valid_results_stage2) > 0:
        y_true_stage2 = valid_results_stage2['stage_2_binary'].values
        y_pred_stage2 = valid_results_stage2['llm_binary'].values
        
        # Basic metrics
        accuracy_stage2 = accuracy_score(y_true_stage2, y_pred_stage2)
        precision_stage2 = precision_score(y_true_stage2, y_pred_stage2, zero_division=0)
        recall_stage2 = recall_score(y_true_stage2, y_pred_stage2, zero_division=0)
        f1_stage2 = f1_score(y_true_stage2, y_pred_stage2, zero_division=0)
        
        # Confusion matrix metrics
        tn2, fp2, fn2, tp2 = confusion_matrix(y_true_stage2, y_pred_stage2).ravel()
    else:
        accuracy_stage2 = precision_stage2 = recall_stage2 = f1_stage2 = 0.0
        tp2 = fp2 = tn2 = fn2 = 0
    
    # Get value counts for LLM decisions
    llm_decision_counts = valid_results_stage2['llm_decision'].value_counts() if len(valid_results_stage2) > 0 else {}
    
    # Updated metrics dictionary (only stage_2, no stage_3)
    metrics = {
        'stage_2_metrics': {
            'accuracy': accuracy_stage2,
            'precision': precision_stage2,
            'recall': recall_stage2,
            'f1_score': f1_stage2,
            'tp': int(tp2),
            'fp': int(fp2),
            'tn': int(tn2),
            'fn': int(fn2)
        },
        'decision_counts': {
            'INCLUDE': int(llm_decision_counts.get('INCLUDE', 0)),
            'MAYBE': int(llm_decision_counts.get('MAYBE', 0)),
            'EXCLUDE': int(llm_decision_counts.get('EXCLUDE', 0)),
            'ERROR': int(llm_decision_counts.get('ERROR', 0))
        },
        'total_examples': len(results_df),
        'successful_classifications': len(valid_results_stage2),
        'errors': len(results_df) - len(valid_results_stage2)
    }
    
    # Enhanced results printing
    if verbose:
        print(f"\nüìä EXPERIMENT RESULTS")
        print("=" * 50)
        print(f"üìà Stage 2 Evaluation (MAYBE counted as INCLUDE):")
        print(f"   Accuracy:  {accuracy_stage2:.3f}")
        print(f"   Precision: {precision_stage2:.3f}")
        print(f"   Recall:    {recall_stage2:.3f}")
        print(f"   F1 Score:  {f1_stage2:.3f}")
        print(f"   TP: {tp2}, FP: {fp2}, TN: {tn2}, FN: {fn2}")
        print(f"\nüìä LLM Decision Distribution:")
        print(f"   INCLUDE: {metrics['decision_counts']['INCLUDE']}")
        print(f"   MAYBE:   {metrics['decision_counts']['MAYBE']}")
        print(f"   EXCLUDE: {metrics['decision_counts']['EXCLUDE']}")
        print(f"   ERROR:   {metrics['decision_counts']['ERROR']}")
        print(f"\nüìã Processing Summary:")
        print(f"   Total examples: {len(results_df)}")
        print(f"   Successful: {len(valid_results_stage2)}")
        print(f"   Errors: {len(results_df) - len(valid_results_stage2)}")
    
    # Save results
    if save_results:
        # Create filename with timestamp
        timestamp = datetime.now().strftime("%m%d%H%M")
        filename = f"{EXPERIMENT_ID}_{dataset_source}_{timestamp}.csv"
        results_dir = "../results"
        os.makedirs(results_dir, exist_ok=True)
        output_path = os.path.join(results_dir, filename)
        
        results_df.to_csv(output_path, index=False)
        
        if verbose:
            print(f"\nüíæ Results saved to: {output_path}")
    
    return {
        'results_df': results_df,
        'metrics': metrics,
        'filename': filename if save_results else None
    }

print("‚úÖ Classification experiment function with MAYBE option support defined")
print("üöÄ Ready to run: run_classification_experiment(df_LB, batch_size=15)")

‚úÖ Classification experiment function with MAYBE option support defined
üöÄ Ready to run: run_classification_experiment(df_LB, batch_size=15)


## üöÄ Run experiment! 

In [9]:
# Run experiment with default settings
results = run_classification_experiment(df_LB)

üß™ Starting Classification Experiment with Batch Processing
üìä Dataset: LB
üéØ Total examples: 50
‚úÖ Stage 3 True: 5
‚ùå Stage 3 False: 45
üì¶ Batch size: 20
üìù Sampled 50 examples
üì¶ Processing 50 examples in 3 batch(es)
‚è±Ô∏è  Estimated time: ~6 minutes (2 min per batch)

üîÑ Processing Batch 1/3 (examples 1-20)
‚úÖ Batch 1 completed in 94.8s
‚è≥ Brief pause before next batch...

üîÑ Processing Batch 2/3 (examples 21-40)
‚úÖ Batch 2 completed in 103.0s
‚è≥ Brief pause before next batch...

üîÑ Processing Batch 3/3 (examples 41-50)
‚úÖ Batch 3 completed in 46.0s

üìä EXPERIMENT RESULTS
üìà Stage 2 Evaluation (MAYBE counted as INCLUDE):
   Accuracy:  0.820
   Precision: 0.444
   Recall:    0.500
   F1 Score:  0.471
   TP: 4, FP: 5, TN: 37, FN: 4

üìä LLM Decision Distribution:
   INCLUDE: 9
   MAYBE:   0
   EXCLUDE: 41
   ERROR:   0

üìã Processing Summary:
   Total examples: 50
   Successful: 50
   Errors: 0

üíæ Results saved to: ../results/004_LB_08131718.csv


## üìä Results Analysis

In [None]:
# Load results file - you can specify the exact file path here
RESULTS_FILE_PATH = "../results/0002_LB_08131412.csv"  # ‚¨ÖÔ∏è Change this to your specific file path

# Alternative: Set to None to auto-load the most recent file
# RESULTS_FILE_PATH = None

if RESULTS_FILE_PATH:
    # Load specific file
    if os.path.exists(RESULTS_FILE_PATH):
        print(f"üìÅ Loading specified file: {os.path.basename(RESULTS_FILE_PATH)}")
        df_results = pd.read_csv(RESULTS_FILE_PATH)
    else:
        print(f"‚ùå Error: File not found: {RESULTS_FILE_PATH}")
        df_results = None
else:
    # Auto-load most recent file (original behavior)
    results_dir = "../results"
    result_files = [f for f in os.listdir(results_dir) if f.endswith('.csv')]
    if result_files:
        latest_file = sorted(result_files)[-1]
        file_path = os.path.join(results_dir, latest_file)
        print(f"üìÅ Auto-loading most recent file: {latest_file}")
        df_results = pd.read_csv(file_path)
    else:
        print("‚ùå No result files found in ../results directory")
        df_results = None

# Continue with analysis if file was loaded successfully
if df_results is not None:
    print(f"\nüìä RESULTS OVERVIEW")
    print("=" * 50)
    print(f"Shape: {df_results.shape}")
    print(f"Columns: {list(df_results.columns)}")
    
    print(f"\nüéØ DECISION SUMMARY")
    print("=" * 30)
    print(df_results['llm_decision'].value_counts())
    
    print(f"\nüìà PERFORMANCE PREVIEW")
    print("=" * 30)
    print("Stage 2 vs LLM:")
    print(pd.crosstab(df_results['stage_2_true'], df_results['llm_decision']))
    print("\nStage 3 vs LLM:")
    print(pd.crosstab(df_results['stage_3_true'], df_results['llm_decision']))
    
    print(f"\nüìã FIRST FEW RESULTS")
    print("=" * 30)
    display(df_results[['example_id', 'stage_2_true', 'stage_3_true', 'llm_decision', 'llm_reasoning']].head())
else:
    print("‚ùå Could not load results file for analysis")

In [None]:
# Display full reasoning for first 5 examples
print("ü§ñ FULL LLM REASONING EXAMPLES")
print("=" * 80)

for idx in range(min(5, len(df_results))):
    row = df_results.iloc[idx]
    print(f"\nüìã EXAMPLE {row['example_id']} - {row['llm_decision']}")
    print(f"üéØ Ground Truth: Stage 2={row['stage_2_true']}, Stage 3={row['stage_3_true']}")
    print(f"üìñ Title: {row['title'][:100]}{'...' if len(row['title']) > 100 else ''}")
    print(f"\nüí≠ FULL REASONING:")
    print("-" * 60)
    print(row['llm_reasoning'])
    print("-" * 60)

## ‚ûï Add experiment info to the results_df

In [10]:
def add_experiment_to_summary(results_dict, summary_file="../results/experiment_summary.csv"):
    """Add new experiment results to the summary DataFrame with confusion matrix metrics and decision counts"""
    
    new_row = pd.DataFrame({
        'experiment_id': [EXPERIMENT_ID],
        'experiment_date': [EXPERIMENT_DATE],
        'experiment_category': [EXPERIMENT_CATEGORY],
        'experiment_goal': [EXPERIMENT_GOAL],
        'system_prompt_id': [SYSTEM_PROMPT_ID],
        'user_prompt_id': [USER_PROMPT_ID],
        'model_name': [MODEL_NAME],
        'temperature': [TEMPERATURE],
        'max_tokens': [MAX_TOKENS],
        'criteria_file': [CRITERIA_FILE],
        'examples_file': [EXAMPLES_FILE],
        'output_format': [OUTPUT_FORMAT],
        'domain': [DOMAIN],
        'topic': [TOPIC],
        'dataset_source': [DATASET_SOURCE],
        'n_total_examples': [results_dict['metrics']['total_examples']],
        'n_successful': [results_dict['metrics']['successful_classifications']],
        'n_errors': [results_dict['metrics']['errors']],
        # Stage 2 metrics
        'stage2_accuracy': [results_dict['metrics']['stage_2_metrics']['accuracy']],
        'stage2_precision': [results_dict['metrics']['stage_2_metrics']['precision']],
        'stage2_recall': [results_dict['metrics']['stage_2_metrics']['recall']],
        'stage2_f1': [results_dict['metrics']['stage_2_metrics']['f1_score']],
        'stage2_tp': [results_dict['metrics']['stage_2_metrics']['tp']],
        'stage2_fp': [results_dict['metrics']['stage_2_metrics']['fp']],
        'stage2_tn': [results_dict['metrics']['stage_2_metrics']['tn']],
        'stage2_fn': [results_dict['metrics']['stage_2_metrics']['fn']],
        # Decision counts (new columns for MAYBE experiments)
        'llm_include_count': [results_dict['metrics']['decision_counts']['INCLUDE']],
        'llm_maybe_count': [results_dict['metrics']['decision_counts']['MAYBE']],
        'llm_exclude_count': [results_dict['metrics']['decision_counts']['EXCLUDE']],
        'llm_error_count': [results_dict['metrics']['decision_counts']['ERROR']],
        # Stage 3 metrics (set to None for MAYBE experiments)
        'stage3_accuracy': [None],
        'stage3_precision': [None],
        'stage3_recall': [None],
        'stage3_f1': [None],
        'stage3_tp': [None],
        'stage3_fp': [None],
        'stage3_tn': [None],
        'stage3_fn': [None],
        'results_filename': [results_dict['filename']],
        'timestamp': [datetime.now().isoformat()]
    })
    
    # Load existing summary or create new one
    if os.path.exists(summary_file):
        existing_summary = pd.read_csv(summary_file)
        updated_summary = pd.concat([existing_summary, new_row], ignore_index=True)
        print(f"‚úÖ Added experiment {EXPERIMENT_ID} to existing summary")
    else:
        updated_summary = new_row
        print(f"‚úÖ Created new summary file with experiment {EXPERIMENT_ID}")
    
    # Save updated summary
    updated_summary.to_csv(summary_file, index=False)
    print(f"üíæ Summary saved to: {summary_file}")
    
    # Display last 5 rows for verification
    print(f"\nüìã LAST 5 EXPERIMENTS IN SUMMARY:")
    print("=" * 50)
    display(updated_summary.tail())
    
    print(f"\nüìä SUMMARY STATS:")
    print(f"   Total experiments: {len(updated_summary)}")
    print(f"   Unique experiment IDs: {updated_summary['experiment_id'].nunique()}")
    print(f"   Datasets used: {updated_summary['dataset_source'].unique().tolist()}")
    
    return updated_summary

# Usage example (uncomment to run):
summary_df = add_experiment_to_summary(results)

‚úÖ Added experiment 004 to existing summary
üíæ Summary saved to: ../results/experiment_summary.csv

üìã LAST 5 EXPERIMENTS IN SUMMARY:


  updated_summary = pd.concat([existing_summary, new_row], ignore_index=True)


Unnamed: 0,experiment_id,experiment_date,experiment_category,experiment_goal,system_prompt_id,user_prompt_id,model_name,temperature,max_tokens,criteria_file,examples_file,output_format,domain,topic,dataset_source,n_total_examples,n_successful,n_errors,stage2_accuracy,stage2_precision,stage2_recall,stage2_f1,stage2_tp,stage2_fp,stage2_tn,stage2_fn,stage3_accuracy,stage3_precision,stage3_recall,stage3_f1,stage3_tp,stage3_fp,stage3_tn,stage3_fn,results_filename,timestamp,llm_include_count,llm_maybe_count,llm_exclude_count,llm_error_count
0,1,2025-08-13,Testing,Test Set Up,SYS_001,USR_001,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,../prompts/exmpl_single_LB_01.csv,Binary,political_communication,media_dviersity,LB,25,25,0,0.88,0.8,0.667,0.727,4,1,18,2,0.92,0.8,0.8,0.8,4.0,1.0,19.0,1.0,001_LB_08131314.csv,2025-08-13T13:14:33.374055,,,,
1,2,2025-08-13,Testing,Test Set Up,SYS_001,USR_002,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,,Binary,political_communication,media_diversity,LB,50,50,0,0.84,0.5,0.5,0.5,4,4,38,4,0.9,0.5,0.8,0.615,4.0,4.0,41.0,1.0,0002_LB_08131412.csv,2025-08-13T14:18:25.943746,,,,
2,3,2025-08-13,Testing,Test Set Up,SYS_001,USR_001,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,../prompts/exmpl_five_LB_01.csv,Binary,political_communication,media_diversity,LB,50,50,0,0.84,0.5,0.625,0.556,5,5,37,3,0.86,0.4,0.8,0.533,4.0,6.0,39.0,1.0,003_LB_08131430.csv,2025-08-13T14:34:00.733833,,,,
3,4,2025-08-13,Testing,Test Set Up,SYS_001,USR_003,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,,Yes/Maybe/No,political_communication,media_diversity,LB,50,50,0,0.82,0.462,0.75,0.571,6,7,35,2,,,,,,,,,004_LB_08131504.csv,2025-08-13T15:08:35.001340,13.0,0.0,37.0,0.0
4,4,2025-08-13,Testing,Test Set Up,SYS_001,USR_004,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,../prompts/exmpl_single_LB_01.csv,Yes/Maybe/No,political_communication,media_diversity,LB,50,50,0,0.82,0.444,0.5,0.471,4,5,37,4,,,,,,,,,004_LB_08131718.csv,2025-08-13T17:20:17.593168,9.0,0.0,41.0,0.0



üìä SUMMARY STATS:
   Total experiments: 5
   Unique experiment IDs: 5
   Datasets used: ['LB']


In [16]:
# Load the experiment summary
summary_df = pd.read_csv("../results/experiment_summary.csv")

# Change experiment_id from 4 to 5 for the row at index 3 (4th row, 0-indexed)
summary_df.loc[4, 'experiment_id'] = '005'

# Save the corrected summary back to file
summary_df.to_csv("../results/experiment_summary.csv", index=False)

# Verify the change
print("‚úÖ Experiment ID corrected")
print(f"üìä Row at index 3 now has experiment_id: {summary_df.loc[3, 'experiment_id']}")

# Display the last 5 rows to confirm
print(f"\nüìã LAST 5 EXPERIMENTS IN SUMMARY:")
print("=" * 50)
display(summary_df.tail())

‚úÖ Experiment ID corrected
üìä Row at index 3 now has experiment_id: 4

üìã LAST 5 EXPERIMENTS IN SUMMARY:


  summary_df.loc[4, 'experiment_id'] = '005'


Unnamed: 0,experiment_id,experiment_date,experiment_category,experiment_goal,system_prompt_id,user_prompt_id,model_name,temperature,max_tokens,criteria_file,examples_file,output_format,domain,topic,dataset_source,n_total_examples,n_successful,n_errors,stage2_accuracy,stage2_precision,stage2_recall,stage2_f1,stage2_tp,stage2_fp,stage2_tn,stage2_fn,stage3_accuracy,stage3_precision,stage3_recall,stage3_f1,stage3_tp,stage3_fp,stage3_tn,stage3_fn,results_filename,timestamp,llm_include_count,llm_maybe_count,llm_exclude_count,llm_error_count
0,1,2025-08-13,Testing,Test Set Up,SYS_001,USR_001,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,../prompts/exmpl_single_LB_01.csv,Binary,political_communication,media_dviersity,LB,25,25,0,0.88,0.8,0.667,0.727,4,1,18,2,0.92,0.8,0.8,0.8,4.0,1.0,19.0,1.0,001_LB_08131314.csv,2025-08-13T13:14:33.374055,,,,
1,2,2025-08-13,Testing,Test Set Up,SYS_001,USR_002,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,,Binary,political_communication,media_diversity,LB,50,50,0,0.84,0.5,0.5,0.5,4,4,38,4,0.9,0.5,0.8,0.615,4.0,4.0,41.0,1.0,0002_LB_08131412.csv,2025-08-13T14:18:25.943746,,,,
2,3,2025-08-13,Testing,Test Set Up,SYS_001,USR_001,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,../prompts/exmpl_five_LB_01.csv,Binary,political_communication,media_diversity,LB,50,50,0,0.84,0.5,0.625,0.556,5,5,37,3,0.86,0.4,0.8,0.533,4.0,6.0,39.0,1.0,003_LB_08131430.csv,2025-08-13T14:34:00.733833,,,,
3,4,2025-08-13,Testing,Test Set Up,SYS_001,USR_003,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,,Yes/Maybe/No,political_communication,media_diversity,LB,50,50,0,0.82,0.462,0.75,0.571,6,7,35,2,,,,,,,,,004_LB_08131504.csv,2025-08-13T15:08:35.001340,13.0,0.0,37.0,0.0
4,5,2025-08-13,Testing,Test Set Up,SYS_001,USR_004,gpt-4o,0.0,4000,../prompts/Criteria_LB_01.csv,../prompts/exmpl_single_LB_01.csv,Yes/Maybe/No,political_communication,media_diversity,LB,50,50,0,0.82,0.444,0.5,0.471,4,5,37,4,,,,,,,,,004_LB_08131718.csv,2025-08-13T17:20:17.593168,9.0,0.0,41.0,0.0


## üìù Conclusions and Next Steps

### Key Findings
- 

### Next Steps
- [Suggest follow-up experiments]
- [List potential improvements]
- [Identify areas for further investigation]