## Summarizing Many Papers

In [4]:
# Import Relevant Libraries
import getpass
import os
import glob
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.documents import Document
from langchain_community.document_loaders import PyPDFLoader  # Added missing import
import time
from tqdm import tqdm
import warnings

# Suppress PDF parsing warnings
warnings.filterwarnings("ignore", category=UserWarning)
from langchain_community.document_loaders import WebBaseLoader
from langchain.chat_models import init_chat_model

# Keys for tracing and API
os.environ["LANGSMITH_TRACING"] = "true"
if not os.environ.get("LANGSMITH_API_KEY"):
    os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter LangSmith API Key: ")
if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter OpenAI API Key: ")

# Initialize the LLM
llm = ChatOpenAI(model="o3", temperature=0.2)

def load_pdfs_from_folder(folder):
    """Load PDFs and return each as a separate document without combining pages"""
    pdf_paths = glob.glob(f"{folder}/*.pdf")
    docs = []
    for i, path in enumerate(pdf_paths):
        print(f"Loading PDF {i+1}/{len(pdf_paths)}: {os.path.basename(path)}")
        try:
            loader = PyPDFLoader(path)
            pdf_docs = loader.load()
            
            # Keep each page separate to avoid context window issues
            # But combine into logical chunks if needed
            full_text = ""
            for page in pdf_docs:
                full_text += page.page_content + "\n"
            
            # Only add if we successfully extracted text
            if full_text.strip():
                docs.append({
                    "path": path, 
                    "content": full_text,
                    "filename": os.path.basename(path)
                })
            else:
                print(f"Warning: No text extracted from {path}")
                
        except Exception as e:
            print(f"Error loading {path}: {e}")
            continue
    
    return docs

# Field description
# Field description
MY_FIELD_DESCRIPTION = """My field is Multirobot Adaptive Navigation in Environmental Vector Fields, 
combining robotics, differential geometry, and information theory, and numerical methods to determine the absolute minimum 
information required for navigation in vector and scalar environments. The core discovery is that environmental 
vector fields contain sufficient geometric structure for complete navigation using only instantaneous 
measurements from 3 or 4 strategically positioned robots, which can extract all necessary second-order field 
information (gradients, Jacobians, Hessians, eigenstructure) that traditionally required 6 or more 
measurements. The key breakthrough is developing memory-free control, also known as reactive control or adaptiv control laws that achieve convergence using 
only instantaneous measurements without any state estimation or feedback history (Markov). These control laws exploit 
fundamental mathematical properties like Hessian symmetry and vector field consistency constraints where 
the field structure itself acts as a natural computer providing navigation instructions through classical 
optimization geometry. Technical innovations include proving that 4-point multi-robot formations form sufficient 
sampling stencils for second-order field properties, creating universal navigation primitives that handle 
saddle points in sclar fieldss, three robot primitives that can attract, repel, and maintain a fixed orbit around
critical points in 2D vector fields (focus, nodes, centers, vortices, saddle points) through a single memoryless orbit primitive.
All primitives are tested experimentally, with communication rate of 10Hz, and achieving sub-centimeter formation precision enabling reliable field geometry extraction.
Another contribution is a primitive that moves through a vector field along the separatrix or bifurcation. This approach 
enables applications in GPS-denied environments or dynamic envorinments, which change in time. There is also emphasis information-minimal navigation. 
The experimental verification helps bridging the simulation-to-reality gap. Potential use cases in land, sea, sky, surface, underwater, space, magnetic fields, gravitational fields, all vector and scalar fields potentially"""

# Analysis questions
ANALYSIS_QUESTIONS = """
Analyze this paper and extract the following information:

0. Give the citation for the paper in MLA (be sure to capture the date/year)
1. Main Contributions: What are the 2-4 key contributions claimed by the authors?
2. Objective: What was their objective, and how well did they achive it (Give metrics)
3. Limitations: What limitations did the identify to their work (Identify 2-5)
4. Future Work: What future directions do they see their work going in (Identify 2 -5)
5. Methods: What hardware and software were used? ROS Gazebo Simulink, Python, OptiTrack
6. Field Positiong: How do the authors describe the current state of their own field? Be very verbose here (like 100 words)
7. Multirobot fields: How do the authors describe the current state of the art in multirobot or multiagent systems? Very verbose again.
8. Research Gaps: Does the author identify any outstanding challenges in their field, or areas that are not explored well.
9. Benefits: What benefits are achieved using a multirobot system, and does this paper unlock additional benefits?
10. Applications: Tell me all the applications that are using multirobot systems.
10. Vector Fields: Does the paper talk about natrucally occuring vector fields? What about scalar fields? What kinds?
11. Adaptivity: Does the paper have knowledge of the environment, if so how much? or is it only using local measurements to navigate?
12. Memory: In addition to local measurements, does it build a map, or maintain history, or is it purely reactive or purely adaptive?
13. Mathematical Disciplines: List all mathematical disciplines used in this research
14. Minimality: Does the paper make any reference to minimal information? What about optimality or convergence?
15. Scaling: Does the paper talk about how algorithms or systems scale as they increase the number of robots (or robustness?) Memory complexity? zero-memory navigation?
16. Field Properties: Does it mention, gradients, hessians, Jacobians, geometric structure? If so, how?
17. Assumptions: What assumptions does the paper make?
18. Drawbacks: Does it mention other methods and drawbacks, such as iterative convergence, extensice training data, probabalistic, extensive modelling apriori?
19. Simulation to reality: Does it talk about sim only, or real world? If real world or indoor testbed, give details. If no testbed, see if it says why.
20. Key words: Does it mention any of these key words, if so, in what context? Focii, saddle, centre, vortex, sink, source, separatrix, bifurcation, memoryless navigation, eigenvector, eigenvalue, reactive navigation, adapative navigation, eigenstructure, determinant.
21. Formation Control: Does it use cluster control? swarm particle optimizaiton? other? does it say why it chose that type?
22. Similiarities and differences. Explain how my research addresses any gaps or future work identified in the paper. (or n/a and explanation.)



"""

# Create the review prompt
review_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a highly analytical academic researcher. You have a background in robotics. Please be specific and quote directly when the authors make important claims about the field or gaps. Be verbose, as I want to extract as much context as possible out of each paper as it related to my field as possible. \nUser's field: " + MY_FIELD_DESCRIPTION),
    ("human", "Given the following paper text, answer these questions. :\n"
     + ANALYSIS_QUESTIONS +
    "\n=== BEGIN PAPER DOCUMENT ===\n{paper_text}\n=== END PAPER DOCUMENT ===")
])

def analyze_papers(folder_path="./PapersForResearch", output_file="all_paper_summaries.md"):
    """Main function to analyze all papers and generate summaries"""
    
    # Load PDFs
    print("Loading PDFs...")
    pdf_docs = load_pdfs_from_folder(folder_path)
    print(f"Successfully loaded {len(pdf_docs)} papers")
    
    if not pdf_docs:
        print("No PDFs found or loaded successfully!")
        return
    
    # Process each paper
    results = []
    
    for i, doc in enumerate(pdf_docs):
        print(f"\n=== Processing paper {i+1}/{len(pdf_docs)}: {doc['filename']} ===")
        
        # Check if content is too long (rough estimate for token limit)
        if len(doc['content']) > 100000:  # Adjust this threshold as needed
            print(f"Warning: {doc['filename']} is very long ({len(doc['content'])} chars). Consider splitting.")
        
        # Prepare input
        paper_input = {"paper_text": doc['content']}
        
        # Create chain
        chain = review_prompt | llm
        
        # Generate analysis
        try:
            print(f"Sending to LLM...")
            response = chain.invoke(paper_input)
            summary = response.content if hasattr(response, 'content') else str(response)
            print(f"✓ Successfully analyzed {doc['filename']}")
            
        except Exception as e:
            summary = f"ERROR analyzing {doc['filename']}: {e}"
            print(f"✗ Error analyzing {doc['filename']}: {e}")
        
        # Store result
        results.append({
            "file": doc["path"],
            "filename": doc["filename"],
            "summary": summary,
        })
        
        # Rate limiting
        time.sleep(2)  # Adjust based on your OpenAI plan
    
    # Write all results to file (FIXED: moved outside the loop)
    print(f"\nWriting results to {output_file}...")
    with open(output_file, "w", encoding='utf-8') as f:
        f.write(f"# Academic Paper Analysis Results\n\n")
        f.write(f"Generated on: {time.strftime('%Y-%m-%d %H:%M:%S')}\n\n")
        f.write(f"Total papers analyzed: {len(results)}\n\n")
        f.write("---\n\n")
        
        for r in results:
            f.write(f"# Analysis for {r['filename']}\n\n")
            f.write(f"**File Path:** {r['file']}\n\n")
            f.write(r['summary'])
            f.write("\n\n" + "="*80 + "\n\n")
    
    print(f"✓ Analysis complete! Results saved to {output_file}")
    return results

# Run the analysis
if __name__ == "__main__":
    # You can customize these parameters
    results = analyze_papers(
        folder_path="./PapersForResearch",  # Change this to your PDF folder
        output_file="all_paper_summaries.md"
    )

Loading PDFs...
Loading PDF 1/68: [2] INDOOR TESTBED FOR VECTOR FIELD MULTIROBOT ADAPTIVE NAVIGATION (2).pdf
Loading PDF 2/68: [27] Augmented_Kalman_Filter_Design_in_a_Localization_System_Using_Onboard_Sensors_With_Intrinsic_Delays.pdf
Loading PDF 3/68: [23] Safe multiagent reinforcement learning.pdf
Loading PDF 4/68: [57] Matroid s10514-018-9778-6.pdf
Loading PDF 5/68: [66] Cooperative_Distributed_Source_Seeking_by_Multiple_Robots_Algorithms_and_Experiments.pdf
Loading PDF 6/68: [64] 3-D_Adaptive_Navigation_Multirobot_Formation_Control_for_Seeking_and_Tracking_of_a_Moving_Source.pdf
Loading PDF 7/68: [13] Multi-Robot_Dynamical_Source_Seeking_in_Unknown_Environments.pdf
Loading PDF 8/68: [44] Fully_Decentralized_Controller_for_Multi-Robot_Collective_Transport_in_Space_Applications.pdf
Loading PDF 9/68: [58] Differential_analysis_of_bifurcations_and_isolated_singularities_for_robots_and_mechanisms.pdf
Loading PDF 10/68: [22] Mobile_Robot_Navigation_Functions_Tuned_by_Sensor_Readings_in_

Ignoring wrong pointing object 6 0 (offset 0)
Ignoring wrong pointing object 9 0 (offset 0)
Ignoring wrong pointing object 11 0 (offset 0)
Ignoring wrong pointing object 13 0 (offset 0)
Ignoring wrong pointing object 15 0 (offset 0)
Ignoring wrong pointing object 17 0 (offset 0)
Ignoring wrong pointing object 19 0 (offset 0)
Ignoring wrong pointing object 21 0 (offset 0)
Ignoring wrong pointing object 27 0 (offset 0)
Ignoring wrong pointing object 29 0 (offset 0)
Ignoring wrong pointing object 31 0 (offset 0)
Ignoring wrong pointing object 38 0 (offset 0)
Ignoring wrong pointing object 40 0 (offset 0)
Ignoring wrong pointing object 42 0 (offset 0)
Ignoring wrong pointing object 44 0 (offset 0)
Ignoring wrong pointing object 51 0 (offset 0)
Ignoring wrong pointing object 53 0 (offset 0)
Ignoring wrong pointing object 55 0 (offset 0)
Ignoring wrong pointing object 57 0 (offset 0)
Ignoring wrong pointing object 77 0 (offset 0)
Ignoring wrong pointing object 79 0 (offset 0)
Ignoring wrong 

Loading PDF 36/68: [30] Final Paper.pdf
Loading PDF 37/68: [67] Cooperative_control_of_mobile_sensor_networksAdaptive_gradient_climbing_in_a_distributed_environment.pdf
Loading PDF 38/68: [7] Guiding_Vector_Fields_for_the_Distributed_Motion_Coordination_of_Mobile_Robots.pdf
Loading PDF 39/68: [14] A Survey of Distributed Relative Localization Algorithms.pdf
Loading PDF 40/68: [51] Optimizing_Topologies_for_Probabilistically_Secure_Multi-Robot_Systems.pdf
Loading PDF 41/68: [3] Initial_Study_of_Multirobot_Adaptive_Navigation_for_Exploring_Environmental_Vector_Fields.pdf
Loading PDF 42/68: [24] Simultaneous_Position_and_Orientation_Planning_of_Nonholonomic_Multirobot_Systems_A_Dynamic_Vector_Field_Approach.pdf
Loading PDF 43/68: [37] Distributed_Nonlinear_Trajectory_Optimization_for_Multi-Robot_Motion_Planning.pdf
Loading PDF 44/68: [12] A_Distributed_Multi-Robot_Framework_for_Exploration_Information_Acquisition_and_Consensus.pdf
Loading PDF 45/68: [28] Highly_Efficient_Observation_Proce

Ignoring wrong pointing object 22 0 (offset 0)
Ignoring wrong pointing object 52 0 (offset 0)
Ignoring wrong pointing object 56 0 (offset 0)
Ignoring wrong pointing object 141 0 (offset 0)
Ignoring wrong pointing object 249 0 (offset 0)
Ignoring wrong pointing object 280 0 (offset 0)
Ignoring wrong pointing object 293 0 (offset 0)
Ignoring wrong pointing object 300 0 (offset 0)
Ignoring wrong pointing object 391 0 (offset 0)
Ignoring wrong pointing object 416 0 (offset 0)
Ignoring wrong pointing object 531 0 (offset 0)


Loading PDF 52/68: [49] NeurIPS-2019-necessary-and-sufficient-geometries-for-gradient-methods-Paper.pdf
Loading PDF 53/68: [61] Recent Advances s43154-021-00049-2.pdf
Loading PDF 54/68: [54] Multirobot_Symmetric_Formations_for_Gradient_and_Hessian_Estimation_With_Application_to_Source_Seeking.pdf
Loading PDF 55/68: [48] Importance Sampling1608.08814v1.pdf
Loading PDF 56/68: [62] Multiple_UAV_Adaptive_Navigation_for_Three-Dimensional_Scalar_Fields.pdf
Loading PDF 57/68: [60] swarm intelligence a review.pdf
Loading PDF 58/68: [32] khatib-1986-real-time-obstacle-avoidance-for-manipulators-and-mobile-robots.pdf
Loading PDF 59/68: [15] Distributing_Collaborative_Multi-Robot_Planning_With_Gaussian_Belief_Propagation.pdf
Loading PDF 60/68: [43] partial eigenstructure math-06-10-647.pdf
Loading PDF 61/68: [9] Multirobot_Field_of_View_Control_With_Adaptive_Decentralization.pdf
Loading PDF 62/68: [38] Distributed_Competition_of_Multi-Robot_Coordination_Under_Variable_and_Switching_Topologies.pdf

KeyboardInterrupt: 

I have attached summaries of 60+ paper, and I want you to do some analysis on them for me, and include some useful statistics.
Below is a sample of what I want, but please feel free to include additional analysis if you believe it would be helpful. Some of these may be similar, so no need to include all. These are just suggestions.

General Summary
List the 5 most common themes in order. What research direction appears most frequently?
Give me a 1 pager on how the field has evolved over time.
Create a taxonomy of the main research streams in the field.
Tell me about all the gaps in the field of multirobot systems that exist. What unsolved problems still remain in the field, ranked by consensus and impact
Tell me about common methodologies, what is becoming standard vs novel (example, using Mujoco, python)

Comparison to my work
How does my work (excerpt below) differ from the existing approaches? Which approach is closest to it? Include ones that involve Chris Kitts, and another list of ones that dont.
What specific gaps in literature does my work cover? Is it mentioned elsewhere?
Create a table comparing my methods to 5 most similar approaches
What would be top real life use cases for my work? Which papers suggest so?
Is my work novel? Or is it covered by someone else?
What potential bottlenecks would reviewers have about my work or field?


My research:
My field is Multirobot Adaptive Navigation in Environmental Vector Fields, an interdisciplinary area combining robotics, control theory, and environmental sensing. This field addresses the challenge of deploying teams of autonomous robots to explore and navigate through naturally occurring vector fields—such as ocean currents, atmospheric flows, or electromagnetic fields—using only local, distributed measurements. Unlike traditional path planning where robots navigate to predetermined locations, adaptive navigation requires robots to discover and track dynamic features of interest (critical points, bifurcations, extrema) in real-time using collaborative sensing and estimation. The core technical challenges include: extracting meaningful information (gradients, jacobins, hessians, eigenvectors) from noisy distributed sensors, maintaining precise multi-robot formations while executing adaptive maneuvers, developing control primitives that exploit topological features of vector fields (sources, sinks, centers, critical points, saddles, bifurcations, separatrices), and bridging the gap between theoretical algorithms and physical implementations. The key to the research is that it only uses current sensor data to update the control law. No memory stores are needed, no feedback loops, yet, they reach their target or trajectory of choice.

This field has critical applications in environmental monitoring (tracking pollution plumes, mapping ocean dynamics), search and rescue (following scent gradients), and scientific exploration (characterizing atmospheric phenomena), where understanding and navigating complex flow patterns is essential for mission success. Some of the math is attached as Paper II.

My field is Multirobot Adaptive Navigation in Environmental Vector Fields, 
combining robotics, differential geometry, and information theory to determine the absolute minimum 
information required for navigation in complex environments. The core discovery is that environmental 
vector fields contain sufficient geometric structure for complete navigation using only instantaneous 
measurements from 3 or 4 strategically positioned robots, which can extract all necessary second-order field 
information (gradients, Jacobians, Hessians, eigenstructure) that traditionally required 6 or more 
measurements. The key breakthrough is developing memory-free control laws that achieve convergence using 
only instantaneous 10Hz measurements without any state estimation or feedback history, exploiting 
fundamental mathematical properties like Hessian symmetry and vector field consistency constraints where 
the field structure itself acts as a natural computer providing navigation instructions through classical 
optimization geometry. Technical innovations include proving that 4-point multi-robot formations form sufficient 
sampling stencils for second-order field properties, creating universal navigation primitives that handle 
saddle points in sclar fieldss, three robot primitives that can attract, repel, and maintain a fixed orbit around
critical points in 2D vector fields (focus, nodes, centers, vortices, saddle points) through a single memoryless orbit primitive, and 
achieving sub-centimeter formation precision enabling reliable field geometry extraction.
Another contribution is a primitive that moves through a vector field along the separatrix or bifurcation. This approach 
enables applications in GPS-denied environments like ocean robots tracking pollution plumes with minimal 
battery power, aerial swarms characterizing atmospheric phenomena with limited communication, and 
search-and-rescue teams following chemical gradients, with all theoretical results validated on a 
physical testbed marking the first successful implementation of truly information-minimal navigation 
bridging the simulation-to-reality gap.

After providing this analysis, using the attached documents and your analysis, write a good intro for a IEEE transactions in Mechatonics paper. No em-dashes.
Give me the outline, starting with multirobot systems, explain the field and gaps needed to introduce my work, and why solving these gaps is important.
Then introduce my novel contributions. Keep it in LATEX and to about 3-4 pages (about 30-50% longer than what I have below).
Add citations like this '..that single robots cannot achieve [1]." Use text, not \cite. I'll have a list of numbered references for you to compare to. 
Add about 40 citations, and more than 1 reference can be used in the same sentence.
Attached is an example of what I have written, but this was before i had the all paper summaries.

