# Research Agent Bot Development & Testing Notebook

This comprehensive notebook tests all classes, functions, and logic step by step:

1. **Configuration Testing** - Verify environment setup
2. **Gemini Client Testing** - Test AI model connectivity
3. **Google Scholar API Testing** - Test paper search functionality
4. **Document Processing Testing** - Test PDF/HTML processing
5. **Vector Store Testing** - Test embeddings and similarity search
6. **Research Agent Integration** - Test full workflow
7. **Error Handling Testing** - Test edge cases and failures

## 1. Setup and Configuration Testing

In [1]:
# Import all necessary libraries
import os
import sys
import asyncio
import logging
from datetime import datetime
import json

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

print(f"Python version: {sys.version}")
print(f"Current working directory: {os.getcwd()}")
print(f"Test started at: {datetime.now()}")
print("" + "=" * 50)

Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr  8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)]
Current working directory: e:\Research_Agent_Bot
Test started at: 2025-07-24 08:48:57.507358


In [2]:
# Test configuration loading
try:
    from config import config
    print("✅ Configuration loaded successfully")
    print(f"   - Gemini Model: {config.GEMINI_MODEL}")
    print(f"   - Max Results: {config.MAX_RESULTS}")
    print(f"   - Temperature: {config.TEMPERATURE}")
    print(f"   - Vector DB Path: {config.VECTOR_DB_PATH}")
    print(f"   - Cache Dir: {config.CACHE_DIR}")
    
    # Check API keys (without revealing them)
    print(f"   - Gemini API Key: {'✅ Present' if config.GEMINI_API_KEY else '❌ Missing'}")
    print(f"   - SerpAPI Key: {'✅ Present' if config.SERPAPI_KEY else '❌ Missing'}")
    
except Exception as e:
    print(f"❌ Configuration failed: {e}")

✅ Configuration loaded successfully
   - Gemini Model: gemini-2.5-flash
   - Max Results: 10
   - Temperature: 0.3
   - Vector DB Path: ./chroma_db
   - Cache Dir: ./cache
   - Gemini API Key: ✅ Present
   - SerpAPI Key: ✅ Present


## 2. Gemini Client Testing

In [3]:
# Test Gemini Client initialization and basic functionality
try:
    from gemini_client import GeminiClient
    
    print("🧪 Testing Gemini Client...")
    
    # Initialize client
    gemini_client = GeminiClient(
        api_key=config.GEMINI_API_KEY,
        model=config.GEMINI_MODEL,
        temperature=config.TEMPERATURE
    )
    print("✅ Gemini client initialized")
    
    # Test connection
    connection_test = gemini_client.test_connection()
    print(f"   - Connection test: {'✅ Passed' if connection_test else '❌ Failed'}")
    
    # Get model info
    model_info = gemini_client.get_model_info()
    print(f"   - Model info: {json.dumps(model_info, indent=2)}")
    
    # Test simple invocation
    test_message = "Hello, this is a test message. Please respond with 'Test successful'"
    response = gemini_client.invoke([test_message])
    print(f"   - Test response: {response.content[:100]}...")
    
except Exception as e:
    print(f"❌ Gemini Client test failed: {e}")

  from .autonotebook import tqdm as notebook_tqdm
2025-07-24 08:49:07,411 - gemini_client - INFO - Initialized Gemini model: gemini-2.5-flash


🧪 Testing Gemini Client...
✅ Gemini client initialized


2025-07-24 08:49:08,840 - gemini_client - INFO - Gemini API connection test successful


   - Connection test: ✅ Passed
   - Model info: {
  "name": "models/gemini-2.5-flash",
  "display_name": "Gemini 2.5 Flash",
  "description": "Stable version of Gemini 2.5 Flash, our mid-size multimodal model that supports up to 1 million tokens, released in June of 2025.",
  "supported_generation_methods": [
    "generateContent",
    "countTokens",
    "createCachedContent",
    "batchGenerateContent"
  ]
}
   - Test response: Test successful...


## 3. Google Scholar API Testing

In [4]:
# Test Google Scholar API functionality
try:
    from scholar_api import GoogleScholarAPI, ScholarResult
    
    print("🧪 Testing Google Scholar API...")
    
    # Initialize API
    scholar_api = GoogleScholarAPI(config.SERPAPI_KEY)
    print("✅ Scholar API initialized")
    
    # Test query refinement
    test_query = "machine learning algorithms"
    refined_query = scholar_api.refine_query(test_query)
    print(f"   - Original query: {test_query}")
    print(f"   - Refined query: {refined_query}")
    
    # Test paper search (async)
    async def test_scholar_search():
        results = await scholar_api.search_papers(refined_query, max_results=3)
        return results
    
    search_results = await test_scholar_search()
    print(f"   - Found {len(search_results)} papers")
    
    # Display first result details
    if search_results:
        first_paper = search_results[0]
        print(f"   - First paper: {first_paper.title[:80]}...")
        print(f"   - Authors: {', '.join(first_paper.authors[:3]) if first_paper.authors else 'Unknown'}")
        print(f"   - Year: {first_paper.year or 'Unknown'}")
        print(f"   - Citations: {first_paper.citation_count}")
        print(f"   - PDF URL: {'✅ Available' if first_paper.pdf_url else '❌ Not available'}")
    
except Exception as e:
    print(f"❌ Scholar API test failed: {e}")

🧪 Testing Google Scholar API...
✅ Scholar API initialized
   - Original query: machine learning algorithms
   - Refined query: machine learning algorithms
   - Found 3 papers
   - First paper: A review of supervised machine learning algorithms...
❌ Scholar API test failed: sequence item 0: expected str instance, dict found


## 4. Document Processing Testing

In [5]:
# Test Document Processor functionality
try:
    from document_processor import DocumentProcessor, DocumentChunk
    
    print("🧪 Testing Document Processor...")
    
    # Initialize processor
    doc_processor = DocumentProcessor(config.CACHE_DIR)
    print("✅ Document processor initialized")
    print(f"   - Cache directory: {config.CACHE_DIR}")
    
    # Test with a paper that has PDF (if available from search results)
    if 'search_results' in locals() and search_results:
        paper_with_pdf = None
        for paper in search_results:
            if paper.pdf_url:
                paper_with_pdf = paper
                break
        
        if paper_with_pdf:
            print(f"   - Testing with paper: {paper_with_pdf.title[:50]}...")
            
            # Process the paper (async)
            async def test_document_processing():
                chunks = await doc_processor.process_paper(
                    paper_with_pdf.pdf_url, 
                    paper_with_pdf.title
                )
                return chunks
            
            document_chunks = await test_document_processing()
            print(f"   - Extracted {len(document_chunks)} chunks")
            
            if document_chunks:
                first_chunk = document_chunks[0]
                print(f"   - First chunk preview: {first_chunk.text[:100]}...")
                print(f"   - Chunk metadata: {first_chunk.metadata}")
        else:
            print("   - No papers with PDF URLs found for testing")
    else:
        print("   - No search results available for document processing test")
    
except Exception as e:
    print(f"❌ Document processor test failed: {e}")

🧪 Testing Document Processor...
✅ Document processor initialized
   - Cache directory: ./cache
   - Testing with paper: Types of machine learning algorithms...
   - Extracted 170 chunks
   - First chunk preview: Types of Machine Learning Algorithms 19 Types of Machine Learning Algorithms Taiwo Oladipupo Ayodele...
   - Chunk metadata: {'title': 'Types of machine learning algorithms', 'page': 1, 'chunk_id': 'page_1_chunk_0', 'total_pages': 32}


## 5. Vector Store Testing

In [6]:
# Test Vector Store functionality
try:
    from vector_store import VectorStore
    
    print("🧪 Testing Vector Store...")
    
    # Initialize vector store
    vector_store = VectorStore(config.VECTOR_DB_PATH, config.EMBEDDING_MODEL)
    print("✅ Vector store initialized")
    
    # Get current stats
    stats = vector_store.get_collection_stats()
    print(f"   - Current stats: {stats}")
    
    # Test adding documents (if we have chunks from previous step)
    if 'document_chunks' in locals() and document_chunks:
        print(f"   - Adding {len(document_chunks)} chunks to vector store...")
        success = vector_store.add_documents(document_chunks)
        print(f"   - Add documents: {'✅ Success' if success else '❌ Failed'}")
        
        # Test similarity search
        test_search_query = "machine learning algorithms"
        search_results_vector = vector_store.similarity_search(test_search_query, k=3)
        print(f"   - Similarity search results: {len(search_results_vector)} found")
        
        if search_results_vector:
            for i, (text, metadata, score) in enumerate(search_results_vector):
                print(f"     Result {i+1}: Score={score:.3f}, Text={text[:80]}...")
    else:
        print("   - No document chunks available for testing")
        
        # Test with sample data
        sample_chunks = [
            DocumentChunk(
                text="Machine learning is a subset of artificial intelligence that focuses on algorithms.",
                metadata={"title": "Sample Paper 1", "page": 1},
                source="test",
                page_number=1
            ),
            DocumentChunk(
                text="Deep learning uses neural networks with multiple layers to process data.",
                metadata={"title": "Sample Paper 2", "page": 1},
                source="test",
                page_number=1
            )
        ]
        
        success = vector_store.add_documents(sample_chunks)
        print(f"   - Add sample documents: {'✅ Success' if success else '❌ Failed'}")
        
        # Test search with sample data
        search_results_vector = vector_store.similarity_search("neural networks", k=2)
        print(f"   - Sample search results: {len(search_results_vector)} found")
    
    # Updated stats
    updated_stats = vector_store.get_collection_stats()
    print(f"   - Updated stats: {updated_stats}")
    
except Exception as e:
    print(f"❌ Vector store test failed: {e}")

🧪 Testing Vector Store...


2025-07-24 08:50:25,684 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device_name: cpu
2025-07-24 08:50:25,685 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2
2025-07-24 08:50:38,203 - vector_store - INFO - Initialized vector store with 11290 documents


✅ Vector store initialized
   - Current stats: {'total_chunks': 11290, 'unique_papers': 17, 'papers': ['Exploring large language model based intelligent agents: Definitions, methods, and prospects', 'Logic-based technologies for multi-agent systems: a systematic literature review', 'Exploring autonomous agents through the lens of large language models: A review', 'AI and Agents: State of the Art', 'Longheads: Multi-head attention is secretly a long context processor', 'Exploring self-attention mechanisms for speech separation', 'EEG-transformer: Self-attention from transformer architecture for decoding EEG of imagined speech', 'Architectures and applications of intelligent agents: A survey', 'Agents: An open-source framework for autonomous language agents', 'Multi-modal and multi-agent systems meet rationality: A survey', 'Moh: Multi-head attention as mixture-of-head attention', "Machine learning-based virtual screening and its applications to Alzheimer's drug discovery: a review", 'Un

Batches: 100%|██████████| 6/6 [00:08<00:00,  1.39s/it]
2025-07-24 08:50:48,869 - vector_store - INFO - Added 170 document chunks to vector store


   - Add documents: ✅ Success


Batches: 100%|██████████| 1/1 [00:00<00:00, 31.25it/s]


   - Similarity search results: 3 found
     Result 1: Score=0.634, Text=Types of Machine Learning Algorithms 19 Types of Machine Learning Algorithms Tai...
     Result 2: Score=0.581, Text=Otherwise, it wouldnt be easy for whoever requires that input to figure out what...
     Result 3: Score=0.569, Text=Types of Machine Learning Algorithms 25 Unsupervised learning has produced many ...
   - Updated stats: {'total_chunks': 11460, 'unique_papers': 18, 'papers': ['Exploring large language model based intelligent agents: Definitions, methods, and prospects', 'Logic-based technologies for multi-agent systems: a systematic literature review', 'Exploring autonomous agents through the lens of large language models: A review', 'AI and Agents: State of the Art', 'Longheads: Multi-head attention is secretly a long context processor', 'Exploring self-attention mechanisms for speech separation', 'EEG-transformer: Self-attention from transformer architecture for decoding EEG of imagined speech', '

## 6. Research Agent Integration Testing

In [7]:
# Test full Research Agent integration
try:
    from research_agent import ResearchAgent, ResearchState
    
    print("🧪 Testing Research Agent Integration...")
    
    # Initialize research agent
    research_agent = ResearchAgent()
    print("✅ Research agent initialized")
    
    # Test individual components
    test_query = "What are the latest developments in transformer architectures?"
    print(f"   - Test query: {test_query}")
    
    # Test query refinement
    refined_query = research_agent.query_refinement_tool.refine_query(test_query)
    print(f"   - Refined query: {refined_query}")
    
    # Test each step of the workflow
    initial_state = {
        'original_query': test_query,
        'refined_query': '',
        'search_results': [],
        'processed_documents': [],
        'context_chunks': [],
        'final_answer': '',
        'error': None,
        'step': 'initialized'
    }
    
    # Step 1: Query refinement
    state = await research_agent._refine_query(initial_state)
    print(f"   - Step 1 (Refine Query): {state['step']}")
    print(f"     Refined: {state.get('refined_query', 'N/A')[:80]}...")
    
    # Step 2: Search papers
    if not state.get('error'):
        state = await research_agent._search_papers(state)
        print(f"   - Step 2 (Search Papers): {state['step']}")
        print(f"     Found {len(state.get('search_results', []))} papers")
    
    # Step 3: Process documents (limited for testing)
    if not state.get('error') and state.get('search_results'):
        # Limit to first paper for testing
        state['search_results'] = state['search_results'][:1]
        state = await research_agent._process_documents(state)
        print(f"   - Step 3 (Process Documents): {state['step']}")
        print(f"     Processed {len(state.get('processed_documents', []))} papers")
    
    # Step 4: Extract context
    if not state.get('error'):
        state = await research_agent._extract_context(state)
        print(f"   - Step 4 (Extract Context): {state['step']}")
        print(f"     Context chunks: {len(state.get('context_chunks', []))}")
    
    # Step 5: Generate answer
    if not state.get('error'):
        state = await research_agent._generate_answer(state)
        print(f"   - Step 5 (Generate Answer): {state['step']}")
        print(f"     Answer preview: {state.get('final_answer', 'N/A')[:100]}...")
    
    # Print any errors
    if state.get('error'):
        print(f"   - ❌ Error occurred: {state['error']}")
    
except Exception as e:
    print(f"❌ Research agent integration test failed: {e}")

2025-07-24 08:50:50,953 - gemini_client - INFO - Initialized Gemini model: gemini-2.5-flash
2025-07-24 08:50:50,995 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device_name: cpu
2025-07-24 08:50:50,995 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


🧪 Testing Research Agent Integration...


2025-07-24 08:50:54,718 - vector_store - INFO - Initialized vector store with 11460 documents
2025-07-24 08:50:56,809 - gemini_client - INFO - Gemini API connection test successful
2025-07-24 08:50:57,201 - research_agent - INFO - Connected to Gemini model: Gemini 2.5 Flash


✅ Research agent initialized
   - Test query: What are the latest developments in transformer architectures?


2025-07-24 08:51:02,857 - research_agent - INFO - Refining query: What are the latest developments in transformer architectures?


   - Refined query: ("transformer architectures" OR "attention mechanisms") AND ("recent advances" OR "novel" OR "state-of-the-art" OR "efficient" OR "scalable")


2025-07-24 08:51:07,304 - research_agent - INFO - Refined query: ("transformer architectures" OR "attention mechanisms" OR "large language models" OR "vision transformers" OR "multimodal transformers") AND ("recent advancements" OR "novel architectures" OR "state-of-the-art" OR "survey" OR "review")
2025-07-24 08:51:07,305 - research_agent - INFO - Searching papers for: ("transformer architectures" OR "attention mechanisms" OR "large language models" OR "vision transformers" OR "multimodal transformers") AND ("recent advancements" OR "novel architectures" OR "state-of-the-art" OR "survey" OR "review")


   - Step 1 (Refine Query): query_refined
     Refined: ("transformer architectures" OR "attention mechanisms" OR "large language models...


2025-07-24 08:51:07,988 - research_agent - INFO - Found 10 papers
2025-07-24 08:51:07,990 - research_agent - INFO - Processing documents


   - Step 2 (Search Papers): papers_found
     Found 10 papers


Batches: 100%|██████████| 98/98 [02:05<00:00,  1.28s/it]
2025-07-24 08:53:31,337 - vector_store - INFO - Added 3110 document chunks to vector store
2025-07-24 08:53:31,349 - research_agent - INFO - Processed paper: Transformers in vision: A survey
2025-07-24 08:53:31,351 - research_agent - INFO - Extracting context


   - Step 3 (Process Documents): documents_processed
     Processed 1 papers


Batches: 100%|██████████| 1/1 [00:00<00:00, 27.79it/s]
2025-07-24 08:53:32,025 - research_agent - INFO - Extracted 10 relevant chunks
2025-07-24 08:53:32,026 - research_agent - INFO - Generating answer


   - Step 4 (Extract Context): context_extracted
     Context chunks: 10


2025-07-24 08:53:44,411 - research_agent - INFO - Answer generated successfully


   - Step 5 (Generate Answer): answer_generated
     Answer preview: I apologize, but the provided context from the research papers is heavily corrupted and unreadable. ...


## 7. Full Workflow Testing

In [8]:
# Test the complete research workflow using the graph
try:
    print("🧪 Testing Complete Research Workflow...")
    
    # Use the compiled graph for end-to-end testing
    if 'research_agent' in locals():
        # Simple test query
        simple_query = "machine learning applications"
        
        # Run the complete workflow
        print(f"   - Running complete workflow for: {simple_query}")
        
        initial_state = {
            'original_query': simple_query,
            'refined_query': '',
            'search_results': [],
            'processed_documents': [],
            'context_chunks': [],
            'final_answer': '',
            'error': None,
            'step': 'start'
        }
        
        # This would run the full graph - for testing, we'll run abbreviated version
        # result = research_agent.graph.invoke(initial_state)
        
        # Instead, let's test the research method if available
        if hasattr(research_agent, 'research'):
            result = await research_agent.research(simple_query)
            print(f"   - ✅ Workflow completed successfully")
            print(f"   - Final answer: {result.get('final_answer', 'No answer')[:200]}...")
        else:
            print("   - Research method not available, skipping full workflow test")
    else:
        print("   - Research agent not initialized, skipping workflow test")
    
except Exception as e:
    print(f"❌ Full workflow test failed: {e}")

2025-07-24 08:53:44,533 - research_agent - INFO - Refining query: machine learning applications


🧪 Testing Complete Research Workflow...
   - Running complete workflow for: machine learning applications


2025-07-24 08:53:51,946 - research_agent - INFO - Refined query: "machine learning" AND ("applications" OR "use cases" OR "case studies" OR "real-world implementation" OR "industry solutions")
2025-07-24 08:53:51,948 - research_agent - INFO - Searching papers for: "machine learning" AND ("applications" OR "use cases" OR "case studies" OR "real-world implementation" OR "industry solutions")
2025-07-24 08:53:53,262 - research_agent - INFO - Found 10 papers
2025-07-24 08:53:53,264 - research_agent - INFO - Processing documents
2025-07-24 08:53:57,081 - document_processor - ERROR - Error processing PDF: EOF marker not found
Batches: 100%|██████████| 2/2 [00:01<00:00,  1.34it/s]
2025-07-24 08:54:02,673 - vector_store - INFO - Added 40 document chunks to vector store
2025-07-24 08:54:02,674 - research_agent - INFO - Processed paper: An Introduction to Optimization: With Applications to Machine Learning
Batches: 100%|██████████| 11/11 [00:13<00:00,  1.22s/it]
2025-07-24 08:54:19,817 - vector_

   - ✅ Workflow completed successfully
   - Final answer: No answer...


## 8. Error Handling and Edge Cases Testing

In [9]:
# Test error handling and edge cases
try:
    print("🧪 Testing Error Handling and Edge Cases...")
    
    # Test 1: Empty query
    print("   - Test 1: Empty query")
    try:
        empty_result = research_agent.query_refinement_tool.refine_query("")
        print(f"     Empty query result: {empty_result}")
    except Exception as e:
        print(f"     Expected error for empty query: {e}")
    
    # Test 2: Very long query
    print("   - Test 2: Very long query")
    long_query = "machine learning " * 100  # Very long query
    try:
        long_result = research_agent.query_refinement_tool.refine_query(long_query)
        print(f"     Long query handled: {len(long_result)} chars")
    except Exception as e:
        print(f"     Error with long query: {e}")
    
    # Test 3: Invalid characters in query
    print("   - Test 3: Invalid characters in query")
    invalid_query = "machine learning 🤖 @#$%^&*()"
    try:
        invalid_result = research_agent.query_refinement_tool.refine_query(invalid_query)
        print(f"     Invalid chars handled: {invalid_result[:50]}...")
    except Exception as e:
        print(f"     Error with invalid characters: {e}")
    
    # Test 4: Vector store with no documents
    print("   - Test 4: Vector store search with no documents")
    try:
        empty_vector_store = VectorStore("./test_empty_db", config.EMBEDDING_MODEL)
        empty_search = empty_vector_store.similarity_search("test query", k=5)
        print(f"     Empty vector store search: {len(empty_search)} results")
    except Exception as e:
        print(f"     Error with empty vector store: {e}")
    
    # Test 5: Network error simulation (invalid URL)
    print("   - Test 5: Document processing with invalid URL")
    try:
        invalid_chunks = await doc_processor.process_paper(
            "https://invalid-url-that-does-not-exist.com/paper.pdf",
            "Test Paper"
        )
        print(f"     Invalid URL handled: {len(invalid_chunks)} chunks")
    except Exception as e:
        print(f"     Error with invalid URL: {e}")
    
    print("✅ Error handling tests completed")
    
except Exception as e:
    print(f"❌ Error handling test failed: {e}")

🧪 Testing Error Handling and Edge Cases...
   - Test 1: Empty query
     Empty query result: ("academic search query formulation" OR "scholarly literature search strategy" OR "research query optimization") AND ("information retrieval" OR "literature search methodology")
   - Test 2: Very long query
     Long query handled: 162 chars
   - Test 3: Invalid characters in query


2025-07-24 08:54:54,836 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device_name: cpu
2025-07-24 08:54:54,837 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


     Invalid chars handled: ("machine learning" OR "deep learning" OR "artific...
   - Test 4: Vector store search with no documents


2025-07-24 08:54:58,938 - vector_store - INFO - Initialized vector store with 0 documents
Batches: 100%|██████████| 1/1 [00:00<00:00, 62.51it/s]


     Empty vector store search: 0 results
   - Test 5: Document processing with invalid URL




     Invalid URL handled: 0 chunks
✅ Error handling tests completed


## 9. Performance and Resource Testing

In [10]:
# Test performance and resource usage
import time
import psutil
import gc

try:
    print("🧪 Testing Performance and Resource Usage...")
    
    # Memory usage before
    process = psutil.Process()
    memory_before = process.memory_info().rss / 1024 / 1024  # MB
    print(f"   - Memory before tests: {memory_before:.2f} MB")
    
    # Test query refinement speed
    start_time = time.time()
    for i in range(3):
        test_query = f"machine learning test query {i}"
        refined = research_agent.query_refinement_tool.refine_query(test_query)
    refinement_time = time.time() - start_time
    print(f"   - Query refinement (3 queries): {refinement_time:.2f} seconds")
    
    # Test vector search speed
    if 'vector_store' in locals():
        start_time = time.time()
        for i in range(5):
            search_results = vector_store.similarity_search(f"test query {i}", k=3)
        search_time = time.time() - start_time
        print(f"   - Vector search (5 searches): {search_time:.2f} seconds")
    
    # Memory usage after
    gc.collect()  # Force garbage collection
    memory_after = process.memory_info().rss / 1024 / 1024  # MB
    print(f"   - Memory after tests: {memory_after:.2f} MB")
    print(f"   - Memory difference: {memory_after - memory_before:.2f} MB")
    
    # CPU usage
    cpu_percent = process.cpu_percent()
    print(f"   - Current CPU usage: {cpu_percent:.1f}%")
    
except Exception as e:
    print(f"❌ Performance test failed: {e}")

🧪 Testing Performance and Resource Usage...
   - Memory before tests: 776.25 MB
   - Query refinement (3 queries): 21.08 seconds


Batches: 100%|██████████| 1/1 [00:00<00:00, 58.19it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 45.45it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 55.57it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 45.68it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 52.60it/s]

   - Vector search (5 searches): 0.14 seconds





   - Memory after tests: 748.22 MB
   - Memory difference: -28.03 MB
   - Current CPU usage: 0.0%


## 10. Test Summary and Results

In [11]:
# Generate test summary
print("" + "=" * 60)
print("                    TEST SUMMARY")
print("" + "=" * 60)

test_results = []

# Check what components were successfully tested
components = [
    ('Configuration', 'config' in locals()),
    ('Gemini Client', 'gemini_client' in locals()),
    ('Scholar API', 'scholar_api' in locals()),
    ('Document Processor', 'doc_processor' in locals()),
    ('Vector Store', 'vector_store' in locals()),
    ('Research Agent', 'research_agent' in locals()),
    ('Search Results', 'search_results' in locals() and len(search_results) > 0),
    ('Document Chunks', 'document_chunks' in locals() and len(document_chunks) > 0),
]

for component, status in components:
    status_icon = '✅' if status else '❌'
    print(f"{status_icon} {component}")

print("" + "-" * 60)

# Overall system health
passed_tests = sum(1 for _, status in components if status)
total_tests = len(components)
success_rate = (passed_tests / total_tests) * 100

print(f"Overall Success Rate: {success_rate:.1f}% ({passed_tests}/{total_tests})")

if success_rate >= 80:
    print("🎉 System is functioning well!")
elif success_rate >= 60:
    print("⚠️  System has some issues that need attention.")
else:
    print("🚨 System has significant issues that need immediate attention.")

print("" + "=" * 60)
print(f"Test completed at: {datetime.now()}")
print("" + "=" * 60)

                    TEST SUMMARY
✅ Configuration
✅ Gemini Client
✅ Scholar API
✅ Document Processor
✅ Vector Store
✅ Research Agent
✅ Search Results
✅ Document Chunks
------------------------------------------------------------
Overall Success Rate: 100.0% (8/8)
🎉 System is functioning well!
Test completed at: 2025-07-24 08:55:21.575358
