[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/viincci/Viincci-RAG/blob/main/Test.ipynb)

# Viincci-RAG Complete Testing Notebook

**Comprehensive testing** across all domains (poetry, medical research, botany, art history, carpentry).

Click the badge above to **open in Google Colab** or run locally with Jupyter.

In [None]:

# Viincci-RAG Complete Testing Notebook
# =====================================
# Test all domains: poetry, medical research, botany, art history, and carpentry

# ============================================================================
# SECTION 1: Installation & Setup
# ============================================================================

print("üì¶ Installing Viincci-RAG from GitHub...")
!pip install -q git+https://github.com/viincci/Viincci-RAG.git

# Install additional dependencies if needed
!pip install -q torch transformers faiss-cpu sentence-transformers

print("‚úÖ Installation complete!")

# ============================================================================
# SECTION 2: Environment Setup
# ============================================================================

import os
from getpass import getpass

# Set up SerpAPI key (required for web research)
print("\n" + "="*80)
print("‚ö†Ô∏è  SERPAPI KEY REQUIRED")
print("="*80)
print("You need a SerpAPI key from https://serpapi.com/")
print("Free tier includes 100 searches/month")
print()

serp_api_key = getpass("Enter your SerpAPI key: ")
os.environ['SERP_API_KEY'] = serp_api_key

# ============================================================================
# SECTION 3: Import and Verify
# ============================================================================

print("\n" + "="*80)
print("IMPORTING VIINCCI-RAG MODULES")
print("="*80)

# Import using the new viincci_rag package structure
from V4 import (
    ConfigManager,
    UniversalResearchSpider,
    RAGSystem,
    UniversalArticleGenerator
)

print("‚úÖ Viincci-RAG imported successfully!")
print("   - ConfigManager")
print("   - UniversalResearchSpider")
print("   - RAGSystem")
print("   - UniversalArticleGenerator")

# ============================================================================
# SECTION 4: Helper Functions
# ============================================================================

def print_section_header(title, test_num):
    """Print a formatted section header"""
    print("\n" + "="*80)
    print(f"TEST {test_num}: {title}")
    print("="*80)

def save_and_preview(content, filename, preview_length=1500):
    """Save content to file and show preview"""
    with open(filename, "w", encoding='utf-8') as f:
        f.write(content)

    print(f"\nüìÑ Preview of {filename}:")
    print("-" * 80)
    print(content[:preview_length])
    if len(content) > preview_length:
        print("...\n[Content truncated]")
    print("-" * 80)
    print(f"‚úÖ Full content saved to: {filename}")

# ============================================================================
# SECTION 5: Test Case 1 - Poetry Generation
# ============================================================================

print_section_header("Poetry Generation - Edgar Allan Poe Style", 1)

try:
    # Configure for literature domain
    poetry_config = ConfigManager(domain="literature", verbose=False)

    # Research a poet
    spider = UniversalResearchSpider(poetry_config)
    print("\nüîç Researching Edgar Allan Poe's poetry style and themes...")
    poet_sources = spider.research("Edgar Allan Poe poetry style themes Gothic")

    print(f"‚úÖ Found {len(poet_sources)} sources")

    # Build RAG system with research
    rag_poetry = RAGSystem(poetry_config)
    texts = [s.get('text', '') for s in poet_sources if s.get('text')]
    metadata = [s.get('metadata', {}) for s in poet_sources]

    if texts:
        rag_poetry.build_index(texts, metadata)
        rag_poetry.load_llm()

        # Generate a poem
        generator = UniversalArticleGenerator(poetry_config, rag_system=rag_poetry)
        poem = generator.generate_full_article(
            "Edgar Allan Poe",
            poet_sources,
            content_type="poem"
        )

        save_and_preview(poem, "gothic_poem.txt", 1000)
    else:
        print("‚ö†Ô∏è No text content found in sources")

except Exception as e:
    print(f"‚ùå Error in poetry generation: {e}")
    import traceback
    traceback.print_exc()

# ============================================================================
# SECTION 6: Test Case 2 - Medical Research Paper
# ============================================================================

print_section_header("Medical Research Paper - Diabetes Mellitus", 2)

try:
    # Configure for medical domain
    medical_config = ConfigManager(domain="medical", verbose=False)

    # Research diabetes
    spider_medical = UniversalResearchSpider(medical_config)
    print("\nüîç Researching diabetes mellitus type 2...")
    disease_sources = spider_medical.research("diabetes mellitus type 2 pathophysiology treatment")

    print(f"‚úÖ Found {len(disease_sources)} sources")

    # Build RAG system
    rag_medical = RAGSystem(medical_config)
    texts = [s.get('text', '') for s in disease_sources if s.get('text')]
    metadata = [s.get('metadata', {}) for s in disease_sources]

    if texts:
        rag_medical.build_index(texts, metadata)
        rag_medical.load_llm()

        # Generate research paper
        generator_medical = UniversalArticleGenerator(medical_config, rag_system=rag_medical)
        research_paper = generator_medical.generate_full_article(
            "diabetes mellitus type 2",
            disease_sources,
            content_type="essay"
        )

        save_and_preview(research_paper, "diabetes_research.txt", 1500)
    else:
        print("‚ö†Ô∏è No text content found in sources")

except Exception as e:
    print(f"‚ùå Error in medical research: {e}")
    import traceback
    traceback.print_exc()

# ============================================================================
# SECTION 7: Test Case 3 - Botanical Blog Article (HTML)
# ============================================================================

print_section_header("Botanical Blog Article - Rosa rubiginosa (HTML)", 3)

try:
    # Configure for botany domain
    botany_config = ConfigManager(domain="botany", verbose=False)

    # Research a plant
    spider_botany = UniversalResearchSpider(botany_config)
    print("\nüîç Researching Rosa rubiginosa (Sweet Briar Rose)...")
    plant_sources = spider_botany.research("Rosa rubiginosa Sweet Briar Rose characteristics habitat")

    print(f"‚úÖ Found {len(plant_sources)} sources")

    # Build RAG system
    rag_botany = RAGSystem(botany_config)
    texts = [s.get('text', '') for s in plant_sources if s.get('text')]
    metadata = [s.get('metadata', {}) for s in plant_sources]

    if texts:
        rag_botany.build_index(texts, metadata)
        rag_botany.load_llm()

        # Generate HTML article
        generator_botany = UniversalArticleGenerator(botany_config, rag_system=rag_botany)
        html_article = generator_botany.generate_full_article(
            "Rosa rubiginosa",
            plant_sources,
            format="html"
        )

        save_and_preview(html_article, "rosa_rubiginosa_blog.html", 1000)

        # Display in Colab
        from IPython.display import HTML, display
        print("\nüåø Rendering HTML preview in notebook:")
        display(HTML(html_article[:2000]))
    else:
        print("‚ö†Ô∏è No text content found in sources")

except Exception as e:
    print(f"‚ùå Error in botanical article: {e}")
    import traceback
    traceback.print_exc()

# ============================================================================
# SECTION 8: Test Case 4 - Artist Background Research
# ============================================================================

print_section_header("Artist Biography - Vincent van Gogh", 4)

try:
    # Configure for art history domain
    art_config = ConfigManager(domain="art_history", verbose=False)

    # Research artist
    spider_art = UniversalResearchSpider(art_config)
    print("\nüîç Researching Vincent van Gogh's life and artistic journey...")
    artist_sources = spider_art.research("Vincent van Gogh biography life artistic journey")

    print(f"‚úÖ Found {len(artist_sources)} sources")

    # Build RAG system
    rag_art = RAGSystem(art_config)
    texts = [s.get('text', '') for s in artist_sources if s.get('text')]
    metadata = [s.get('metadata', {}) for s in artist_sources]

    if texts:
        rag_art.build_index(texts, metadata)
        rag_art.load_llm()

        # Generate biography
        generator_art = UniversalArticleGenerator(art_config, rag_system=rag_art)
        artist_bio = generator_art.generate_full_article(
            "Vincent van Gogh",
            artist_sources,
            content_type="essay"
        )

        save_and_preview(artist_bio, "van_gogh_biography.txt", 1500)
    else:
        print("‚ö†Ô∏è No text content found in sources")

except Exception as e:
    print(f"‚ùå Error in artist biography: {e}")
    import traceback
    traceback.print_exc()

# ============================================================================
# SECTION 9: Test Case 5 - Art Movement Article (Plain Text)
# ============================================================================

print_section_header("Art Movement Article - Impressionism", 5)

try:
    # Reuse art history config
    print("\nüîç Researching Impressionism art movement...")
    movement_sources = spider_art.research("Impressionism art movement history characteristics")

    print(f"‚úÖ Found {len(movement_sources)} sources")

    # Build RAG system
    rag_movement = RAGSystem(art_config)
    texts = [s.get('text', '') for s in movement_sources if s.get('text')]
    metadata = [s.get('metadata', {}) for s in movement_sources]

    if texts:
        rag_movement.build_index(texts, metadata)
        rag_movement.load_llm()

        # Generate article
        generator_movement = UniversalArticleGenerator(art_config, rag_system=rag_movement)
        movement_article = generator_movement.generate_full_article(
            "Impressionism",
            movement_sources,
            format="text"
        )

        save_and_preview(movement_article, "impressionism_article.txt", 1500)
    else:
        print("‚ö†Ô∏è No text content found in sources")

except Exception as e:
    print(f"‚ùå Error in art movement article: {e}")
    import traceback
    traceback.print_exc()

# ============================================================================
# SECTION 10: Test Case 6 - Carpentry Content
# ============================================================================

print_section_header("Carpentry Guide - Dovetail Joints", 6)

try:
    # Try carpentry domain, fallback to default if not available
    try:
        carpentry_config = ConfigManager(domain="carpentry", verbose=False)
        print("‚úÖ Using carpentry domain configuration")
    except:
        print("‚ÑπÔ∏è  Carpentry domain not found, using default configuration")
        carpentry_config = ConfigManager(verbose=False)

    # Research carpentry topic
    spider_carpentry = UniversalResearchSpider(carpentry_config)
    print("\nüîç Researching dovetail joints in woodworking...")
    carpentry_sources = spider_carpentry.research("dovetail joints woodworking techniques tutorial")

    print(f"‚úÖ Found {len(carpentry_sources)} sources")

    # Build RAG system
    rag_carpentry = RAGSystem(carpentry_config)
    texts = [s.get('text', '') for s in carpentry_sources if s.get('text')]
    metadata = [s.get('metadata', {}) for s in carpentry_sources]

    if texts:
        rag_carpentry.build_index(texts, metadata)
        rag_carpentry.load_llm()

        # Generate guide
        generator_carpentry = UniversalArticleGenerator(carpentry_config, rag_system=rag_carpentry)
        carpentry_guide = generator_carpentry.generate_full_article(
            "dovetail joints",
            carpentry_sources,
            content_type="essay"
        )

        save_and_preview(carpentry_guide, "dovetail_joints_guide.txt", 1500)
    else:
        print("‚ö†Ô∏è No text content found in sources")

except Exception as e:
    print(f"‚ùå Error in carpentry guide: {e}")
    import traceback
    traceback.print_exc()

# ============================================================================
# SECTION 11: Summary
# ============================================================================

print("\n" + "="*80)
print("üéâ TESTING COMPLETE!")
print("="*80)

print("\nüìÅ Generated Files:")
print("  1. gothic_poem.txt - Poetry inspired by Edgar Allan Poe")
print("  2. diabetes_research.txt - Medical research paper on diabetes")
print("  3. rosa_rubiginosa_blog.html - Botanical blog article (HTML)")
print("  4. van_gogh_biography.txt - Vincent van Gogh artist biography")
print("  5. impressionism_article.txt - Art movement article (plain text)")
print("  6. dovetail_joints_guide.txt - Carpentry guide on dovetail joints")

print("\nüí° Tips:")
print("  - Download files from the Colab file browser (left sidebar)")
print("  - Adjust verbose=True for more detailed output")
print("  - Modify content_type: 'poem', 'essay', 'article', 'guide'")
print("  - Modify format: 'html', 'text', 'json'")

print("\nüìä Available Domains:")
try:
    # List available domains
    config = ConfigManager()
    print("  - botany, medical, mathematics, art_history")
    print("  - literature, music, carpentry (if configured)")
    print("  - See V4/config/domains.json for full list")
except:
    pass

# ============================================================================
# SECTION 12: Interactive Query Example
# ============================================================================

print("\n" + "="*80)
print("BONUS: Interactive RAG Query Example")
print("="*80)

try:
    if 'rag_movement' in locals() and texts:
        query = "What are the main characteristics of Impressionism?"
        print(f"\n‚ùì Query: {query}")
        result = rag_movement.query(query, k=5)
        print(f"\n‚úÖ Answer:")
        print("-" * 80)
        print(result[:800] + "..." if len(result) > 800 else result)
        print("-" * 80)
        print("\nüí° You can query any of the RAG systems created above!")
    else:
        print("‚ö†Ô∏è RAG system not available for interactive query")
except Exception as e:
    print(f"‚ö†Ô∏è Could not run interactive query: {e}")

print("\n" + "="*80)
print("‚ú® All tests completed! Check the generated files above.")
print("="*80)

üì¶ Installing Viincci-RAG from GitHub...
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
‚úÖ Installation complete!

‚ö†Ô∏è  SERPAPI KEY REQUIRED
You need a SerpAPI key from https://serpapi.com/
Free tier includes 100 searches/month


IMPORTING VIINCCI-RAG MODULES
‚úÖ Viincci-RAG imported successfully!
   - ConfigManager
   - UniversalResearchSpider
   - RAGSystem
   - UniversalArticleGenerator

TEST 1: Poetry Generation - Edgar Allan Poe Style

üîç Researching Edgar Allan Poe's poetry style and themes...

üî¨ UNIVERSAL RESEARCH SYSTEM V4
Domain: LITERATURE - Literature Research
Query: Edgar Allan Poe poetry style themes Gothic


üí∞ Research Cost Estimate: Edgar Allan Poe poetry style themes Gothic
Web Searches: 3
AI Questions: 4
Total Searches Needed: 7

Searches Available: 250
Remaining After: 243
Estimated Operations Remaining: 35

‚úÖ Sufficient credits t

ERROR:V4.Spider:PDF extraction error: 404 Client Error: Not Found for url: https://cilexlawschool.ac.uk/fulldisplay/mJivXA/2S9054/EdgarAllanPoeWritingStyle.pdf


  [3/20] Edgar Allan Poe Style...


ERROR:V4.Spider:PDF extraction error: EOF marker not found
ERROR:V4.Spider:PDF extraction error: HTTPSConnectionPool(host='do-server1.sfs.uwm.edu', port=443): Max retries exceeded with url: /search/D542E04199/slide/D458E35/edgar-allan_poes-complete__poetical-works.pdf (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x787033c4acf0>: Failed to resolve 'do-server1.sfs.uwm.edu' ([Errno -2] Name or service not known)"))


  [4/20] Edgar Allan Poes Complete Poetical Works...
  [5/20] Analysis of Unreliable Narration in Edgar Allan Poe's The .....
    ‚úì Extracted from core.ac.uk
  [6/20] Edgar Allan Poe and the Economy of Horror...
    ‚úì Extracted from warwick.ac.uk
  [7/20] English I, Grade 9 Intro to Gothic Literature Through Poe...
    ‚úì Extracted from minio.la.utexas.edu
  [8/20] an analysis of edgar allan poe and nathaniel hawthorne...




  [9/20] Edgar Allan Poe: Friend of Fear and Master of Madness...
    ‚úì Extracted from scalar.usc.edu
  [10/20] The haunting power of Edgar Allan Poe - UChicago News...
    ‚úì Extracted from news.uchicago.edu
  [11/20] Eldorado: The Poes in Norfolk...


ERROR:V4.Spider:Error extracting from https://core.ac.uk/outputs/228316674/: HTTPSConnectionPool(host='core.ac.uk', port=443): Read timed out. (read timeout=30)


  [12/20] in Poe...
  [13/20] The Grotesque In Edgar Allan Poe's Fiction - UVM ScholarWork...




  [14/20] Poe's Challenge to Sentimental Literature through Themes of ...




  [15/20] Exploring Allan Poe's Stylistic Distinctiveness from a ......
    ‚úì Extracted from www.birmingham.ac.uk
  [16/20] 'Edgar Allan Poe: On the Value of the Popular' by Edward O ....
    ‚úì Extracted from blog.bham.ac.uk
  [17/20] Gothic Writing Technique and Yin-Yang Theory in The Fall ......
    ‚úì Extracted from core.ac.uk
  [18/20] Introduction: Some Remarks On Poe and His Critics...
    ‚úì Extracted from core.ac.uk
  [19/20] 19th Century Gothic Inspiration...




  [20/20] Poe's Gothic Soul in "Metzengerstein"...





‚úì Successfully extracted 10 sources

üíæ Results saved to: Edgar_Allan_Poe_poetry_style_themes_Gothic_literature_research.json

üìä RESEARCH SUMMARY: Edgar Allan Poe poetry style themes Gothic
Domain: literature
Total Sources: 10

Reliability Distribution:
  ‚Ä¢ low: 10


‚úÖ Found 10 sources
Loading embedding model: all-MiniLM-L6-v2
Generating embeddings...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

‚úì Index built with 10 vectors
Loading LLM on device: cpu


tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/434 [00:00<?, ?B/s]

chat_template.jinja: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors:   0%|          | 0.00/2.34G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/137 [00:00<?, ?B/s]

Device set to use cpu
Traceback (most recent call last):
  File "/tmp/ipython-input-359307281.py", line 108, in <cell line: 0>
    poem = generator.generate_full_article(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: UniversalArticleGenerator.generate_full_article() got an unexpected keyword argument 'content_type'


‚ùå Error in poetry generation: UniversalArticleGenerator.generate_full_article() got an unexpected keyword argument 'content_type'

TEST 2: Medical Research Paper - Diabetes Mellitus

üîç Researching diabetes mellitus type 2...

üî¨ UNIVERSAL RESEARCH SYSTEM V4
Domain: MEDICAL - Medical Research
Query: diabetes mellitus type 2 pathophysiology treatment


üí∞ Research Cost Estimate: diabetes mellitus type 2 pathophysiology treatment
Web Searches: 3
AI Questions: 4
Total Searches Needed: 7

Searches Available: 248
Remaining After: 241
Estimated Operations Remaining: 35

‚úÖ Sufficient credits to proceed


üìä SerpAPI Account Status - 2025-11-11 13:59:25
Account: miguelmehgoss@gmail.com
Plan: Free Plan

Status: üü¢ OK
Message: Sufficient searches available: 248 remaining

üìä Usage Statistics:
  ‚Ä¢ Searches Used: -248
  ‚Ä¢ Searches Remaining: 248
  ‚Ä¢ Plan Limit: 0
  ‚Ä¢ Usage: 0.0%


üìö Step 1: Searching for sources...
‚úì Found 20 potential sources

üìÑ Step 2: Extracting c

ERROR:V4.Spider:Error extracting from https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/pubmed/23625211?dopt=Abstract: HTTPSConnectionPool(host='0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk', port=443): Max retries exceeded with url: /pubmed/23625211?dopt=Abstract (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x787033b7bd70>, 'Connection to 0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk timed out. (connect timeout=30)'))


  [2/20] Implications for Type 2 Diabetes Mellitus Management...
    ‚úì Extracted from pages.ucsd.edu
  [3/20] Genetic drivers of heterogeneity in type 2 diabetes ......
    ‚úì Extracted from eprints.gla.ac.uk
  [4/20] Type 2 Diabetes Mellitus: A Pathophysiologic Perspective...


ERROR:V4.Spider:PDF extraction error: EOF marker not found


  [5/20] Diabetes (PDF)...
    ‚úì Extracted from www.medschool.lsuhsc.edu
  [6/20] Pathophysiology and Clinical Manifestations | Type 2 diabete...
    ‚úì Extracted from u.osu.edu
  [7/20] Pharmacological treatment of hyperglycemia in type 2 diabete...
    ‚úì Extracted from search.lib.uconn.edu
  [8/20] Pathophysiology and Treatment of Type 2 Diabetes...
    ‚úì Extracted from depts.washington.edu
  [9/20] Prediabetes and Cardiovascular Disease: Pathophysiology ......


ERROR:V4.Spider:Error extracting from https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/pmc/articles/PMC5806140: HTTPSConnectionPool(host='0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk', port=443): Max retries exceeded with url: /pmc/articles/PMC5806140 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x787033b7a600>, 'Connection to 0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk timed out. (connect timeout=30)'))


  [10/20] Type 2 Diabetes Mellitus - Harvard Health...
    ‚úì Extracted from www.health.harvard.edu
  [11/20] Pathophysiology of Insulin Resistance and Type II Diabetes ....




  [12/20] Diagnosis and Classification of Diabetes Mellitus - PMC...


ERROR:V4.Spider:Error extracting from https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/pmc/articles/PMC2613584/: HTTPSConnectionPool(host='0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk', port=443): Max retries exceeded with url: /pmc/articles/PMC2613584/ (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7870337d6420>, 'Connection to 0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk timed out. (connect timeout=30)'))


  [13/20] Pathophysiology of Type 2 Diabetes in Children and ......


ERROR:V4.Spider:Error extracting from https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/pmc/articles/PMC7516333: HTTPSConnectionPool(host='0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk', port=443): Max retries exceeded with url: /pmc/articles/PMC7516333 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7870337d6780>, 'Connection to 0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk timed out. (connect timeout=30)'))


  [14/20] A Review of Current Trends with Type 2 Diabetes ......
    ‚úì Extracted from onesearch.cumbria.ac.uk
  [15/20] Pathophysiology of Diabetic Dyslipidemia - PMC...


ERROR:V4.Spider:Error extracting from https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/pmc/articles/PMC6143775: HTTPSConnectionPool(host='0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk', port=443): Max retries exceeded with url: /pmc/articles/PMC6143775 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x787023d27320>, 'Connection to 0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk timed out. (connect timeout=30)'))


  [16/20] Pathophysiology of Type 2 Diabetes Mellitus...
    ‚úì Extracted from search.library.ucsf.edu
  [17/20] 2. Classification and Diagnosis of Diabetes...
  [18/20] Type 2/Adult-Onset Diabetes < Endocrinology & Metabolism...
    ‚úì Extracted from medicine.yale.edu
  [19/20] Type 2 diabetes mellitus - PubMed - NIH...


ERROR:V4.Spider:Error extracting from https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/pubmed/27189025: HTTPSConnectionPool(host='0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk', port=443): Max retries exceeded with url: /pubmed/27189025 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x787023ca4da0>, 'Connection to 0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk timed out. (connect timeout=30)'))


  [20/20] ADDRESSING HYPERTENSION IN THE PATIENT WITH ......


ERROR:V4.Spider:Error extracting from https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/pmc/articles/PMC5679430: HTTPSConnectionPool(host='0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk', port=443): Max retries exceeded with url: /pmc/articles/PMC5679430 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x787023ca5850>, 'Connection to 0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk timed out. (connect timeout=30)'))



‚úì Successfully extracted 10 sources

üíæ Results saved to: diabetes_mellitus_type_2_pathophysiology_treatment_medical_research.json

üìä RESEARCH SUMMARY: diabetes mellitus type 2 pathophysiology treatment
Domain: medical
Total Sources: 10

Reliability Distribution:
  ‚Ä¢ low: 10


‚úÖ Found 10 sources
Loading embedding model: all-MiniLM-L6-v2
Generating embeddings...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

‚úì Index built with 10 vectors
Loading LLM on device: cpu


Device set to use cpu
Traceback (most recent call last):
  File "/tmp/ipython-input-359307281.py", line 151, in <cell line: 0>
    research_paper = generator_medical.generate_full_article(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: UniversalArticleGenerator.generate_full_article() got an unexpected keyword argument 'content_type'


‚ùå Error in medical research: UniversalArticleGenerator.generate_full_article() got an unexpected keyword argument 'content_type'

TEST 3: Botanical Blog Article - Rosa rubiginosa (HTML)

üîç Researching Rosa rubiginosa (Sweet Briar Rose)...

üî¨ UNIVERSAL RESEARCH SYSTEM V4
Domain: BOTANY - Botanical Research
Query: Rosa rubiginosa Sweet Briar Rose characteristics habitat


üí∞ Research Cost Estimate: Rosa rubiginosa Sweet Briar Rose characteristics habitat
Web Searches: 3
AI Questions: 4
Total Searches Needed: 7

Searches Available: 246
Remaining After: 239
Estimated Operations Remaining: 35

‚úÖ Sufficient credits to proceed


üìä SerpAPI Account Status - 2025-11-11 14:04:08
Account: miguelmehgoss@gmail.com
Plan: Free Plan

Status: üü¢ OK
Message: Sufficient searches available: 246 remaining

üìä Usage Statistics:
  ‚Ä¢ Searches Used: -246
  ‚Ä¢ Searches Remaining: 246
  ‚Ä¢ Plan Limit: 0
  ‚Ä¢ Usage: 0.0%


üìö Step 1: Searching for sources...
‚úì Found 22 potential sources

ERROR:V4.Spider:PDF extraction error: EOF marker not found


  [2/22] Alien plants and their impact on Tristan da Cunha Part 1...
    ‚úì Extracted from herbaria.plants.ox.ac.uk
  [3/22] Riparian Assessment and Management Report...
    ‚úì Extracted from cpfm.uoregon.edu
  [4/22] Plants of Iowa: A Preliminary List of the Native and Introdu...
    ‚úì Extracted from scholarworks.uni.edu
  [5/22] Environmental DNA reveals diversity and abundance of ......
    ‚úì Extracted from eprints.worc.ac.uk
  [6/22] Genetic diversity of wild roses (Rosa spp.) in Europe, with ...
    ‚úì Extracted from core.ac.uk
  [7/22] AND ROSE SHOWING...
  [8/22] Journal Pre-proof...


ERROR:V4.Spider:PDF extraction error: 404 Client Error: Not Found for url: https://files.core.ac.uk/download/614857469.pdf


  [9/22] tracking the origin of invasive rosa rubiginosa populations ...
    ‚úì Extracted from core.ac.uk
  [10/22] Mod 3 Hedgerows...
    ‚úì Extracted from nora.nerc.ac.uk
  [11/22] Plants of Iowa: A Preliminary List of the Native and ......


ERROR:V4.Spider:PDF extraction error: EOF marker not found


  [12/22] rural households' perceptions of an invasive alien species r...




  [13/22] THE TSITSA PROJECT Integrated Restoration and ......
    ‚úì Extracted from www.ru.ac.za
  [14/22] Rosa agrestis; Small-leaved Sweet Briar - CalPhotos...


ERROR:V4.Spider:Error extracting from https://calphotos.berkeley.edu/cgi/img_query?seq_num=979219&one=T: HTTPSConnectionPool(host='calphotos.berkeley.edu', port=443): Max retries exceeded with url: /cgi/img_query?seq_num=979219&one=T (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x78705bfaf1a0>, 'Connection to calphotos.berkeley.edu timed out. (connect timeout=30)'))


  [15/22] Woody native and exotic species respond differently to ......
    ‚úì Extracted from core.ac.uk
  [16/22] MSU Extension Publication Archive Scroll down to view the .....
    ‚úì Extracted from archive.lib.msu.edu
  [17/22] (PDF) Shrub management is the principal driver of differing ...
    ‚úì Extracted from www.academia.edu
  [18/22] characterizing salinity tolerance in greenhouse roses - OAKT...


ERROR:V4.Spider:PDF extraction error: 403 Client Error: Forbidden for url: https://oaktrust.library.tamu.edu/bitstream/1969.1/ETD-TAMU-2009-05-725/3/SOLIS-PEREZ-DISSERTATION.pdf


  [19/22] N006758RE.pdf - NERC Open Research Archive...
    ‚úì Extracted from nora.nerc.ac.uk
  [20/22] the potential conflict of interest associated with the ......





‚úì Successfully extracted 12 sources

üíæ Results saved to: Rosa_rubiginosa_Sweet_Briar_Rose_characteristics_habitat_botany_research.json

üìä RESEARCH SUMMARY: Rosa rubiginosa Sweet Briar Rose characteristics habitat
Domain: botany
Total Sources: 12

Reliability Distribution:
  ‚Ä¢ low: 12


‚úÖ Found 12 sources
Loading embedding model: all-MiniLM-L6-v2
Generating embeddings...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

‚úì Index built with 12 vectors
Loading LLM on device: cpu
