# ‚òëÔ∏è ScholarMind - AI Research Assistant

## Team: Sterling Syntax
**Developers:** Suprava Saha Dibya, Abdulla Al Noman  
**Track:** Academic Research Automation  
**Course:** 5-Day AI Agents Intensive with Google  
**Date:** November 2025

---

## ‚úîÔ∏è Architecture Overview

This notebook demonstrates a production-ready multi-agent system for academic research assistance.

### ‚úîÔ∏è System Components

| Component | Purpose |
|-----------|---------|
| **Coordinator Agent** | Orchestrates tasks and maintains conversation context |
| **Paper Search Agent** | Searches arXiv & academic databases for relevant papers |
| **Summarization Agent** | Extracts key findings & methodologies from papers |
| **Comparison Agent** | Compares research methodologies across papers |
| **Literature Review Agent** | Synthesizes findings & generates comprehensive reviews |
| **Citation Manager** | Manages citations (APA, MLA, Chicago, IEEE, Harvard) |

### ‚úîÔ∏è Key Concepts Demonstrated
1. ‚úÖ Function Calling & Custom Tools (6+ specialized tools)
2. ‚úÖ Multi-Agent Architecture with Coordinator
3. ‚úÖ Memory & Context Management
4. ‚úÖ Agent Orchestration & Dynamic Routing
5. ‚úÖ Observability & Comprehensive Logging
6. ‚úÖ Session Export & Persistence (New)
7. ‚úÖ Agent Reset & State Management (New)
8. ‚úÖ Conversation Search & Retrieval (New)
9. ‚úÖ Dynamic Agent Configuration (New)
10. ‚úÖ Batch Query Processing (New)
11. ‚úÖ Memory Summarization & Auto-Management (New)
12. ‚úÖ Feedback Collection & Continuous Improvement (New)
13. ‚úÖ Response Validation & Quality Assurance (New)
14. ‚úÖ Performance Monitoring & Analytics (New)
---


































































































In [30]:
import sys
import os
import time
import json
import re
from datetime import datetime
from typing import Dict, List, Any, Optional
from dataclasses import dataclass, field
import warnings

warnings.filterwarnings('ignore')

import google.generativeai as genai
from google.generativeai.types import FunctionDeclaration, Tool

from IPython.display import display, HTML, clear_output, Markdown

print("‚úì Libraries Loaded for Google Colab")

‚úì Libraries Loaded for Google Colab


## **API Configuration**

In [None]:
# Configure Gemini API for Google Colab
genai.configure(api_key=GOOGLE_API_KEY)
print("‚úì Gemini API Key Configured for Google Colab")

# Agent Configuration
CONFIG = {
    "team": "Sterling Syntax",
    "developer": "Suprava Saha Dibya, Abdulla Al Noman",
    "model": "models/gemini-2.0-flash",
    "max_tokens": 3000,
    "temperature": 0.4,
    "version": "1.0.0",
    "rate_limit_delay": 2.0
}

print(f"\n{'='*60}")
print(f"{'AGENT CONFIGURATION':^60}")
print(f"{'='*60}")
for k, v in CONFIG.items():
    print(f"{k:.<25} {v}")
print(f"{'='*60}")

‚úì Gemini API Key Configured for Google Colab

                    AGENT CONFIGURATION                     
team..................... Sterling Syntax
developer................ Suprava Saha Dibya, Abdulla Al Noman
model.................... models/gemini-2.0-flash
max_tokens............... 3000
temperature.............. 0.4
version.................. 1.0.0
rate_limit_delay......... 2.0


## **Tool Functions**

In [32]:
def search_arxiv_papers(query: str, max_results: int = 10, category: str = "all") -> str:
    """Search arXiv for academic papers"""
    time.sleep(CONFIG.get('rate_limit_delay', 2.0))  # Rate limiting
    prompt = (
        f"As an academic research assistant, search arXiv and provide relevant papers for:\n\n"
        f"Query: {query}\n"
        f"Category: {category}\n"
        f"Max Results: {max_results}\n\n"
        f"Provide: Paper titles, authors, publication dates, arXiv IDs, abstracts (brief), "
        f"and relevance scores. Format as a structured list with key details."
    )
    model = genai.GenerativeModel(CONFIG['model'])
    return model.generate_content(prompt).text


def summarize_paper(paper_title: str, paper_abstract: str, focus_areas: str = "general") -> str:
    time.sleep(CONFIG.get('rate_limit_delay', 2.0))  # Rate limiting
    prompt = (
        f"Provide a comprehensive academic summary of this paper:\n\n"
        f"Title: {paper_title}\n"
        f"Abstract: {paper_abstract}\n"
        f"Focus Areas: {focus_areas}\n\n"
        f"Include: Main research question, methodology overview, key findings, "
        f"contributions to the field, limitations, and future work suggestions. "
        f"Use academic language and be precise."
    )
    model = genai.GenerativeModel(CONFIG['model'])
    return model.generate_content(prompt).text


def compare_methodologies(paper1_info: str, paper2_info: str, comparison_aspect: str = "general") -> str:
    time.sleep(CONFIG.get('rate_limit_delay', 2.0))  # Rate limiting
    prompt = (
        f"Compare the methodologies of these two research papers:\n\n"
        f"Paper 1: {paper1_info}\n\n"
        f"Paper 2: {paper2_info}\n\n"
        f"Comparison Aspect: {comparison_aspect}\n\n"
        f"Analyze: Research design, data collection methods, analytical techniques, "
        f"experimental setup, validation approaches, strengths and weaknesses of each, "
        f"and which methodology is more appropriate for specific research contexts."
    )
    model = genai.GenerativeModel(CONFIG['model'])
    return model.generate_content(prompt).text


def generate_literature_review(topic: str, papers_summary: str, review_length: str = "medium") -> str:
    time.sleep(CONFIG.get('rate_limit_delay', 2.0))  # Rate limiting
    prompt = (
        f"Generate a comprehensive literature review on:\n\n"
        f"Topic: {topic}\n"
        f"Papers Summary: {papers_summary}\n"
        f"Length: {review_length}\n\n"
        f"Structure: Introduction to the topic, thematic organization of literature, "
        f"synthesis of key findings, identification of research gaps, critical analysis, "
        f"trends and patterns, and future research directions. Use formal academic style."
    )
    model = genai.GenerativeModel(CONFIG['model'])
    return model.generate_content(prompt).text


def manage_citations(papers_list: str, citation_style: str = "APA", action: str = "generate") -> str:
    time.sleep(CONFIG.get('rate_limit_delay', 2.0))  # Rate limiting
    prompt = (
        f"Citation Management Task:\n\n"
        f"Papers: {papers_list}\n"
        f"Citation Style: {citation_style}\n"
        f"Action: {action}\n\n"
        f"Generate properly formatted citations in {citation_style} style. "
        f"Include: Author names, publication year, title, journal/conference, "
        f"DOI/arXiv ID. Organize alphabetically and follow {citation_style} guidelines precisely."
    )
    model = genai.GenerativeModel(CONFIG['model'])
    return model.generate_content(prompt).text


def extract_research_insights(papers_text: str, insight_type: str = "trends") -> str:
    time.sleep(CONFIG.get('rate_limit_delay', 2.0))  # Rate limiting
    prompt = (
        f"Analyze multiple papers and extract insights:\n\n"
        f"Papers Content: {papers_text}\n"
        f"Insight Type: {insight_type}\n\n"
        f"Extract: Common themes, emerging trends, contradicting findings, "
        f"consensus areas, research gaps, methodological innovations, "
        f"and potential future directions. Provide evidence-based analysis."
    )
    model = genai.GenerativeModel(CONFIG['model'])
    return model.generate_content(prompt).text


print("‚úì 6 Tool Functions Defined")
print("  ‚Ä¢ search_arxiv_papers")
print("  ‚Ä¢ summarize_paper")
print("  ‚Ä¢ compare_methodologies")
print("  ‚Ä¢ generate_literature_review")
print("  ‚Ä¢ manage_citations")

print("  ‚Ä¢ extract_research_insights")


‚úì 6 Tool Functions Defined
  ‚Ä¢ search_arxiv_papers
  ‚Ä¢ summarize_paper
  ‚Ä¢ compare_methodologies
  ‚Ä¢ generate_literature_review
  ‚Ä¢ manage_citations
  ‚Ä¢ extract_research_insights


## **Function Declarations**

In [33]:
function_declarations = [
    FunctionDeclaration(
        name="search_arxiv_papers",
        description="Searches arXiv and academic databases for relevant research papers",
        parameters={
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query for papers"},
                "max_results": {"type": "integer", "description": "Maximum number of results (default: 10)"},
                "category": {"type": "string", "description": "Academic category (e.g., cs.AI, cs.LG, stat.ML, all)"}
            },
            "required": ["query"]
        }
    ),
    FunctionDeclaration(
        name="summarize_paper",
        description="Summarizes key findings and contributions from a research paper",
        parameters={
            "type": "object",
            "properties": {
                "paper_title": {"type": "string", "description": "Title of the paper"},
                "paper_abstract": {"type": "string", "description": "Abstract or full text of the paper"},
                "focus_areas": {"type": "string", "description": "Specific aspects to focus on (e.g., methodology, results, implications)"}
            },
            "required": ["paper_title", "paper_abstract"]
        }
    ),
    FunctionDeclaration(
        name="compare_methodologies",
        description="Compares research methodologies across multiple papers",
        parameters={
            "type": "object",
            "properties": {
                "paper1_info": {"type": "string", "description": "First paper details (title, methodology)"},
                "paper2_info": {"type": "string", "description": "Second paper details (title, methodology)"},
                "comparison_aspect": {"type": "string", "description": "Aspect to compare (e.g., data collection, analysis, experimental design)"}
            },
            "required": ["paper1_info", "paper2_info"]
        }
    ),
    FunctionDeclaration(
        name="generate_literature_review",
        description="Generates a comprehensive literature review from multiple papers",
        parameters={
            "type": "object",
            "properties": {
                "topic": {"type": "string", "description": "Research topic for the literature review"},
                "papers_summary": {"type": "string", "description": "Summary of papers to include"},
                "review_length": {"type": "string", "description": "Length of review (short, medium, comprehensive)"}
            },
            "required": ["topic", "papers_summary"]
        }
    ),
    FunctionDeclaration(
        name="manage_citations",
        description="Manages citations and generates bibliographies in various styles",
        parameters={
            "type": "object",
            "properties": {
                "papers_list": {"type": "string", "description": "List of papers with details"},
                "citation_style": {"type": "string", "description": "Citation style (APA, MLA, Chicago, IEEE, Harvard)"},
                "action": {"type": "string", "description": "Action to perform (generate, format, organize)"}
            },
            "required": ["papers_list"]
        }
    ),
    FunctionDeclaration(
        name="extract_research_insights",
        description="Extracts insights, trends, and patterns from multiple research papers",
        parameters={
            "type": "object",
            "properties": {
                "papers_text": {"type": "string", "description": "Text content from multiple papers"},
                "insight_type": {"type": "string", "description": "Type of insights to extract (trends, gaps, consensus, contradictions)"}
            },
            "required": ["papers_text"]
        }
    )
]

tools = Tool(function_declarations=function_declarations)
print(f"‚úì Function Declarations Created ({len(function_declarations)} tools)")

‚úì Function Declarations Created (6 tools)


## **Memory System**

In [34]:
@dataclass
class ConversationMemory:
    """Manages conversation history and context"""
    messages: List[Dict[str, str]] = field(default_factory=list)
    max_history: int = 20

    def add_message(self, role: str, content: str):
        self.messages.append({
            "role": role,
            "content": content,
            "timestamp": datetime.now().isoformat()
        })
        if len(self.messages) > self.max_history:
            self.messages = self.messages[-self.max_history:]

    def get_context(self) -> str:
        if not self.messages:
            return "No previous conversation."
        context = "Recent conversation:\n"
        for msg in self.messages[-5:]:
            context += f"{msg['role']}: {msg['content'][:100]}...\n"
        return context

    def clear(self):
        self.messages.clear()

    def get_stats(self) -> Dict[str, Any]:
        return {
            "total_messages": len(self.messages),
            "user_messages": sum(1 for m in self.messages if m['role'] == 'user'),
            "agent_messages": sum(1 for m in self.messages if m['role'] == 'agent')
        }

memory = ConversationMemory(max_history=20)
print(f"‚úì Memory System Initialized (Max: {memory.max_history} messages)")

‚úì Memory System Initialized (Max: 20 messages)


## **Logging System**

In [35]:
@dataclass
class AgentLogger:
    """Comprehensive logging for agent operations"""
    logs: List[Dict[str, Any]] = field(default_factory=list)

    def log(self, level: str, event: str, details: Dict[str, Any] = None):
        self.logs.append({
            "timestamp": datetime.now().isoformat(),
            "level": level,
            "event": event,
            "details": details or {}
        })

    def info(self, event: str, **kwargs):
        self.log("INFO", event, kwargs)

    def error(self, event: str, **kwargs):
        self.log("ERROR", event, kwargs)

    def warning(self, event: str, **kwargs):
        self.log("WARNING", event, kwargs)

    def get_recent_logs(self, count: int = 10) -> List[Dict]:
        return self.logs[-count:]

    def get_stats(self) -> Dict[str, Any]:
        return {
            "total_logs": len(self.logs),
            "info_count": sum(1 for log in self.logs if log['level'] == 'INFO'),
            "error_count": sum(1 for log in self.logs if log['level'] == 'ERROR'),
            "warning_count": sum(1 for log in self.logs if log['level'] == 'WARNING')
        }

    def export_logs(self, filename: str = "agent_logs.json"):
        with open(filename, 'w') as f:
            json.dump(self.logs, f, indent=2)
        print(f"‚úì Logs exported to {filename}")

logger = AgentLogger()
logger.info("Logger initialized")
print("‚úì Logging System Ready")

‚úì Logging System Ready


## **Main Agent Class**

In [36]:
class ScholarMindAgent:
    """Main orchestrating agent for ScholarMind research assistance"""

    def __init__(self, config: Dict, tools: Tool, memory: ConversationMemory, logger: AgentLogger):
        self.config = config
        self.tools = tools
        self.memory = memory
        self.logger = logger
        self.model = genai.GenerativeModel(model_name=config['model'], tools=[tools])

        self.stats = {
            "queries_processed": 0,
            "tools_called": 0,
            "total_response_time": 0.0,
            "errors": 0
        }
        self.logger.info("Agent initialized", model=config['model'])

    def _call_function(self, function_call) -> str:
        """Execute tool function and return result"""
        function_name = function_call.name
        function_args = dict(function_call.args)

        self.logger.info("Function called", function=function_name, args=str(function_args)[:100])

        function_map = {
            "search_arxiv_papers": search_arxiv_papers,
            "summarize_paper": summarize_paper,
            "compare_methodologies": compare_methodologies,
            "generate_literature_review": generate_literature_review,
            "manage_citations": manage_citations,
            "extract_research_insights": extract_research_insights
        }

        if function_name in function_map:
            try:
                result = function_map[function_name](**function_args)
                self.stats["tools_called"] += 1
                return result
            except Exception as e:
                self.logger.error("Function execution failed", error=str(e))
                return f"Error executing {function_name}: {str(e)}"
        return f"Unknown function: {function_name}"

    def run(self, user_query: str) -> str:
        start_time = time.time()
        max_retries = 3
        retry_delay = 3.0

        try:
            self.logger.info("Query received", query=user_query[:100])
            self.memory.add_message("user", user_query)

            system_prompt = f"""You are an expert Academic Research Assistant.

Capabilities: Paper search (arXiv, Google Scholar), summarization, methodology comparison, literature review generation, citation management, research insights extraction

Context: {self.memory.get_context()}

Provide precise, academic-quality responses with proper citations and structured analysis."""

            # Start chat with retry logic
            for attempt in range(max_retries):
                try:
                    chat = self.model.start_chat()
                    response = chat.send_message(f"{system_prompt}\n\nUser Query: {user_query}")
                    break
                except Exception as e:
                    if '429' in str(e) and attempt < max_retries - 1:
                        wait_time = retry_delay * (2 ** attempt)
                        print(f"‚è≥ Rate limit hit. Waiting {wait_time:.0f}s before retry {attempt + 2}/{max_retries}...")
                        time.sleep(wait_time)
                    else:
                        raise

            # Check if function was called
            function_calls = []
            if response.candidates and response.candidates[0].content.parts:
                for part in response.candidates[0].content.parts:
                    if hasattr(part, 'function_call') and part.function_call:
                        function_calls.append(part.function_call)

            # Execute functions and get results
            if function_calls:
                function_responses = []
                for fc in function_calls:
                    result = self._call_function(fc)
                    function_responses.append(result)

                # Send function results back to model with retry logic
                for attempt in range(max_retries):
                    try:
                        response = chat.send_message(function_responses)
                        break
                    except Exception as e:
                        if '429' in str(e) and attempt < max_retries - 1:
                            wait_time = retry_delay * (2 ** attempt)
                            print(f"‚è≥ Rate limit hit. Waiting {wait_time:.0f}s before retry {attempt + 2}/{max_retries}...")
                            time.sleep(wait_time)
                        else:
                            raise

            # Extract final text response
            try:
                response_text = response.text
            except Exception:
                if hasattr(response, 'candidates') and response.candidates:
                    parts = response.candidates[0].content.parts
                    response_text = ""
                    for part in parts:
                        if hasattr(part, 'text') and part.text:
                            response_text += part.text
                    if not response_text:
                        response_text = "Response generated successfully."
                else:
                    response_text = "Unable to extract response."

            self.memory.add_message("agent", response_text)

            elapsed = time.time() - start_time
            self.stats["queries_processed"] += 1
            self.stats["total_response_time"] += elapsed

            self.logger.info("Query completed", response_time=f"{elapsed:.2f}s")
            return response_text

        except Exception as e:
            self.stats["errors"] += 1
            self.logger.error("Query failed", error=str(e))
            return f"Error: {str(e)}"

    def get_stats(self) -> Dict[str, Any]:
        avg_response_time = (
            self.stats["total_response_time"] / self.stats["queries_processed"]
            if self.stats["queries_processed"] > 0 else 0
        )

        return {
            **self.stats,
            "avg_response_time": round(avg_response_time, 2),
            "memory_stats": self.memory.get_stats(),
            "logger_stats": self.logger.get_stats()
        }

    def reset(self):
        self.memory.clear()
        self.stats = {"queries_processed": 0, "tools_called": 0, "total_response_time": 0.0, "errors": 0}
        self.logger.info("Agent reset")

if GOOGLE_API_KEY:
    agent = ScholarMindAgent(config=CONFIG, tools=tools, memory=memory, logger=logger)
    print("‚úì ScholarMind Agent Initialized")
    print("‚úì Ready for Academic Research Assistance")
    print(f"‚úì Powered by Team @Sterling Syntax")
else:
    agent = None
    print("‚ö† Agent initialization skipped - Configure API key")

‚úì ScholarMind Agent Initialized
‚úì Ready for Academic Research Assistance
‚úì Powered by Team @Sterling Syntax


## **Test the Agent**

In [37]:
def test_agent(query: str):
    """Test agent with a query"""
    if not agent:
        print("‚ö† Agent not initialized")
        return

    print(f"\n{'='*60}")
    print(f"USER: {query}")
    print(f"{'='*60}\n")

    response = agent.run(query)

    print("AGENT RESPONSE:")
    print(f"{'-'*60}")
    display(Markdown(response))
    print(f"{'='*60}\n")

print("‚úì Test function ready")
print("üìå Usage: test_agent('your research question here')")

‚úì Test function ready
üìå Usage: test_agent('your research question here')


In [54]:
# Example test query
test_agent("Find recent papers on transformer architectures in natural language processing")


USER: Find recent papers on transformer architectures in natural language processing

AGENT RESPONSE:
------------------------------------------------------------


Okay, that's a good starting point. Based on the relevance scores you provided, the most relevant papers appear to be:

1.  **A Survey of Transformers** (Relevance Score: 5) - A comprehensive overview of Transformer architectures.
2.  **Efficient Reasoning for Unseen Question Answering via Memory-Augmented Transformers** (Relevance Score: 4) - Focuses on memory-augmented Transformers for question answering.
3.  **An Empirical Study of Prompt Engineering for Transformer-Based Models in Source Code Understanding** (Relevance Score: 4) - Examines prompt engineering techniques for Transformers in source code understanding.
4.  **Instruction Tuning with Retrieval Augmentation for Long-Form Question Answering** (Relevance Score: 4) - Explores instruction tuning and retrieval augmentation for Transformers in question answering.

To provide a more useful response, let's focus on these top 4 papers.  I will now provide summaries of each and then perform a comparison of their methodologies. Would you like me to proceed in that order?





## **Statistics Dashboard**

In [39]:
def display_statistics():
    """Display agent performance metrics"""
    if not agent:
        print("‚ö† Agent not initialized")
        return

    stats = agent.get_stats()

    print(f"\n{'='*60}")
    print(f"{'AGENT PERFORMANCE DASHBOARD':^60}")
    print(f"{'='*60}")

    print(f"\nüìä Query Statistics:")
    print(f"  Total Queries: {stats['queries_processed']}")
    print(f"  Tools Called: {stats['tools_called']}")
    print(f"  Avg Response Time: {stats['avg_response_time']:.2f}s")
    print(f"  Errors: {stats['errors']}")

    print(f"\nüí≠ Memory Statistics:")
    mem = stats['memory_stats']
    print(f"  Total Messages: {mem['total_messages']}")
    print(f"  User Messages: {mem['user_messages']}")
    print(f"  Agent Messages: {mem['agent_messages']}")

    print(f"\nüìù Logger Statistics:")
    log = stats['logger_stats']
    print(f"  Total Logs: {log['total_logs']}")
    print(f"  Info: {log['info_count']} | Warning: {log['warning_count']} | Error: {log['error_count']}")

    print(f"{'='*60}\n")

if agent:
    display_statistics()


                AGENT PERFORMANCE DASHBOARD                 

üìä Query Statistics:
  Total Queries: 1
  Tools Called: 1
  Avg Response Time: 34.93s
  Errors: 0

üí≠ Memory Statistics:
  Total Messages: 2
  User Messages: 1
  Agent Messages: 1

üìù Logger Statistics:
  Total Logs: 5



## **Utility Functions**

In [40]:
def export_conversation_history(filename="research_conversation.txt"):
    """Export conversation history to file"""
    if not agent:
        print("‚ö† Agent not initialized")
        return None
    try:
        stats = agent.get_stats()
        memory_stats = stats['memory_stats']
        with open(filename, 'w', encoding='utf-8') as f:
            f.write("=" * 60 + "\n")
            f.write("SCHOLARMIND AI - CONVERSATION HISTORY\n")
            f.write("Team: @ScholarMind | Developer: Abdulla Al Noman\n")
            f.write("=" * 60 + "\n\n")
            f.write(f"Session Statistics:\n")
            f.write(f"  Total Queries: {stats['queries_processed']}\n")
            f.write(f"  Tools Called: {stats['tools_called']}\n")
            f.write(f"  Average Response Time: {stats['avg_response_time']:.2f}s\n")
            f.write(f"  Errors: {stats['errors']}\n\n")
            f.write("=" * 60 + "\n")
            f.write("CONVERSATION LOG\n")
            f.write("=" * 60 + "\n\n")
            for msg in agent.memory.messages:
                role = msg['role'].upper()
                timestamp = msg.get('timestamp', 'N/A')
                content = msg['content']
                f.write(f"[{timestamp}] {role}:\n")
                f.write(f"{content}\n")
                f.write("-" * 60 + "\n\n")
            f.write("=" * 60 + "\n")
            f.write(f"Total Messages: {memory_stats['total_messages']}\n")
            f.write(f"User Messages: {memory_stats['user_messages']}\n")
            f.write(f"Agent Messages: {memory_stats['agent_messages']}\n")
            f.write("=" * 60 + "\n")
        print(f"‚úì Conversation history exported to: {filename}")
        print(f"üìä Total messages: {memory_stats['total_messages']}")
        return filename
    except Exception as e:
        print(f"‚ùå Error exporting conversation: {str(e)}")
        return None

def reset_agent():
    """Reset agent memory and statistics"""
    if not agent:
        print("‚ö† Agent not initialized")
        return False
    try:
        old_stats = agent.get_stats()
        agent.reset()
        print("=" * 60)
        print("AGENT RESET SUCCESSFUL")
        print("=" * 60)
        print(f"\nüìä Previous Session Stats:")
        print(f"  Total Queries: {old_stats['queries_processed']}")
        print(f"  Tools Called: {old_stats['tools_called']}")
        print(f"  Avg Response Time: {old_stats['avg_response_time']:.2f}s")
        print(f"  Errors: {old_stats['errors']}")
        print(f"\nüîÑ Agent memory and statistics cleared")
        print(f"‚úì Ready for new research session\n")
        return True
    except Exception as e:
        print(f"‚ùå Error resetting agent: {str(e)}")
        return False

def search_conversation(keyword):
    """Search conversation history for keyword"""
    if not agent:
        print("‚ö† Agent not initialized")
        return []
    try:
        keyword_lower = keyword.lower()
        results = []
        for idx, msg in enumerate(agent.memory.messages):
            if keyword_lower in msg['content'].lower():
                results.append({
                    'index': idx,
                    'role': msg['role'],
                    'timestamp': msg.get('timestamp', 'N/A'),
                    'content': msg['content']
                })
        if results:
            print(f"üîç Found {len(results)} message(s) containing '{keyword}':\n")
            for r in results:
                print(f"Message #{r['index']} [{r['role'].upper()}]:")
                print(f"  Timestamp: {r['timestamp']}")
                print(f"  Preview: {r['content'][:200]}...")
                print("-" * 60)
        else:
            print(f"‚ùå No messages found containing '{keyword}'")
        return results
    except Exception as e:
        print(f"‚ùå Error searching conversation: {str(e)}")
        return []

def batch_query(questions):
    """Process multiple queries in batch"""
    if not agent:
        print("‚ö† Agent not initialized")
        return None
    try:
        if isinstance(questions, str):
            questions = [q.strip() for q in questions.split(';') if q.strip()]
        if not questions:
            print("‚ùå No questions provided")
            return None
        print("=" * 60)
        print(f"BATCH PROCESSING {len(questions)} QUERIES")
        print("=" * 60)
        results = {}
        start_time = time.time()
        for idx, question in enumerate(questions, 1):
            print(f"\n[{idx}/{len(questions)}] Processing: {question[:60]}...")
            try:
                response = agent.run(question)
                results[question] = {'response': response, 'status': 'success'}
                print(f"‚úì Completed")
            except Exception as e:
                results[question] = {'response': None, 'status': 'error', 'error': str(e)}
                print(f"‚ùå Error: {str(e)}")
        total_time = time.time() - start_time
        print("\n" + "=" * 60)
        print("BATCH PROCESSING COMPLETE")
        print("=" * 60)
        print(f"‚úì Processed: {len(questions)} queries")
        print(f"‚úì Successful: {sum(1 for r in results.values() if r['status'] == 'success')}")
        print(f"‚ùå Failed: {sum(1 for r in results.values() if r['status'] == 'error')}")
        print(f"‚è± Total Time: {total_time:.2f}s")
        print(f"‚è± Avg Time: {total_time/len(questions):.2f}s per query")
        print("=" * 60)
        return results
    except Exception as e:
        print(f"‚ùå Batch processing error: {str(e)}")
        return None

def export_agent_logs(filename="agent_logs.json"):
    """Export comprehensive agent logs"""
    if not agent:
        print("‚ö† Agent not initialized")
        return None
    try:
        stats = agent.get_stats()
        export_data = {
            "performance_metrics": {
                "queries_processed": stats['queries_processed'],
                "tools_called": stats['tools_called'],
                "avg_response_time": stats['avg_response_time'],
                "errors": stats['errors']
            },
            "memory_stats": stats['memory_stats'],
            "logger_stats": stats['logger_stats'],
            "logs": agent.logger.logs,
            "conversation": agent.memory.messages
        }
        with open(filename, 'w') as f:
            json.dump(export_data, f, indent=2)
        print(f"‚úì Agent logs exported to: {filename}")
        print(f"üìù Total log entries: {stats['logger_stats']['total_logs']}")
        return filename
    except Exception as e:
        print(f"‚ùå Error exporting logs: {str(e)}")
        return None

def configure_agent(temperature=None, max_tokens=None, model_name=None):
    """Dynamically reconfigure agent parameters"""
    if not agent:
        print("‚ö† Agent not initialized")
        return False
    try:
        changes = []
        if temperature is not None:
            if 0.0 <= temperature <= 1.0:
                CONFIG['temperature'] = temperature
                changes.append(f"Temperature: {temperature}")
            else:
                print("‚ùå Temperature must be between 0.0 and 1.0")
                return False
        if max_tokens is not None:
            if max_tokens > 0:
                CONFIG['max_tokens'] = max_tokens
                changes.append(f"Max Tokens: {max_tokens}")
            else:
                print("‚ùå Max tokens must be positive")
                return False
        if model_name is not None:
            CONFIG['model'] = model_name
            changes.append(f"Model: {model_name}")
        agent.model = genai.GenerativeModel(model_name=CONFIG['model'], tools=[agent.tools])
        print("=" * 60)
        print("AGENT RECONFIGURED")
        print("=" * 60)
        print("\n‚úì Changes applied:")
        for change in changes:
            print(f"  ‚Ä¢ {change}")
        print("\nüìã Current Configuration:")
        print(f"  Model: {CONFIG['model']}")
        print(f"  Temperature: {CONFIG['temperature']}")
        print(f"  Max Tokens: {CONFIG['max_tokens']}")
        print("=" * 60)
        agent.logger.info("Agent reconfigured", changes=changes)
        return True
    except Exception as e:
        print(f"‚ùå Error configuring agent: {str(e)}")
        return False

def show_agent_config():
    """Display current agent configuration"""
    if not agent:
        print("‚ö† Agent not initialized")
        return
    print("=" * 60)
    print("CURRENT AGENT CONFIGURATION")
    print("=" * 60)
    print(f"\nü§ñ Model Settings:")
    print(f"  Model: {CONFIG['model']}")
    print(f"  Temperature: {CONFIG['temperature']}")
    print(f"  Max Tokens: {CONFIG['max_tokens']}")
    print(f"\nüìä Runtime Stats:")
    stats = agent.get_stats()
    print(f"  Queries Processed: {stats['queries_processed']}")
    print(f"  Tools Called: {stats['tools_called']}")
    print(f"  Avg Response Time: {stats['avg_response_time']:.2f}s")
    print(f"  Errors: {stats['errors']}")
    print("=" * 60)

def display_batch_results(results):
    """Display formatted batch query results"""
    if not results:
        print("‚ö† No results to display")
        return
    print("\n" + "=" * 60)
    print("BATCH QUERY RESULTS")
    print("=" * 60)
    for idx, (question, result) in enumerate(results.items(), 1):
        print(f"\n[Q{idx}] {question}")
        print("-" * 60)
        if result['status'] == 'success':
            response = result['response']
            if len(response) > 300:
                print(f"{response[:300]}...")
                print(f"\n[Full response: {len(response)} characters]")
            else:
                print(response)
        else:
            print(f"‚ùå Error: {result.get('error', 'Unknown error')}")
        print("-" * 60)

def summarize_conversation():
    """Summarize conversation when memory approaches limit"""
    if not agent:
        print("‚ö† Agent not initialized")
        return None
    try:
        if len(agent.memory.messages) < 5:
            print("üìù Conversation too short to summarize")
            return None
        summary_points = []
        for msg in agent.memory.messages:
            if msg['role'] == 'user':
                if 'paper' in msg['content'].lower() or 'search' in msg['content'].lower():
                    summary_points.append("Paper search discussed")
                elif 'summarize' in msg['content'].lower():
                    summary_points.append("Paper summarization discussed")
                elif 'compare' in msg['content'].lower():
                    summary_points.append("Methodology comparison discussed")
                elif 'literature' in msg['content'].lower() or 'review' in msg['content'].lower():
                    summary_points.append("Literature review discussed")
                elif 'citation' in msg['content'].lower():
                    summary_points.append("Citation management discussed")
        summary_points = list(set(summary_points))
        summary = "Conversation Summary:\n"
        for point in summary_points:
            summary += f"‚Ä¢ {point}\n"
        print("=" * 60)
        print("CONVERSATION SUMMARIZED")
        print("=" * 60)
        print(summary)
        print(f"üìä Original messages: {len(agent.memory.messages)}")
        print(f"üìä Summary points: {len(summary_points)}")
        print("=" * 60)
        return summary
    except Exception as e:
        print(f"‚ùå Error summarizing conversation: {str(e)}")
        return None

def auto_summarize_if_needed():
    """Auto-summarize when approaching memory limit"""
    if not agent:
        print("‚ö† Agent not initialized")
        return False
    try:
        if len(agent.memory.messages) >= 15:
            print("‚ö† Conversation length approaching limit - auto summarizing")
            summary = summarize_conversation()
            if summary:
                agent.memory.clear()
                agent.memory.add_message("system", summary)
                print("‚úì Memory cleared and summary added")
                return True
        return False
    except Exception as e:
        print(f"‚ùå Error in auto-summarize: {str(e)}")
        return False

def collect_feedback(question, response, rating=None, comments=None):
    """Collect user feedback for continuous improvement"""
    if not agent:
        print("‚ö† Agent not initialized")
        return None
    try:
        feedback_entry = {
            'timestamp': datetime.now().isoformat(),
            'question': question,
            'response': response,
            'rating': rating,
            'comments': comments
        }
        if not hasattr(agent, 'feedback'):
            agent.feedback = []
        agent.feedback.append(feedback_entry)
        print("=" * 60)
        print("FEEDBACK RECORDED")
        print("=" * 60)
        print(f"üìù Question: {question[:100]}...")
        print(f"üìù Response: {response[:100]}...")
        print(f"‚≠ê Rating: {rating}/5" if rating else "‚≠ê Rating: Not provided")
        print(f"üí¨ Comments: {comments}" if comments else "üí¨ Comments: None")
        print(f"üìä Total feedback entries: {len(agent.feedback)}")
        print("=" * 60)
        agent.logger.info("Feedback collected", entry=feedback_entry)
        return feedback_entry
    except Exception as e:
        print(f"‚ùå Error collecting feedback: {str(e)}")
        return None

def show_feedback_summary():
    """Display feedback analytics"""
    if not agent or not hasattr(agent, 'feedback') or not agent.feedback:
        print("üìù No feedback collected yet")
        return
    try:
        total = len(agent.feedback)
        avg_rating = sum(f['rating'] for f in agent.feedback if f['rating']) / sum(1 for f in agent.feedback if f['rating'])
        print("=" * 60)
        print("FEEDBACK SUMMARY")
        print("=" * 60)
        print(f"üìä Total Feedback: {total}")
        print(f"‚≠ê Average Rating: {avg_rating:.2f}/5")
        print(f"üí¨ With Comments: {sum(1 for f in agent.feedback if f['comments'])}")
        print("=" * 60)
        print("\nRecent Feedback:")
        for f in agent.feedback[-3:]:
            print(f"  ‚Ä¢ {f['question'][:50]}... [{f['rating']}/5]")
        return {'total': total, 'avg_rating': avg_rating, 'with_comments': sum(1 for f in agent.feedback if f['comments'])}
    except Exception as e:
        print(f"‚ùå Error showing feedback summary: {str(e)}")
        return None

def validate_response(question, response):
    """Validate response quality with multiple checks"""
    if not agent:
        print("‚ö† Agent not initialized")
        return None
    try:
        response_length = len(response)
        code_marker = '```'
        quality_checks = {
            'min_length': response_length >= 50,
            'has_examples': 'example' in response.lower() or 'sample' in response.lower(),
            'has_steps': 'step' in response.lower() or 'first' in response.lower() or 'second' in response.lower(),
            'has_code': code_marker in response or 'code' in response.lower(),
            'has_explanation': 'why' in response.lower() or 'because' in response.lower() or 'reason' in response.lower(),
            'has_actionable': 'try' in response.lower() or 'suggest' in response.lower() or 'recommend' in response.lower()
        }
        score = sum(1 for check in quality_checks.values() if check)
        max_score = len(quality_checks)
        feedback = []
        if not quality_checks['min_length']:
            feedback.append("Response is too short")
        if not quality_checks['has_examples']:
            feedback.append("Include examples")
        if not quality_checks['has_steps']:
            feedback.append("Include step-by-step guidance")
        if not quality_checks['has_code']:
            feedback.append("Include code samples")
        if not quality_checks['has_explanation']:
            feedback.append("Include reasoning")
        if not quality_checks['has_actionable']:
            feedback.append("Include actionable advice")
        print("=" * 60)
        print("RESPONSE VALIDATION")
        print("=" * 60)
        print(f"üìù Question: {question[:100]}...")
        print(f"üìù Response: {response[:100]}...")
        print(f"üìä Score: {score}/{max_score}")
        print(f"‚úÖ Checks passed: {sum(1 for check in quality_checks.values() if check)}")
        print(f"‚ùå Checks failed: {sum(1 for check in quality_checks.values() if not check)}")
        if feedback:
            print("üí° Suggestions:")
            for f in feedback:
                print(f"  - {f}")
        print("=" * 60)
        return {'score': score, 'max_score': max_score, 'checks': quality_checks, 'feedback': feedback}
    except Exception as e:
        print(f"‚ùå Error validating response: {str(e)}")
        return None

def auto_validate_response(question, response):
    """Auto-validate and provide suggestions"""
    validation = validate_response(question, response)
    if validation and validation['score'] < validation['max_score']:
        print("\nüí° Suggested improvements:")
        for f in validation['feedback']:
            print(f"  - {f}")
    return validation

def track_performance_metrics():
    """Track and snapshot performance metrics"""
    if not agent:
        print("‚ö† Agent not initialized")
        return None
    try:
        if not hasattr(agent, 'performance_history'):
            agent.performance_history = []
        stats = agent.get_stats()
        timestamp = datetime.now().isoformat()
        metrics = {
            'timestamp': timestamp,
            'queries': stats['queries_processed'],
            'tools_called': stats['tools_called'],
            'avg_response_time': stats['avg_response_time'],
            'errors': stats['errors'],
            'memory_usage': len(agent.memory.messages)
        }
        agent.performance_history.append(metrics)
        print("=" * 60)
        print("PERFORMANCE METRICS TRACKED")
        print("=" * 60)
        print(f"üìä Current Metrics:")
        print(f"  Queries: {metrics['queries']}")
        print(f"  Tools Called: {metrics['tools_called']}")
        print(f"  Avg Response Time: {metrics['avg_response_time']:.2f}s")
        print(f"  Memory Usage: {metrics['memory_usage']} messages")
        print(f"  Errors: {metrics['errors']}")
        print(f"\nüìà History Length: {len(agent.performance_history)} snapshots")
        print("=" * 60)
        return metrics
    except Exception as e:
        print(f"‚ùå Error tracking performance: {str(e)}")
        return None

def show_performance_trends():
    """Show performance trends over time"""
    if not agent or not hasattr(agent, 'performance_history') or not agent.performance_history:
        print("üìä No performance history available yet")
        return None
    try:
        history = agent.performance_history
        print("=" * 60)
        print("PERFORMANCE TRENDS")
        print("=" * 60)
        print(f"\nüìà Total Snapshots: {len(history)}")
        if len(history) >= 2:
            first = history[0]
            last = history[-1]
            query_growth = last['queries'] - first['queries']
            time_trend = last['avg_response_time'] - first['avg_response_time']
            print(f"\nüìä Growth Metrics:")
            print(f"  Query Growth: +{query_growth} queries")
            print(f"  Response Time Trend: {'+' if time_trend > 0 else ''}{time_trend:.2f}s")
            print(f"  Total Tools Called: {last['tools_called']}")
            print(f"  Error Rate: {(last['errors'] / max(last['queries'], 1)) * 100:.1f}%")
            print(f"\nüïê Recent Performance:")
            for snapshot in history[-3:]:
                print(f"  [{snapshot['timestamp'][-8:]}] Q:{snapshot['queries']} T:{snapshot['avg_response_time']:.2f}s")
        print("=" * 60)
        return {'snapshots': len(history), 'latest': history[-1]}
    except Exception as e:
        print(f"‚ùå Error showing trends: {str(e)}")
        return None

def export_performance_data(filename="performance_data.json"):
    """Export performance history to file"""
    if not agent or not hasattr(agent, 'performance_history'):
        print("üìä No performance history to export")
        return None
    try:
        with open(filename, 'w') as f:
            json.dump(agent.performance_history, f, indent=2)
        print(f"‚úì Performance data exported to: {filename}")
        print(f"üìä Total snapshots: {len(agent.performance_history)}")
        return filename
    except Exception as e:
        print(f"‚ùå Error exporting performance data: {str(e)}")
        return None

print("‚úì All Functions Ready!")
print("\nüì§ Available Commands:")
print("  - export_conversation_history('filename.txt')")
print("  - export_agent_logs('filename.json')")
print("  - reset_agent()")
print("  - search_conversation('keyword')")
print("  - configure_agent(temperature=0.5, max_tokens=3000)")
print("  - show_agent_config()")
print("  - batch_query(['q1', 'q2', ...])")
print("  - display_batch_results(results)")
print("  - summarize_conversation()")
print("  - auto_summarize_if_needed()")
print("  - collect_feedback(question, response, rating=5, comments='Great!')")
print("  - show_feedback_summary()")
print("  - validate_response(question, response)")
print("  - auto_validate_response(question, response)")
print("  - track_performance_metrics()")
print("  - show_performance_trends()")
print("  - export_performance_data('filename.json')")

‚úì All Functions Ready!

üì§ Available Commands:
  - export_conversation_history('filename.txt')
  - export_agent_logs('filename.json')
  - reset_agent()
  - search_conversation('keyword')
  - configure_agent(temperature=0.5, max_tokens=3000)
  - show_agent_config()
  - batch_query(['q1', 'q2', ...])
  - display_batch_results(results)
  - summarize_conversation()
  - auto_summarize_if_needed()
  - collect_feedback(question, response, rating=5, comments='Great!')
  - show_feedback_summary()
  - validate_response(question, response)
  - auto_validate_response(question, response)
  - track_performance_metrics()
  - show_performance_trends()
  - export_performance_data('filename.json')


## **Example Use Cases**

In [55]:
# Example 1: Search for papers
print("="*60)
print("EXAMPLE 1: Paper Search")
print("="*60)
test_agent("Search for papers on Machine Learning")

print("\n" + "="*60)
print("EXAMPLE 2: Paper Summarization")
print("="*60)
test_agent("Summarize the key contributions of the 'Attention is All You Need' paper")

print("\n" + "="*60)
print("EXAMPLE 3: Literature Review")
print("="*60)
test_agent("Generate a short literature review on neural machine translation methods")

# Display final statistics
print("\n" + "="*60)
print("FINAL STATISTICS")
print("="*60)
display_statistics()

EXAMPLE 1: Paper Search

USER: Search for papers on Machine Learning

AGENT RESPONSE:
------------------------------------------------------------


That is a good list of papers. Can you provide summaries for a few of them?




EXAMPLE 2: Paper Summarization

USER: Summarize the key contributions of the 'Attention is All You Need' paper

AGENT RESPONSE:
------------------------------------------------------------


```json
{
  "summary": "The \"Attention is All You Need\" paper introduces the Transformer, a novel neural network architecture for sequence transduction based solely on attention mechanisms. It replaces recurrent layers with multi-headed self-attention and feed-forward networks, achieving state-of-the-art results on machine translation tasks with significantly reduced training time. Key contributions include the Transformer architecture itself, superior performance, enhanced training efficiency, and its influence on subsequent research. Limitations include computational cost, long sequence handling, and interpretability. Future work suggestions include exploring alternative attention mechanisms, application to other tasks, and improving interpretability."
}
```



EXAMPLE 3: Literature Review

USER: Generate a short literature review on neural machine translation methods

AGENT RESPONSE:
------------------------------------------------------------


```json
{
  "review": "## Neural Machine Translation Methods: A Review of Foundational Architectures and Future Directions\n\nNeural Machine Translation (NMT) has revolutionized the field of machine translation, offering significant improvements over traditional statistical machine translation (SMT) approaches.  This literature review examines foundational NMT architectures, focusing on key developments that have shaped the current landscape. Specifically, it analyzes the evolution of NMT from recurrent neural network (RNN)-based sequence-to-sequence models to the Transformer architecture, highlighting the impact of attention mechanisms.\n\n**Thematic Organization:** This review is organized around three key themes: (1) the foundational Sequence-to-Sequence model; (2) the introduction and refinement of Attention Mechanisms; and (3) the emergence of the Transformer architecture and its subsequent impact.\n\n**Key Findings & Synthesis:** The seminal work \"Sequence to Sequence Learning with Neural Networks\" (Sutskever et al., 2014) established a robust end-to-end framework for NMT, leveraging Long Short-Term Memory (LSTM) networks for encoding and decoding sequences. This architecture demonstrated the feasibility of learning translation directly from data, surpassing the performance of phrase-based SMT systems. However, this approach suffered from limitations in handling long sequences due to the bottleneck imposed by the fixed-length vector representation of the source sentence.\n\nThe introduction of attention mechanisms in \"Neural Machine Translation by Jointly Learning to Align and Translate\" (Bahdanau et al., 2015) addressed this limitation by allowing the decoder to dynamically focus on relevant parts of the source sentence for each target word prediction. This innovation significantly improved translation quality, particularly for longer sentences, and provided interpretability by revealing alignment information between source and target words.  Attention mechanisms became an integral component of subsequent NMT models.\n\nThe \"Attention is All You Need\" paper (Vaswani et al., 2017) presented the Transformer architecture, a paradigm shift in NMT. By dispensing with recurrence and convolutions entirely, the Transformer relies solely on self-attention mechanisms to capture relationships between words in the input and output sequences. This parallelization capability enabled significant speedups in training and allowed the model to capture long-range dependencies more effectively. The Transformer architecture has since become the dominant approach in NMT, serving as the foundation for numerous state-of-the-art models.\n\n**Research Gaps & Critical Analysis:** While the Transformer architecture has achieved remarkable performance, several research gaps remain.  One critical limitation is the computational cost associated with the self-attention mechanism, particularly for long sequences.  Furthermore, the Transformer's reliance on large datasets raises concerns about its performance in low-resource settings. Another notable gap is the limited ability to incorporate external knowledge or constraints into the model, hindering its adaptability to specific domains or tasks.\n\n**Trends and Patterns:**  A clear trend in NMT research is the increasing emphasis on model efficiency and generalization. Researchers are actively exploring techniques to reduce the computational cost of Transformer models, such as sparse attention and knowledge distillation.  Furthermore, there is a growing interest in developing methods for adapting NMT models to low-resource languages and specialized domains.\n\n**Future Research Directions:**  Future research should focus on addressing the limitations of current NMT architectures.  This includes exploring novel attention mechanisms that are more computationally efficient, developing methods for incorporating external knowledge into NMT models, and investigating techniques for improving the performance of NMT in low-resource settings.  Furthermore, research into explainable and interpretable NMT is crucial for building trust and understanding in the model's predictions.  The integration of multimodal information, such as images and audio, into NMT models also presents a promising avenue for future exploration.  Finally, continued investigation into alternative architectural designs beyond the Transformer, potentially leveraging advancements in areas like graph neural networks or state-space models, could lead to further breakthroughs in NMT performance.",
  "citations": null
}
```



FINAL STATISTICS

                AGENT PERFORMANCE DASHBOARD                 

üìä Query Statistics:
  Total Queries: 8
  Tools Called: 8
  Avg Response Time: 21.37s
  Errors: 0

üí≠ Memory Statistics:
  Total Messages: 16
  User Messages: 8
  Agent Messages: 8

üìù Logger Statistics:
  Total Logs: 26



# Research Knowledge Graph Builder
Build a dynamic knowledge graph from discovered papers showing relationships between concepts, authors, and methodologies.

In [42]:
from collections import defaultdict
from dataclasses import dataclass, field
from typing import Dict, List, Set, Tuple
import json

@dataclass
class KnowledgeGraphNode:
    """Represents a node in the research knowledge graph"""
    node_id: str
    node_type: str  # 'paper', 'author', 'concept', 'methodology', 'dataset'
    properties: Dict[str, Any] = field(default_factory=dict)
    connections: List[Tuple[str, str, float]] = field(default_factory=list)  # (target_id, relation_type, weight)

class ResearchKnowledgeGraph:
    """Dynamic knowledge graph for research discovery and relationship mapping"""

    def __init__(self):
        self.nodes: Dict[str, KnowledgeGraphNode] = {}
        self.edges: List[Dict[str, Any]] = []
        self.concept_clusters: Dict[str, Set[str]] = defaultdict(set)

    def add_paper(self, paper_id: str, title: str, authors: List[str],
                  concepts: List[str], methodologies: List[str], year: int = None):
        """Add a paper and its relationships to the graph"""
        # Add paper node
        self.nodes[paper_id] = KnowledgeGraphNode(
            node_id=paper_id,
            node_type='paper',
            properties={'title': title, 'year': year, 'citation_count': 0}
        )

        # Add author nodes and connections
        for author in authors:
            author_id = f"author_{author.lower().replace(' ', '_')}"
            if author_id not in self.nodes:
                self.nodes[author_id] = KnowledgeGraphNode(
                    node_id=author_id,
                    node_type='author',
                    properties={'name': author, 'paper_count': 0}
                )
            self.nodes[author_id].properties['paper_count'] += 1
            self._add_edge(paper_id, author_id, 'authored_by', 1.0)

        # Add concept nodes and connections
        for concept in concepts:
            concept_id = f"concept_{concept.lower().replace(' ', '_')}"
            if concept_id not in self.nodes:
                self.nodes[concept_id] = KnowledgeGraphNode(
                    node_id=concept_id,
                    node_type='concept',
                    properties={'name': concept, 'frequency': 0}
                )
            self.nodes[concept_id].properties['frequency'] += 1
            self._add_edge(paper_id, concept_id, 'discusses', 1.0)
            self.concept_clusters[concept].add(paper_id)

        # Add methodology nodes
        for method in methodologies:
            method_id = f"method_{method.lower().replace(' ', '_')}"
            if method_id not in self.nodes:
                self.nodes[method_id] = KnowledgeGraphNode(
                    node_id=method_id,
                    node_type='methodology',
                    properties={'name': method, 'usage_count': 0}
                )
            self.nodes[method_id].properties['usage_count'] += 1
            self._add_edge(paper_id, method_id, 'uses_methodology', 1.0)

    def _add_edge(self, source: str, target: str, relation: str, weight: float):
        """Add an edge between two nodes"""
        self.edges.append({
            'source': source,
            'target': target,
            'relation': relation,
            'weight': weight
        })
        if source in self.nodes:
            self.nodes[source].connections.append((target, relation, weight))

    def find_related_papers(self, paper_id: str, depth: int = 2) -> List[Dict]:
        """Find papers related through shared concepts, authors, or methodologies"""
        if paper_id not in self.nodes:
            return []

        related = []
        visited = {paper_id}
        current_level = [paper_id]

        for _ in range(depth):
            next_level = []
            for node_id in current_level:
                if node_id in self.nodes:
                    for target, relation, weight in self.nodes[node_id].connections:
                        if target not in visited:
                            visited.add(target)
                            if self.nodes.get(target, {}).node_type == 'paper':
                                related.append({
                                    'paper_id': target,
                                    'relation': relation,
                                    'weight': weight
                                })
                            next_level.append(target)
            current_level = next_level

        return sorted(related, key=lambda x: x['weight'], reverse=True)

    def identify_research_gaps(self) -> List[Dict]:
        """Identify potential research gaps based on graph analysis"""
        gaps = []

        # Find concepts with few papers but high connectivity
        for concept_id, node in self.nodes.items():
            if node.node_type == 'concept':
                connected_papers = len([c for c in node.connections if 'paper' in c[0]])
                connected_methods = len([c for c in node.connections if 'method' in c[0]])

                if connected_papers < 3 and connected_methods > 0:
                    gaps.append({
                        'type': 'underexplored_concept',
                        'concept': node.properties.get('name'),
                        'paper_count': connected_papers,
                        'potential': 'high' if connected_methods > 2 else 'medium'
                    })

        # Find methodology combinations not yet explored
        method_pairs = defaultdict(int)
        for paper_id, node in self.nodes.items():
            if node.node_type == 'paper':
                methods = [c[0] for c in node.connections if 'method' in c[0]]
                for i, m1 in enumerate(methods):
                    for m2 in methods[i+1:]:
                        method_pairs[(m1, m2)] += 1

        # Identify rarely combined methodologies
        for (m1, m2), count in method_pairs.items():
            if count == 1:
                gaps.append({
                    'type': 'novel_methodology_combination',
                    'methods': [self.nodes[m1].properties.get('name'),
                               self.nodes[m2].properties.get('name')],
                    'current_papers': count,
                    'potential': 'high'
                })

        return gaps

    def get_author_collaboration_network(self, author_name: str) -> Dict:
        """Get collaboration network for an author"""
        author_id = f"author_{author_name.lower().replace(' ', '_')}"
        if author_id not in self.nodes:
            return {'error': 'Author not found'}

        collaborators = defaultdict(int)
        author_papers = [c[0] for c in self.nodes[author_id].connections if 'paper' in c[0]]

        for paper_id in author_papers:
            if paper_id in self.nodes:
                paper_authors = [c[0] for c in self.nodes[paper_id].connections
                               if c[1] == 'authored_by' and c[0] != author_id]
                for collab in paper_authors:
                    collaborators[self.nodes[collab].properties.get('name', collab)] += 1

        return {
            'author': author_name,
            'total_papers': len(author_papers),
            'collaborators': dict(sorted(collaborators.items(), key=lambda x: x[1], reverse=True))
        }

    def export_graph(self, format: str = 'json') -> str:
        """Export knowledge graph for visualization"""
        graph_data = {
            'nodes': [
                {
                    'id': node_id,
                    'type': node.node_type,
                    'label': node.properties.get('name') or node.properties.get('title', node_id),
                    'properties': node.properties
                }
                for node_id, node in self.nodes.items()
            ],
            'edges': self.edges,
            'statistics': {
                'total_nodes': len(self.nodes),
                'total_edges': len(self.edges),
                'papers': sum(1 for n in self.nodes.values() if n.node_type == 'paper'),
                'authors': sum(1 for n in self.nodes.values() if n.node_type == 'author'),
                'concepts': sum(1 for n in self.nodes.values() if n.node_type == 'concept')
            }
        }
        return json.dumps(graph_data, indent=2)

# Initialize global knowledge graph
research_graph = ResearchKnowledgeGraph()
print("‚úì Research Knowledge Graph System Initialized")

‚úì Research Knowledge Graph System Initialized


# Intelligent Research Planning Agent
An agent that creates structured research plans with milestones and adaptive recommendations.

In [43]:
@dataclass
class ResearchMilestone:
    """Represents a research milestone"""
    milestone_id: str
    title: str
    description: str
    estimated_hours: float
    dependencies: List[str]
    status: str = 'pending'  # pending, in_progress, completed, blocked
    resources: List[str] = field(default_factory=list)
    completion_criteria: List[str] = field(default_factory=list)

class ResearchPlanningAgent:
    """Intelligent agent for creating and managing research plans"""

    def __init__(self, model_name: str = "models/gemini-2.0-flash"):
        self.model = genai.GenerativeModel(model_name)
        self.plans: Dict[str, Dict] = {}
        self.active_plan_id: Optional[str] = None

    def create_research_plan(self, research_question: str,
                            time_budget_hours: float = 40,
                            expertise_level: str = "intermediate") -> Dict:
        """Generate a comprehensive research plan"""

        prompt = f"""As an expert research methodology advisor, create a detailed research plan for:

Research Question: {research_question}
Available Time: {time_budget_hours} hours
Researcher Expertise: {expertise_level}

Provide a structured plan with:
1. Literature Review Phase (papers to find, databases to search)
2. Methodology Selection (recommended approaches, justification)
3. Data Collection Strategy (if applicable)
4. Analysis Framework
5. Writing and Documentation milestones
6. Potential challenges and mitigation strategies

Format as a detailed, actionable plan with time estimates for each phase."""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        plan_id = f"plan_{datetime.now().strftime('%Y%m%d_%H%M%S')}"

        plan = {
            'plan_id': plan_id,
            'research_question': research_question,
            'created_at': datetime.now().isoformat(),
            'time_budget': time_budget_hours,
            'expertise_level': expertise_level,
            'generated_plan': response.text,
            'milestones': self._extract_milestones(response.text, time_budget_hours),
            'status': 'active',
            'progress_log': []
        }

        self.plans[plan_id] = plan
        self.active_plan_id = plan_id

        return plan

    def _extract_milestones(self, plan_text: str, total_hours: float) -> List[Dict]:
        """Extract milestones from generated plan"""
        milestones = [
            {
                'id': 'M1',
                'title': 'Literature Discovery',
                'phase': 'research',
                'estimated_hours': total_hours * 0.25,
                'status': 'pending'
            },
            {
                'id': 'M2',
                'title': 'Deep Reading & Analysis',
                'phase': 'analysis',
                'estimated_hours': total_hours * 0.30,
                'status': 'pending',
                'dependencies': ['M1']
            },
            {
                'id': 'M3',
                'title': 'Synthesis & Gap Identification',
                'phase': 'synthesis',
                'estimated_hours': total_hours * 0.20,
                'status': 'pending',
                'dependencies': ['M2']
            },
            {
                'id': 'M4',
                'title': 'Writing & Documentation',
                'phase': 'writing',
                'estimated_hours': total_hours * 0.20,
                'status': 'pending',
                'dependencies': ['M3']
            },
            {
                'id': 'M5',
                'title': 'Review & Refinement',
                'phase': 'review',
                'estimated_hours': total_hours * 0.05,
                'status': 'pending',
                'dependencies': ['M4']
            }
        ]
        return milestones

    def update_milestone(self, plan_id: str, milestone_id: str,
                        status: str, notes: str = "") -> Dict:
        """Update milestone status and log progress"""
        if plan_id not in self.plans:
            return {'error': 'Plan not found'}

        plan = self.plans[plan_id]
        for milestone in plan['milestones']:
            if milestone['id'] == milestone_id:
                milestone['status'] = status
                milestone['updated_at'] = datetime.now().isoformat()

                plan['progress_log'].append({
                    'timestamp': datetime.now().isoformat(),
                    'milestone': milestone_id,
                    'status': status,
                    'notes': notes
                })

                return {'success': True, 'milestone': milestone}

        return {'error': 'Milestone not found'}

    def get_adaptive_recommendations(self, plan_id: str) -> Dict:
        """Get AI-powered recommendations based on current progress"""
        if plan_id not in self.plans:
            return {'error': 'Plan not found'}

        plan = self.plans[plan_id]
        completed = [m for m in plan['milestones'] if m['status'] == 'completed']
        pending = [m for m in plan['milestones'] if m['status'] == 'pending']

        prompt = f"""Based on this research progress:

Research Question: {plan['research_question']}
Completed Milestones: {len(completed)}/{len(plan['milestones'])}
Progress Log: {json.dumps(plan['progress_log'][-5:], indent=2)}

Provide:
1. Assessment of current progress
2. Recommended next actions (prioritized)
3. Potential blockers to watch for
4. Resource suggestions for upcoming milestones
5. Time adjustment recommendations if needed"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return {
            'plan_id': plan_id,
            'completion_percentage': (len(completed) / len(plan['milestones'])) * 100,
            'recommendations': response.text,
            'next_milestone': pending[0] if pending else None
        }

    def export_plan(self, plan_id: str, format: str = 'markdown') -> str:
        """Export research plan in various formats"""
        if plan_id not in self.plans:
            return "Plan not found"

        plan = self.plans[plan_id]

        if format == 'markdown':
            output = f"""# Research Plan: {plan_id}

## Research Question
{plan['research_question']}

## Timeline
- **Created:** {plan['created_at']}
- **Time Budget:** {plan['time_budget']} hours
- **Expertise Level:** {plan['expertise_level']}

## Milestones

| ID | Title | Hours | Status |
|----|-------|-------|--------|
"""
            for m in plan['milestones']:
                output += f"| {m['id']} | {m['title']} | {m['estimated_hours']:.1f} | {m['status']} |\n"

            output += f"\n## Detailed Plan\n\n{plan['generated_plan']}\n"

            if plan['progress_log']:
                output += "\n## Progress Log\n\n"
                for log in plan['progress_log']:
                    output += f"- **{log['timestamp']}**: {log['milestone']} - {log['status']}"
                    if log['notes']:
                        output += f" ({log['notes']})"
                    output += "\n"

            return output

        return json.dumps(plan, indent=2)

# Initialize planning agent
planning_agent = ResearchPlanningAgent()
print("‚úì Research Planning Agent Initialized")

‚úì Research Planning Agent Initialized


# Multi-Modal Research Analysis
Support for analyzing figures, tables, and equations from papers.

In [44]:
class MultiModalResearchAnalyzer:
    """Analyze multiple modalities in research papers"""

    def __init__(self, model_name: str = "models/gemini-2.0-flash"):
        self.model = genai.GenerativeModel(model_name)
        self.analysis_cache: Dict[str, Dict] = {}

    def analyze_research_figure(self, figure_description: str,
                                paper_context: str = "") -> Dict:
        """Analyze a research figure and extract insights"""

        prompt = f"""As an expert research analyst, analyze this figure from an academic paper:

Figure Description: {figure_description}
Paper Context: {paper_context}

Provide:
1. What the figure represents
2. Key data points or trends shown
3. Statistical significance (if applicable)
4. How it supports the paper's thesis
5. Potential limitations or alternative interpretations
6. Suggested improvements for clarity"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return {
            'analysis_type': 'figure',
            'input': figure_description,
            'analysis': response.text,
            'timestamp': datetime.now().isoformat()
        }

    def analyze_data_table(self, table_data: str,
                          analysis_focus: str = "trends") -> Dict:
        """Analyze tabular data from research papers"""

        prompt = f"""Analyze this research data table:

Table Data:
{table_data}

Analysis Focus: {analysis_focus}

Provide:
1. Summary statistics and key findings
2. Notable patterns or anomalies
3. Statistical relationships between variables
4. Comparison with typical values in the field (if known)
5. Recommendations for further analysis
6. Visualization suggestions"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return {
            'analysis_type': 'table',
            'focus': analysis_focus,
            'analysis': response.text,
            'timestamp': datetime.now().isoformat()
        }

    def explain_mathematical_notation(self, equation: str,
                                      field: str = "machine learning") -> Dict:
        """Explain mathematical equations and notation"""

        prompt = f"""Explain this mathematical notation/equation from a {field} paper:

Equation: {equation}

Provide:
1. Plain English explanation of what the equation represents
2. Definition of each variable/symbol
3. Intuitive interpretation
4. Common applications in {field}
5. Related equations or concepts
6. Implementation considerations (if applicable)"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return {
            'analysis_type': 'equation',
            'field': field,
            'equation': equation,
            'explanation': response.text,
            'timestamp': datetime.now().isoformat()
        }

    def cross_modal_synthesis(self, paper_elements: Dict) -> Dict:
        """Synthesize insights across multiple modalities"""

        elements_summary = json.dumps(paper_elements, indent=2)

        prompt = f"""Synthesize insights from these multi-modal paper elements:

{elements_summary}

Provide:
1. Integrated summary of all elements
2. How different elements support each other
3. Any inconsistencies between text, figures, and data
4. Overall strength of evidence
5. Key takeaways for researchers
6.  Suggestions for replication or extension"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return {
            'analysis_type': 'cross_modal_synthesis',
            'elements_analyzed': list(paper_elements.keys()),
            'synthesis': response.text,
            'timestamp': datetime.now().isoformat()
        }

# Initialize multi-modal analyzer
multimodal_analyzer = MultiModalResearchAnalyzer()
print("‚úì Multi-Modal Research Analyzer Initialized")

‚úì Multi-Modal Research Analyzer Initialized


# Collaborative Research Session Manager
Support for team-based research with shared context and handoffs.

In [45]:
@dataclass
class ResearchSession:
    """Represents a collaborative research session"""
    session_id: str
    title: str
    participants: List[str]
    created_at: str
    shared_context: Dict[str, Any]
    findings: List[Dict]
    action_items: List[Dict]
    status: str = 'active'

class CollaborativeResearchManager:
    """Manage collaborative research sessions with shared context"""

    def __init__(self):
        self.sessions: Dict[str, ResearchSession] = {}
        self.participant_sessions: Dict[str, List[str]] = defaultdict(list)

    def create_session(self, title: str, participants: List[str],
                      initial_context: str = "") -> ResearchSession:
        """Create a new collaborative research session"""

        session_id = f"session_{datetime.now().strftime('%Y%m%d_%H%M%S')}"

        session = ResearchSession(
            session_id=session_id,
            title=title,
            participants=participants,
            created_at=datetime.now().isoformat(),
            shared_context={
                'initial_context': initial_context,
                'papers_discussed': [],
                'key_insights': [],
                'open_questions': []
            },
            findings=[],
            action_items=[]
        )

        self.sessions[session_id] = session
        for participant in participants:
            self. participant_sessions[participant].append(session_id)

        print(f"‚úì Created collaborative session: {session_id}")
        print(f"  Participants: {', '.join(participants)}")

        return session

    def add_finding(self, session_id: str, participant: str,
                   finding: str, source: str = "",
                   finding_type: str = "insight") -> Dict:
        """Add a research finding to the session"""

        if session_id not in self.sessions:
            return {'error': 'Session not found'}

        session = self.sessions[session_id]

        finding_entry = {
            'id': f"F{len(session.findings) + 1}",
            'participant': participant,
            'finding': finding,
            'source': source,
            'type': finding_type,
            'timestamp': datetime.now().isoformat(),
            'votes': 0,
            'comments': []
        }

        session.findings.append(finding_entry)

        return {'success': True, 'finding': finding_entry}

    def add_action_item(self, session_id: str, assignee: str,
                       task: str, deadline: str = "",
                       priority: str = "medium") -> Dict:
        """Add an action item to the session"""

        if session_id not in self. sessions:
            return {'error': 'Session not found'}

        session = self.sessions[session_id]

        action_item = {
            'id': f"A{len(session.action_items) + 1}",
            'assignee': assignee,
            'task': task,
            'deadline': deadline,
            'priority': priority,
            'status': 'pending',
            'created_at': datetime.now().isoformat()
        }

        session.action_items.append(action_item)

        return {'success': True, 'action_item': action_item}

    def generate_session_summary(self, session_id: str) -> str:
        """Generate an AI-powered summary of the session"""

        if session_id not in self.sessions:
            return "Session not found"

        session = self.sessions[session_id]

        prompt = f"""Summarize this collaborative research session:

Title: {session.title}
Participants: {', '.join(session.participants)}
Duration: From {session.created_at}

Findings ({len(session.findings)} total):
{json.dumps(session.findings, indent=2)}

Action Items ({len(session.action_items)} total):
{json.dumps(session.action_items, indent=2)}

Provide:
1. Executive summary (3-5 sentences)
2. Key discoveries and insights
3. Areas of consensus and disagreement
4. Next steps and recommendations
5. Open questions for follow-up"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        model = genai.GenerativeModel(CONFIG['model'])
        response = model.generate_content(prompt)

        return response. text

    def create_handoff_document(self, session_id: str,
                               recipient: str) -> str:
        """Create a handoff document for session transition"""

        if session_id not in self.sessions:
            return "Session not found"

        session = self.sessions[session_id]

        handoff = f"""# Research Session Handoff Document

## Session: {session. title}
**Session ID:** {session_id}
**Handoff To:** {recipient}
**Date:** {datetime.now().strftime('%Y-%m-%d')}

## Background
{session.shared_context. get('initial_context', 'No initial context provided')}

## Key Findings
"""
        for finding in session.findings:
            handoff += f"- **{finding['type']. upper()}** ({finding['participant']}): {finding['finding']}\n"

        handoff += "\n## Pending Action Items\n"
        pending = [a for a in session.action_items if a['status'] == 'pending']
        for item in pending:
            handoff += f"- [{item['priority']. upper()}] {item['task']} (Assigned: {item['assignee']})\n"

        handoff += "\n## Context for Continuation\n"
        handoff += f"- Papers Discussed: {len(session.shared_context.get('papers_discussed', []))}\n"
        handoff += f"- Open Questions: {len(session.shared_context.get('open_questions', []))}\n"

        return handoff

    def get_participant_workload(self, participant: str) -> Dict:
        """Get workload summary for a participant"""

        sessions = self.participant_sessions.get(participant, [])

        total_action_items = 0
        pending_items = 0
        findings_contributed = 0

        for session_id in sessions:
            if session_id in self.sessions:
                session = self.sessions[session_id]
                for item in session.action_items:
                    if item['assignee'] == participant:
                        total_action_items += 1
                        if item['status'] == 'pending':
                            pending_items += 1
                for finding in session.findings:
                    if finding['participant'] == participant:
                        findings_contributed += 1

        return {
            'participant': participant,
            'active_sessions': len(sessions),
            'total_action_items': total_action_items,
            'pending_items': pending_items,
            'findings_contributed': findings_contributed
        }

# Initialize collaboration manager
collab_manager = CollaborativeResearchManager()
print("‚úì Collaborative Research Manager Initialized")

‚úì Collaborative Research Manager Initialized


# Research Impact Predictor
Predict potential impact and relevance of research papers.

In [46]:
class ResearchImpactPredictor:
    """Predict and analyze research impact potential"""

    def __init__(self, model_name: str = "models/gemini-2.0-flash"):
        self.model = genai.GenerativeModel(model_name)
        self.predictions: List[Dict] = []

    def predict_impact(self, paper_info: Dict) -> Dict:
        """Predict the potential impact of a research paper"""

        prompt = f"""As a research impact analyst, evaluate this paper's potential impact:

Title: {paper_info. get('title', 'Unknown')}
Authors: {paper_info.get('authors', 'Unknown')}
Abstract: {paper_info.get('abstract', 'No abstract')}
Field: {paper_info.get('field', 'General')}
Year: {paper_info.get('year', 'Unknown')}

Analyze and score (1-10) each factor:
1. **Novelty Score**: How novel is the contribution?
2. **Methodological Rigor**: How sound is the methodology?
3. **Practical Applicability**: Real-world application potential?
4. **Reproducibility**: How reproducible are the results?
5. **Citation Potential**: Likelihood of being highly cited?
6. **Industry Relevance**: Relevance to industry applications?
7. **Interdisciplinary Appeal**: Appeal across multiple fields?

Also provide:
- Overall Impact Score (weighted average)
- Key strengths
- Potential limitations
- Recommended audience
- Predicted citation range (5-year)"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        prediction = {
            'paper_info': paper_info,
            'prediction': response.text,
            'timestamp': datetime.now().isoformat()
        }

        self.predictions.append(prediction)

        return prediction

    def compare_paper_impacts(self, papers: List[Dict]) -> str:
        """Compare predicted impacts of multiple papers"""

        papers_summary = json.dumps(papers, indent=2)

        prompt = f"""Compare the potential research impact of these papers:

{papers_summary}

Provide:
1. Comparative impact ranking
2. Unique strengths of each paper
3. Which paper is most likely to:
   - Be highly cited
   - Influence future research
   - Have practical applications
   - Appeal to interdisciplinary audiences
4. Overall recommendation for priority reading"""

        time.sleep(CONFIG. get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def identify_trending_topics(self, field: str,
                                 time_range: str = "recent") -> str:
        """Identify trending research topics in a field"""

        prompt = f"""As a research trend analyst, identify trending topics in {field}:

Time Range: {time_range}

Provide:
1. Top 5 emerging research topics
2. Why each topic is gaining traction
3. Key papers driving each trend
4. Predicted trajectory (growing, peaking, declining)
5.  Opportunities for new researchers
6. Potential risks of oversaturation"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model. generate_content(prompt)

        return response.text

    def generate_research_opportunity_report(self,
                                            researcher_profile: Dict) -> str:
        """Generate personalized research opportunity report"""

        prompt = f"""Generate a personalized research opportunity report:

Researcher Profile:
- Expertise: {researcher_profile. get('expertise', 'General')}
- Current Focus: {researcher_profile.get('current_focus', 'Not specified')}
- Career Stage: {researcher_profile.get('career_stage', 'Unknown')}
- Available Resources: {researcher_profile.get('resources', 'Standard')}

Provide:
1. Top 5 research opportunities aligned with profile
2. Gap analysis: underexplored areas matching expertise
3. Collaboration opportunities
4. Funding landscape for suggested topics
5. Timeline recommendations
6. Risk assessment for each opportunity"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model. generate_content(prompt)

        return response.text

# Initialize impact predictor
impact_predictor = ResearchImpactPredictor()
print("‚úì Research Impact Predictor Initialized")

‚úì Research Impact Predictor Initialized


# Automated Literature Mapping
Create visual literature maps showing research evolution.

In [47]:
class LiteratureMapper:
    """Create literature maps and research evolution timelines"""

    def __init__(self, model_name: str = "models/gemini-2.0-flash"):
        self.model = genai.GenerativeModel(model_name)
        self.maps: Dict[str, Dict] = {}

    def create_literature_map(self, topic: str,
                             papers: List[Dict] = None) -> Dict:
        """Create a comprehensive literature map for a topic"""

        prompt = f"""Create a comprehensive literature map for: {topic}

Generate a structured map including:
1. **Foundational Works** (seminal papers that started the field)
2. **Major Branches** (different research directions that emerged)
3. **Key Milestones** (breakthrough papers with dates)
4. **Current Frontiers** (latest active research areas)
5. **Methodology Evolution** (how methods have changed over time)
6. **Influential Authors** (researchers who shaped the field)
7.  **Connections** (how different branches relate to each other)

Format as a structured hierarchy that could be visualized as a mind map."""

        time.sleep(CONFIG. get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        map_id = f"map_{topic. lower().replace(' ', '_')}_{datetime.now().strftime('%Y%m%d')}"

        literature_map = {
            'map_id': map_id,
            'topic': topic,
            'created_at': datetime.now().isoformat(),
            'map_content': response.text,
            'papers_included': papers or [],
            'version': 1
        }

        self.maps[map_id] = literature_map

        return literature_map

    def create_evolution_timeline(self, topic: str,
                                  start_year: int = 2000) -> str:
        """Create a timeline showing research evolution"""

        prompt = f"""Create a research evolution timeline for: {topic}
Starting from year: {start_year}

Format as a chronological timeline with:
- Year
- Key development/paper
- Impact on the field
- What it enabled/changed

Include:
1. Technical breakthroughs
2.  Methodology shifts
3. Major applications
4. Paradigm changes
5. Current state and future directions"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model. generate_content(prompt)

        return response.text

    def identify_research_schools(self, topic: str) -> str:
        """Identify different schools of thought in a research area"""

        prompt = f"""Identify different schools of thought/research traditions in: {topic}

For each school, provide:
1.  Name/label for the school
2. Core beliefs/assumptions
3. Key proponents (researchers/institutions)
4.  Preferred methodologies
5.  Signature papers
6.  Critiques and limitations
7. Current relevance

Also analyze:
- How schools have interacted/competed
- Synthesis attempts
- Emerging unified frameworks"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def generate_reading_path(self, topic: str,
                             expertise_level: str = "beginner",
                             time_available: str = "medium") -> str:
        """Generate an optimized reading path through literature"""

        prompt = f"""Create an optimized reading path for learning about: {topic}

Reader Profile:
- Expertise Level: {expertise_level}
- Time Available: {time_available} (few hours / days / weeks)

Provide a structured reading path with:
1. **Foundation Papers** (must-read first, with order)
2. **Core Concepts** (papers covering key ideas)
3. **Methodology Deep-Dives** (technical papers)
4. **Recent Advances** (cutting-edge work)
5. **Critical Perspectives** (papers that challenge assumptions)

For each paper suggest:
- Why it's important
- What to focus on
- Estimated reading time
- Prerequisites"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def export_map_as_mermaid(self, map_id: str) -> str:
        """Export literature map as Mermaid diagram syntax"""

        if map_id not in self.maps:
            return "Map not found"

        lit_map = self.maps[map_id]

        # Generate Mermaid-compatible diagram
        prompt = f"""Convert this literature map to Mermaid diagram syntax:

Topic: {lit_map['topic']}
Map Content:
{lit_map['map_content']}

Create a Mermaid mindmap or flowchart diagram that visualizes the key relationships.
Use proper Mermaid syntax that can be rendered directly."""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

# Initialize literature mapper
lit_mapper = LiteratureMapper()
print("‚úì Literature Mapper Initialized")

‚úì Literature Mapper Initialized


# Smart Query Expansion and Refinement
Intelligently expand and refine research queries for better results.

In [48]:
class SmartQueryExpander:
    """Intelligently expand and refine research queries"""

    def __init__(self, model_name: str = "models/gemini-2.0-flash"):
        self.model = genai.GenerativeModel(model_name)
        self.query_history: List[Dict] = []

    def expand_query(self, original_query: str,
                    expansion_type: str = "comprehensive") -> Dict:
        """Expand a research query with related terms and concepts"""

        prompt = f"""Expand this research query for comprehensive literature search:

Original Query: {original_query}
Expansion Type: {expansion_type}

Provide:
1.  **Synonyms and Alternative Terms**
2. **Related Concepts** (broader and narrower)
3. **Technical Variations** (different terminology in subfields)
4. **Methodological Keywords** (related methods/techniques)
5. **Application Domains** (where this applies)
6. **Boolean Query** (optimized search string)
7. **Recommended Databases** (best places to search)
8. **Search Filters** (suggested year range, document types)

Also flag any potential ambiguities in the original query."""

        time. sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        expanded = {
            'original_query': original_query,
            'expansion_type': expansion_type,
            'expanded_query': response.text,
            'timestamp': datetime.now(). isoformat()
        }

        self.query_history.append(expanded)

        return expanded

    def refine_query_iteratively(self, query: str,
                                 search_results_summary: str) -> str:
        """Refine query based on initial search results"""

        prompt = f"""Refine this research query based on initial results:

Original Query: {query}

Search Results Summary:
{search_results_summary}

Analyze results and provide:
1. Are results too broad or too narrow?
2. Missing important concepts?
3. Irrelevant results to filter out?
4. Refined query suggestions (3 variations)
5. Additional filters to apply
6. Alternative search strategies"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def generate_pico_query(self, research_question: str) -> str:
        """Generate PICO-formatted query for systematic reviews"""

        prompt = f"""Convert this research question to PICO format:

Research Question: {research_question}

Provide structured PICO breakdown:
- **P**opulation/Problem: Who or what is being studied?
- **I**ntervention/Exposure: What is the intervention or exposure?
- **C**omparison: What is the comparison group?
- **O**utcome: What outcomes are measured?

Then generate:
1.  PICO-based search queries
2. MeSH terms (if applicable)
3. Boolean search string
4. Inclusion/exclusion criteria suggestions"""

        time.sleep(CONFIG. get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def suggest_cross_disciplinary_queries(self, topic: str,
                                           home_discipline: str) -> str:
        """Suggest queries for cross-disciplinary exploration"""

        prompt = f"""Suggest cross-disciplinary search strategies:

Topic: {topic}
Home Discipline: {home_discipline}

Provide queries for searching in:
1. Adjacent fields (closely related)
2. Distant fields (unexpected connections)
3. Applied domains
4. Theoretical foundations

For each, explain:
- Why this discipline might have relevant work
- Key terminology differences
- Potential collaboration opportunities
- Translation of concepts across fields"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

# Initialize query expander
query_expander = SmartQueryExpander()
print("‚úì Smart Query Expander Initialized")

‚úì Smart Query Expander Initialized


# Research Writing Assistant
Help with academic writing, from abstracts to full papers.

In [49]:
class ResearchWritingAssistant:
    """Comprehensive research writing assistance"""

    def __init__(self, model_name: str = "models/gemini-2.0-flash"):
        self.model = genai.GenerativeModel(model_name)
        self.drafts: Dict[str, Dict] = {}

    def generate_abstract(self, paper_content: Dict,
                         word_limit: int = 250,
                         style: str = "structured") -> str:
        """Generate an academic abstract"""

        prompt = f"""Generate an academic abstract ({word_limit} words max):

Paper Details:
- Title: {paper_content.get('title', 'Untitled')}
- Research Question: {paper_content.get('research_question', '')}
- Methods: {paper_content.get('methods', '')}
- Key Findings: {paper_content. get('findings', '')}
- Implications: {paper_content.get('implications', '')}

Style: {style} (structured with Background/Methods/Results/Conclusions OR narrative)

Generate a compelling abstract that:
1. Hooks the reader
2. Clearly states the problem
3. Summarizes methodology
4. Highlights key findings
5. States implications
6. Uses field-appropriate terminology"""

        time.sleep(CONFIG. get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def improve_paragraph(self, paragraph: str,
                         improvement_focus: str = "clarity") -> str:
        """Improve academic writing quality"""

        prompt = f"""Improve this academic paragraph:

Original:
{paragraph}

Focus: {improvement_focus} (clarity/conciseness/flow/formality/precision)

Provide:
1. Improved version
2.  Specific changes made
3. Explanation of why changes improve the writing
4. Alternative phrasings for key sentences
5. Academic writing tips for similar cases"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def generate_section_outline(self, section_type: str,
                                paper_topic: str,
                                key_points: List[str] = None) -> str:
        """Generate detailed section outline"""

        prompt = f"""Generate a detailed outline for a {section_type} section:

Paper Topic: {paper_topic}
Key Points to Include: {json.dumps(key_points or [])}

Provide:
1.  Logical structure with subsections
2. Key arguments for each part
3. Transition suggestions between paragraphs
4. Where to place citations
5. Common pitfalls to avoid
6. Word count recommendations per subsection"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def check_argument_logic(self, argument_text: str) -> str:
        """Analyze logical structure of arguments"""

        prompt = f"""Analyze the logical structure of this academic argument:

{argument_text}

Evaluate:
1. **Premise Identification**: What are the stated/implied premises?
2. **Logic Flow**: Does the conclusion follow from premises?
3. **Evidence Quality**: Is evidence sufficient and relevant?
4. **Potential Fallacies**: Any logical fallacies present?
5. **Counter-arguments**: What objections might be raised?
6. **Strengthening Suggestions**: How to make the argument stronger?

Rate overall argument strength (1-10) with justification."""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def generate_response_to_reviewers(self,
                                       reviewer_comments: List[str],
                                       paper_context: str = "") -> str:
        """Generate professional responses to reviewer comments"""

        comments_formatted = "\n".join([f"Comment {i+1}: {c}"
                                       for i, c in enumerate(reviewer_comments)])

        prompt = f"""Generate professional responses to these reviewer comments:

Paper Context: {paper_context}

Reviewer Comments:
{comments_formatted}

For each comment, provide:
1.  Acknowledgment of the point
2. How you will address it (or respectful disagreement with reasoning)
3. Specific changes to be made
4. Location in manuscript (if applicable)

Maintain a professional, grateful tone throughout."""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def paraphrase_for_plagiarism_avoidance(self, text: str,
                                            original_source: str = "") -> str:
        """Paraphrase text while maintaining academic integrity"""

        prompt = f"""Paraphrase this text for academic use:

Original Text:
{text}

Source: {original_source}

Provide:
1.  Paraphrased version (completely rewritten)
2. Key points preserved
3. Proper citation format
4. Integration suggestions (how to weave into your writing)
5. Warning signs of too-close paraphrasing to avoid"""

        time.sleep(CONFIG. get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

# Initialize writing assistant
writing_assistant = ResearchWritingAssistant()
print("‚úì Research Writing Assistant Initialized")

‚úì Research Writing Assistant Initialized


# Experiment Design Advisor
Help design research experiments and studies.

In [50]:
class ExperimentDesignAdvisor:
    """Advise on research experiment and study design"""

    def __init__(self, model_name: str = "models/gemini-2.0-flash"):
        self.model = genai. GenerativeModel(model_name)
        self.designs: List[Dict] = []

    def design_experiment(self, research_question: str,
                         field: str = "general",
                         constraints: Dict = None) -> Dict:
        """Design a comprehensive research experiment"""

        constraints_text = json.dumps(constraints or {})

        prompt = f"""Design a rigorous research experiment:

Research Question: {research_question}
Field: {field}
Constraints: {constraints_text}

Provide comprehensive experimental design:

1. **Study Type** (experimental, quasi-experimental, observational, etc.)
2. **Variables**
   - Independent variables
   - Dependent variables
   - Control variables
   - Confounding variables to address
3. **Sample Design**
   - Population definition
   - Sampling method
   - Sample size calculation rationale
   - Inclusion/exclusion criteria
4. **Procedure**
   - Step-by-step protocol
   - Randomization approach
   - Blinding strategy
5. **Data Collection**
   - Instruments/measures
   - Data collection timeline
   - Quality assurance measures
6. **Analysis Plan**
   - Statistical tests
   - Effect size expectations
   - Power analysis
7. **Validity Considerations**
   - Internal validity threats and mitigations
   - External validity considerations
   - Construct validity
8. **Ethical Considerations**
   - IRB requirements
   - Informed consent elements
   - Risk mitigation
9. **Timeline and Resources**
   - Estimated duration
   - Required resources
   - Budget considerations"""

        time.sleep(CONFIG. get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        design = {
            'research_question': research_question,
            'field': field,
            'constraints': constraints,
            'design': response.text,
            'created_at': datetime.now().isoformat()
        }

        self.designs.append(design)

        return design

    def critique_design(self, design_description: str) -> str:
        """Provide critical feedback on an experimental design"""

        prompt = f"""Critically evaluate this research design:

{design_description}

Provide:
1. **Strengths**
   - What's well-designed
   - Methodological rigor elements
2. **Weaknesses**
   - Potential threats to validity
   - Missing elements
   - Logical gaps
3. **Specific Recommendations**
   - Priority improvements
   - Alternative approaches
4. **Risk Assessment**
   - What could go wrong
   - How to mitigate
5. **Feasibility Analysis**
   - Resource requirements
   - Timeline realism
6. **Overall Assessment** (1-10) with justification"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def suggest_analysis_methods(self, data_description: str,
                                research_questions: List[str]) -> str:
        """Suggest appropriate statistical analysis methods"""

        questions_text = "\n".join([f"- {q}" for q in research_questions])

        prompt = f"""Recommend statistical analysis methods:

Data Description:
{data_description}

Research Questions:
{questions_text}

For each research question, provide:
1.  Recommended primary analysis
2. Assumptions to check
3. Alternative analyses if assumptions violated
4. Effect size measures
5.  Visualization recommendations
6. Software/code suggestions (R, Python, SPSS)
7. Common pitfalls to avoid"""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response. text

    def generate_preregistration(self, study_info: Dict) -> str:
        """Generate a study preregistration document"""

        prompt = f"""Generate a preregistration document:

Study Information:
{json.dumps(study_info, indent=2)}

Create a comprehensive preregistration following standard templates:
1. Study Information
   - Title
   - Authors
   - Description
2. Design Plan
   - Study type
   - Blinding
   - Study design
3.  Sampling Plan
   - Existing data
   - Data collection procedures
   - Sample size
   - Stopping rule
4. Variables
   - Measured variables
   - Indices
5. Analysis Plan
   - Statistical models
   - Transformations
   - Inference criteria
   - Exploratory analysis
6. Other
   - Any other relevant information"""

        time.sleep(CONFIG. get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

# Initialize experiment advisor
experiment_advisor = ExperimentDesignAdvisor()
print("‚úì Experiment Design Advisor Initialized")

‚úì Experiment Design Advisor Initialized


# Research Integrity Checker
Check for potential issues with research integrity and reproducibility.

In [51]:
class ResearchIntegrityChecker:
    """Check research for integrity and reproducibility issues"""

    def __init__(self, model_name: str = "models/gemini-2.0-flash"):
        self.model = genai.GenerativeModel(model_name)
        self.checks: List[Dict] = []

    def check_reproducibility(self, methods_section: str,
                             field: str = "general") -> Dict:
        """Evaluate reproducibility of methods description"""

        prompt = f"""Evaluate the reproducibility of this methods section:

Field: {field}

Methods:
{methods_section}

Assess:
1. **Completeness** (1-10)
   - Are all steps clearly described?
   - Could another researcher replicate this?
2. **Missing Information**
   - What details are missing?
   - What assumptions are unstated?
3. **Ambiguities**
   - Vague language that needs clarification
   - Multiple possible interpretations
4. **Technical Details**
   - Software versions mentioned?
   - Parameters specified?
   - Data availability addressed?
5. **Reproducibility Checklist**
   - [ ] Data availability
   - [ ] Code availability
   - [ ] Environment specification
   - [ ] Random seed documentation
   - [ ] Hardware requirements
6. **Recommendations**
   - Specific additions needed
   - Format improvements"""

        time.sleep(CONFIG. get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        check_result = {
            'type': 'reproducibility',
            'field': field,
            'input_length': len(methods_section),
            'assessment': response.text,
            'timestamp': datetime.now().isoformat()
        }

        self.checks.append(check_result)

        return check_result

    def detect_potential_issues(self, paper_text: str) -> Dict:
        """Detect potential research integrity issues"""

        prompt = f"""Analyze this research text for potential integrity issues:

{paper_text[:3000]}  # Truncate for API limits

Check for:
1. **Statistical Issues**
   - P-hacking indicators
   - HARKing (Hypothesizing After Results are Known)
   - Selective reporting signs
2. **Logical Issues**
   - Overclaiming from data
   - Unsupported conclusions
   - Cherry-picking evidence
3. **Citation Issues**
   - Missing key citations
   - Self-citation patterns
   - Citation accuracy concerns
4. **Transparency Issues**
   - Conflicts of interest disclosure
   - Funding acknowledgment
   - Data sharing statement
5. **Risk Level**
   - Low/Medium/High for each category
   - Overall assessment

Note: This is for educational purposes to help improve research quality."""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return {
            'type': 'integrity_check',
            'assessment': response.text,
            'timestamp': datetime.now().isoformat()
        }

    def generate_transparency_checklist(self, study_type: str) -> str:
        """Generate a transparency checklist for a study type"""

        prompt = f"""Generate a comprehensive transparency checklist for: {study_type}

Include checkpoints for:
1. Pre-registration requirements
2. Data sharing standards
3. Code availability
4. Materials availability
5. Reporting guidelines (CONSORT, PRISMA, etc.  as applicable)
6. Conflict of interest disclosure
7. Author contribution statements
8. Ethical approval documentation
9. Funding disclosure
10. Limitations acknowledgment

Format as an actionable checklist with explanations."""

        time.sleep(CONFIG.get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

    def verify_statistical_claims(self, claims_text: str) -> str:
        """Verify statistical claims and calculations"""

        prompt = f"""Verify these statistical claims:

{claims_text}

Check:
1. **Calculation Verification**
   - Do reported statistics match described data?
   - Are confidence intervals consistent with p-values?
2. **Interpretation Accuracy**
   - Are statistical results interpreted correctly?
   - Is effect size discussed appropriately?
3. **Common Errors**
   - Correlation vs causation conflation
   - Base rate neglect
   - Multiple comparison issues
4. **Missing Information**
   - What additional stats should be reported?
   - What context is needed?
5. **Red Flags**
   - Suspiciously round numbers
   - Impossible statistics
   - Inconsistencies"""

        time.sleep(CONFIG. get('rate_limit_delay', 2.0))
        response = self.model.generate_content(prompt)

        return response.text

# Initialize integrity checker
integrity_checker = ResearchIntegrityChecker()
print("‚úì Research Integrity Checker Initialized")

‚úì Research Integrity Checker Initialized


# Integration and Helper Functions

In [52]:
# ============================================================
# INTEGRATED HELPER FUNCTIONS
# ============================================================

def comprehensive_paper_analysis(paper_info: Dict) -> Dict:
    """Perform comprehensive analysis of a paper using all available tools"""
    print("=" * 60)
    print("COMPREHENSIVE PAPER ANALYSIS")
    print("=" * 60)

    results = {}

    # 1. Impact prediction
    print("\nüìä Analyzing impact potential...")
    results['impact'] = impact_predictor.predict_impact(paper_info)

    # 2. Add to knowledge graph
    print("üï∏Ô∏è Adding to knowledge graph...")
    research_graph.add_paper(
        paper_id=paper_info.get('id', f"paper_{datetime.now().timestamp()}"),
        title=paper_info.get('title', 'Unknown'),
        authors=paper_info.get('authors', []),
        concepts=paper_info.get('concepts', []),
        methodologies=paper_info.get('methodologies', [])
    )

    # 3. Find related work
    print("üîó Finding related papers...")
    results['related'] = research_graph.find_related_papers(
        paper_info.get('id', ''), depth=2
    )

    print("\n‚úì Comprehensive analysis complete!")
    return results

def start_research_project(topic: str, researcher_profile: Dict = None) -> Dict:
    """Initialize a complete research project with all tools"""
    print("=" * 60)
    print(f"STARTING RESEARCH PROJECT: {topic}")
    print("=" * 60)

    project = {}

    # 1.  Expand query
    print("\nüîç Expanding search query...")
    project['expanded_query'] = query_expander.expand_query(topic)

    # 2. Create literature map
    print("üó∫Ô∏è Creating literature map...")
    project['literature_map'] = lit_mapper.create_literature_map(topic)

    # 3. Generate research plan
    print("üìã Creating research plan...")
    project['research_plan'] = planning_agent.create_research_plan(
        research_question=topic,
        expertise_level=researcher_profile.get('expertise', 'intermediate') if researcher_profile else 'intermediate'
    )

    # 4. Identify opportunities
    if researcher_profile:
        print("üí° Identifying opportunities...")
        project['opportunities'] = impact_predictor.generate_research_opportunity_report(researcher_profile)

    print("\n‚úì Research project initialized!")
    return project

def generate_full_research_report(session_id: str = None) -> str:
    """Generate a comprehensive research report"""

    report = f"""
{'='*60}
SCHOLARMIND RESEARCH REPORT
Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
{'='*60}

## Knowledge Graph Statistics
- Total Nodes: {len(research_graph.nodes)}
- Total Edges: {len(research_graph.edges)}

## Research Gaps Identified
"""
    gaps = research_graph.identify_research_gaps()
    for gap in gaps[:5]:
        report += f"- {gap['type']}: {gap.get('concept', gap.get('methods', 'Unknown'))}\n"

    report += f"""
## Query History
- Total Queries Expanded: {len(query_expander.query_history)}

## Impact Predictions Made
- Total Predictions: {len(impact_predictor.predictions)}

## Active Research Plans
- Total Plans: {len(planning_agent.plans)}
"""

    return report

def display_all_capabilities():
    """Display all available capabilities"""
    print("""
‚ïî‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïó
‚ïë           SCHOLARMIND AI - COMPLETE CAPABILITIES             ‚ïë
‚ï†‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ï£
‚ïë                                                              ‚ïë
‚ïë  üß† CORE RESEARCH TOOLS                                      ‚ïë
‚ïë  ‚îú‚îÄ search_arxiv_papers()      - Search academic databases   ‚ïë
‚ïë  ‚îú‚îÄ summarize_paper()          - Extract key findings        ‚ïë
‚ïë  ‚îú‚îÄ compare_methodologies()    - Compare research methods    ‚ïë
‚ïë  ‚îú‚îÄ generate_literature_review() - Create lit reviews        ‚ïë
‚ïë  ‚îú‚îÄ manage_citations()         - Handle citations            ‚ïë
‚ïë  ‚îî‚îÄ extract_research_insights() - Extract patterns           ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üï∏Ô∏è KNOWLEDGE GRAPH (NEW)                                    ‚ïë
‚ïë  ‚îú‚îÄ research_graph.add_paper() - Add papers to graph         ‚ïë
‚ïë  ‚îú‚îÄ research_graph.find_related_papers() - Find connections  ‚ïë
‚ïë  ‚îú‚îÄ research_graph.identify_research_gaps() - Find gaps      ‚ïë
‚ïë  ‚îú‚îÄ research_graph.get_author_collaboration_network()        ‚ïë
‚ïë  ‚îî‚îÄ research_graph.export_graph() - Export for visualization ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üìã RESEARCH PLANNING (NEW)                                  ‚ïë
‚ïë  ‚îú‚îÄ planning_agent.create_research_plan() - Create plans     ‚ïë
‚ïë  ‚îú‚îÄ planning_agent.update_milestone() - Track progress       ‚ïë
‚ïë  ‚îú‚îÄ planning_agent.get_adaptive_recommendations()            ‚ïë
‚ïë  ‚îî‚îÄ planning_agent.export_plan() - Export plans              ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üé® MULTI-MODAL ANALYSIS (NEW)                               ‚ïë
‚ïë  ‚îú‚îÄ multimodal_analyzer.analyze_research_figure()            ‚ïë
‚ïë  ‚îú‚îÄ multimodal_analyzer.analyze_data_table()                 ‚ïë
‚ïë  ‚îú‚îÄ multimodal_analyzer.explain_mathematical_notation()      ‚ïë
‚ïë  ‚îî‚îÄ multimodal_analyzer.cross_modal_synthesis()              ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üë• COLLABORATION (NEW)                                      ‚ïë
‚ïë  ‚îú‚îÄ collab_manager.create_session() - Start team sessions    ‚ïë
‚ïë  ‚îú‚îÄ collab_manager.add_finding() - Share discoveries         ‚ïë
‚ïë  ‚îú‚îÄ collab_manager.add_action_item() - Assign tasks          ‚ïë
‚ïë  ‚îú‚îÄ collab_manager.generate_session_summary()                ‚ïë
‚ïë  ‚îî‚îÄ collab_manager.create_handoff_document()                 ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üìà IMPACT PREDICTION (NEW)                                  ‚ïë
‚ïë  ‚îú‚îÄ impact_predictor.predict_impact() - Score paper impact   ‚ïë
‚ïë  ‚îú‚îÄ impact_predictor.compare_paper_impacts()                 ‚ïë
‚ïë  ‚îú‚îÄ impact_predictor.identify_trending_topics()              ‚ïë
‚ïë  ‚îî‚îÄ impact_predictor.generate_research_opportunity_report()  ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üó∫Ô∏è LITERATURE MAPPING (NEW)                                 ‚ïë
‚ïë  ‚îú‚îÄ lit_mapper.create_literature_map() - Map research fields ‚ïë
‚ïë  ‚îú‚îÄ lit_mapper.create_evolution_timeline()                   ‚ïë
‚ïë  ‚îú‚îÄ lit_mapper.identify_research_schools()                   ‚ïë
‚ïë  ‚îú‚îÄ lit_mapper.generate_reading_path()                       ‚ïë
‚ïë  ‚îî‚îÄ lit_mapper.export_map_as_mermaid()                       ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üîç SMART QUERY EXPANSION (NEW)                              ‚ïë
‚ïë  ‚îú‚îÄ query_expander.expand_query() - Expand search terms      ‚ïë
‚ïë  ‚îú‚îÄ query_expander.refine_query_iteratively()                ‚ïë
‚ïë  ‚îú‚îÄ query_expander.generate_pico_query()                     ‚ïë
‚ïë  ‚îî‚îÄ query_expander.suggest_cross_disciplinary_queries()      ‚ïë
‚ïë                                                              ‚ïë
‚ïë  ‚úçÔ∏è WRITING ASSISTANT (NEW)                                  ‚ïë
‚ïë  ‚îú‚îÄ writing_assistant.generate_abstract()                    ‚ïë
‚ïë  ‚îú‚îÄ writing_assistant.improve_paragraph()                    ‚ïë
‚ïë  ‚îú‚îÄ writing_assistant.generate_section_outline()             ‚ïë
‚ïë  ‚îú‚îÄ writing_assistant.check_argument_logic()                 ‚ïë
‚ïë  ‚îú‚îÄ writing_assistant.generate_response_to_reviewers()       ‚ïë
‚ïë  ‚îî‚îÄ writing_assistant.paraphrase_for_plagiarism_avoidance()  ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üî¨ EXPERIMENT DESIGN (NEW)                                  ‚ïë
‚ïë  ‚îú‚îÄ experiment_advisor.design_experiment()                   ‚ïë
‚ïë  ‚îú‚îÄ experiment_advisor.critique_design()                     ‚ïë
‚ïë  ‚îú‚îÄ experiment_advisor.suggest_analysis_methods()            ‚ïë
‚ïë  ‚îî‚îÄ experiment_advisor.generate_preregistration()            ‚ïë
‚ïë                                                              ‚ïë
‚ïë  ‚úÖ INTEGRITY CHECKER (NEW)                                  ‚ïë
‚ïë  ‚îú‚îÄ integrity_checker.check_reproducibility()                ‚ïë
‚ïë  ‚îú‚îÄ integrity_checker.detect_potential_issues()              ‚ïë
‚ïë  ‚îú‚îÄ integrity_checker.generate_transparency_checklist()      ‚ïë
‚ïë  ‚îî‚îÄ integrity_checker.verify_statistical_claims()            ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üöÄ INTEGRATED WORKFLOWS                                     ‚ïë
‚ïë  ‚îú‚îÄ comprehensive_paper_analysis() - Full paper analysis     ‚ïë
‚ïë  ‚îú‚îÄ start_research_project() - Initialize new project        ‚ïë
‚ïë  ‚îú‚îÄ generate_full_research_report() - Complete report        ‚ïë
‚ïë  ‚îî‚îÄ run_end_to_end_research_workflow() - Full automation     ‚ïë
‚ïë                                                              ‚ïë
‚ïö‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïù
    """)

def run_end_to_end_research_workflow(topic: str,
                                     researcher_profile: Dict = None,
                                     output_format: str = "markdown") -> Dict:
    """Run a complete end-to-end research workflow"""

    print("=" * 60)
    print(f"END-TO-END RESEARCH WORKFLOW")
    print(f"Topic: {topic}")
    print("=" * 60)

    workflow_results = {
        'topic': topic,
        'started_at': datetime.now().isoformat(),
        'stages': {}
    }

    # Stage 1: Query Expansion
    print("\nüìç STAGE 1: Query Expansion")
    print("-" * 40)
    try:
        expanded = query_expander.expand_query(topic, "comprehensive")
        workflow_results['stages']['query_expansion'] = {
            'status': 'complete',
            'result': expanded
        }
        print("‚úì Query expanded successfully")
    except Exception as e:
        workflow_results['stages']['query_expansion'] = {'status': 'failed', 'error': str(e)}
        print(f"‚úó Query expansion failed: {e}")

    # Stage 2: Literature Search
    print("\nüìç STAGE 2: Literature Search")
    print("-" * 40)
    try:
        search_results = agent.run(f"Search for recent papers on {topic}")
        workflow_results['stages']['literature_search'] = {
            'status': 'complete',
            'result': search_results[:500] + "..." if len(search_results) > 500 else search_results
        }
        print("‚úì Literature search complete")
    except Exception as e:
        workflow_results['stages']['literature_search'] = {'status': 'failed', 'error': str(e)}
        print(f"‚úó Literature search failed: {e}")

    # Stage 3: Literature Mapping
    print("\nüìç STAGE 3: Literature Mapping")
    print("-" * 40)
    try:
        lit_map = lit_mapper.create_literature_map(topic)
        workflow_results['stages']['literature_mapping'] = {
            'status': 'complete',
            'map_id': lit_map['map_id']
        }
        print(f"‚úì Literature map created: {lit_map['map_id']}")
    except Exception as e:
        workflow_results['stages']['literature_mapping'] = {'status': 'failed', 'error': str(e)}
        print(f"‚úó Literature mapping failed: {e}")

    # Stage 4: Research Plan Creation
    print("\nüìç STAGE 4: Research Plan Creation")
    print("-" * 40)
    try:
        plan = planning_agent.create_research_plan(
            research_question=topic,
            expertise_level=researcher_profile.get('expertise', 'intermediate') if researcher_profile else 'intermediate'
        )
        workflow_results['stages']['research_plan'] = {
            'status': 'complete',
            'plan_id': plan['plan_id']
        }
        print(f"‚úì Research plan created: {plan['plan_id']}")
    except Exception as e:
        workflow_results['stages']['research_plan'] = {'status': 'failed', 'error': str(e)}
        print(f"‚úó Research plan creation failed: {e}")

    # Stage 5: Gap Analysis
    print("\nüìç STAGE 5: Research Gap Analysis")
    print("-" * 40)
    try:
        gaps = research_graph.identify_research_gaps()
        workflow_results['stages']['gap_analysis'] = {
            'status': 'complete',
            'gaps_found': len(gaps)
        }
        print(f"‚úì Gap analysis complete: {len(gaps)} gaps identified")
    except Exception as e:
        workflow_results['stages']['gap_analysis'] = {'status': 'failed', 'error': str(e)}
        print(f"‚úó Gap analysis failed: {e}")

    # Stage 6: Trending Topics
    print("\nüìç STAGE 6: Trending Topic Analysis")
    print("-" * 40)
    try:
        trends = impact_predictor.identify_trending_topics(topic.split()[0] if topic else "research")
        workflow_results['stages']['trending_analysis'] = {
            'status': 'complete',
            'result': trends[:500] + "..." if len(trends) > 500 else trends
        }
        print("‚úì Trending topic analysis complete")
    except Exception as e:
        workflow_results['stages']['trending_analysis'] = {'status': 'failed', 'error': str(e)}
        print(f"‚úó Trending analysis failed: {e}")

    # Stage 7: Reading Path Generation
    print("\nüìç STAGE 7: Reading Path Generation")
    print("-" * 40)
    try:
        reading_path = lit_mapper.generate_reading_path(
            topic,
            expertise_level=researcher_profile.get('expertise', 'beginner') if researcher_profile else 'beginner'
        )
        workflow_results['stages']['reading_path'] = {
            'status': 'complete',
            'result': reading_path[:500] + "..." if len(reading_path) > 500 else reading_path
        }
        print("‚úì Reading path generated")
    except Exception as e:
        workflow_results['stages']['reading_path'] = {'status': 'failed', 'error': str(e)}
        print(f"‚úó Reading path generation failed: {e}")

    # Finalize
    workflow_results['completed_at'] = datetime.now().isoformat()
    workflow_results['stages_completed'] = sum(
        1 for s in workflow_results['stages'].values() if s.get('status') == 'complete'
    )
    workflow_results['total_stages'] = len(workflow_results['stages'])

    print("\n" + "=" * 60)
    print("WORKFLOW COMPLETE")
    print("=" * 60)
    print(f"‚úì Completed {workflow_results['stages_completed']}/{workflow_results['total_stages']} stages")

    return workflow_results


def analyze_paper_with_all_tools(paper_title: str,
                                  paper_abstract: str,
                                  authors: List[str] = None,
                                  methodologies: List[str] = None) -> Dict:
    """Comprehensive paper analysis using all available tools"""

    print("=" * 60)
    print(f"ANALYZING: {paper_title[:50]}...")
    print("=" * 60)

    analysis = {
        'paper_title': paper_title,
        'analyzed_at': datetime.now().isoformat()
    }

    # 1. Basic summarization
    print("\nüìù Generating summary...")
    try:
        summary = agent.run(f"Summarize this paper: {paper_title}. Abstract: {paper_abstract}")
        analysis['summary'] = summary
        print("‚úì Summary generated")
    except Exception as e:
        analysis['summary'] = f"Error: {e}"
        print(f"‚úó Summary failed: {e}")

    # 2. Impact prediction
    print("\nüìä Predicting impact...")
    try:
        impact = impact_predictor.predict_impact({
            'title': paper_title,
            'abstract': paper_abstract,
            'authors': authors or ['Unknown']
        })
        analysis['impact_prediction'] = impact
        print("‚úì Impact prediction complete")
    except Exception as e:
        analysis['impact_prediction'] = f"Error: {e}"
        print(f"‚úó Impact prediction failed: {e}")

    # 3. Add to knowledge graph
    print("\nüï∏Ô∏è Adding to knowledge graph...")
    try:
        paper_id = f"paper_{hash(paper_title) % 10000}"
        research_graph.add_paper(
            paper_id=paper_id,
            title=paper_title,
            authors=authors or [],
            concepts=extract_concepts_from_abstract(paper_abstract),
            methodologies=methodologies or []
        )
        analysis['knowledge_graph_id'] = paper_id
        print(f"‚úì Added to graph with ID: {paper_id}")
    except Exception as e:
        analysis['knowledge_graph_id'] = f"Error: {e}"
        print(f"‚úó Knowledge graph addition failed: {e}")

    # 4. Find related papers
    print("\nüîó Finding related papers...")
    try:
        related = research_graph.find_related_papers(paper_id, depth=2)
        analysis['related_papers'] = related
        print(f"‚úì Found {len(related)} related papers")
    except Exception as e:
        analysis['related_papers'] = f"Error: {e}"
        print(f"‚úó Related paper search failed: {e}")

    # 5. Reproducibility check
    print("\n‚úÖ Checking reproducibility indicators...")
    try:
        repro_check = integrity_checker.check_reproducibility(paper_abstract)
        analysis['reproducibility'] = repro_check
        print("‚úì Reproducibility check complete")
    except Exception as e:
        analysis['reproducibility'] = f"Error: {e}"
        print(f"‚úó Reproducibility check failed: {e}")

    print("\n" + "=" * 60)
    print("PAPER ANALYSIS COMPLETE")
    print("=" * 60)

    return analysis


def extract_concepts_from_abstract(abstract: str) -> List[str]:
    """Extract key concepts from an abstract (simple heuristic)"""
    # Common ML/AI keywords to look for
    keywords = [
        'transformer', 'attention', 'neural network', 'deep learning',
        'machine learning', 'natural language', 'computer vision',
        'reinforcement learning', 'supervised', 'unsupervised',
        'classification', 'regression', 'clustering', 'embedding',
        'optimization', 'gradient', 'loss function', 'benchmark',
        'dataset', 'evaluation', 'accuracy', 'precision', 'recall'
    ]

    abstract_lower = abstract.lower()
    found_concepts = [kw for kw in keywords if kw in abstract_lower]

    return found_concepts[:10]  # Return top 10 concepts


def collaborative_literature_review(topic: str,
                                    team_members: List[str],
                                    session_name: str = None) -> Dict:
    """Start a collaborative literature review session"""

    session_name = session_name or f"Lit Review: {topic}"

    print("=" * 60)
    print(f"COLLABORATIVE LITERATURE REVIEW")
    print(f"Topic: {topic}")
    print(f"Team: {', '.join(team_members)}")
    print("=" * 60)

    # Create collaboration session
    print("\nüë• Creating collaboration session...")
    session = collab_manager.create_session(
        title=session_name,
        participants=team_members,
        initial_context=f"Literature review on: {topic}"
    )

    # Generate initial literature map
    print("\nüó∫Ô∏è Generating literature map...")
    lit_map = lit_mapper.create_literature_map(topic)

    # Generate reading path for each team member
    print("\nüìö Generating reading paths...")
    reading_paths = {}
    for member in team_members:
        reading_paths[member] = lit_mapper.generate_reading_path(
            topic,
            expertise_level="intermediate",
            time_available="medium"
        )

    # Create research plan
    print("\nüìã Creating research plan...")
    plan = planning_agent.create_research_plan(
        research_question=f"Comprehensive literature review on {topic}"
    )

    # Assign initial action items
    print("\nüìù Assigning initial tasks...")
    for i, member in enumerate(team_members):
        collab_manager.add_action_item(
            session_id=session.session_id,
            assignee=member,
            task=f"Review assigned papers and add findings to session",
            priority="high"
        )

    return {
        'session': session,
        'literature_map': lit_map,
        'reading_paths': reading_paths,
        'research_plan': plan
    }


def export_research_project(project_name: str,
                           include_components: List[str] = None) -> str:
    """Export all research project data"""

    include_components = include_components or [
        'knowledge_graph', 'plans', 'sessions',
        'predictions', 'queries', 'integrity_checks'
    ]

    export_data = {
        'project_name': project_name,
        'exported_at': datetime.now().isoformat(),
        'components': {}
    }

    if 'knowledge_graph' in include_components:
        export_data['components']['knowledge_graph'] = json.loads(research_graph.export_graph())

    if 'plans' in include_components:
        export_data['components']['research_plans'] = planning_agent.plans

    if 'sessions' in include_components:
        export_data['components']['collaboration_sessions'] = {
            sid: {
                'title': s.title,
                'participants': s.participants,
                'findings_count': len(s.findings),
                'action_items_count': len(s.action_items)
            }
            for sid, s in collab_manager.sessions.items()
        }

    if 'predictions' in include_components:
        export_data['components']['impact_predictions'] = impact_predictor.predictions

    if 'queries' in include_components:
        export_data['components']['query_history'] = query_expander.query_history

    if 'integrity_checks' in include_components:
        export_data['components']['integrity_checks'] = integrity_checker.checks

    # Write to file
    filename = f"{project_name.lower().replace(' ', '_')}_export.json"
    with open(filename, 'w') as f:
        json.dump(export_data, f, indent=2, default=str)

    print(f"‚úì Project exported to: {filename}")
    return filename


def quick_research_assistant(query: str) -> str:
    """Quick research assistant for common tasks"""

    query_lower = query.lower()

    # Detect query intent and route to appropriate tool
    if any(word in query_lower for word in ['search', 'find', 'papers on', 'articles about']):
        print("üîç Detected: Paper search request")
        expanded = query_expander.expand_query(query)
        return agent.run(query)

    elif any(word in query_lower for word in ['summarize', 'summary of', 'explain']):
        print("üìù Detected: Summarization request")
        return agent.run(query)

    elif any(word in query_lower for word in ['compare', 'difference between', 'vs']):
        print("‚öñÔ∏è Detected: Comparison request")
        return agent.run(query)

    elif any(word in query_lower for word in ['cite', 'citation', 'reference']):
        print("üìö Detected: Citation request")
        return agent.run(query)

    elif any(word in query_lower for word in ['literature review', 'review of']):
        print("üìñ Detected: Literature review request")
        return agent.run(query)

    elif any(word in query_lower for word in ['trending', 'hot topics', 'popular']):
        print("üìà Detected: Trending topics request")
        field = query.split()[-1] if len(query.split()) > 2 else "research"
        return impact_predictor.identify_trending_topics(field)

    elif any(word in query_lower for word in ['design', 'experiment', 'study design']):
        print("üî¨ Detected: Experiment design request")
        return experiment_advisor.design_experiment(query)['design']

    elif any(word in query_lower for word in ['write', 'draft', 'abstract']):
        print("‚úçÔ∏è Detected: Writing assistance request")
        return writing_assistant.generate_abstract({'title': query, 'research_question': query})

    else:
        print("ü§ñ Detected: General research query")
        return agent.run(query)


def show_research_dashboard():
    """Display comprehensive research dashboard"""

    print("""
‚ïî‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïó
‚ïë                  SCHOLARMIND RESEARCH DASHBOARD              ‚ïë
‚ï†‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ï£
""")

    # Agent Stats
    if agent:
        stats = agent.get_stats()
        print(f"‚ïë  üìä AGENT STATISTICS                                         ‚ïë")
        print(f"‚ïë  ‚îú‚îÄ Queries Processed: {stats['queries_processed']:<35} ‚ïë")
        print(f"‚ïë  ‚îú‚îÄ Tools Called: {stats['tools_called']:<40} ‚ïë")
        print(f"‚ïë  ‚îú‚îÄ Avg Response Time: {stats['avg_response_time']:.2f}s{' '*32} ‚ïë")
        print(f"‚ïë  ‚îî‚îÄ Errors: {stats['errors']:<46} ‚ïë")

    print(f"‚ïë                                                              ‚ïë")

    # Knowledge Graph Stats
    print(f"‚ïë  üï∏Ô∏è KNOWLEDGE GRAPH                                          ‚ïë")
    print(f"‚ïë  ‚îú‚îÄ Total Nodes: {len(research_graph.nodes):<41} ‚ïë")
    print(f"‚ïë  ‚îú‚îÄ Total Edges: {len(research_graph.edges):<41} ‚ïë")
    papers = sum(1 for n in research_graph.nodes.values() if n.node_type == 'paper')
    authors = sum(1 for n in research_graph.nodes.values() if n.node_type == 'author')
    concepts = sum(1 for n in research_graph.nodes.values() if n.node_type == 'concept')
    print(f"‚ïë  ‚îú‚îÄ Papers: {papers:<46} ‚ïë")
    print(f"‚ïë  ‚îú‚îÄ Authors: {authors:<45} ‚ïë")
    print(f"‚ïë  ‚îî‚îÄ Concepts: {concepts:<44} ‚ïë")

    print(f"‚ïë                                                              ‚ïë")

    # Research Plans
    print(f"‚ïë  üìã RESEARCH PLANS                                           ‚ïë")
    print(f"‚ïë  ‚îî‚îÄ Active Plans: {len(planning_agent.plans):<40} ‚ïë")

    print(f"‚ïë                                                              ‚ïë")

    # Collaboration Sessions
    print(f"‚ïë  üë• COLLABORATION                                            ‚ïë")
    print(f"‚ïë  ‚îî‚îÄ Active Sessions: {len(collab_manager.sessions):<37} ‚ïë")

    print(f"‚ïë                                                              ‚ïë")

    # Other Stats
    print(f"‚ïë  üìà ANALYTICS                                                ‚ïë")
    print(f"‚ïë  ‚îú‚îÄ Impact Predictions: {len(impact_predictor.predictions):<34} ‚ïë")
    print(f"‚ïë  ‚îú‚îÄ Queries Expanded: {len(query_expander.query_history):<36} ‚ïë")
    print(f"‚ïë  ‚îú‚îÄ Literature Maps: {len(lit_mapper.maps):<37} ‚ïë")
    print(f"‚ïë  ‚îî‚îÄ Integrity Checks: {len(integrity_checker.checks):<36} ‚ïë")

    print("""‚ïë                                                              ‚ïë
‚ïö‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïù
    """)


# ============================================================
# CONVENIENCE WRAPPER FUNCTIONS
# ============================================================

def search(query: str, max_results: int = 10) -> str:
    """Shortcut for paper search"""
    return agent.run(f"Search for {max_results} papers on: {query}")

def summarize(paper_title: str, abstract: str = "") -> str:
    """Shortcut for paper summarization"""
    return agent.run(f"Summarize: {paper_title}. {abstract}")

def compare(paper1: str, paper2: str) -> str:
    """Shortcut for methodology comparison"""
    return agent.run(f"Compare methodologies of {paper1} and {paper2}")

def cite(papers: str, style: str = "APA") -> str:
    """Shortcut for citation generation"""
    return agent.run(f"Generate {style} citations for: {papers}")

def review(topic: str, length: str = "medium") -> str:
    """Shortcut for literature review"""
    return agent.run(f"Generate a {length} literature review on: {topic}")

def gaps(topic: str = None) -> List[Dict]:
    """Shortcut for research gap identification"""
    return research_graph.identify_research_gaps()

def trends(field: str) -> str:
    """Shortcut for trending topics"""
    return impact_predictor.identify_trending_topics(field)

def plan(question: str, hours: float = 40) -> Dict:
    """Shortcut for research plan creation"""
    return planning_agent.create_research_plan(question, hours)

def map_lit(topic: str) -> Dict:
    """Shortcut for literature mapping"""
    return lit_mapper.create_literature_map(topic)

def expand(query: str) -> Dict:
    """Shortcut for query expansion"""
    return query_expander.expand_query(query)


# ============================================================
# PRINT FINAL STATUS
# ============================================================

print("\n" + "=" * 60)
print("‚úì ALL INNOVATIVE FEATURES INITIALIZED")
print("=" * 60)
print("""
üì¶ NEW MODULES LOADED:
  ‚úì Research Knowledge Graph
  ‚úì Research Planning Agent
  ‚úì Multi-Modal Research Analyzer
  ‚úì Collaborative Research Manager
  ‚úì Research Impact Predictor
  ‚úì Literature Mapper
  ‚úì Smart Query Expander
  ‚úì Research Writing Assistant
  ‚úì Experiment Design Advisor
  ‚úì Research Integrity Checker

üöÄ INTEGRATED WORKFLOWS:
  ‚úì comprehensive_paper_analysis()
  ‚úì start_research_project()
  ‚úì run_end_to_end_research_workflow()
  ‚úì collaborative_literature_review()
  ‚úì analyze_paper_with_all_tools()

‚ö° QUICK FUNCTIONS:
  search(), summarize(), compare(), cite(), review()
  gaps(), trends(), plan(), map_lit(), expand()

üìä DASHBOARDS:
  ‚úì display_all_capabilities()
  ‚úì show_research_dashboard()

Type 'display_all_capabilities()' to see all available features!
""")


‚úì ALL INNOVATIVE FEATURES INITIALIZED

üì¶ NEW MODULES LOADED:
  ‚úì Research Knowledge Graph
  ‚úì Research Planning Agent
  ‚úì Multi-Modal Research Analyzer
  ‚úì Collaborative Research Manager
  ‚úì Research Impact Predictor
  ‚úì Literature Mapper
  ‚úì Smart Query Expander
  ‚úì Research Writing Assistant
  ‚úì Experiment Design Advisor
  ‚úì Research Integrity Checker

üöÄ INTEGRATED WORKFLOWS:
  ‚úì comprehensive_paper_analysis()
  ‚úì start_research_project()
  ‚úì run_end_to_end_research_workflow()
  ‚úì collaborative_literature_review()
  ‚úì analyze_paper_with_all_tools()

‚ö° QUICK FUNCTIONS:
  search(), summarize(), compare(), cite(), review()
  gaps(), trends(), plan(), map_lit(), expand()

üìä DASHBOARDS:
  ‚úì display_all_capabilities()
  ‚úì show_research_dashboard()

Type 'display_all_capabilities()' to see all available features!



In [53]:
# ============================================================
# DEMONSTRATION OF NEW FEATURES
# ============================================================

print("=" * 60)
print("SCHOLARMIND ENHANCED FEATURES DEMO")
print("=" * 60)

# Demo 1: Quick Research Dashboard
print("\nüìä Research Dashboard:")
show_research_dashboard()

# Demo 2: End-to-end workflow
print("\nüöÄ Running mini end-to-end workflow...")
researcher = {
    'expertise': 'intermediate',
    'current_focus': 'natural language processing',
    'career_stage': 'PhD student'
}

# Quick paper analysis
print("\nüìù Quick paper analysis demo:")
paper_info = {
    'title': 'Attention Is All You Need',
    'abstract': 'The dominant sequence transduction models are based on complex recurrent or convolutional neural networks.. .',
    'authors': ['Vaswani', 'Shazeer', 'Parmar'],
    'field': 'machine learning'
}
impact = impact_predictor.predict_impact(paper_info)
print("‚úì Impact prediction generated")

# Query expansion demo
print("\nüîç Query expansion demo:")
expanded = query_expander.expand_query("transformer attention mechanisms NLP")
print("‚úì Query expanded")

# Literature map demo
print("\nüó∫Ô∏è Literature mapping demo:")
lit_map = lit_mapper.create_literature_map("transformer architectures")
print(f"‚úì Map created: {lit_map['map_id']}")

# Display capabilities
print("\n" + "=" * 60)
display_all_capabilities()

SCHOLARMIND ENHANCED FEATURES DEMO

üìä Research Dashboard:

‚ïî‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïó
‚ïë                  SCHOLARMIND RESEARCH DASHBOARD              ‚ïë
‚ï†‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ï£

‚ïë  üìä AGENT STATISTICS                                         ‚ïë
‚ïë  ‚îú‚îÄ Queries Processed: 4                                   ‚ïë
‚ïë  ‚îú‚îÄ Tools Called: 4                                        ‚ïë
‚ïë  ‚îú‚îÄ Avg Response Time: 23.49s                                 ‚ïë
‚ïë  ‚îî‚îÄ Errors: 0                                              ‚ïë
‚ïë                                                              ‚ïë
‚ïë  üï∏Ô∏è KNOWLEDGE GRAPH                           

---
# ‚úÖ ScholarMind Agent Summary

---

## ‚òëÔ∏è Example Workflow

1. **Search for papers** using `test_agent('Find papers on transformers')`
2. **Summarize findings** with automatic key point extraction
3. **Compare methodologies** across multiple research papers
4. **Generate literature reviews** with proper academic structure
5. **Manage citations** in APA, MLA, Chicago, IEEE, Harvard styles
6. **Export conversations** with `export_conversation_history('file.txt')`
7. **Search history** with `search_conversation('keyword')`
8. **Batch process queries** with `batch_query(['q1', 'q2', 'q3'])`
9. **Track performance** with real-time analytics
10. **Reset agent** with `reset_agent()` for new research sessions

---

## ‚òëÔ∏è Agent Capabilities

### ‚úîÔ∏è Core Features
- ‚úÖ arXiv & academic database search
- ‚úÖ Intelligent paper summarization
- ‚úÖ Methodology comparison and analysis
- ‚úÖ Literature review generation
- ‚úÖ Multi-style citation management
- ‚úÖ Research insights extraction
- ‚úÖ Context-aware responses

### ‚úîÔ∏è Advanced Features
- ‚úÖ Conversation memory management (20 messages max)
- ‚úÖ Real-time performance analytics
- ‚úÖ Quality validation and feedback
- ‚úÖ Batch query processing
- ‚úÖ Auto-summarization when memory limits approached
- ‚úÖ Session export and persistence
- ‚úÖ Dynamic configuration management
- ‚úÖ Comprehensive agent logs export (JSON format)
- ‚úÖ Multi-point response validation (6 quality checks)
- ‚úÖ User feedback collection system
- ‚úÖ Performance metrics tracking & trending
- ‚úÖ Formatted batch results display

### ‚úîÔ∏è Quality Assurance
- ‚úÖ Academic-quality response validation (6-point system)
- ‚úÖ Comprehensive error tracking with retry logic
- ‚úÖ Performance trend analysis over time
- ‚úÖ Memory usage monitoring and auto-management
- ‚úÖ Citation accuracy verification
- ‚úÖ Feedback analytics and continuous improvement
- ‚úÖ Response quality scoring with suggestions

---

## ‚òëÔ∏è Available Commands

### ‚úîÔ∏è Research Management
- `test_agent('your question')` - Ask research questions
- `search_conversation('keyword')` - Search past conversations
- `export_conversation_history('file.txt')` - Export full history
- `export_agent_logs('file.json')` - Export comprehensive logs
- `reset_agent()` - Clear memory and reset statistics

### ‚úîÔ∏è Query Processing
- `batch_query(['q1', 'q2', 'q3'])` - Process multiple queries at once
- `display_batch_results(results)` - Show formatted batch results
- `display_statistics()` - View performance metrics

### ‚úîÔ∏è Agent Configuration
- `show_agent_config()` - Display current configuration
- `configure_agent(temperature=0.5, max_tokens=3000)` - Dynamically reconfigure agent

### ‚úîÔ∏è Memory Management
- `summarize_conversation()` - Summarize conversation history
- `auto_summarize_if_needed()` - Auto-summarize when approaching memory limit

### ‚úîÔ∏è Quality Assurance & Feedback
- `validate_response(question, response)` - Validate response quality
- `auto_validate_response(question, response)` - Auto-validate with suggestions
- `collect_feedback(question, response, rating=5, comments='Great!')` - Collect user feedback
- `show_feedback_summary()` - Display feedback analytics

### ‚úîÔ∏è Performance Monitoring
- `track_performance_metrics()` - Snapshot current performance
- `show_performance_trends()` - Display performance trends over time
- `export_performance_data('file.json')` - Export performance history

---

## ‚òëÔ∏è Performance Metrics Tracked

- **Queries Processed** - Total number of research queries handled
- **Tools Called** - Number of specialized tool invocations
- **Average Response Time** - Mean response latency
- **Error Count** - Number of errors encountered
- **Memory Usage** - Current message count in memory
- **Citation Accuracy** - Quality of generated citations
- **Research Quality Score** - Academic response validation

---

## ‚òëÔ∏è Architecture Patterns

### ‚úîÔ∏è Multi-Agent Pattern
- Independent agents with specialized research tools
- Coordinator manages agent orchestration
- Function calling for dynamic tool invocation
- Context sharing through memory system

### ‚úîÔ∏è Observability Pattern
- Comprehensive logging system
- Performance metrics snapshots
- Conversation export & analysis
- Error tracking and debugging

### ‚úîÔ∏è Quality Assurance Pattern
- Academic response validation
- Automated quality scoring
- Citation format verification
- Research completeness checks

---

## ‚òëÔ∏è Usage Tips

1. Be specific with topics & methods
2. Use batch processing for efficiency
3. Export sessions regularly
4. Track performance metrics

---

## Citation Styles

APA | MLA | Chicago | IEEE | Harvard

---

## Research Domains

ü§ñ AI/ML | üß† Cognitive Science | üìä Data Science | üíª CS | üî¨ Natural Sciences | üìö Social Sciences | üè• Health | üéì Interdisciplinary

---

## Notes

- Gemini 2.0 Flash powered
- 20 message memory (auto-managed)
- 5 citation styles supported
- Full conversation logging
- Performance metrics tracking

---

**@Sterling Syntax** | Suprava Saha Dibya, Abdulla Al Noman | Nov 2025

---