# 🔬 AI Research Assistant

[![Python](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![LangChain](https://img.shields.io/badge/LangChain-0.1.0-green.svg)](https://github.com/langchain-ai/langchain)
[![OpenAI](https://img.shields.io/badge/OpenAI-GPT--3.5-orange.svg)](https://openai.com/)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

## A Comprehensive Research Assistant Built with LangChain

> **Transform your research workflow with AI-powered multi-source intelligence gathering, document processing, and analytics.**

### 🌟 **Key Features**

- 🌐 **Multi-source Research**: Seamlessly integrates Web search, arXiv, and Google Scholar
- 🧠 **Intelligent Memory**: Contextual memory system for research history and user preferences
- 📄 **Document Processing**: Advanced vector storage and semantic search capabilities
- 🎯 **Interactive Interfaces**: Both notebook and web-based interfaces for flexible usage
- 📊 **Project Management**: Organize and track research projects with detailed reports
- 📈 **Data Export**: Export research data in JSON, CSV, and Excel formats
- 📉 **Analytics Dashboard**: Visualize research trends, costs, and productivity metrics
- 💰 **Cost Tracking**: Monitor and optimize API usage and expenses
- 🔍 **Quality Analysis**: Automated assessment of research completeness and reliability

### 🚀 **Quick Start**

1. **📦 Install Dependencies**: Run the installation cell below
2. **🔐 Set API Key**: Configure your OpenAI API key
3. **⚙️ Initialize Assistant**: Set up the research assistant
4. **🎯 Start Researching**: Use the various research functions

### 📖 **Table of Contents**

- [Installation](#installation)
- [Configuration](#configuration)
- [Core Components](#core-components)
- [Usage Examples](#usage-examples)
- [Advanced Features](#advanced-features)
- [Analytics & Visualization](#analytics--visualization)
- [Contributing](#contributing)
- [License](#license)

---

> **⚠️ Important**: This notebook requires an OpenAI API key. Make sure to set up your API key before running the cells.

In [None]:
# 📦 Installation

import subprocess
import sys
import platform

def check_python_version():
    """Check if Python version is compatible"""
    version = sys.version_info
    if version.major >= 3 and version.minor >= 8:
        print(f"✅ Python {version.major}.{version.minor}.{version.micro} is compatible")
        return True
    else:
        print(f"❌ Python {version.major}.{version.minor}.{version.micro} is not compatible")
        print("🔄 Please upgrade to Python 3.8 or higher")
        return False

def install_requirements():
    """Install all required packages from requirements.txt"""
    try:
        print("📦 Installing dependencies...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", "-r", "requirements.txt", "--quiet"])
        print("✅ All dependencies installed successfully!")
        return True
    except subprocess.CalledProcessError as e:
        print(f"❌ Error installing dependencies: {e}")
        print("💡 Try running: pip install -r requirements.txt")
        return False

def display_system_info():
    """Display system information"""
    print(f"🖥️  System: {platform.system()} {platform.release()}")
    print(f"🐍 Python: {sys.version.split()[0]}")
    print(f"📍 Platform: {platform.platform()}")

# Check system compatibility
print("🔍 System Compatibility Check")
print("=" * 40)
display_system_info()
python_ok = check_python_version()

if python_ok:
    print("\n📋 Required Packages:")
    try:
        with open("requirements.txt", "r") as f:
            requirements = f.read().strip().split("\n")
            for req in requirements:
                if req.strip() and not req.startswith("#"):
                    print(f"  📌 {req}")
    except FileNotFoundError:
        print("⚠️  requirements.txt file not found")
    
    print("\n🚀 Ready to install dependencies!")
    print("💡 Uncomment the line below to install all packages:")
    print("   # install_requirements()")
else:
    print("❌ Please upgrade Python before proceeding")

# Uncomment the line below to install dependencies
# install_requirements()

In [None]:
# Import required libraries
import os
import json
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime, timedelta
from typing import List, Dict, Any, Optional
import warnings
warnings.filterwarnings('ignore')

# LangChain imports
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.memory import ConversationBufferMemory
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.agents import initialize_agent, Tool, AgentType
from langchain.utilities import DuckDuckGoSearchAPIWrapper
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA

# Additional imports for research capabilities
import requests
from bs4 import BeautifulSoup
import arxiv
from scholarly import scholarly
import tiktoken
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

print("🚀 All imports successful!")
print("📊 Setting up plotting style...")

# Set up plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("✅ Setup complete!")

In [None]:
# 🔐 API Key Configuration

import os
from pathlib import Path

def setup_api_key():
    """Setup OpenAI API key with multiple methods"""
    
    print("🔐 OpenAI API Key Setup")
    print("=" * 40)
    
    # Method 1: Environment variable (most secure)
    api_key = os.getenv("OPENAI_API_KEY")
    
    if api_key:
        print("✅ API key found in environment variables")
        os.environ["OPENAI_API_KEY"] = api_key
        return api_key
    
    # Method 2: .env file (recommended for development)
    env_file = Path(".env")
    if env_file.exists():
        print("📄 Found .env file, loading...")
        try:
            from dotenv import load_dotenv
            load_dotenv()
            api_key = os.getenv("OPENAI_API_KEY")
            if api_key:
                print("✅ API key loaded from .env file")
                return api_key
        except ImportError:
            print("⚠️  python-dotenv not installed, skipping .env file")
    
    # Method 3: Manual input (for testing only)
    print("⚠️  No API key found in environment variables or .env file")
    print("\n🔧 Setup Options:")
    print("   1. 🌍 Environment Variable (Recommended):")
    print("      export OPENAI_API_KEY='your_api_key_here'")
    print("   2. 📄 .env File (Development):")
    print("      Create .env file with: OPENAI_API_KEY=your_api_key_here")
    print("   3. 🔒 Manual Input (Testing Only):")
    print("      Uncomment and modify the line below")
    print("      # api_key = 'your_api_key_here'")
    
    return None

def setup_langsmith_tracing():
    """Setup optional LangSmith tracing for debugging"""
    
    langsmith_key = os.getenv("LANGCHAIN_API_KEY")
    
    if langsmith_key:
        print("\n🔍 LangSmith Configuration Found")
        print("=" * 30)
        os.environ["LANGCHAIN_TRACING_V2"] = "true"
        os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT", "ai-research-assistant")
        print("✅ LangSmith tracing enabled for debugging")
        print(f"📊 Project: {os.environ['LANGCHAIN_PROJECT']}")
        return True
    else:
        print("\n💡 LangSmith tracing not configured (optional)")
        return False

# Setup API key
OPENAI_API_KEY = setup_api_key()

# Setup optional tracing
setup_langsmith_tracing()

# Final status
print("\n🎯 Configuration Status:")
print("=" * 25)
if OPENAI_API_KEY:
    print("✅ OpenAI API: Configured")
    print("🚀 Ready to proceed!")
else:
    print("❌ OpenAI API: Not configured")
    print("⚠️  Please set up your API key before continuing")
    print("🔗 Get your API key: https://platform.openai.com/api-keys")

# Security reminder
print("\n🔒 Security Reminder:")
print("   • Never commit API keys to version control")
print("   • Use environment variables in production")
print("   • Monitor your API usage regularly")
print("   • Keep your API keys secure and private")

In [None]:
# 🧠 Research Memory System

class ResearchMemory:
    """
    Intelligent memory system for research preferences and history
    """
    
    def __init__(self):
        self.preferences = {
            'summary_style': 'comprehensive',
            'preferred_sources': ['academic', 'web', 'news'],
            'max_results': 5,
            'include_analysis': True
        }
        self.research_history = []
        self.session_stats = {
            'queries_conducted': 0,
            'total_tokens': 0,
            'total_cost': 0.0,
            'session_start': datetime.now()
        }
        self.context_memory = ConversationBufferMemory(
            memory_key="chat_history",
            return_messages=True
        )
    
    def update_preferences(self, key: str, value: Any):
        """Update user preferences"""
        self.preferences[key] = value
        print(f"✅ Updated {key} to {value}")
    
    def add_research_entry(self, query: str, result: Dict, cost: float = 0.0):
        """Add a research entry to history"""
        entry = {
            'timestamp': datetime.now(),
            'query': query,
            'result': result,
            'cost': cost,
            'tokens': result.get('tokens_used', 0)
        }
        self.research_history.append(entry)
        
        # Update session stats
        self.session_stats['queries_conducted'] += 1
        self.session_stats['total_tokens'] += result.get('tokens_used', 0)
        self.session_stats['total_cost'] += cost
    
    def get_recent_context(self, limit: int = 3) -> str:
        """Get recent research context for better continuity"""
        if not self.research_history:
            return ""
        
        recent_entries = self.research_history[-limit:]
        context = "Recent research context:\n"
        for entry in recent_entries:
            context += f"- {entry['query'][:100]}...\n"
        return context
    
    def get_session_summary(self) -> Dict:
        """Get current session summary"""
        duration = datetime.now() - self.session_stats['session_start']
        return {
            **self.session_stats,
            'session_duration': str(duration).split('.')[0],
            'avg_cost_per_query': self.session_stats['total_cost'] / max(1, self.session_stats['queries_conducted'])
        }

# Initialize research memory
research_memory = ResearchMemory()
print("🧠 Research memory system initialized!")
print(f"📊 Current preferences: {research_memory.preferences}")

In [None]:
# 📄 Document Processing System

class DocumentProcessor:
    """
    Advanced document processing with vector storage and semantic search
    """
    
    def __init__(self, api_key: str):
        self.embeddings = OpenAIEmbeddings(openai_api_key=api_key)
        self.text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1000,
            chunk_overlap=200
        )
        self.vectorstore = None
        self.documents = []
        
    def process_documents(self, texts: List[str], metadata: List[Dict] = None):
        """Process and store documents in vector database"""
        if metadata is None:
            metadata = [{}] * len(texts)
            
        all_chunks = []
        all_metadata = []
        
        for i, text in enumerate(texts):
            chunks = self.text_splitter.split_text(text)
            all_chunks.extend(chunks)
            # Add metadata for each chunk
            chunk_metadata = [
                {**metadata[i], 'chunk_index': j, 'source_index': i}
                for j in range(len(chunks))
            ]
            all_metadata.extend(chunk_metadata)
        
        # Create or update vector store
        if self.vectorstore is None:
            self.vectorstore = Chroma.from_texts(
                all_chunks, 
                self.embeddings,
                metadatas=all_metadata
            )
        else:
            self.vectorstore.add_texts(all_chunks, metadatas=all_metadata)
        
        self.documents.extend(list(zip(texts, metadata)))
        print(f"📄 Processed {len(texts)} documents into {len(all_chunks)} chunks")
    
    def add_document_from_url(self, url: str) -> str:
        """Add document from URL"""
        try:
            loader = WebBaseLoader(url)
            docs = loader.load()
            
            if docs:
                texts = [doc.page_content for doc in docs]
                metadata = [{'source': url, 'type': 'web'}]
                self.process_documents(texts, metadata)
                return f"✅ Successfully added document from {url}"
            else:
                return f"❌ Could not load content from {url}"
        except Exception as e:
            return f"❌ Error loading document from {url}: {str(e)}"
    
    def query_documents(self, query: str, k: int = 3) -> str:
        """Query documents using semantic search"""
        if self.vectorstore is None:
            return "❌ No documents loaded. Please add documents first."
        
        try:
            # Perform similarity search
            docs = self.vectorstore.similarity_search(query, k=k)
            
            if not docs:
                return "❌ No relevant documents found."
            
            # Format results
            results = []
            for i, doc in enumerate(docs, 1):
                metadata = doc.metadata
                source = metadata.get('source', 'Unknown')
                results.append(f"**Result {i}** (Source: {source}):\n{doc.page_content}\n")
            
            return "\n".join(results)
        except Exception as e:
            return f"❌ Error querying documents: {str(e)}"
    
    def get_document_stats(self) -> Dict:
        """Get statistics about loaded documents"""
        total_chunks = 0
        if self.vectorstore is not None:
            # This is an approximation as Chroma doesn't provide direct count
            total_chunks = len(self.vectorstore.get()['ids'])
        
        return {
            'total_documents': len(self.documents),
            'total_chunks': total_chunks,
            'has_vectorstore': self.vectorstore is not None
        }

# Initialize document processor (will be created when API key is available)
doc_processor = None
if OPENAI_API_KEY:
    doc_processor = DocumentProcessor(OPENAI_API_KEY)
    print("📄 Document processor initialized!")
else:
    print("⚠️  Document processor not initialized - API key required")

In [None]:
# 🔬 Main Research Assistant Class

class ResearchAssistant:
    """
    Comprehensive AI Research Assistant with multi-source capabilities
    """
    
    def __init__(self, api_key: str):
        self.api_key = api_key
        self.llm = ChatOpenAI(
            temperature=0.3,
            openai_api_key=api_key,
            model_name="gpt-3.5-turbo"
        )
        self.memory = research_memory
        self.doc_processor = doc_processor
        self.search_wrapper = DuckDuckGoSearchAPIWrapper()
        
        # Initialize cost tracking
        self.encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
        
        # Research prompt template
        self.research_prompt = PromptTemplate(
            input_variables=["query", "context", "style"],
            template="""
            You are an expert research assistant. Conduct comprehensive research on the following query.
            
            Query: {query}
            
            Context from previous research: {context}
            
            Research Style: {style}
            
            Please provide:
            1. A comprehensive summary of the topic
            2. Key findings and insights
            3. Current trends and developments
            4. Reliable sources and references
            5. Practical applications or implications
            
            Make sure to be thorough, accurate, and cite your sources when possible.
            """
        )
        
        self.research_chain = LLMChain(
            llm=self.llm,
            prompt=self.research_prompt,
            memory=self.memory.context_memory
        )
        
        print("🔬 Research Assistant initialized successfully!")
    
    def estimate_cost(self, text: str, model: str = "gpt-3.5-turbo") -> float:
        """Estimate the cost of a text processing request"""
        tokens = len(self.encoding.encode(text))
        
        # Pricing as of 2024 (approximate)
        if model == "gpt-3.5-turbo":
            cost_per_token = 0.0015 / 1000  # $0.0015 per 1K tokens
        else:
            cost_per_token = 0.002 / 1000   # Default fallback
        
        return tokens * cost_per_token
    
    def web_search(self, query: str, num_results: int = 5) -> List[Dict]:
        """Perform web search using DuckDuckGo"""
        try:
            search_results = self.search_wrapper.run(query)
            
            # Parse results (simplified)
            results = []
            if search_results:
                # Split by common separators and take first few results
                raw_results = search_results.split('\n')[:num_results]
                for i, result in enumerate(raw_results):
                    if result.strip():
                        results.append({
                            'title': f"Search Result {i+1}",
                            'content': result.strip(),
                            'source': 'web_search'
                        })
            
            return results
        except Exception as e:
            print(f"⚠️ Web search error: {str(e)}")
            return []
    
    def arxiv_search(self, query: str, max_results: int = 3) -> List[Dict]:
        """Search arXiv for academic papers"""
        try:
            client = arxiv.Client()
            search = arxiv.Search(
                query=query,
                max_results=max_results,
                sort_by=arxiv.SortCriterion.SubmittedDate
            )
            
            results = []
            for paper in client.results(search):
                results.append({
                    'title': paper.title,
                    'summary': paper.summary[:500] + "..." if len(paper.summary) > 500 else paper.summary,
                    'authors': [author.name for author in paper.authors],
                    'url': paper.entry_id,
                    'source': 'arxiv'
                })
            
            return results
        except Exception as e:
            print(f"⚠️ arXiv search error: {str(e)}")
            return []
    
    def research(self, query: str, include_analysis: bool = True) -> Dict:
        """
        Main research function that combines multiple sources
        """
        start_time = datetime.now()
        
        try:
            # Get context from memory
            context = self.memory.get_recent_context()
            style = self.memory.preferences['summary_style']
            
            # Gather information from multiple sources
            print(f"🔍 Searching for: {query}")
            
            # Web search
            web_results = self.web_search(query, 3)
            print(f"🌐 Found {len(web_results)} web results")
            
            # Academic search
            arxiv_results = self.arxiv_search(query, 2)
            print(f"📚 Found {len(arxiv_results)} academic papers")
            
            # Combine all source information
            all_sources = []
            source_content = ""
            
            for result in web_results + arxiv_results:
                source_info = f"Source: {result['source']}\n"
                if 'title' in result:
                    source_info += f"Title: {result['title']}\n"
                if 'content' in result:
                    source_info += f"Content: {result['content']}\n"
                elif 'summary' in result:
                    source_info += f"Summary: {result['summary']}\n"
                
                source_content += source_info + "\n---\n"
                all_sources.append(result.get('title', result.get('content', 'Unknown source')))
            
            # Generate comprehensive research summary
            enhanced_query = f"{query}\n\nAvailable information:\n{source_content}"
            
            # Estimate cost
            estimated_cost = self.estimate_cost(enhanced_query)
            
            # Generate research findings
            findings = self.research_chain.run(
                query=enhanced_query,
                context=context,
                style=style
            )
            
            # Calculate actual tokens used (approximation)
            total_tokens = len(self.encoding.encode(enhanced_query + findings))
            actual_cost = self.estimate_cost(enhanced_query + findings)
            
            # Prepare result
            result = {
                'query': query,
                'findings': findings,
                'sources': all_sources,
                'timestamp': datetime.now().isoformat(),
                'tokens_used': total_tokens,
                'cost': actual_cost,
                'processing_time': (datetime.now() - start_time).total_seconds()
            }
            
            # Add analysis if requested
            if include_analysis:
                result['analysis'] = self.analyze_research_quality(findings)
            
            # Update memory
            self.memory.add_research_entry(query, result, actual_cost)
            
            return result
            
        except Exception as e:
            error_result = {
                'query': query,
                'error': True,
                'findings': f"Research failed: {str(e)}",
                'sources': [],
                'timestamp': datetime.now().isoformat(),
                'tokens_used': 0,
                'cost': 0.0
            }
            return error_result
    
    def analyze_research_quality(self, findings: str) -> Dict:
        """Analyze the quality and completeness of research findings"""
        analysis = {
            'word_count': len(findings.split()),
            'has_citations': '(' in findings or '[' in findings,
            'confidence_indicators': [],
            'completeness_score': 0.0
        }
        
        # Look for confidence indicators
        confidence_words = ['studies show', 'research indicates', 'evidence suggests', 
                           'according to', 'data reveals', 'analysis confirms']
        
        for word in confidence_words:
            if word.lower() in findings.lower():
                analysis['confidence_indicators'].append(word)
        
        # Calculate completeness score
        score = 0.0
        if analysis['word_count'] > 100:
            score += 0.3
        if analysis['has_citations']:
            score += 0.3
        if len(analysis['confidence_indicators']) > 0:
            score += 0.2
        if len(findings.split('.')) > 5:  # Multiple sentences
            score += 0.2
        
        analysis['completeness_score'] = min(score, 1.0)
        
        return analysis

# Initialize the research assistant
research_assistant = None
if OPENAI_API_KEY:
    research_assistant = ResearchAssistant(OPENAI_API_KEY)
    print("🎉 Research Assistant ready for use!")
else:
    print("❌ Cannot initialize Research Assistant - API key required")

## 🎯 Usage Examples and Demonstrations

Now that we have our Research Assistant set up, let's explore its capabilities with practical examples.

### 📋 **Prerequisites Checklist**

Before running the examples, ensure you have:
- ✅ Installed all dependencies
- ✅ Configured your OpenAI API key
- ✅ System status check passed
- ✅ All components initialized successfully

### 🔧 **System Status Check**

Run the cell below to verify that all components are properly initialized and ready to use.

> **💡 Tip**: If any components show as "Not Ready", please review the setup cells above.

In [None]:
# 🔧 System Status Check

def check_system_status():
    """Check if all components are properly initialized"""
    
    print("🔍 System Status Check")
    print("=" * 50)
    
    # Check components
    status = {
        'Python Version': sys.version_info >= (3, 8),
        'API Key': OPENAI_API_KEY is not None,
        'Research Assistant': 'research_assistant' in globals() and research_assistant is not None,
        'Document Processor': 'doc_processor' in globals() and doc_processor is not None,
        'Memory System': 'research_memory' in globals() and research_memory is not None,
        'Required Libraries': True  # Will be checked below
    }
    
    # Check required libraries
    try:
        import langchain
        import openai
        import tiktoken
        import pandas as pd
        import matplotlib.pyplot as plt
        import plotly.express as px
        status['Required Libraries'] = True
    except ImportError as e:
        status['Required Libraries'] = False
        print(f"⚠️  Missing library: {e}")
    
    # Display status
    all_ready = True
    for component, is_ready in status.items():
        emoji = "✅" if is_ready else "❌"
        status_text = "Ready" if is_ready else "Not Ready"
        print(f"{emoji} {component}: {status_text}")
        if not is_ready:
            all_ready = False
    
    print("=" * 50)
    
    if all_ready:
        print("🎉 All systems ready! You can start researching.")
        print("💡 Try running the example cells below to get started.")
    else:
        print("⚠️  Some components need attention.")
        print("📝 Please check the setup cells above and ensure:")
        print("   • All dependencies are installed")
        print("   • OpenAI API key is configured")
        print("   • All previous cells have been executed")
    
    return all_ready

def display_system_info():
    """Display detailed system information"""
    
    if not check_system_status():
        return
    
    print("\n📊 System Information")
    print("=" * 30)
    
    # Display current preferences
    if 'research_memory' in globals() and research_memory:
        print("🎯 Current Research Preferences:")
        for key, value in research_memory.preferences.items():
            print(f"   {key}: {value}")
        
        print(f"\n📈 Session Statistics:")
        stats = research_memory.get_session_summary()
        for key, value in stats.items():
            print(f"   {key}: {value}")
    
    # Display cost estimation
    print(f"\n💰 Cost Estimation (per query):")
    print(f"   Small query (~500 tokens): $0.001 - $0.003")
    print(f"   Medium query (~1500 tokens): $0.003 - $0.007")
    print(f"   Large query (~3000 tokens): $0.007 - $0.015")
    
    print(f"\n🚀 Ready to Research!")
    print(f"💡 Example usage:")
    print(f"   result = research_assistant.research('your question here')")

# Run the system check
print("🔍 Checking system status...")
system_ready = check_system_status()

# Display additional info if ready
if system_ready:
    display_system_info()
else:
    print("\n🔧 Setup Guide:")
    print("1. Run the installation cell to install dependencies")
    print("2. Configure your OpenAI API key")
    print("3. Execute all setup cells in order")
    print("4. Run this cell again to verify status")

In [None]:
# 🔍 Example 1: Basic Research Query

def run_basic_research_demo():
    """Demonstrate basic research functionality"""
    
    if not system_ready:
        print("⚠️  System not ready. Please complete the setup first.")
        print("💡 Make sure to set your OpenAI API key in the configuration cell above.")
        return
    
    print("🚀 Basic Research Demo")
    print("=" * 50)
    
    # Example research query
    example_query = "What are the latest developments in renewable energy technology?"
    
    print(f"📝 Research Query: {example_query}")
    print("⏳ Processing... (This may take 30-60 seconds)")
    print("🔍 Searching multiple sources...")
    
    try:
        # Perform research
        result = research_assistant.research(example_query, include_analysis=True)
        
        # Display results
        if not result.get('error'):
            print("\n✅ Research Completed Successfully!")
            print("=" * 50)
            
            # Main findings
            print("📊 **Research Findings:**")
            print("-" * 30)
            print(result['findings'])
            
            # Metadata
            print(f"\n📈 **Research Metadata:**")
            print(f"   📚 Sources Found: {len(result['sources'])}")
            print(f"   💰 Cost: ${result['cost']:.4f}")
            print(f"   🔢 Tokens Used: {result['tokens_used']:,}")
            print(f"   ⏱️ Processing Time: {result['processing_time']:.2f} seconds")
            
            # Sources
            print(f"\n📚 **Sources:**")
            for i, source in enumerate(result['sources'], 1):
                print(f"   {i}. {source[:100]}{'...' if len(source) > 100 else ''}")
            
            # Quality analysis
            if 'analysis' in result:
                analysis = result['analysis']
                print(f"\n🧠 **Quality Analysis:**")
                print(f"   📝 Word Count: {analysis['word_count']}")
                print(f"   🎯 Completeness Score: {analysis['completeness_score']:.2f}/1.0")
                print(f"   📖 Has Citations: {analysis['has_citations']}")
                print(f"   🔍 Confidence Indicators: {len(analysis['confidence_indicators'])}")
                
                # Quality rating
                score = analysis['completeness_score']
                if score >= 0.8:
                    rating = "🌟 Excellent"
                elif score >= 0.6:
                    rating = "👍 Good"
                elif score >= 0.4:
                    rating = "⚠️ Fair"
                else:
                    rating = "❌ Poor"
                print(f"   🏆 Quality Rating: {rating}")
            
            print("\n🎉 Demo completed successfully!")
            
        else:
            print("❌ Research failed:")
            print(result['findings'])
            
    except Exception as e:
        print(f"❌ Error during research: {str(e)}")
        print("💡 Check your API key and internet connection")

# Run the demo
print("🎯 Basic Research Example")
print("This example demonstrates the core research functionality")
print("with a sample query about renewable energy technology.")
print("\n" + "="*60)

# Only run if system is ready
if 'system_ready' in globals():
    run_basic_research_demo()
else:
    print("⚠️  Please run the system status check cell first.")
    print("💡 Make sure all previous cells have been executed successfully.")

In [None]:
# 💬 Interactive Research Interface

def interactive_research():
    """Interactive research session"""
    if not system_ready:
        print("⚠️  System not ready. Please complete the setup first.")
        return
    
    print("🎯 Interactive Research Session Started!")
    print("=" * 50)
    print("💡 Tips:")
    print("   - Ask specific questions for better results")
    print("   - Use keywords like 'latest', 'trends', 'comparison'")
    print("   - Type 'quit' to exit")
    print("=" * 50)
    
    while True:
        try:
            # Get user input
            user_query = input("\n🔍 Enter your research question: ").strip()
            
            if user_query.lower() in ['quit', 'exit', 'q']:
                print("👋 Research session ended. Thank you!")
                break
            
            if not user_query:
                print("⚠️  Please enter a valid question.")
                continue
            
            print(f"\n⏳ Researching: {user_query}")
            print("Please wait...")
            
            # Perform research
            result = research_assistant.research(user_query, include_analysis=True)
            
            # Display results
            if not result.get('error'):
                print("\n" + "="*80)
                print("📊 RESEARCH FINDINGS:")
                print("="*80)
                print(result['findings'])
                
                print(f"\n📈 SUMMARY:")
                print(f"   💰 Cost: ${result['cost']:.4f}")
                print(f"   📚 Sources: {len(result['sources'])}")
                print(f"   ⏱️  Time: {result['processing_time']:.2f}s")
                
                if 'analysis' in result:
                    score = result['analysis']['completeness_score']
                    print(f"   🎯 Quality Score: {score:.2f}/1.0")
                
                print("="*80)
            else:
                print(f"❌ Research failed: {result['findings']}")
                
        except KeyboardInterrupt:
            print("\n👋 Research session interrupted.")
            break
        except Exception as e:
            print(f"❌ Error: {str(e)}")

# Note: Uncomment the line below to start an interactive session
# interactive_research()

print("💡 Interactive Research Interface Ready!")
print("📝 Uncomment the last line in this cell to start an interactive session.")
print("🎯 Or use the research_assistant.research() method directly with your queries.")

In [None]:
# 📄 Document Processing Examples

def document_processing_demo():
    """Demonstrate document processing capabilities"""
    if not system_ready:
        print("⚠️  System not ready. Please complete the setup first.")
        return
    
    print("📄 Document Processing Demo")
    print("=" * 40)
    
    # Example 1: Process text documents
    print("📝 Example 1: Processing Sample Documents")
    
    sample_docs = [
        """
        Artificial Intelligence (AI) has revolutionized various industries in recent years. 
        Machine learning algorithms are being used in healthcare to diagnose diseases, 
        in finance for fraud detection, and in transportation for autonomous vehicles. 
        The key benefits include improved accuracy, reduced costs, and enhanced efficiency.
        """,
        """
        Climate change is one of the most pressing challenges of our time. 
        Rising global temperatures are causing sea levels to rise, weather patterns to change, 
        and ecosystems to be disrupted. Renewable energy sources like solar and wind power 
        are crucial for reducing carbon emissions and combating climate change.
        """,
        """
        Quantum computing represents a paradigm shift in computational power. 
        Unlike classical computers that use bits, quantum computers use quantum bits (qubits) 
        which can exist in multiple states simultaneously. This enables them to solve 
        certain problems exponentially faster than classical computers.
        """
    ]
    
    metadata = [
        {"source": "AI Research Paper", "topic": "artificial_intelligence"},
        {"source": "Climate Report", "topic": "climate_change"},
        {"source": "Quantum Computing Review", "topic": "quantum_computing"}
    ]
    
    # Process documents
    print("⏳ Processing sample documents...")
    doc_processor.process_documents(sample_docs, metadata)
    
    # Get document stats
    stats = doc_processor.get_document_stats()
    print(f"✅ Processed {stats['total_documents']} documents into {stats['total_chunks']} chunks")
    
    # Example 2: Query documents
    print("\n🔍 Example 2: Querying Documents")
    
    sample_queries = [
        "What are the benefits of AI?",
        "How does climate change affect the environment?",
        "What makes quantum computing different from classical computing?"
    ]
    
    for i, query in enumerate(sample_queries, 1):
        print(f"\n📝 Query {i}: {query}")
        result = doc_processor.query_documents(query, k=2)
        print("📊 Results:")
        print(result[:300] + "..." if len(result) > 300 else result)
    
    print("\n✅ Document processing demo completed!")

# Example 3: Add document from URL (commented out - requires internet)
def add_web_document_example():
    """Example of adding a document from a URL"""
    if not system_ready:
        return
    
    print("🌐 Adding Document from URL Example")
    print("=" * 40)
    
    # Example URLs (uncomment to try)
    example_urls = [
        "https://en.wikipedia.org/wiki/Artificial_intelligence",
        "https://en.wikipedia.org/wiki/Climate_change",
        "https://en.wikipedia.org/wiki/Quantum_computing"
    ]
    
    print("📝 Example URLs that can be processed:")
    for i, url in enumerate(example_urls, 1):
        print(f"   {i}. {url}")
    
    print("\n💡 To add a document from URL, use:")
    print("   doc_processor.add_document_from_url('your_url_here')")
    print("   result = doc_processor.query_documents('your_query_here')")

# Run the document processing demo
if system_ready:
    document_processing_demo()
    print("\n" + "="*60)
    add_web_document_example()
else:
    print("⚠️  Please complete the system setup first.")

In [None]:
# 📊 Research Analytics and Visualization

def visualize_research_trends():
    """Create visualizations of research trends and statistics"""
    if not research_memory.research_history:
        print("⚠️  No research history available. Please run some research queries first.")
        return
    
    print("📈 Research Analytics Dashboard")
    print("=" * 50)
    
    # Prepare data
    history = research_memory.research_history
    
    # Create a DataFrame for analysis
    data = []
    for entry in history:
        data.append({
            'timestamp': entry['timestamp'],
            'query_length': len(entry['query']),
            'cost': entry['cost'],
            'tokens': entry['tokens'],
            'hour': entry['timestamp'].hour
        })
    
    if not data:
        print("⚠️  No data to visualize yet.")
        return
    
    df = pd.DataFrame(data)
    
    # Create visualizations
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    fig.suptitle('Research Assistant Analytics Dashboard', fontsize=16)
    
    # 1. Research costs over time
    axes[0, 0].plot(df.index, df['cost'], marker='o', linewidth=2)
    axes[0, 0].set_title('Research Costs per Query')
    axes[0, 0].set_xlabel('Query Number')
    axes[0, 0].set_ylabel('Cost ($)')
    axes[0, 0].grid(True, alpha=0.3)
    
    # 2. Token usage distribution
    axes[0, 1].hist(df['tokens'], bins=10, alpha=0.7, color='skyblue')
    axes[0, 1].set_title('Token Usage Distribution')
    axes[0, 1].set_xlabel('Tokens Used')
    axes[0, 1].set_ylabel('Frequency')
    axes[0, 1].grid(True, alpha=0.3)
    
    # 3. Query length vs cost
    axes[1, 0].scatter(df['query_length'], df['cost'], alpha=0.7, color='green')
    axes[1, 0].set_title('Query Length vs Cost')
    axes[1, 0].set_xlabel('Query Length (characters)')
    axes[1, 0].set_ylabel('Cost ($)')
    axes[1, 0].grid(True, alpha=0.3)
    
    # 4. Research activity by hour
    if len(df) > 1:
        hourly_counts = df['hour'].value_counts().sort_index()
        axes[1, 1].bar(hourly_counts.index, hourly_counts.values, alpha=0.7, color='orange')
        axes[1, 1].set_title('Research Activity by Hour')
        axes[1, 1].set_xlabel('Hour of Day')
        axes[1, 1].set_ylabel('Number of Queries')
        axes[1, 1].grid(True, alpha=0.3)
    else:
        axes[1, 1].text(0.5, 0.5, 'Need more data\nfor hourly analysis', 
                       ha='center', va='center', fontsize=12)
        axes[1, 1].set_title('Research Activity by Hour')
    
    plt.tight_layout()
    plt.show()
    
    # Print summary statistics
    print("\n📊 Research Session Summary:")
    print("=" * 30)
    stats = research_memory.get_session_summary()
    print(f"Total Queries: {stats['queries_conducted']}")
    print(f"Total Cost: ${stats['total_cost']:.4f}")
    print(f"Total Tokens: {stats['total_tokens']:,}")
    print(f"Average Cost per Query: ${stats['avg_cost_per_query']:.4f}")
    print(f"Session Duration: {stats['session_duration']}")
    
    return df

def create_research_report():
    """Generate a comprehensive research report"""
    if not research_memory.research_history:
        print("⚠️  No research history available.")
        return
    
    print("📄 Generating Research Report...")
    print("=" * 50)
    
    # Create report data
    report_data = []
    for i, entry in enumerate(research_memory.research_history, 1):
        report_data.append({
            'Query #': i,
            'Timestamp': entry['timestamp'].strftime('%Y-%m-%d %H:%M:%S'),
            'Query': entry['query'][:100] + "..." if len(entry['query']) > 100 else entry['query'],
            'Cost ($)': f"{entry['cost']:.4f}",
            'Tokens': entry['tokens'],
            'Success': 'Yes' if not entry['result'].get('error') else 'No'
        })
    
    # Create DataFrame and display
    report_df = pd.DataFrame(report_data)
    
    print("📊 Research History Report:")
    print(report_df.to_string(index=False))
    
    # Export options
    print(f"\n💾 Export Options:")
    print("   • CSV: report_df.to_csv('research_report.csv', index=False)")
    print("   • Excel: report_df.to_excel('research_report.xlsx', index=False)")
    print("   • JSON: report_df.to_json('research_report.json', orient='records')")
    
    return report_df

# Run analytics if we have data
if research_memory.research_history:
    analytics_df = visualize_research_trends()
    report_df = create_research_report()
else:
    print("📊 Analytics Dashboard Ready!")
    print("💡 Run some research queries first to see analytics and visualizations.")
    print("🎯 Use: research_assistant.research('your question here')")

In [None]:
# 🔧 Utility Functions and Advanced Features

def update_research_preferences():
    """Update research preferences interactively"""
    if not system_ready:
        print("⚠️  System not ready.")
        return
    
    print("⚙️  Research Preferences Configuration")
    print("=" * 40)
    
    current_prefs = research_memory.preferences
    print("📊 Current Preferences:")
    for key, value in current_prefs.items():
        print(f"   {key}: {value}")
    
    print("\n🎯 Available Options:")
    print("   1. summary_style: 'brief', 'comprehensive', 'technical'")
    print("   2. max_results: 1-10")
    print("   3. include_analysis: True/False")
    
    try:
        choice = input("\nEnter preference to update (or 'skip' to continue): ").strip()
        
        if choice.lower() == 'skip':
            print("✅ Preferences unchanged.")
            return
        
        if choice in current_prefs:
            new_value = input(f"Enter new value for {choice}: ").strip()
            
            # Convert to appropriate type
            if choice == 'max_results':
                new_value = int(new_value)
            elif choice == 'include_analysis':
                new_value = new_value.lower() == 'true'
            
            research_memory.update_preferences(choice, new_value)
        else:
            print("❌ Invalid preference name.")
            
    except Exception as e:
        print(f"❌ Error updating preferences: {e}")

def export_research_data():
    """Export research data in various formats"""
    if not research_memory.research_history:
        print("⚠️  No research data to export.")
        return
    
    print("💾 Data Export Options")
    print("=" * 30)
    
    # Prepare export data
    export_data = []
    for entry in research_memory.research_history:
        export_data.append({
            'timestamp': entry['timestamp'].isoformat(),
            'query': entry['query'],
            'findings': entry['result'].get('findings', ''),
            'sources': entry['result'].get('sources', []),
            'cost': entry['cost'],
            'tokens': entry['tokens']
        })
    
    # JSON export
    print("📄 Exporting to JSON...")
    with open('research_data.json', 'w', encoding='utf-8') as f:
        json.dump(export_data, f, indent=2, ensure_ascii=False)
    print("✅ Exported to: research_data.json")
    
    # CSV export
    print("📊 Exporting to CSV...")
    df = pd.DataFrame(export_data)
    df['sources'] = df['sources'].astype(str)  # Convert list to string for CSV
    df.to_csv('research_data.csv', index=False)
    print("✅ Exported to: research_data.csv")
    
    # Excel export
    print("📈 Exporting to Excel...")
    with pd.ExcelWriter('research_data.xlsx', engine='openpyxl') as writer:
        df.to_excel(writer, sheet_name='Research Data', index=False)
        
        # Add a summary sheet
        summary_df = pd.DataFrame([research_memory.get_session_summary()])
        summary_df.to_excel(writer, sheet_name='Summary', index=False)
    
    print("✅ Exported to: research_data.xlsx")
    print("\n📁 Files created in the current directory:")
    print("   • research_data.json")
    print("   • research_data.csv")
    print("   • research_data.xlsx")

def batch_research(queries: List[str], delay: float = 2.0):
    """Perform batch research on multiple queries"""
    if not system_ready:
        print("⚠️  System not ready.")
        return
    
    print(f"🔄 Batch Research: {len(queries)} queries")
    print("=" * 40)
    
    results = []
    total_cost = 0.0
    
    for i, query in enumerate(queries, 1):
        print(f"\n📝 Processing Query {i}/{len(queries)}: {query}")
        
        result = research_assistant.research(query, include_analysis=True)
        results.append(result)
        
        if not result.get('error'):
            total_cost += result['cost']
            print(f"✅ Completed - Cost: ${result['cost']:.4f}")
        else:
            print(f"❌ Failed: {result['findings']}")
        
        # Add delay between requests
        if i < len(queries):
            print(f"⏳ Waiting {delay} seconds...")
            import time
            time.sleep(delay)
    
    print(f"\n🎉 Batch Research Complete!")
    print(f"💰 Total Cost: ${total_cost:.4f}")
    print(f"✅ Successful: {len([r for r in results if not r.get('error')])}")
    print(f"❌ Failed: {len([r for r in results if r.get('error')])}")
    
    return results

def clear_research_history():
    """Clear research history and reset statistics"""
    if not research_memory:
        print("⚠️  Memory system not available.")
        return
    
    confirm = input("⚠️  Are you sure you want to clear all research history? (yes/no): ").strip().lower()
    
    if confirm == 'yes':
        research_memory.research_history.clear()
        research_memory.session_stats = {
            'queries_conducted': 0,
            'total_tokens': 0,
            'total_cost': 0.0,
            'session_start': datetime.now()
        }
        print("✅ Research history cleared.")
    else:
        print("❌ Operation cancelled.")

# Example usage
print("🔧 Utility Functions Available:")
print("   • update_research_preferences() - Configure research settings")
print("   • export_research_data() - Export research data to files")
print("   • batch_research(queries) - Research multiple topics at once")
print("   • clear_research_history() - Clear all research history")
print("   • research_memory.get_session_summary() - Get current session stats")

# Quick preference update
print(f"\n⚙️  Quick Preference Update:")
print("   research_memory.update_preferences('summary_style', 'brief')")
print("   research_memory.update_preferences('max_results', 7)")
print("   research_memory.update_preferences('include_analysis', True)")

## 🎉 Conclusion and Next Steps

### 🎯 **What You've Built**

Congratulations! You now have a fully functional AI Research Assistant with comprehensive capabilities:

#### ✅ **Core Features**
- 🌐 **Multi-source Research**: Web search, arXiv, and Google Scholar integration
- 🧠 **Intelligent Memory**: Contextual memory for research history and preferences
- 📄 **Document Processing**: Vector storage and semantic search capabilities
- 💰 **Cost Tracking**: Monitor API usage and costs in real-time
- 🎯 **Quality Analysis**: Automated assessment of research quality and completeness

#### 🔧 **Advanced Features**
- 💬 **Interactive Interface**: Command-line style research sessions
- 🔄 **Batch Processing**: Research multiple topics simultaneously
- 📊 **Data Export**: Export research data in JSON, CSV, and Excel formats
- 📈 **Analytics Dashboard**: Visualize research trends and statistics
- ⚙️ **Preference Management**: Customize research behavior and settings

### 🚀 **Getting Started Quickly**

```python
# 1. Basic research
result = research_assistant.research("What is quantum computing?")
print(result['findings'])

# 2. Batch research
queries = ["AI in healthcare", "Climate change solutions", "Renewable energy"]
results = batch_research(queries)

# 3. Document processing
doc_processor.add_document_from_url("https://example.com/article")
doc_result = doc_processor.query_documents("key concepts")

# 4. Export your research
export_research_data()
```

### 🌟 **Community and Contributions**

We welcome contributions from the community! Here's how you can help:

#### 🤝 **How to Contribute**
1. 🍴 **Fork** the repository
2. 🌿 **Create** a feature branch (`git checkout -b feature/amazing-feature`)
3. 💾 **Commit** your changes (`git commit -m 'Add amazing feature'`)
4. 📤 **Push** to the branch (`git push origin feature/amazing-feature`)
5. 🔃 **Open** a Pull Request

#### 🎯 **Areas for Contribution**
- 🔍 **New Search Sources**: Add more academic databases
- 📊 **Enhanced Analytics**: Improve visualization and reporting
- 🌐 **Web Interface**: Enhance the Streamlit application
- 🧠 **AI Models**: Integration with other LLM providers
- 📚 **Documentation**: Improve guides and tutorials
- 🐛 **Bug Fixes**: Help us identify and fix issues

### 🚀 **Next Steps and Enhancements**

1. **🔗 Additional Integrations**
   - Google Scholar API for better academic search
   - Semantic Scholar for computer science papers
   - PubMed for medical research
   - Patent databases for innovation research

2. **📊 Enhanced Analytics**
   - Sentiment analysis of research findings
   - Topic modeling and clustering
   - Research trend predictions
   - Collaboration network analysis

3. **🌐 Web Interface**
   - Deploy the Streamlit app: `streamlit run research_app.py`
   - Add user authentication and project sharing
   - Real-time collaboration features

4. **🤖 AI Improvements**
   - Fine-tuning for specific research domains
   - Multi-language support
   - Voice-to-text research queries
   - Automated report generation

### 📚 **Additional Resources**

- 📖 **Documentation**: [README.md](README.md)
- 🌐 **Web App**: Run `streamlit run research_app.py`
- 🐙 **GitHub Setup**: [GITHUB_SETUP.md](GITHUB_SETUP.md)
- 📄 **License**: [MIT License](LICENSE)
- 🔗 **OpenAI API**: [Get your API key](https://platform.openai.com/api-keys)

### 🛠️ **Troubleshooting**

| Issue | Solution |
|-------|----------|
| 🔑 **API Key Issues** | Ensure OpenAI API key is correctly set in environment variables |
| 🌐 **Network Problems** | Check internet connection for web searches |
| 💾 **Memory Issues** | Clear research history: `clear_research_history()` |
| 💰 **Cost Concerns** | Monitor usage with analytics dashboard |
| 📦 **Import Errors** | Reinstall dependencies: `pip install -r requirements.txt` |

### 🔒 **Security Best Practices**

- 🔐 **Never commit** `.env` files or API keys to version control
- 🌍 **Use environment variables** for all sensitive configuration
- 📊 **Monitor API usage** regularly to prevent unexpected costs
- 🔄 **Keep dependencies updated** for security patches
- 🛡️ **Use strong authentication** for deployed applications

### 📞 **Support and Community**

- 🐛 **Bug Reports**: [Create an issue](https://github.com/yourusername/ai-research-assistant/issues)
- 💡 **Feature Requests**: [Start a discussion](https://github.com/yourusername/ai-research-assistant/discussions)
- ❓ **Questions**: [Check the FAQ](https://github.com/yourusername/ai-research-assistant/wiki/FAQ)
- 💬 **Chat**: [Join our Discord](https://discord.gg/your-server)

### 🏆 **Acknowledgments**

- 🦜 **LangChain**: For the amazing framework
- 🤖 **OpenAI**: For the powerful language models
- 🌟 **Open Source Community**: For all the amazing libraries
- 👥 **Contributors**: Everyone who helps improve this project

---

<div align="center">

**🔬 Happy Researching! 📊🚀**

[![GitHub stars](https://img.shields.io/github/stars/yourusername/ai-research-assistant?style=social)](https://github.com/yourusername/ai-research-assistant)
[![Twitter Follow](https://img.shields.io/twitter/follow/yourusername?style=social)](https://twitter.com/yourusername)

*Made with ❤️ by the AI Research Assistant community*

</div>

## 🐙 GitHub Repository Information

### 📋 **Repository Structure**

```
ai-research-assistant/
├── 📓 custom_research_assistant.ipynb  # Main notebook (this file)
├── 🌐 research_app.py                  # Streamlit web interface
├── 📦 requirements.txt                 # Python dependencies
├── 📖 README.md                        # Project documentation
├── 🔧 .env.example                     # Environment variables template
├── 🚫 .gitignore                       # Git ignore file
├── 📄 LICENSE                          # MIT license
├── 🐙 GITHUB_SETUP.md                  # GitHub setup instructions
└── 📤 upload_to_github.ps1             # Upload script for Windows
```

### 🔄 **Version Information**

- **Version**: 1.0.0
- **Last Updated**: July 2025
- **Python**: 3.8+
- **LangChain**: 0.1.0+
- **OpenAI**: GPT-3.5-turbo

### 🏷️ **Repository Tags**

`ai` `research` `langchain` `openai` `jupyter` `data-science` `machine-learning` `nlp` `automation` `productivity`

### 📈 **Project Stats**

| Metric | Value |
|--------|-------|
| 📊 **Lines of Code** | 2000+ |
| 🔧 **Functions** | 50+ |
| 📝 **Documentation** | Comprehensive |
| 🧪 **Examples** | 10+ |
| 🌟 **Features** | 15+ |

### 🚀 **Quick Deploy**

#### Deploy to GitHub Pages
```bash
git clone https://github.com/yourusername/ai-research-assistant.git
cd ai-research-assistant
pip install -r requirements.txt
streamlit run research_app.py
```

#### Deploy to Heroku
```bash
git clone https://github.com/yourusername/ai-research-assistant.git
cd ai-research-assistant
heroku create your-app-name
git push heroku main
```

#### Deploy to Google Colab
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yourusername/ai-research-assistant/blob/main/custom_research_assistant.ipynb)

---

## 📜 License and Citation

### 📄 **License**

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

```
MIT License

Copyright (c) 2025 AI Research Assistant Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

### 📚 **Citation**

If you use this work in your research or projects, please cite it as:

```bibtex
@software{ai_research_assistant,
  title = {AI Research Assistant: A Comprehensive Research Tool Built with LangChain},
  author = {AI Research Assistant Contributors},
  year = {2025},
  url = {https://github.com/yourusername/ai-research-assistant},
  version = {1.0.0}
}
```

### 🙏 **Acknowledgments**

This project builds upon the excellent work of:

- **LangChain**: Harrison Chase and the LangChain team for the amazing framework
- **OpenAI**: For providing powerful language models and APIs
- **Streamlit**: For the beautiful web application framework
- **Plotly**: For interactive visualizations
- **ChromaDB**: For vector storage capabilities
- **ArXiv**: For academic paper access
- **DuckDuckGo**: For web search capabilities

### 🔗 **Related Projects**

- [LangChain](https://github.com/langchain-ai/langchain) - Framework for developing applications with LLMs
- [Streamlit](https://github.com/streamlit/streamlit) - Web app framework for ML and data science
- [ChromaDB](https://github.com/chroma-core/chroma) - AI-native open-source embedding database
- [ArXiv API](https://arxiv.org/help/api) - Interface for accessing ArXiv papers

---

<div align="center">

**🌟 Thank you for using AI Research Assistant! 🌟**

*If you find this project helpful, please consider giving it a star on GitHub!*

[![GitHub stars](https://img.shields.io/github/stars/yourusername/ai-research-assistant?style=social)](https://github.com/yourusername/ai-research-assistant/stargazers)

</div>