Skip to content

Orion employs mode-specific prompt templates that dynamically incorporate user preferences: Précis Mode: Fast-track synthesis with executive summaries (100-500 words, ~4K tokens) Synopsis Mode: Balanced analytical reports with structured sections (1500-2500 words, ~8K tokens) Treatise Mode: Academic-grade research with abstracts(2000-4000)

License

Notifications You must be signed in to change notification settings

Devanik21/Perplexity

Repository files navigation

🌌 Orion Deep Research Engine

Python Streamlit Gemini License

An AI-powered research synthesis platform that transforms information overload into actionable intelligence.

Orion harnesses Google's Gemini AI to conduct autonomous web research, analyze multi-modal documents, and generate publication-grade reports with unprecedented depth and nuance.


🎯 Core Innovation

Traditional search engines return links. Orion returns understanding.

By combining large language models with advanced information retrieval, Orion doesn't just find sources—it reads them, synthesizes perspectives, identifies gaps, and produces coherent narratives that would take human researchers hours to compile.

Key Differentiators

  • Adaptive Depth Control: Three synthesis modes (Précis, Synopsis, Treatise) dynamically adjust research scope and analytical rigor
  • Multi-Modal Intelligence: Seamlessly processes text, PDFs, images, and structured data in a unified analytical framework
  • Perspective-Aware Analysis: Automatically identifies competing viewpoints and presents balanced, evidence-based comparisons
  • Citation-Integrated Output: Inline references and experimental APA formatting maintain academic rigor
  • Real-Time Web Integration: Live SerpAPI connection ensures access to current information beyond training cutoffs

🚀 Technical Architecture

┌─────────────────────────────────────────────────────────┐
│                    User Interface Layer                 │
│            (Streamlit + Custom CSS/HTML)                │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│              Application Logic Layer                    │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐ │
│  │ Web Research │  │   Document   │  │ Conversation │ │
│  │   Pipeline   │  │   Analysis   │  │   Manager    │ │
│  └──────────────┘  └──────────────┘  └──────────────┘ │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│                 AI Integration Layer                    │
│  ┌──────────────────────────────────────────────────┐  │
│  │     Google Gemini (Gemma 3 27B Instruct)        │  │
│  │  • Dynamic prompt engineering                    │  │
│  │  • Context-aware generation (up to 8K tokens)   │  │
│  │  • Multi-turn conversation management           │  │
│  └──────────────────────────────────────────────────┘  │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│              External Service Layer                     │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐ │
│  │   SerpAPI    │  │  BeautifulSoup│  │   WeasyPrint │ │
│  │ (Web Search) │  │  (Scraping)   │  │ (PDF Export) │ │
│  └──────────────┘  └──────────────┘  └──────────────┘ │
└─────────────────────────────────────────────────────────┘

Intelligent Prompt Engineering

Orion employs mode-specific prompt templates that dynamically incorporate user preferences:

  • Précis Mode: Fast-track synthesis with executive summaries (100-500 words, ~4K tokens)
  • Synopsis Mode: Balanced analytical reports with structured sections (1500-2500 words, ~8K tokens)
  • Treatise Mode: Academic-grade research with abstracts and TOC (2000-4000 words, ~8K tokens)

Each mode integrates optional features (future projections, historical context, expert quotes, data visualization suggestions) through modular prompt injection—enabling unprecedented customization without combinatorial complexity.


🛠️ Installation & Setup

Prerequisites

Python 3.8+
pip (Python package manager)

Quick Start

  1. Clone the repository
git clone https://github.com/yourusername/orion-research-engine.git
cd orion-research-engine
  1. Install dependencies
pip install -r requirements.txt
  1. Configure secrets

Create .streamlit/secrets.toml:

GEMINI_API_KEY = "your_gemini_api_key_here"
SERPAPI_KEY = "your_serpapi_key_here"
research_app_password = "your_secure_password"
  1. Launch the application
streamlit run app.py

Access at http://localhost:8501

Dependencies

streamlit>=1.28.0
google-generativeai>=0.3.0
requests>=2.31.0
beautifulsoup4>=4.12.0
markdown2>=2.4.0
weasyprint>=60.0
pypdf>=3.17.0
Pillow>=10.0.0
wordcloud>=1.9.0
matplotlib>=3.7.0

💡 Usage Examples

Web Research Mode

Query: "Ethical implications of large language models in medical diagnosis"

Output: Comprehensive analysis covering:

  • Current deployment landscape
  • Comparative accuracy studies (AI vs. human practitioners)
  • Regulatory frameworks (FDA, EMA perspectives)
  • Patient privacy considerations
  • Future trajectory predictions with confidence intervals

Document Analysis Mode

Scenario: Upload 3 research papers on quantum computing

Capability:

  • Cross-document synthesis
  • Concept extraction and comparison
  • Interactive Q&A with contextual memory
  • Export conversation transcripts

🧪 Advanced Features

Experimental Modules (Beta)

Feature Description Status
Perspective Analysis Multi-viewpoint synthesis with proponent attribution ✅ Active
Future Trajectories Predictive insights based on trend analysis ✅ Active
Data Viz Suggestions AI-recommended charts/graphs for quantitative data 🧪 Beta
Expert Quotations Automated extraction of domain expert insights 🧪 Beta
Historical Context Temporal evolution tracking of research topics 🧪 Beta

Security

  • Password Protection: Three-attempt lockout mechanism
  • Session State Management: Secure handling of user data
  • No Data Persistence: Privacy-first architecture (no external storage)

🎨 Customization

Theme System

Toggle between light/dark modes with custom CSS injection. Background images and color schemes are fully configurable via set_app_background().

Citation Styles

  • Inline Numbers: [1], [2] superscript citations
  • Academic (APA): Experimental best-effort APA formatting
  • None: Clean narrative output

📊 Performance Benchmarks

Metric Précis Synopsis Treatise
Avg. Generation Time 45-60s 90-120s 180-240s
Source Processing 5-15 10-30 20-100
Token Output ~2K ~5K ~8K
Recommended Use Quick overviews Detailed analysis Academic research

Tested on standard queries with 15-100 web sources


🔬 Research Applications

Industry Use Cases

  • Academic Research: Literature reviews, gap analysis, hypothesis generation
  • Corporate Intelligence: Market research, competitive analysis, trend forecasting
  • Policy Analysis: Multi-stakeholder perspective synthesis, impact assessment
  • Technical Due Diligence: Technology evaluation, risk analysis, vendor comparison

Validated Domains

✅ Healthcare & Life Sciences
✅ AI/ML & Computer Science
✅ Finance & Economics
✅ Climate & Sustainability
✅ Regulatory & Legal Frameworks


🤝 Contributing

We welcome contributions that enhance Orion's capabilities:

  1. Feature Requests: Open an issue with [FEATURE] tag
  2. Bug Reports: Detailed reproduction steps appreciated
  3. Pull Requests: Follow existing code style, include tests where applicable

Development Roadmap

  • Multi-language support (beyond English)
  • Graph-based source relationship visualization
  • Automated fact-checking with confidence scores
  • Export to LaTeX/Overleaf
  • Integration with academic databases (PubMed, arXiv, IEEE)

📜 License

This project is licensed under the MIT License - see LICENSE file for details.


🙏 Acknowledgments

Built with:

  • Google Gemini API - Powering the intelligence layer
  • SerpAPI - Enabling real-time web integration
  • Streamlit - Rapid UI prototyping framework
  • Open Source Community - BeautifulSoup, WeasyPrint, and countless other tools

📧 Contact

Project Lead: [Your Name]
Email: your.email@domain.com
GitHub: @yourusername


Transforming information into insight, one query at a time.

⭐ Star this repo if Orion helped your research!

About

Orion employs mode-specific prompt templates that dynamically incorporate user preferences: Précis Mode: Fast-track synthesis with executive summaries (100-500 words, ~4K tokens) Synopsis Mode: Balanced analytical reports with structured sections (1500-2500 words, ~8K tokens) Treatise Mode: Academic-grade research with abstracts(2000-4000)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages