Skip to content

defog-ai/introspect

Repository files navigation

🔬 Defog Introspect: Deep Research for your internal data

Introspect is a service that does data-focused deep research for structured data. It understands your structured data (databases or CSV/Excel files), unstructured data (PDFs), and can query the web to get additional context.

Demo

Quick Start

  1. Set up environment variables:
# Create a .env file in your root folder
# You need all 3 - not just one
OPENAI_API_KEY="YOUR_OPENAI_API_KEY"
ANTHROPIC_API_KEY="YOUR_ANTHROPIC_API_KEY"
GEMINI_API_KEY="YOUR_GEMINI_API_KEY"
  1. Start all services using Docker Compose:
docker compose up --build
  1. Access the application in your browser:

How it works

We use a simple AI agent with tool use. An LLM attempts to answer a user question with 3 tools – text_to_sql, web_search, and pdf_with_citations.

The model then recursively asks questions using one of these tools until it is satisfied that it has enough context to answer the users question. By default, we use o3-mini for text to SQL, gemini-2.0-flash for web search, and claude-3-7-sonnet for both PDF analysis and orchestration.

image

Development

For development workflows and more detailed instructions, see the README files in the /backend and /frontend directories.

Supported Databases

Defog supports most database connectors including PostgreSQL, MySQL, SQLite, BigQuery, Redshift, Snowflake, and Databricks – and also includes support for CSV and Excel files.

Build/Test/Lint Commands

Backend (Python)

  • Run all tests: docker exec introspect-backend pytest
  • Run single test: docker exec introspect-backend pytest tests/test_file.py::test_function -v
  • Tests use the agents-postgres service for database operations
  • Create admin user: docker exec introspect-backend python create_admin_user.py

Frontend (JavaScript/TypeScript)

  • Development server: cd frontend && npm run dev
  • Build production: cd frontend && npm run build
  • Export static site: cd frontend && npm run export
  • Run frontend tests: cd frontend && npx playwright test
  • Lint (Prettier): cd frontend && npm run lint

Docs

Coming soon

Contributing and Maintainers

This repo is maintained by Defog.ai

To do

  • Create Docs
  • Have an option to replace Google/Gemini based web search tool with Perplexity
  • Let users choose what model they want for which task from the `.env. file
  • Docs and examples for how to add custom tools
  • Docs and exampels for how to integrate with unstructured data sources with search, like Google Drive and OneDrive