Skip to content

AMythicDev/pathway-dataquest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pathway Dataquest: Intelligent Document Analysis Platform

1. Overview

Pathway Dataquest is a full-stack web application designed for intelligent document management and analysis. It allows users to connect various data sources (local files, Google Drive, S3), interact with documents through a chat interface, and automatically track and analyze changes in document content. The backend is powered by a Retrieval-Augmented Generation (RAG) pipeline using the Pathway framework, enabling natural language queries on your document library.

2. Key Features

  • Conversational Q&A: Chat with your documents to find information, get summaries, and ask complex questions.
  • Multi-Source Connectivity: Connect and sync documents from your local filesystem, Google Drive, and AWS S3 buckets.
  • Automated Change Detection: Automatically detects, tracks, and displays changes between document versions, highlighting additions and modifications.
  • Risk & Policy Analysis: A specialized AI agent identifies high-impact clauses in legal and policy documents related to legal obligations, safety protocols, and financial penalties.
  • Interactive Dashboard: A modern, responsive UI to view recent files, analyze document activity, and explore changes.

3. Technical Stack

  • Frontend:

    • Framework: Next.js
    • Language: TypeScript
    • UI Components: shadcn/ui
    • Styling: Tailwind CSS
    • API Communication: Axios
  • Backend:

    • Framework: FastAPI
    • Language: Python
    • Core Engine: Pathway (for RAG and data processing pipelines)
    • Database: MongoDB (for storing conversations and source configurations)
    • Server: Uvicorn
  • DevOps & Tooling:

    • Package Management: npm (frontend), uv (backend)
    • Containerization: Docker (docker-compose.yaml)

4. Getting Started

Prerequisites

  • Node.js and npm
  • Python 3.12+ and uv
  • Docker and Docker Compose
  • A running MongoDB instance

Installation

  1. Clone the repository:

    git clone https://github.com/AMythicDev/pathway-dataquest/
    cd pathway-dataquest
  2. Set Up Environment Variables: Create a .env file in the project root. The backend requires credentials for MongoDB and a secret key for encrypting data source credentials.

    You can generate a suitable encryption key with this Python command:

    from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())

    Your .env file should look like this:

    # .env
    ENCRYPTION_KEY=your_generated_secret_key
    GEMINI_API_KEY=<your_gemini_api_key>
    MONGODB_URI=<mongodb_connection_uri>
    
  3. Install Frontend Dependencies:

    npm install
  4. Install Backend Dependencies:

    uv sync

5. Running the Application

This project can be run using Docker Compose or by starting the frontend and backend services manually.

Running Services Manually

  1. Create required folders:

    mkdir model/state model/out
  2. Start the Backend Server: The FastAPI server is defined in model/main.py.

    uvicorn model.main:app --host 0.0.0.0 --port 8000 --reload
  3. Start the Frontend Development Server: In a separate terminal, run the Next.js app.

    npm run dev

The application will be accessible at http://localhost:3000.

6. Project Structure

/
├── app/            # Next.js frontend pages and routing
├── components/     # React components (UI elements, charts, etc.)
├── hooks/          # Custom React hooks (e.g., use-google-drive)
├── lib/            # Frontend utilities and context providers
├── model/          # Python backend (FastAPI, Pathway RAG pipeline)
│   ├── main.py     # FastAPI application entrypoint
│   ├── rag.py      # Core RAG logic
│   └── connectors.py # Logic for connecting to data sources
├── public/         # Static assets (images, icons)
├── pyproject.toml  # Backend Python dependencies (for uv)
├── package.json    # Frontend Node.js dependencies (for npm)
└── docker-compose.yaml # Docker configuration for services

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors