Skip to content

DurmazDev/AgenticAI-Hackathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgenticAI System

An intelligent survey and document processing system powered by LLM agents with RAG (Retrieval-Augmented Generation) capabilities.

Overview

This system provides an AI-powered platform for processing surveys, analyzing documents, and answering questions using advanced language models. It features:

  • Multi-Agent Architecture: Specialized agents for retrieval, reasoning, and answer composition
  • Document Processing: Support for PDF, Excel, JSON, and web scraping
  • RAG System: Vector-based search with FAISS for accurate information retrieval
  • Survey Processing: Automated survey form filling with intelligent question answering
  • Web Interface: User-friendly UI for chat and survey interactions

System Requirements

  • Python 3.8 or higher
  • pip (Python package installer)
  • Tesseract OCR (for PDF image text extraction)

Installation & Setup

1. Clone the Repository

cd /Users/ahmet/Desktop/agenticai

2. Create Virtual Environment

python -m venv venv

3. Activate Virtual Environment

macOS/Linux:

source venv/bin/activate

Windows:

venv\Scripts\activate

4. Install Dependencies

pip install -r requirements.txt

5. Configure Environment Variables

Copy the example environment file:

cp .env.example .env

Edit .env and set your API credentials:

OPENAI_API_KEY=your_actual_api_key_here
OPENAI_ENDPOINT=https://api.openai.com/v1

6. Review Configuration

The system configuration is managed in config.yaml. Default settings include:

  • LLM Model: gpt-oss-120b
  • Embedding Model: qwen3-embedding-8b
  • Chunk Size: 700-1200 characters
  • Max Concurrent Threads: 100

You can modify these settings based on your requirements.

Running the System

Start Both Servers

The system runs two servers simultaneously:

python run_server.py

This will start:

  • API Server: http://localhost:8000 (backend API with hot-reload)
  • UI Server: http://localhost:5500 (static file server for UI)

Access the Application

Once both servers are running, open your browser and navigate to:

API Documentation

With the API server running, you can access:

Modular Code Structure

AgenticAI uses a layered and modular architecture. Each module has specific responsibilities and works in a loosely coupled manner.

Project Organization

agenticai/
├── src/                          # Main source code
│   ├── agents/                   # Specialized LLM agents
│   │   ├── retrieval_agent.py   # Hybrid retrieval + LLM reranking
│   │   ├── reasoning_agent.py   # Question classification and intent analysis
│   │   ├── answer_composer_agent.py  # Answer composition and formatting
│   │   ├── answer_selector_agent.py  # Multiple choice answer selection
│   │   ├── conflict_resolver_agent.py  # Conflict resolution
│   │   ├── document_ingestion_agent.py  # Document processing and parsing
│   │   ├── search_agent.py      # Web search coordination
│   │   └── survey_agent.py      # Survey processing
│   ├── orchestrator/             # Agent coordination
│   │   └── agent_orchestrator.py  # Multi-agent pipeline management
│   ├── chunking/                 # Document chunking
│   │   └── chunker.py           # Semantic and token-based chunking
│   ├── indexing/                 # Vector indexing
│   │   └── vector_store.py      # FAISS-based vector store
│   ├── tools/                    # LLM tools
│   │   ├── calculator.py        # Mathematical calculations
│   │   └── unit_converter.py   # Unit conversions
│   ├── utils/                    # Utility modules
│   ├── openai_manager.py        # OpenAI client and session management
│   ├── session_manager.py       # Session isolation
│   ├── error_handler.py         # Centralized error handling
│   ├── logging_config.py        # Structured logging
│   └── api_server.py            # FastAPI backend server
├── ui/                          # Frontend interface
│   ├── index.html               # Chat interface
│   ├── app.js                   # Main application logic
│   └── styles.css               # Style definitions
├── data/                        # Document storage
├── logs/                        # Application logs
├── tests/                       # Test files
├── config.yaml                  # System configuration
├── requirements.txt             # Python dependencies
├── run_server.py               # Server startup script
└── .env                        # Environment variables (not in git)

Module Responsibilities

1. Agents Layer

Each agent is responsible for a specific task and interacts with the LLM:

  • Retrieval Agent: Performs semantic search from the vector store with LLM-based reranking. Guarantees minimum 1 chunk from each document source.
  • Reasoning Agent: Classifies questions, performs intent analysis, and decides which tools to use.
  • Answer Composer Agent: Creates natural language answers using reasoning outputs and context.
  • Document Ingestion Agent: Multi-format document parsing (PDF, Excel, CSV, TXT, URL) and batch processing.

2. Orchestrator Layer

  • Agent Orchestrator: Coordinates all agents, enforces session isolation, and manages the question-answer pipeline.

3. Chunking Layer

  • Document Chunker: Token-based and semantic boundary-aware chunking. Preserves paragraph and sentence boundaries, supports overlap.

4. Indexing Layer

  • Vector Store: FAISS-based vector indexing, embedding search, document filtering.

5. Tools Layer

Tools available for the LLM to use:

  • Calculator: Safe Python mathematical calculations
  • Unit Converter: Unit conversions (TL/USD, ton/kg, etc.)

6. Core Services

  • OpenAI Manager: One client per session, strict isolation
  • Session Manager: UUID-based session management
  • Error Handler: Centralized error handling with error codes
  • Logging Config: Structured JSON logging

Document Reading Process and Agent Architecture

Multi-Format Document Processing

The system uses specialized parsers to process documents in different formats:

graph LR
    A[Document Upload] --> B{Format Detection}
    B -->|PDF| C[PDF Parser<br/>OCR Fallback]
    B -->|Excel| D[Excel Parser<br/>Multi-Sheet]
    B -->|CSV| E[CSV Parser<br/>Encoding Detection]
    B -->|TXT| F[TXT Parser]
    B -->|URL| G[URL Scraper<br/>Selenium]
    
    C --> H[Text Extraction]
    D --> H
    E --> H
    F --> H
    G --> H
    
    H --> I[Document Chunker]
    I --> J[Token-based<br/>Chunking]
    I --> K[Semantic<br/>Boundary Detection]
    
    J --> L[Vector<br/>Embedding]
    K --> L
    
    L --> M[FAISS<br/>Index]
    M --> N[Vector Store]
Loading

Chunking Strategy

Document Chunker uses an intelligent chunking strategy:

  1. Token-Based Sizing: Chunks ranging from 700-1200 tokens
  2. Semantic Boundary Preservation: Preserves paragraph and sentence boundaries
  3. Overlap Support: 100 token overlap for context continuity
  4. Source Prefix: Each chunk is tagged with source information
# Example Chunk Metadata
{
    "content": "...",
    "doc_id": "unique_doc_id",
    "source_type": "pdf|excel|url",
    "chunk_index": 0,
    "page_or_position": "Sayfa 5",
    "token_count": 850,
    "metadata": {
        "filename": "report.pdf",
        "title": "Sustainability Report"
    }
}

Document Ingestion Pipeline

  1. Upload & Validation: File format and size validation
  2. Parsing: Text extraction with format-specific parsers
  3. Cleaning: Garbled text cleanup and normalization
  4. Chunking: Semantic and token-aware chunking
  5. Embedding: Vectorization with OpenAI embedding model
  6. Indexing: Insertion into FAISS vector store

Agentic Architecture: Decision Making and Task Processing

Multi-Agent Pipeline

AgenticAI uses multiple coordinated agents to answer questions:

graph TD
    A[User Question] --> B[Agent Orchestrator]
    B --> C[Session Creation<br/>UUID Generation]
    C --> D[Reasoning Agent]
    
    D --> E{Question<br/>Classification}
    E -->|Numerical| F[Calculator Tool]
    E -->|Contextual| G[Retrieval Agent]
    E -->|Open-ended| G
    
    G --> H[Vector Search<br/>FAISS]
    H --> I[Semantic Retrieval<br/>Top-2K Chunks]
    I --> J[LLM Reranking<br/>Relevance Scoring]
    J --> K[Doc Coverage<br/>Guarantee]
    
    K --> L[Context Assembly]
    F --> L
    
    L --> M[Reasoning Agent<br/>Context Analysis]
    M --> N[Answer Composer]
    N --> O[Formatted Response]
    
    O --> P{Question Type}
    P -->|Multiple Choice| Q[Answer Selector Agent]
    P -->|Numeric| R[Direct Answer]
    P -->|Open-ended| R
    
    Q --> S[Final Answer]
    R --> S
    
    S --> T[Session Cleanup]
Loading

Agent Roles and Decision Mechanisms

1. Reasoning Agent (Thinking and Decision)

Responsibilities:

  • Classifies question type (open-ended, multiple-choice, numerical, calculation)
  • Performs intent analysis
  • Decides which tools to use
  • Performs reasoning with context

Decision Mechanism:

# Question analysis using LLM
{
    "type": "NUMERICAL",
    "needs_calculation": True,
    "needs_search": True,
    "intent": "Find total emission value",
    "clarified_question": "..."
}

2. Retrieval Agent (Information Retrieval)

Responsibilities:

  • Hybrid retrieval (semantic search + LLM reranking)
  • Guarantees minimum 1 chunk from each document source
  • Relevance scoring

Processing Steps:

  1. Create query embedding
  2. Semantic search with FAISS (top-2K)
  3. Reranking with LLM (top-K)
  4. Ensure document coverage guarantee

3. Answer Composer Agent (Answer Composition)

Responsibilities:

  • Expresses reasoning results in natural language
  • Source attribution (source citation)
  • Provenance tracking (page/line information)

Session Isolation

A new session is created for each question:

graph LR
    A[Question Arrives] --> B[Generate UUID Session]
    B --> C[Create OpenAI Client]
    C --> D[Pipeline Execution]
    D --> E[Answer Generated]
    E --> F[Session Cleanup]
    F --> G[Client Destroyed]
Loading

Session Management:

  • Unique UUID for each session
  • Single OpenAI client per session
  • Automatic cleanup when pipeline completes
  • Prevents cross-session contamination

Tool Coordination

LLM calls tools when needed:

graph TD
    A[Reasoning Agent] --> B{Tool Needed?}
    B -->|Calculation| C[Calculator Tool]
    B -->|Unit Conversion| D[Unit Converter]
    B -->|No Tool| E[Direct Reasoning]
    
    C --> F[Safe Python Eval]
    D --> G[Pint Library]
    
    F --> H[Result Integration]
    G --> H
    E --> I[Answer Composition]
    H --> I
Loading

Overall Architectural Design

High-Level System Architecture

graph TB
    subgraph "Frontend Layer"
        UI[Web UI<br/>index.html + app.js]
    end
    
    subgraph "API Layer"
        API[FastAPI Server<br/>Port 8000]
        UISRV[UI Server<br/>Port 5500]
    end
    
    subgraph "Orchestration Layer"
        ORCH[Agent Orchestrator]
        SESS[Session Manager]
    end
    
    subgraph "Agent Layer"
        RET[Retrieval Agent]
        REAS[Reasoning Agent]
        COMP[Answer Composer]
        SURV[Survey Agent]
        DOC[Document Ingestion]
    end
    
    subgraph "Core Services"
        OAI[OpenAI Manager]
        CHUNK[Document Chunker]
        VEC[Vector Store<br/>FAISS]
    end
    
    subgraph "Tools"
        CALC[Calculator]
        CONV[Unit Converter]
    end
    
    subgraph "External Services"
        OPENAI[OpenAI API]
    end
    
    UI --> API
    UISRV -.Serves.-> UI
    API --> ORCH
    ORCH --> SESS
    ORCH --> RET
    ORCH --> REAS
    ORCH --> COMP
    ORCH --> SURV
    ORCH --> DOC
    
    RET --> VEC
    REAS --> CALC
    REAS --> CONV
    DOC --> CHUNK
    CHUNK --> VEC
    
    RET --> OAI
    REAS --> OAI
    COMP --> OAI
    
    OAI --> OPENAI
    
    style UI fill:#e1f5ff
    style ORCH fill:#fff4e1
    style OAI fill:#ffe1e1
Loading

Question Answering Workflow

sequenceDiagram
    participant User
    participant UI
    participant API
    participant Orchestrator
    participant Reasoning
    participant Retrieval
    participant Composer
    participant OpenAI
    
    User->>UI: Ask Question
    UI->>API: POST /process-question
    API->>Orchestrator: answer_question()
    
    Orchestrator->>Orchestrator: Create Session (UUID)
    Orchestrator->>OpenAI: Create Client
    
    Orchestrator->>Reasoning: classify_question()
    Reasoning->>OpenAI: LLM Classification
    OpenAI-->>Reasoning: Question Type
    
    Orchestrator->>Retrieval: retrieve()
    Retrieval->>Retrieval: Vector Search (FAISS)
    Retrieval->>OpenAI: LLM Reranking
    OpenAI-->>Retrieval: Ranked Chunks
    
    Orchestrator->>Reasoning: reason_with_context()
    Reasoning->>OpenAI: LLM Reasoning
    OpenAI-->>Reasoning: Reasoning Result
    
    Orchestrator->>Composer: compose_answer()
    Composer->>OpenAI: Generate Answer
    OpenAI-->>Composer: Formatted Answer
    
    Composer-->>Orchestrator: Final Answer
    Orchestrator->>Orchestrator: Cleanup Session
    
    Orchestrator-->>API: Response with Provenance
    API-->>UI: JSON Response
    UI-->>User: Display Answer
Loading

Data Flow Architecture

graph LR
    subgraph "Input"
        A1[Documents]
        A2[URLs]
        A3[Questions]
    end
    
    subgraph "Processing"
        B1[Document Parsers]
        B2[Chunker]
        B3[Embeddings]
    end
    
    subgraph "Storage"
        C1[Vector Store<br/>FAISS Index]
        C2[Metadata Store]
    end
    
    subgraph "Retrieval"
        D1[Semantic Search]
        D2[LLM Reranking]
        D3[Doc Coverage]
    end
    
    subgraph "Reasoning"
        E1[Classification]
        E2[Tool Selection]
        E3[Context Analysis]
    end
    
    subgraph "Output"
        F1[Structured Answer]
        F2[Provenance Info]
        F3[Source Attribution]
    end
    
    A1 --> B1
    A2 --> B1
    B1 --> B2
    B2 --> B3
    B3 --> C1
    B2 --> C2
    
    A3 --> D1
    C1 --> D1
    D1 --> D2
    D2 --> D3
    
    D3 --> E1
    E1 --> E2
    E2 --> E3
    C2 --> E3
    
    E3 --> F1
    C2 --> F2
    F2 --> F3
    F1 --> F3
Loading

Project Structure

Features

Document Upload & Processing

  • Upload PDF, Excel, or JSON files
  • Automatic text extraction and chunking
  • OCR support for scanned documents

Web Scraping

  • Extract content from URLs
  • Batch URL processing
  • Duplicate detection and content deduplication

Intelligent Question Answering

  • Context-aware responses using RAG
  • Source attribution with provenance tracking
  • Multi-turn conversation support

Survey Processing

  • Automatic survey form generation from JSON
  • Support for multiple question types:
    • Single choice
    • Multiple choice
    • Numeric input
    • Open text
  • Dependency-based question flow

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors