# Archivist Usage Guide

**Author:** Raptopoulos Petros [petrosrapto@gmail.com]  
**Date:** 2025/03/10

## Overview
This notebook demonstrates how to use the Archivist agent for document indexing. The Archivist supports multiple indexing backends including VectorDB (Pinecone/ChromaDB) and LightRAG.

## How to Use

### 1. Installation
First, install the Archivist package (see cell below).

### 2. Configuration
Configure the agent by enabling the desired indexing backends:
- `enable_vectordb`: Enable vector database indexing (Pinecone/ChromaDB)
- `enable_lightrag`: Enable LightRAG graph-based indexing
- `run_name`: Optional name for the indexing run

### 3. Initialize and Index
Create an Archivist instance with your configuration and call the `index()` method with a file path.

### 4. Supported File Types
- PDF files (.pdf)
- Word documents (.doc, .docx)  
- Text files (.txt)

## Example Usage
The example below shows how to index a document using LightRAG backend.

In [1]:
!pip install /Users/petrosrapto/Desktop/DiplomaThesis/DiplomaThesis/MultiAgentFramework/Archivist

Processing /Users/petrosrapto/Desktop/DiplomaThesis/DiplomaThesis/MultiAgentFramework/Archivist
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: Archivist
  Building wheel for Archivist (pyproject.toml) ... [?25ldone
[?25h  Created wheel for Archivist: filename=archivist-0.1.2-py3-none-any.whl size=31343 sha256=f877d29a62023ce85c7cdc28e59899bb156a8892bd143f6f959923f83de64db5
  Stored in directory: /private/var/folders/0p/9p8jk1wj4tdcm7jypv1nxwv00000gn/T/pip-ephem-wheel-cache-oa9scmh3/wheels/01/6e/a5/4ae2665af66e49e31143d55e15798807563885dd35810a8dca
Successfully built Archivist
Installing collected packages: Archivist
  Attempting uninstall: Archivist
    Found existing installation: Archivist 0.1.2
    Uninstalling Archivist-0.1.2:
      Successfully uninstalled Archivist-0.1.2
Successfully installed Archivist-0.1.2

[1m[[0m[3

In [None]:
"""
Simple usage example of the Archivist agent.
"""

from Archivist import Archivist

# Configure the agent
# this config sets the nodes of the graph, while quering you can pick a subset
# of the retrievers to use
config = {
    "enable_vectordb": False,
    "enable_lightrag": True,
    "run_name": "Example Index"
}

# Initialize the archivist agent
archivist = Archivist(config)

# Example 1: Direct index
print("\n=== Example 1: Direct Index ===\n")
results = archivist.index(filePath="example.docx")

[2025-05-14 19:26:54,823]  [AppLogger] [INFO] [Initializing GraphBuilder...]
[2025-05-14 19:26:54,826]  [AppLogger] [INFO] [GraphBuilder initialized successfully.]
[2025-05-14 19:26:54,827]  [AppLogger] [INFO] [Initializing Archivist with config: {'enable_vectordb': False, 'enable_lightrag': True, 'run_name': 'Example Index'}]
[2025-05-14 19:26:54,827]  [AppLogger] [INFO] [Setting up indexers...]
[2025-05-14 19:26:54,828]  [AppLogger] [INFO] [Initializing LightRAGIndexer with base URL: http://localhost:9621]
[2025-05-14 19:26:54,829]  [AppLogger] [INFO] [LightRAGIndexer configuration: clear_existing=True, max_polling_time=300]
[2025-05-14 19:26:54,829]  [AppLogger] [INFO] [LightRAGIndexer initialized successfully.]
[2025-05-14 19:26:54,829]  [AppLogger] [INFO] [Adding indexer: lightrag]
[2025-05-14 19:26:54,832]  [AppLogger] [INFO] [Indexer lightrag added successfully.]
[2025-05-14 19:26:54,834]  [AppLogger] [INFO] [LightRAGIndexer added successfully.]
[2025-05-14 19:26:54,834]  [AppLo


=== Example 1: Direct Index ===



[2025-05-14 19:26:56,888]  [AppLogger] [INFO] [Delete result: {'status': 'success', 'message': 'All documents cleared successfully. Deleted 0 files.'}]
[2025-05-14 19:26:56,889]  [AppLogger] [INFO] [Processing document 1/77]
[2025-05-14 19:26:56,891]  [AppLogger] [INFO] [Uploading document to LightRAG: tmpu21mn4kb.txt]
[2025-05-14 19:26:56,898]  [AppLogger] [INFO] [Document uploaded successfully: {'status': 'success', 'message': "File 'tmpu21mn4kb.txt' uploaded successfully. Processing will continue in background."}]
[2025-05-14 19:26:58,904]  [AppLogger] [INFO] [Processing document 2/77]
[2025-05-14 19:26:58,909]  [AppLogger] [INFO] [Uploading document to LightRAG: tmpftr82g25.txt]
[2025-05-14 19:26:58,923]  [AppLogger] [INFO] [Document uploaded successfully: {'status': 'success', 'message': "File 'tmpftr82g25.txt' uploaded successfully. Processing will continue in background."}]
[2025-05-14 19:27:00,930]  [AppLogger] [INFO] [Processing document 3/77]
[2025-05-14 19:27:00,934]  [AppLo