🛡️ BeeAI Local Analyst

A high-performance, privacy-focused AI agent built with FastAPI and the BeeAI Framework. This tool performs autonomous security research and analysis using local LLMs (via Ollama) and Wikipedia, streaming real-time thought processes and final reports to a responsive web dashboard.

🚀 Key Features

🔒 Local LLM Processing: All AI inference runs on your hardware using Ollama. Note: Search tools (DuckDuckGo, Wikipedia) send queries to external services.
🧠 Agentic Workflow: Uses the BeeAI Framework to Plan (ThinkTool) > Research (DuckDuckGoSearchTool,WikipediaTool, OpenMeteoTool) > Synthesize.
⚡ High Performance:
- Built on FastAPI & Uvicorn for async concurrency.
- Uses orjson for ultra-fast JSON serialization (Rust-based).
- Server-Sent Events (SSE) for low-latency streaming.
🛡️ Resource Management: Implements Async Semaphores to prevent GPU OOM (Out of Memory) errors by queuing concurrent agent requests.
💻 Integrated Dashboard: Clean, responsive HTML/JS frontend included. No separate build step required.
📄 RAG Document Upload (Optional): Upload documents (PDF, DOCX, PPTX, images, etc.) for context-aware conversations using docling and ChromaDB.

🛠️ Prerequisites

Python 3.10+
Ollama installed and running locally (Download Ollama).
Hardware: A GPU with at least 8GB VRAM is recommended (24GB recommended for larger context windows and models).

📦 Installation

Clone the repository:

git clone [https://github.com/odevstudio/beeai-analyst.git](https://github.com/yourusername/beeai-analyst.git)
cd beeai-analyst

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install core dependencies:

pip install fastapi uvicorn orjson beeai-framework beeai-framework[duckduckgo]

Used versions in this project if you are having trouble:

pip install orjson==3.11.5 fastapi==0.128.0 uvicorn==0.40.0 beeai-framework==0.1.74 "beeai-framework[duckduckgo]"==0.1.74

Install RAG dependencies (optional - for document upload):
```
pip install docling chromadb langchain-text-splitters openai
```
These enable the document upload feature. The app will work without them (file upload will be disabled).

🤖 Model Setup (Ollama)

The agent is configured to use a custom model named gemma-agent via the OpenAI-compatible endpoint. You need to create this model in Ollama to ensure the system prompt works correctly. I used gemma-3-27b-it-q4_k_m.gguf

Chat Model

Pull the base model:
```
ollama pull yourmodel
```
Use a Modelfile: Create a file named Modelfile in your Model file folder (example file for gemma-3-27b-it-q4_k_m.gguf included).
Create the custom model:
```
ollama create gemma-agent -f Modelfile
```

Embedding Model (Required for RAG/Document Upload)

If you want to use the document upload feature, you need the nomic-embed-text embedding model:

ollama pull nomic-embed-text

This model is used to generate embeddings for document chunks and semantic search queries.

▶️ Usage

Start the Server:
```
python main.py
```
Note: Ensure Ollama is running in the background (ollama serve).
Access the Dashboard: Open your browser and navigate to: http://localhost:5000
Start an Analysis: Type a query like: "Analyze the security risks of Quantum Computing for banking encryption" and hit Start.
Upload Documents (Optional): Use the file upload section in the dashboard to upload documents. The agent will use their content to inform its responses.

📄 RAG Document Upload

The application supports uploading documents for Retrieval-Augmented Generation (RAG). When you upload documents:

Text Extraction: Documents are processed using docling to extract text from various formats.
Chunking: Text is split into overlapping chunks (500 chars, 50 char overlap) using LangChain's text splitter.
Embedding: Chunks are embedded using the nomic-embed-text model via Ollama.
Storage: Embeddings are stored in a session-based ChromaDB vector store.
Retrieval: When you ask a question, the top 5 most relevant chunks are retrieved and injected into the agent's context.

Supported File Formats

Format	Extensions
PDF	`.pdf`
Word Documents	`.doc`, `.docx`
PowerPoint	`.pptx`
Excel	`.xlsx`
HTML	`.html`
Markdown	`.md`
Plain Text	`.txt`
CSV	`.csv`
Images (OCR)	`.png`, `.jpg`, `.jpeg`

RAG Visibility in Live Terminal

When uploaded documents are consulted, you will see a [RAG] entry in the Live Terminal showing:

Number of document chunks retrieved
Source files and their relevance scores

🏗️ Architecture

The application follows a modern asynchronous architecture:

Frontend: Sends a POST request to /stream.
FastAPI Endpoint:
- Acquires a GPU Semaphore (locks execution to 1 active agent to save VRAM).
- Spawns an asynchronous background task.
RAG Context Retrieval (if documents uploaded):
- Embeds the query using nomic-embed-text.
- Retrieves top-k relevant chunks from ChromaDB.
- Injects context into the agent's system instructions.
BeeAI Agent:
- Receives the prompt (with RAG context if available).
- ThinkTool: Plans the research steps.
- DuckDuckGoSearch: Internet search data.
- WikipediaTool: Fetches factual data.
- WeatherSearch: If needed.
- LLM: Synthesizes findings into a structured report.
Streaming:
- Events are serialized immediately using orjson.
- Data is pushed to an asyncio.Queue.
- The frontend receives events via Server-Sent Events (SSE) and updates the UI in real-time.

⚙️ Configuration

You can adjust the following variables in beefast.py to fit your hardware:

Core Settings

Variable	Default	Description
`OLLAMA_NUM_CTX`	`8192`	Context window size. Reduce to `4096` if you have low VRAM.
`gpu_semaphore`	`1`	Number of concurrent agents allowed. Increase if you have multiple GPUs.
`ChatModel`	`gemma-agent`	Change this string to use Llama3 or Mistral (`openai:llama3`, etc.).

RAG Settings

Variable	Default	Description
`EMBEDDING_MODEL`	`nomic-embed-text`	Ollama model for generating embeddings.
`CHUNK_SIZE`	`500`	Size of text chunks in characters.
`CHUNK_OVERLAP`	`50`	Overlap between consecutive chunks.
`TOP_K_RESULTS`	`5`	Number of relevant chunks to retrieve per query.

API Endpoints

Endpoint	Method	Description
`/`	GET	Main dashboard UI
`/stream`	POST	Start agent analysis (SSE stream)
`/upload`	POST	Upload document for RAG
`/files/{session_id}`	GET	List uploaded files for session
`/files/{session_id}/{filename}`	DELETE	Remove file from session
`/rag-status`	GET	Check RAG system availability

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the project.
Create your feature branch (git checkout -b feature/AmazingFeature).
Commit your changes (git commit -m 'Add some AmazingFeature').
Push to the branch (git push origin feature/AmazingFeature).
Open a Pull Request.

📄 License

Distributed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
beefast.py		beefast.py
config.py		config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ BeeAI Local Analyst

🚀 Key Features

🛠️ Prerequisites

📦 Installation

🤖 Model Setup (Ollama)

Chat Model

Embedding Model (Required for RAG/Document Upload)

▶️ Usage

📄 RAG Document Upload

Supported File Formats

RAG Visibility in Live Terminal

🏗️ Architecture

⚙️ Configuration

Core Settings

RAG Settings

API Endpoints

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🛡️ BeeAI Local Analyst

🚀 Key Features

🛠️ Prerequisites

📦 Installation

🤖 Model Setup (Ollama)

Chat Model

Embedding Model (Required for RAG/Document Upload)

▶️ Usage

📄 RAG Document Upload

Supported File Formats

RAG Visibility in Live Terminal

🏗️ Architecture

⚙️ Configuration

Core Settings

RAG Settings

API Endpoints

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages