A fully local distributed AI research system with orchestration, ensemble inference, reproducible benchmarking, statistical testing, visual analytics, and publication-ready artifacts.
This repository implements a complete local distributed AI network for research.
The workflow is simple:
- A user query goes to one orchestrator node.
- The orchestrator sends that query to multiple local model agents.
- Each agent runs a different LLM and returns answer + latency + token metadata.
- The orchestrator aggregates the responses using switchable ensemble strategies.
- All metrics are logged and exported into publication-ready outputs.
This project is designed for reproducibility and IEEE-style reporting.
For a visual explanation of the full research pipeline, watch:
https://youtu.be/Zybs0Omg600
- Md Anisur Rahman Chowdhury1*, Kefei Wang1
- 1 Dept. of Computer and Information Science, Gannon University, USA
- Emails:
engr.aanis@gmail.com,wang039@gannon.edu
- Why: test multi-model collaboration instead of single-model output.
- Built: one orchestrator VM + four model-agent VMs on a private LAN.
- Why: compare aggregation algorithms under the same benchmark pipeline.
- Built: Majority, Weighted, ISP, Topic Routing, Debate.
- Why: produce repeatable, statistically valid evaluation.
- Built: MMLU, GSM8K, TruthfulQA runners with deterministic controls.
- Why: directly support conference submission workflow.
- Built: CSV/JSON exports, PNG plots, IEEE LaTeX tables, IEEE paper source, Overleaf package, and PDF.
- Why: make results easy to explore and present.
- Built:
docs/web site + interactive dashboard with charts.
- Orchestrator:
172.16.185.223 - Agent 1:
172.16.185.209(llama3.2:3b) - Agent 2:
172.16.185.218(qwen2.5:3b) - Agent 3:
172.16.185.220(phi3:mini) - Agent 4:
172.16.185.222(gemma2:2b)
User Query
|
v
Orchestrator (FastAPI)
|---> Agent-1 (Ollama)
|---> Agent-2 (Ollama)
|---> Agent-3 (Ollama)
|---> Agent-4 (Ollama)
v
Aggregation + Metrics + Statistical Testing + Artifact Export
flowchart TD
U[User Query] --> O[Orchestrator FastAPI]
O --> A1[Agent 1 llama3.2:3b]
O --> A2[Agent 2 qwen2.5:3b]
O --> A3[Agent 3 phi3:mini]
O --> A4[Agent 4 gemma2:2b]
A1 --> G[Aggregation + Metrics + Statistics]
A2 --> G
A3 --> G
A4 --> G
G --> R[Final Response + Artifacts]
Interactive public version:
https://anis151993.github.io/Distributed-AI/#architecture
GET /healthGET /agentsPOST /query
Each agent exposes Ollama REST API on port 11434 and returns:
responselatency_mstoken_countmodel_id
Configured model set in this run:
llama3.2:3bqwen2.5:3bphi3:minigemma2:2b
You can swap to other models (CPU/GPU permitting) through agents/agent_config.yaml.
Implemented and switchable at runtime:
majorityweightedisptopicdebate
All strategies are modular in orchestrator code:
orchestrator/aggregator.pyorchestrator/router.pyorchestrator/debate.py
To reduce bottlenecks in CPU-friendly settings:
- Shared direct-strategy fan-out (single agent call reused by majority/weighted/ISP/topic)
- Debate early-stop when round-1 agreement is complete
- Conservative token budgets and timeout tuning for local hardware
- Better answer normalization for benchmark parsing stability
- MMLU
- GSM8K
- TruthfulQA
- Accuracy
- F1
- Mean latency
- Latency std
- Agreement rate
- CPU / GPU usage
- Paired t-test
- Wilcoxon signed-rank
- Confidence intervals
Outputs are generated as:
- CSV
- JSON
- PNG plots
- IEEE LaTeX tables
- Protected manuscript package (
.tar.gpg) - Decryption guide (
.txt)
Primary artifact folders:
artifacts/benchmark_runs/run_20260226_193331/artifacts/optimization_runs/
Key files:
overall_summary.csvoverall_summary.jsonoverall_significance.csvpaper/tables/aggregate_strategy_results.texpaper/tables/per_benchmark_results.texpaper/tables/significance_highlights.texpaper/IEEE_Distributed_AI_Ensemble_Protected.tar.gpgpaper/PAPER_ACCESS_INSTRUCTIONS.txt
Web UI files:
docs/index.html(public research portal)docs/interactive_dashboard.html(interactive charts)docs/styles.cssdocs/script.jsdocs/assets/data/report_data.js(embedded chart data)docs/assets/paper/IEEE_Distributed_AI_Ensemble_Protected.tar.gpg(encrypted paper package)docs/assets/paper/PAPER_ACCESS_INSTRUCTIONS.txt(public decryption guide)docs/assets/results/*.csv|*.json(public downloadable benchmark outputs)
Public portal features:
- Interactive architecture map (click nodes + flow simulation)
- Research video section with embedded YouTube walkthrough
- Live AI playground (
/queryAPI caller with strategy controls) - Individual model selection (choose any subset of agents per query)
- Per-query visual analytics (per-model latency/tokens and answer distribution charts)
- Password-gated protected paper package access
- Direct links to project repository and GitHub profile
Enable on GitHub:
- Open repository Settings.
- Go to Pages.
- Source:
mainbranch, folder/docs. - Save.
Local preview:
cd docs
python3 -m http.server 8080
# open http://localhost:8080Public URL:
https://anis151993.github.io/Distributed-AI/
If you want browser visitors to run live queries from GitHub Pages, set CORS on the orchestrator:
export CORS_ALLOW_ORIGINS="https://anis151993.github.io,http://localhost:8080,http://127.0.0.1:8080"
uvicorn orchestrator.main:app --host 0.0.0.0 --port 8000Then, in the website playground, set your public orchestrator endpoint (for example, your tunnel URL) and run a query.
curl -s http://127.0.0.1:8000/health | jq .
curl -s http://127.0.0.1:8000/agents | jq .curl -s -X POST http://127.0.0.1:8000/query \
-H 'Content-Type: application/json' \
-d '{
"prompt": "What is 2+2? Return only the final numeric answer.",
"strategy": "majority",
"seed": 42,
"temperature": 0.0,
"deterministic": true,
"max_tokens": 24
}' | jq .python run_experiments.py \
--orchestrator-url http://127.0.0.1:8000 \
--benchmarks mmlu,gsm8k,truthfulqa \
--strategies majority,weighted,isp,topic,debate \
--repetitions 1 \
--samples-per-benchmark 2 \
--seed 42 \
--deterministic \
--max-agents 2python run_experiments.py \
--orchestrator-url http://127.0.0.1:8000 \
--benchmarks mmlu,gsm8k,truthfulqa \
--strategies majority,weighted,isp,topic,debate \
--repetitions 5 \
--samples-per-benchmark 20 \
--seed 42 \
--deterministicValidated from the public GitHub Pages portal and public orchestrator endpoint.
- Public site:
https://anis151993.github.io/Distributed-AI/ - Public orchestrator endpoint:
https://ai.marcbd.site - CORS status: verified for
https://anis151993.github.io - Agent health: 4/4 healthy
- Playground
majorityquery: passed - Returned answer:
4 - Query timestamp (UTC):
2026-02-27T06:47:49.561498+00:00
Evidence snapshot (from public Playground response):
{
"strategy": "majority",
"aggregate": {
"answer": "4",
"agreement_rate": 1
},
"agent_count": 4,
"total_latency_ms": 33847.143
}- Fixed random seed support (
--seed) - Deterministic mode (
--deterministic) - Configurable temperature
- Fixed benchmark repetitions
- Explicit independent/dependent variables in the paper
- Add agent entries in
agents/agent_config.yaml. - Ensure each agent endpoint is reachable from orchestrator.
- Increase worker/fan-out limits in orchestrator config.
- Re-run health check (
/agents) and benchmark runner.
No algorithm rewrite is needed; strategies consume dynamic agent lists.
- Keep agents on private LAN only.
- Restrict inbound firewall rules to required ports.
- Avoid exposing Ollama ports directly to the public internet.
- Use reverse proxy auth/TLS if remote access is required.
- Rotate tokens/credentials and avoid committing secrets.
- Keep VM packages updated.
Distributed-AI/
├── orchestrator/
├── agents/
├── benchmarks/
├── deploy/
├── scripts/
├── artifacts/
│ ├── benchmark_runs/
│ └── optimization_runs/
├── docs/
├── paper/
│ ├── IEEE_Distributed_AI_Ensemble_Protected.tar.gpg
│ ├── PAPER_ACCESS_INSTRUCTIONS.txt
│ ├── references.bib
│ ├── figures/
│ ├── tables/
│ └── overleaf/
├── run_experiments.py
└── docker-compose.yml
@article{chowdhury2026distributedai,
title={A Local Distributed Multi-Agent LLM Ensemble: Architecture, Optimization, and Reproducible Evaluation},
author={Chowdhury, Md Anisur Rahman and Wang, Kefei},
year={2026},
institution={Gannon University}
}Md Anisur Rahman Chowdhury1*, Kefei Wang1
1 Dept. of Computer and Information Science, Gannon University, USA
Emails: engr.aanis@gmail.com, wang039@gannon.edu
Copyright (c) 2026 Md Anisur Rahman Chowdhury and Kefei Wang
Emails: engr.aanis@gmail.com, wang039@gannon.edu
Affiliation: Dept. of Computer and Information Science, Gannon University, USA
All rights reserved.




