"Museums are not silent repositories of Memory; they are living, thinking organisms, where imagination and knowledge, tradition and innovation meet." — Gayane Umerova, UNESCO, 2025
Version: 10.8
Author: Rob Graham · FAMTEC (Fine Art Media Tech) / RMIT University
Status: Working prototype — multi-institution semantic search + LLM object chat + NFC visitor pages
Target: ISEA2026 Dubai, 6th Summit on New Media Art Archiving (April 11–12)
Paper: docs/ARCHAI_ISEA2026_Rob_Graham.pdf
Licence: MPL-2.0 (code) · CC BY 4.0 (MV data) · CC0 (Met data) · V&A Open Access — see NOTICE for IP and trademark details
Trademark: ARCHAI™ is a trademark of Rob Graham / FAMTEC. See NOTICE for usage terms.
Three museum collections in Qdrant, searchable simultaneously:
| Collection | Source | Objects | Licence | Status |
|---|---|---|---|---|
archai_pilot |
Museums Victoria | ~194 | CC BY 4.0 | ✅ Live |
archai_met |
The Metropolitan Museum of Art, NYC | ~100 | CC0 | ✅ Live |
archai_va |
Victoria and Albert Museum, London | ~80 | V&A Open Access | ✅ Live |
archai_curator |
All of the above + comments | Built on demand | Mixed | ✅ Live |
- Query → embedded via nomic-embed-text → vector searched across all 3 collections → results merged by cosine similarity
- Results colour-tagged: MV (teal), Met (gold), V&A (purple)
- Text fallback when Ollama offline
- Sort by: name, date, discipline, source
- Filter: with images (default), all, MV/Met/V&A only
- Deduplicated by canonical_id across collections
Each object speaks in first person via llama3, grounded in verified metadata:
- System prompt built from ALL metadata fields
- Dynamic institution name per object
- Hallucination prevention: "That's not in my record"
- Metadata fallback when Ollama offline — no LLM required
- Full metadata, image, curatorial description
- Live llama3 chat with question chips
- Semantically related objects across all collections
- Source-specific links: "View on The Met →", "View on V&A →"
- Visitor comment thread — all comments (including flagged) with approve/remove/reply actions for curators
Comments submitted by visitors are AI-screened in real time:
- Ollama classifies each comment as safe / suspicious / harmful
- Safe comments visible immediately on the object page
- Suspicious/harmful comments hidden — sent to curator review queue
- Human curator has final say — approve or remove
- Threaded replies supported (staff can respond to visitors)
- Stored in SQLite — becomes part of the object's collection record
- Comments included in curator vector collection for semantic search
Safe proxy layer for exposing ARCHAI publicly via Cloudflare Tunnel:
- Rate limiting per IP (15 chat/min, 30 search/min)
- Prompt injection pattern blocking (regex filter)
- Safety wrapper prepended to all LLM system prompts
- Token and prompt length caps (512 tokens, 500 chars)
- All frontend fetch calls route through
qdrantFetch()/ollamaFetch()wrappers
Enriched archai_curator Qdrant collection combining:
- All object metadata from all 3 source collections
- Visitor comments attached to each object
- Rebuilt on demand via
POST /api/proxy/curator/build - Semantic search across everything via
POST /api/proxy/curator/search
194 standalone HTML pages from all 3 collections:
- Object image, metadata, description, LLM chat over LAN or via proxy
- Share: native iOS sheet, email, copy link, X/Twitter
- Comment submission with AI moderation (localStorage fallback offline)
- Related objects with cross-collection links
- Captive portal for exhibition WiFi
- Tags from objects with images, mixed across MV/Met/V&A
- 3-column layout: tag list → editor → phone preview
- Search, filter, publish/unpublish
| Role | Access |
|---|---|
| Admin | All tabs |
| Curator | Curator, Nodel, NFC, Vocab, Visitor, FAMTEC |
| Collections | Curator, NFC, Vocab, Visitor, FAMTEC |
| Technician | Nodel, Visitor, FAMTEC |
| Volunteer | Curator, NFC, Visitor, FAMTEC |
| Visitor | Visitor only |
- Select All / Export CSV / Batch Tag — fully wired
- CSV export with all metadata fields, scoped to selection or full collection
- Batch tagging applies keywords to selected objects and rebuilds vocab index
- Test space for interaction design, workflow feel, and interface prototyping inside ARCHAI
- Placeholder institution names are used to simulate exchange activity and help evaluate the app experience
- Feed includes loan, rental, skills, and crew-availability scenarios
- Post listings (hardware, skills, requests), send enquiries, view details
- Enquiries route to institution chat threads where available
- Chip filters and institution chat are functional prototype interactions
- This is not the final FAMTEC platform: production development will be handled separately by FAMTEC outside the PhD work, with potential later integration into ARCHAI once developed
- Gallery cards with status indicators
- Node table, fault log, schedule
- Refresh status polling, emergency stop with confirmation
- Direct links to Nodel web UI and Directus admin
- Live vocabulary index built from Qdrant payloads across all 3 collections (9 facets: discipline, category, object type, classifications, collecting areas, keywords, culture, period, medium)
- CHIN/AAT reference terms — 26 curated terms from Getty AAT with scope notes, broader/narrower hierarchies, and AAT IDs
- DOCAM Glossaurus — media art preservation terminology (emulation, migration, variable media, documentation strategies)
- Nomenclature for Museum Cataloging — Parks Canada/CHIN object naming and classification
- CHIN Discipline Authority List (2006) — bilingual EN/FR discipline headings
- Term search across all sources with scope notes, provider badges, and language tags
- Term detail panel: path, scope note, broader/narrower terms, related terms, collection usage with example objects
- Apply to Search (jumps to Curator with search), Add Local Mapping (creates institution-specific terms)
- Indigenous protocol layer with governance notice
Desktop views:
Mobile views:
LLM currently only sees one object's metadata. Needs RAG: embed user question → search Qdrant → inject related objects into LLM context → synthesise connections. Curators get full cross-collection access, visitors get bounded single-object responses.
Use llava to extract colours, text, objects from images → searchable metadata.
AAT, LCSH, TGN, and ULAN are listed but inactive — currently using curated reference terms rather than live API lookups. Getty AAT LOD endpoint integration is architecturally ready.
Current in-app FAMTEC Exchange uses prototype data and in-memory arrays only. The production FAMTEC Exchange platform will be developed separately by FAMTEC outside the PhD work, with potential later integration into ARCHAI once developed.
Health-checked only. NFC save attempts backend sync but falls back to local confirmation.
Static prototype data. Needs WebSocket to real Nodel instance. UI links and emergency stop are wired.
Date extraction from titles, better Met filtering, incremental harvest. Run Harvesters button verifies collection counts and triggers reload.
┌──────────────────────────────────────────────────────────┐
│ ARCHAI Frontend │
│ (ARCHAI_v10_8.html · browser) │
│ │
│ Search ──→ Ollama embed ──→ Qdrant (4 collections) │
│ Chat ──→ Ollama llama3 ──→ grounded response │
│ NFC ──→ Ollama llama3 ──→ chat over LAN / proxy │
│ Sort ──→ client-side on loaded objects │
│ Comments ──→ Backend API ──→ AI moderation ──→ SQLite │
└────────┬──────────────┬──────────────┬──────────────────┘
│ │ │
localhost:6333 localhost:11434 localhost:8787
Qdrant Ollama Backend API
├── Safe proxy (rate limit + injection block)
├── Comments (AI moderation → SQLite)
├── Curator vectors (build + search)
└── Directus bridge (optional)
Public access (Cloudflare Tunnel):
Visitor phone ──→ tunnel ──→ Backend proxy ──→ Ollama/Qdrant
└──→ Comments API (AI screened)
archai/
├── ARCHAI_v10_8.html ← Main frontend (single-file app)
├── README.md ← This file
├── ARCHAI_OPERATIONS_GUIDE.md ← Full ops guide (startup, testing, APIs, adding objects)
├── REMOTE_TESTING_GUIDE.md ← Tailscale setup for iPad/iPhone testing
├── start-archai.sh ← One-command startup + health checks
├── backend-archai/
│ ├── src/
│ │ ├── server.js ← Express entry point
│ │ ├── data/db.js ← SQLite database (comments)
│ │ ├── middleware/rateLimit.js ← Rate limiter
│ │ ├── routes/
│ │ │ ├── proxy.js ← Safe Qdrant/Ollama/curator proxy
│ │ │ ├── comments.js ← AI-moderated threaded comments
│ │ │ └── ... ← Other route modules
│ │ └── services/
│ │ ├── moderation.js ← Ollama comment screening
│ │ └── curator-vectors.js ← Curator collection builder
│ ├── scripts/
│ │ ├── met-harvester.js ← Met NYC → Qdrant
│ │ └── va-harvester.js ← V&A London → Qdrant
│ └── data/archai.db ← SQLite (created at runtime)
├── nfc-pages/
│ ├── generate-nfc-pages.js ← All collections → HTML per tag
│ ├── nfc-visitor-template.html ← Mobile template
│ ├── captive-portal.html
│ └── v/ ← Generated pages (~194, not in git)
├── docs/
│ └── ARCHAI_ISEA2026_Rob_Graham.pdf
└── docker-compose.yml
cd ~/Desktop/APPS/ARCHAI\ APP
./start-archai.shStarts Docker, Qdrant, Ollama (with LAN+CORS), backend API, frontend server. Runs 7 health checks, shows loaded models, Qdrant collections, comment count, and prints all URLs.
Main app: http://localhost:8000/ARCHAI_v10_8.html NFC index: http://localhost:8000/nfc-pages/v/index.html Backend API: http://localhost:8787/api/health Tailscale: http://100.109.26.39:8000/ARCHAI_v10_8.html
See ARCHAI_OPERATIONS_GUIDE.md for full setup, testing, and API reference.
Mac Studio M2 Max · 64GB · 1TB. Base institutional deployment: ~$3,500–5,000 USD one-time. No subscriptions, no cloud dependency.
| Version | Changes |
|---|---|
| v6 | Initial prototype, mock objects |
| v7 | Role switcher, FAMTEC, Nodel, NFC, vocabulary |
| v10.4 | MV-only, live Qdrant + Ollama, LLM chat |
| v10.5 | Restored all panels, NFC page generator |
| v10.6 | Multi-collection (MV+Met+V&A), sort/filter, dedup, harvesters, NFC share+comments, 200 pages, dynamic institutions |
| v10.7 | Live CHIN-aligned thesaurus (AAT+DOCAM+Nomenclature+CHIN Disciplines), all buttons wired, responsive thumbnail scaling, vocab search with scope notes and provider badges |
| v10.8 | Backend proxy for safe public hosting (rate limiting, prompt injection blocking), AI-moderated threaded comments (Ollama screening → curator review queue), curator vector collection (all metadata + comments searchable), SQLite persistence, NFC pages wired to backend API, object detail comment thread with approve/remove/reply, startup script with health checks, operations guide |
Rob Graham · FAMTEC / RMIT · rob@fineartmedia.tech GitHub: github.com/rob-e-graham/archai
ARCHAI™ is a trademark of Rob Graham / FAMTEC (Fine Art Media Tech). Use of the source code under MPL-2.0 does not grant trademark rights. See NOTICE for details.













