Skip to content

diabhey/mossaic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

108 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mossaic

Voice-first research recall, powered by Moss semantic search. Ask aloud, hear answers grounded in real user research within a second, cited to the source session or document.

The use case shipped in this repo is a fictional business-spend product called Outlay. The corpus is 207 documents of user research: interviews, an onboarding survey, persona documents, customer success notes, quarterly reports, and an affinity map. Anyone on the product team queries it by voice on the web, through an agent built on Moss Voice Agents; the WebRTC transport, speech-to-text, language model, and text-to-speech all run on Moss-managed infrastructure.

The enterprise side runs on OpenClaw in Slack. Researchers add, edit, and remove documents by talking to a bot in a private channel, the same place they already write up findings. The admin/ gateway is an OpenClaw process listening over Socket Mode; its moss-knowledge extension writes straight into the same Moss index the voice agent reads. No ingestion pipeline to maintain, no content management system to log into, no engineer to ticket, no redeploy. A transcript uploaded at 14:02 is answerable by voice at 14:03 because Moss is the source of truth, not a cached copy of one.

Live demo: https://mossaic.vercel.app/

Deploy

A voice agent grounded in Moss, served from Vercel, can be live in about thirty minutes. Three steps, in order:

  1. Create the Moss index. cd corpus && cp sample.env .env, add MOSS_PROJECT_ID and MOSS_PROJECT_KEY from https://portal.usemoss.dev, then npm install && npm start. Lands 207 documents in an index named outlay-research. Details.
  2. Ship the voice agent. cd voice-agent, populate .env with the four Moss values, install moss-agent-cli, then ./deploy.sh deploy. The worker registers with the Moss runtime and picks up rooms. Details.
  3. Push the web client. cd web, vercel link, add MOSS_PROJECT_ID, MOSS_PROJECT_KEY, and MOSS_VOICE_AGENT_ID as production env vars, then vercel --prod. The landing page mints participant tokens and connects users to the voice agent over WebRTC. Details.

The optional surfaces light up the full enterprise story:

  • admin/ on your own VM runs the OpenClaw + Slack curation gateway (argus-automated or manual scripts/bootstrap.sh). Details.
  • voice-agent-livekit/ on LiveKit Cloud is the secondary voice path with explicit speech-to-text, language model, and text-to-speech providers. Details.

Full deploy reference (env vars, sops-age workflow, per-surface troubleshooting): docs/deploy.md.

Once it is up, docs/demo.md is the fifteen to twenty minute end-to-end script: three acts (voice surface, admin surface, the closing question), the eight hero queries with what each one is meant to surface, and the Slack ingest walkthrough where a dropped document is queryable on the very next voice turn.

Two audiences, one architecture

For OpenClaw developers. admin/ is a vanilla OpenClaw gateway with three extensions you can read end-to-end: moss-knowledge for corpus writes, neon-sql for read-only session queries, rate-limiter for per-user budgets. The voice worker ships twice in the same shape: voice-agent/ on the Moss-managed runtime, voice-agent-livekit/ on raw LiveKit Cloud with explicit speech-to-text, language-model, and text-to-speech providers. Same retrieval path, no proprietary glue.

For enterprise. Curation happens inside Slack, where research is already written up. An OpenClaw allowlist gates access to the bot to the private channel's members; the bot is the only thing that writes to the index, and the voice surface only reads. Every session is logged to Neon Postgres for audit. Each surface deploys independently (web/ to Vercel, voice-agent/ to Moss-managed infra, admin/ to your own VM with sops-encrypted secrets and systemd) but all three hit one Moss project, so the boundary between public and employee surfaces sits in the capability layer (which SDK methods each may call), not the data layer.

Architecture

Two trust surfaces, one shared Moss project. Subgraphs are grouped by deploy target - each is where the code physically runs. Moss-owned nodes are highlighted.

flowchart LR
    Customer(["Customer<br/>(browser)"])
    Employee(["Employee<br/>(Slack)"])

    subgraph Vercel["Vercel · web/"]
        direction TB
        Landing["Landing page<br/>voice widget"]
        TokenAPI["/api/connection-details<br/>token mint"]
    end

    subgraph MossPlatform["Moss platform (managed)"]
        direction TB
        VoiceRT["Voice runtime<br/>WebRTC · STT · LLM · TTS"]
        VA["voice-agent/<br/>MossAgentSession<br/>(Python worker)"]
        MossIdx[("Moss index<br/>hybrid · sub-10ms")]
    end

    subgraph Droplet["argus-provisioned VM · admin/"]
        direction TB
        OC["OpenClaw gateway<br/>systemd · Socket Mode"]
        subgraph Plugins["extensions/"]
            direction LR
            RL["rate-limiter"]
            MK["moss-knowledge"]
            NS["neon-sql"]
        end
        OC --> RL
        OC --> MK
        OC --> NS
    end

    Slack(["Slack workspace"])
    Neon[("Neon Postgres<br/>voice_sessions")]

    Customer -->|"V1 · open"| Landing
    Landing -->|"V2 · request token"| TokenAPI
    TokenAPI -->|"V3 · mint"| VoiceRT
    Landing <-->|"V4 · WebRTC (audio)"| VoiceRT
    VoiceRT <-->|"V5 · dispatch + turns"| VA
    VA -->|"V6 · retrieve · alpha=0.8"| MossIdx
    VA -->|"V7 · log session"| Neon

    Employee -->|"A1 · message"| Slack
    Slack <-->|"A2 · Socket Mode"| OC
    MK -->|"A3 · read + write"| MossIdx
    NS -->|"A4 · read-only SQL"| Neon

    classDef moss fill:#0f3d2e,color:#fff,stroke:#2bbc8a,stroke-width:2px
    class MossIdx,VoiceRT,VA moss
Loading

Reading the arrows. Two independent flows share the same Moss index, numbered per flow rather than across the whole graph:

  • V1–V7 - voice turn. Serial within the turn. Steps V1–V5 are the one-time connection setup; V5–V7 repeat on every utterance.
  • A1–A4 - admin. Independent of the public flow entirely. A3 writes to the same Moss index that V6 reads, so changes land with no redeploy.

Per-turn sequence

Flowchart shows topology; this sequence shows the turn-by-turn timing:

sequenceDiagram
    actor C as Customer
    participant L as Landing (web/)
    participant T as /api/connection-details
    participant VR as Moss Voice runtime
    participant VA as voice-agent worker
    participant MI as Moss index

    C->>L: 1. open page
    L->>T: 2. request token
    T->>VR: 3. mint LiveKit token
    T-->>L: 4. token
    L->>VR: 5. WebRTC connect
    VR->>VA: 6. dispatch session

    loop per voice turn
        C->>VR: 7. speech (audio)
        VR->>VA: 8. transcript (STT)
        VA->>MI: 9. retrieve (alpha=0.8)
        MI-->>VA: 10. grounded docs (<10ms)
        VA->>VA: 11. LLM generate
        VA->>VR: 12. TTS audio
        VR-->>C: 13. agent speech
    end
Loading

See docs/system-design.md for the full trust-boundary discussion and component map.

Project layout

mossaic/
├── web/                  # Next.js 16 landing page + voice widget (Vercel)
├── voice-agent/          # Primary: Python LiveKit worker on Moss-managed runtime
├── voice-agent-livekit/  # Secondary: same retrieval, raw LiveKit Cloud worker with explicit providers
├── admin/                # OpenClaw gateway - host-level systemd unit on an argus VM
│   ├── agents/           #   concierge agent (SOUL.md)
│   ├── extensions/       #   moss-knowledge, neon-sql, rate-limiter
│   ├── openclaw.config.json
│   └── slack-app-manifest.yml
├── corpus/               # Outlay research dataset + import scripts (npm run import)
├── scripts/              # bootstrap.sh, mossaic-admin.service, seed-neon.{sh,sql}
├── docs/                 # deploy, slack-setup, system-design, maintenance, diagrams/
├── .sops.yaml            # sops+age creation rules (age recipient for sealed env)
└── .env.enc              # sops-encrypted admin secrets (decrypted on the VM by bootstrap.sh)

Per-component READMEs cover setup, environment, and deploy specifics:

Corpus

The voice agent and Slack concierge both query the outlay-research Moss index. The dataset is fictional but coherent: 55 interviews, a 142-respondent onboarding survey, four personas, an enterprise customer success snapshot, two quarterly research reports, an affinity map, and an eight-query evaluation set. See corpus/DATASET.md for the schema, embedded patterns, and hero queries, or corpus/README.md for the import script and operational details.

To rebuild the index from scratch:

cd corpus
npm install
cp sample.env .env       # add your Moss credentials
npm run import           # default Layout 1: single index "outlay-research"

Two alternative layouts exist (Layout 2 and 3 split the same documents across two or three indexes for "voice vs synthesized findings" or router-pattern demos). Set OUTLAY_LAYOUT=2 or OUTLAY_LAYOUT=3.

Documentation

  • docs/deploy.md - full deploy reference (all three surfaces, argus and manual paths, sops walkthrough, troubleshooting).
  • docs/slack-setup.md - Slack app manifest, scopes, and bot troubleshooting.
  • docs/system-design.md - architecture, trust boundaries, component map.
  • docs/maintenance.md - post-deploy tuning, hardening (bind modes), concurrency knobs, moss-knowledge tuning.
  • docs/demo.md - end-to-end demo script with sample voice queries and the Slack ingest walkthrough.

License

MIT

About

Moss + OpenClaw showcase - voice-first developer experience for semantic search

Resources

Stars

Watchers

Forks

Contributors