Skip to content

sadeezy/mind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mind-runtime

mind-runtime is an experimental, cross-platform runtime for audio-native embodied-agent research. It treats audio as the primary substrate, uses learned token and latent models throughout the runtime graph, and exposes inspection tools for debugging internal state.

Audio In -> AudioTokenizer -> TemporalBinder -> WorkspaceCore
         -> {FastLoop, DeliberativeLoop, SalienceExecutive, MemoryRetriever}
         -> IntentionState -> SpeechDecoder -> Audio Out
flowchart TD
    audio[Audio In] --> tokenizer[AudioTokenizer]
    tokenizer --> binder[TemporalBinder]
    binder --> workspace[WorkspaceCore]
    workspace --> world[AudioWorldModel]
    workspace --> memoryQuery[MemoryRetriever]
    workspace --> goalModel[GoalStore / AgentModelBank]
    binder --> fastLoop[FastLoop]
    world --> deliberative[DeliberativeLoop]
    fastLoop --> executive[SalienceExecutive]
    deliberative --> executive
    memoryQuery --> executive
    goalModel --> executive
    executive --> intention[IntentionState]
    intention --> selfModel[SelfModel]
    intention --> speech[SpeechDecoder]
    speech --> audioOut[Audio Out]
    workspace --> stores[Persistent Side Stores]
    stores --> runtime[Runtime Engine]
Loading

Persistent side stores:

  • SelfModel
  • AgentModelBank
  • GoalStore
  • EpisodicMemory
  • SemanticMemory
  • SkillMemory
  • CalibrationStore

Status

  • Experimental research codebase, not production software.
  • Trainable PyTorch implementations are included for the tokenizer, binder, workspace, world model, executive, self model, goal manager, agent-state model, semantic/skill distillers, calibration model, memory query path, and speech decoder.
  • No pretrained checkpoints are bundled with the repository.
  • By default, the runtime requires a trained checkpoint bundle and fails fast if one is not available.

Highlights

  • Streaming audio ingestion with a 40 ms analysis window and 20 ms hop
  • Dual synchronized token streams:
    • acoustic stream at 50 Hz with 256-d embeddings and 4 residual codebooks
    • semantic/event stream at 12.5 Hz with 512-d embeddings and 2 residual codebooks
  • TemporalBinder for event boundaries, turn structure, speaker/source continuity, and local causal lag estimates
  • WorkspaceCore with explicit entity, event, hypothesis, and background-thread slots
  • AudioWorldModel with 200 ms, 1 s, and 4 s horizon-aware prediction heads and rollout features
  • learned GoalManagerModel, AgentStateModel, SemanticMemoryDistiller, SkillMemoryDistiller, and CalibrationModel components that feed runtime side stores without hand-seeded skill routines or runtime calibration reweighting
  • SalienceExecutive / ArbitrationExecutive covering the full action space:
    • continue
    • continue_with_more_compute
    • monitor_background
    • defer
    • partial_interrupt
    • full_interrupt
    • checkpoint_and_suspend
    • terminate
    • retrieve_memory
    • resume_prior_thread
    • spawn_new_thread
  • SpeechDecoder that generates semantic and acoustic logits for every codebook plus waveform output
  • Inspectable silent-state, inner-audio, and thread views through a FastAPI server and browser dashboard
  • Human-operated curriculum management plus concrete stage trainers via mind-curriculum and mind-train
  • Stage-aware training data flow with horizon-aligned audio targets for Stage 2 and trajectory / synthetic-scenario support for Stages 4 and 5

Installation

Python 3.11+ is required.

Create and activate a Python 3.11+ virtual environment first. On a fresh checkout, verify the active interpreter before installing:

python --version

Install the base package:

python -m pip install -e .

Install optional extras for live audio, local ANN storage, and development tooling:

python -m pip install -e ".[audio,memory,dev]"

Quick Start

1. Initialize the training workspace

mind-curriculum init
mind-curriculum status
mind-curriculum stage-brief stage0_tokenizer_pretraining

2. Register datasets and create a run

Use the templates and workflow documented in docs/training-curriculum.md:

mind-curriculum register-dataset /path/to/dataset_manifest.yaml
mind-curriculum create-run stage0_tokenizer_pretraining "tokenizer-pretrain-v1" --owner "you" --dataset your_dataset_id

3. Execute training

mind-train execute-run <run_id> --root ops/curriculum

4. Launch the inspection server with a trained bundle

mind-server --host 127.0.0.1 --port 8000 --checkpoint-dir /path/to/promoted/bundle

Then open:

  • http://127.0.0.1:8000/
  • http://127.0.0.1:8000/inspect/full

Inspection API

The server exposes the main inspection surfaces used by the dashboard:

  • GET /inspect/full
  • GET /inspect/state
  • GET /inspect/inner-audio
  • GET /inspect/threads
  • GET /inspect/stores
  • POST /inject/audio
  • POST /threads/resume
  • POST /threads/terminate
  • POST /sleep/consolidate

Training Workflow

The repository includes a YAML-backed, human-operated training workflow for staged curriculum execution.

Notable contract details in the current trainer stack:

  • Stage 2 consumes distinct horizon-aligned future targets instead of reusing one future clip for every head.
  • Stage 3 supervises all semantic and acoustic codebooks and includes loopback/self-other losses.
  • Stage 4 uses trajectory rollouts with runtime-consistent fast-loop, deliberative, goal, agent, and calibration features.
  • Stage 5 now has a real non-zero training path for goal promotion, persistence, retirement, G0 alignment, memory distillation, and agent-state learning.

Main commands:

mind-curriculum init
mind-curriculum status
mind-curriculum next-stage
mind-curriculum stage-brief stage0_tokenizer_pretraining
mind-train execute-run <run_id> --root ops/curriculum

See docs/training-curriculum.md for dataset manifest expectations, stage ordering, checkpoint logging, evaluation, and promotion flow.

Runtime Boundary

This repository intentionally does not hide missing training behind heuristic runtime fallbacks.

  • The default runtime path expects trained checkpoints.
  • If config.model.strict_checkpoint_loading is enabled, booting without a bundle raises an error.
  • mind-server --allow-random-init and config.model.allow_random_init_for_training = True are available for debugging and test coverage only. They are not substitutes for trained or promoted checkpoints.

Project Layout

  • src/mind/models/v1.py: trainable v1 model bundle
  • src/mind/runtime_engine/: runtime orchestration package and subsystem side-store integration
  • src/mind/api/server.py: inspection API and dashboard server
  • src/mind/audio_io.py: optional live audio I/O
  • src/mind/training/: curriculum admin, stage-aware datasets, trainers, and runner
  • docs/training-curriculum.md: operator runbook
  • tests/: unit tests

Platform Notes

  • Apple Silicon: prefer a native arm64 Python environment; PyTorch can run on CPU or mps.
  • Linux: install PortAudio if you want live audio via sounddevice.
  • Windows: sounddevice can use ASIO if SD_ENABLE_ASIO=1 is set before import and the installed PortAudio build supports it.

Development

Run the test suite:

python -m unittest discover -s tests

Use the python from the active Python 3.11+ environment. If you want pytest and ruff, install the development extra:

python -m pip install -e ".[dev]"
python -m ruff check .

Useful Python extras and libraries used in this project include:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors