Skip to content

sonicfieldlabs/oram

Repository files navigation

oram 04

ORAM

a speech-operated terminal looper for synthetic sound studies.

oram is not a DAW, not a chatbot, and not a prompt-to-song app. it is a minimal instrument where you record sound, loop it, give spoken commands, and an agent translates those commands into constrained audio actions.

a terminal instrument that listens before it plays.

install

requires python 3.11+.

cd oram
pip install -e ".[dev]"

for whisper STT support:

pip install -e ".[dev,stt]"

for the optional dashboard:

pip install -e ".[dev,web]"

run

oram                              # start with defaults
oram run                          # equivalent explicit subcommand
oram --list-devices               # show audio devices
oram --mock-audio                 # run without audio hardware
oram --input-device 2             # select input device
oram --session-name grey_chapel   # name this session
oram --session-dir ./sessions     # choose archive directory
oram --no-stt                     # keyboard only, no voice
oram dashboard --mock-audio       # browser dashboard (localhost only)
oram dashboard --allow-lan        # expose on LAN (requires token)
oram daemon --mock-audio          # local app/plugin daemon on 127.0.0.1
oram export ./oram_sessions/oram_0001

keyboard controls

space     push-to-talk (toggle: start/stop command capture)
r         record selected layer
o         overdub selected layer
1-4       select layer
m         mute selected layer
M         solo selected layer
x         clear selected layer (repeat to confirm)
s         save session
e         export mix
l         generate listening report
i         toggle prompt/audio input mode
tab       cycle mode
q         quit

core loop

audio input
-> record
-> loop
-> listen
-> voice command / STT
-> intent parser
-> structured action
-> DSP / SFX / generated texture
-> mixer
-> audio output
-> re-listening
-> archive

architecture

see architecture.md for module responsibilities and the threading model.

commands

see commands.md for the full command grammar.

session archive

each performance creates a session folder with mix, stems, metadata, command log, text waveform, and listening report. see session_format.md.

configuration

environment variables:

ORAM_SAMPLE_RATE=48000
ORAM_BLOCK_SIZE=512
ORAM_INPUT_DEVICE=
ORAM_OUTPUT_DEVICE=
ORAM_SESSION_DIR=./oram_sessions
ORAM_STT_BACKEND=mock
ORAM_GENERATOR_BACKEND=mock
ORAM_LLM_BACKEND=none
ORAM_AUTO_LISTEN=false
ORAM_DEFAULT_LISTENING_ROUTE=hybrid
ORAM_DEFAULT_ENGINE=auto
ORAM_DASHBOARD_TOKEN=
ELEVENLABS_API_KEY=

precedence: cli args > env vars > config file > defaults.

safe defaults: generator_backend=mock, llm_backend=none, auto_listen=false. cloud credentials are never required for the core looper.

see .env.example for a complete template.

macOS app

ORAM now includes a native SwiftUI app shell in apps/macos. It wraps the local Python engine through oram daemon, stores provider keys in macOS Keychain, and writes generated material to ~/Music/ORAM Library.

The repository includes an unsigned development DMG:

releases/macos/ORAM.dmg
apps/macos/script/build_and_run.sh --no-open
open apps/macos/dist/ORAM.app

To refresh the repository DMG:

apps/macos/script/package_unsigned.sh

The terminal instrument remains available for manual local runs.

BYOK provider setup

Packaged app usage stores keys in Keychain:

service: wtf.momoto.oram
account: provider:elevenlabs
account: provider:stability

Terminal setup:

oram credentials set elevenlabs
oram credentials set stability
oram credentials status
oram credentials test elevenlabs
oram credentials test stability

Mock mode remains the default; ElevenLabs and Stability AI are optional. See docs/providers/elevenlabs.md and docs/providers/stability.md.

security

ORAM is local-first: no Momoto server is required, no telemetry is enabled by default, and provider credentials are never exposed in local state responses.

The dashboard and daemon bind to 127.0.0.1 by default (localhost only).

to expose on the local network:

# set a token first
export ORAM_DASHBOARD_TOKEN=<local-token>

# then allow LAN access
oram dashboard --allow-lan --mock-audio
  • token authentication: when ORAM_DASHBOARD_TOKEN is set, all POST endpoints require Authorization: Bearer <token>. GET endpoints (state polling) remain open.
  • WebSocket auth: when a token is set, WebSocket connections must include ?token=<token> in the URL.
  • origin checking: WebSocket connections from non-localhost origins are rejected unless --allow-lan is used.
  • no secrets in state: /api/state never exposes API keys or tokens.

Run oram doctor --privacy for a local privacy diagnostic. See docs/security/local-first-privacy.md.

manual smoke test

  1. oram --list-devices
  2. oram --mock-audio
  3. oram with a microphone
  4. press r, make sound, stop recording
  5. confirm loop playback
  6. press 2, record a second layer
  7. mute layer 1
  8. reverse layer 2
  9. generate a mock bed
  10. export session
  11. open exported WAV
  12. inspect session.json, commands.log, and listening_report.md

dashboard

the optional dashboard is a compact browser control surface for local testing:

oram dashboard --mock-audio

it exposes the same command parser as the terminal app. the terminal instrument remains the primary interface.

security: the dashboard binds to localhost only. use --allow-lan with ORAM_DASHBOARD_TOKEN to expose on LAN.

troubleshooting

no audio devices found: check that your system audio permissions allow terminal access to microphone. on macOS, grant terminal access in System Settings > Privacy & Security > Microphone.

audio dropouts: try increasing block size with --block-size 1024.

STT not working: run with --no-stt for keyboard-only mode. check that whisper is installed with pip install -e ".[stt]".

import errors: ensure you installed with pip install -e ".[dev]" from the project root.

About

Agentic looper

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors