Skip to content

ecooxai/auri

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Auri

Common commands

Watch native development (starts a new independent Auri app process):

npm run native:watch

Build a release bundle:

npm run tauri:build

Build and run the release app in the current terminal session on macOS or Linux:

npm run app

Auri is a terminal-centered assistant workspace for macOS and Linux, built with Rust and Tauri. Its interface combines browser-like workspaces, a synchronized folder pane, terminal and AI composer, file inspection and viewing, webviews, clipboard history, settings, and audio/video capture.

The project follows a command-first MVC design: every meaningful GUI action maps to a text command. A user types the public form (auri tab new), while internal GUI code calls the same command without the auri prefix (tab new). This keeps behavior testable and makes automation predictable.

Current status

This repository is a working application foundation, not a claim that every advanced desktop feature is complete.

Implemented now:

  • Vertical main workspaces and horizontal subtabs.
  • Per-workspace folder pane synchronized with terminal cwd.
  • File inspection with size, type, image dimensions, and optional ffprobe codec/bitrate metadata.
  • Text, image, audio, and video viewer paths.
  • Shell command execution and explicit cd synchronization.
  • Terminal composer where Enter inserts a newline and Command/Ctrl+Enter runs.
  • OpenAI-compatible and Gemini-compatible text/image requests, including the current screenshot when enabled.
  • Assistant replies can expose allowlisted shell-command and input-ready actions in a floating panel, with Run, Insert, and Copy controls.
  • Local model/settings management and an Info tab for errors, notices, and sanitized AI request details with text plus playable image/audio previews.
  • Clipboard text history with the 100 + 100 character long-text preview rule.
  • Audio/video recording through the WebView media APIs, with native persistence under ~/auri/media.
  • Native screenshot capture, workspace creation, local configuration, file access, and external file opening.
  • A real external auri CLI that sends commands to the running desktop app over a user-only Unix socket.
  • Browser preview mode with safe capability fallbacks.

Not complete yet:

  • The terminal backend is process-based, not a PTY. Interactive full-screen programs such as top, htop, editors, and password prompts need a PTY session layer.
  • The Alt+Space hold gesture works while Auri receives keyboard events; OS-global shortcut registration is not yet implemented.
  • Gemini Live wake sessions support persistent multi-turn voice interaction. Repeated Alt+Space activation or hold-to-talk reuses the active connection, sends a fresh screenshot, resumes suspended audio after app switching, and uses the configured no-reply timeout before disconnecting.
  • OpenAI Live remains configurable but does not yet provide a native bidirectional realtime session; its requests currently use the normal completion path.
  • Clipboard persistence captures text and images through native polling. Event-driven clipboard monitoring and broader cross-platform deduplication remain to be added.
  • Recording prefers MP4/M4A when the WebView supports it and falls back to WebM. Guaranteed 64 kbps M4A transcoding requires a native encoder/transcode stage.
  • Linux screenshot capture expects gnome-screenshot or grim; JPEG conversion uses ffmpeg when available.

Architecture

src/
  model/                 Pure state and parsing logic
    commands.js          Command registry, parser, raw-tail extraction
    state.js             Immutable application reducer and workspace model
    assistant.js         Safe allowlisted assistant action parsing and streaming cleanup
    clipboard.js         Clipboard serialization and preview rules
    presentation.js      Pure display helpers
  controllers/
    command-controller.js  Single command execution path for GUI and automation
    app-controller.js      Event orchestration and capability adapters
  services/
    backend.js           Browser/Tauri bridge and AI HTTP clients
    media-recorder.js    MediaRecorder capability wrapper
  views/
    app-view.js          Top-level view lifecycle
    panels.js            Modular panel renderers

src-tauri/src/core/
  workspace.rs           ~/auri and ~/.config/auri initialization
  files.rs               Directory, metadata, preview, attachment, media storage
  shell.rs               Shell process execution and cwd synchronization
  capture.rs             macOS/Linux screenshot capture
  clipboard.rs           Persistent clipboard text/image history, pinning, retention, and paste-back
  ipc.rs                 macOS/Linux external CLI command bridge
  util.rs                Dependency-light base64, MIME, path, and CLI helpers

src-tauri/src/bin/
  auri.rs                External terminal command client

The model never imports DOM or Tauri APIs. Controllers dispatch model events. Views render state and emit intents. Native operations stay behind Backend and Tauri commands.

Test-driven workflow

  1. Add or change a behavior test first.
  2. Run the smallest relevant test and confirm that it fails for the expected reason.
  3. Implement the pure model or controller behavior.
  4. Wire GUI controls to the same underlying command—do not duplicate the logic in the view.
  5. Run the focused test, then the complete suite.
  6. Run the production frontend build and Rust compiler check.
  7. Render the browser preview for visual and interaction sanity checks.

Common commands:

npm test
npm run test:rust
npm run check
cargo check --manifest-path src-tauri/Cargo.toml

npm run check runs JavaScript tests, dependency-light Rust core tests, and the frontend production build.

Development

Prerequisites:

  • Node.js 20 or newer.
  • A current stable Rust toolchain.
  • Tauri 2 CLI for native development: cargo install tauri-cli --version '^2'.
  • macOS: Xcode Command Line Tools.
  • Linux: the Tauri/WebKitGTK development packages for your distribution.

Browser preview:

npm run dev

Open http://localhost:4173. Preview mode supports the full interface and safe simulated/basic commands, but native filesystem, screenshot, global OS integration, and unrestricted shell features require Tauri.

Native development:

npm run tauri:dev

For restart-on-change development, npm run native:watch launches an independent Auri process with its own frontend server, temporary build identity, and command socket. Starting it again does not stop or replace another running watcher or Auri window.

Agent development launch rule:

Before starting a new development app or watcher, check for existing Auri development instances and stop only those dev/watch processes when they would conflict with the new run. Do not kill or replace release-version Auri processes. Prefer a build plus a manually started dev app for task verification when continuous watch mode is not needed.

Install the external CLI on your PATH:

npm run cli:install
auri tab new Research

The desktop app must be running. Each process owns a socket under ~/.config/auri/instances/; the CLI selects the most recently started reachable instance, restores quoting for arguments containing spaces, focuses that window, and sends the command through the same controller used by GUI clicks. Set AURI_COMMAND_SOCKET to a specific socket path when several instances are running and you need an exact target. Source builds install the CLI separately; release packaging should install or expose the auri binary on the user's PATH.

Production build:

npm run tauri:build

Tauri places platform bundles under src-tauri/target/release/bundle/. The npm build wrapper assigns every packaged build an independent application identifier so opening one build does not activate or terminate another. Set AURI_BUILD_ID when a stable, repeatable identifier is required for a release channel. Sign and notarize macOS builds, and package/test Linux bundles on each target distribution before release.

Data layout

~/auri/
  media/
    picture/             Screenshots and images
    audio/               Audio recordings
    video/               Video recordings
  subtabs/               Reserved for persisted subtab data
  clipboard/
    history.json         Clipboard history

~/.config/auri/
  settings.json          Models, API keys, and preferences
  instances/
    command-<pid>.sock   User-only external CLI socket for each Auri process

API keys are stored locally in the settings file. They are not committed by this project, but the file is currently plain JSON; protect your user account and filesystem permissions. Supplying a key in a terminal command can also leave it in shell or terminal history, so the Settings UI is preferable for secrets.

Command interface

The public form starts with auri and works from the embedded terminal or the installed external CLI. Inside the application, GUI code calls the same command without that prefix.

auri tab new [title]                                                           Create and focus a new main workspace tab.
auri tab close [id]                                                            Close a main tab (the active tab by default).
auri tab select <id>                                                           Focus a main tab.
auri subtab new <terminal|webview|viewer|clipboard|audio|video|settings|system|info>  Create and focus a horizontal subtab.
auri subtab close [id]                                                         Close a horizontal subtab.
auri subtab select <id>                                                        Focus a horizontal subtab.
auri folder cd <path>                                                          Change both folder and terminal working directory.
auri folder list [path]                                                        List a directory.
auri folder sort <name|date|type>                                             Sort the active folder listing.
auri folder create-file <name>                                                Create an empty file in the active folder.
auri folder create-folder <name>                                              Create a folder in the active folder.
auri folder info [path]                                                        Show folder size, disk, owner, and permission details.
auri file inspect <path>                                                       Show file metadata; repeat to open it.
auri file open <path>                                                          Open a file in the viewer.
auri file external [path]                                                      Open a file with the operating system.
auri terminal run <command...>                                                 Run a shell command in the active workspace.
auri ai ask <prompt...>                                                        Ask the selected AI with the current screenshot.
auri ai model add <name> <type> <model> <url> <key>                            Add an AI provider configuration.
auri ai model select <id>                                                      Select the AI model for the terminal.
auri ai model update <id> <name> <type> <model> <url> <key>                    Update an AI provider configuration.
auri ai model delete <id>                                                      Delete an AI provider configuration.
auri clipboard list                                                            Open clipboard history.
auri clipboard insert <id>                                                     Paste a clipboard item into the previously active application.
auri clipboard pin <id>                                                       Pin a clipboard item.
auri clipboard unpin <id>                                                     Unpin a clipboard item.
auri clipboard remove <id>                                                    Remove a clipboard item.
auri clipboard copy <text>                                                     Copy text to the system clipboard.
auri attachment add <path>                                                     Attach a local file to the next AI request.
auri attachment remove <id>                                                    Remove a prompt attachment.
auri input insert <text>                                                       Insert text into the focused prompt input.
auri transcript dismiss                                                        Close the completed voice-input text popup.
auri web open <url>                                                            Navigate the active webview.
auri web reload                                                                Reload the active webview.
auri web back                                                                  Go back in the active webview.
auri web forward                                                               Go forward in the active webview.
auri web external                                                              Open the active web URL externally.
auri web download                                                                Download the active page.
auri web zoom-in                                                                 Increase the active page zoom.
auri web zoom-out                                                                Decrease the active page zoom.
auri web zoom-reset                                                              Reset the active page zoom to 100%.
auri web bookmark [add [name] [url]|remove <id>]                                 Open the add-bookmark dialog or manage bookmarks.
auri web bookmarks                                                               Show saved bookmarks.
auri web history [clear]                                                         Show or clear browser history.
auri web devtools                                                                Open developer tools for the active page.
auri live record start                                                        Start push-to-talk microphone input for the selected Live model.
auri live record stop                                                         Stop microphone input and send the captured turn to the Live API.
auri record audio                                                              Open audio recording.
auri record video                                                              Open video recording.
auri record start <audio|video>                                                Start media capture.
auri record stop                                                               Stop the active media capture.
auri media attach <audio|video>                                                Attach the latest recording to the prompt.
auri settings open                                                             Open Settings.
auri settings set <key> <value>                                                Update an application setting.
auri permission status                                                         Refresh macOS media permission status.
auri permission request <microphone|screen-recording>                       Request or open macOS settings for a media permission.
auri system open                                                              Open the System monitor.
auri system sort <cpu|port|name|pid|ram|net>                                      Sort System monitor processes.
auri system refresh                                                           Refresh System monitor statistics.
auri system select <pid>                                                    Select a System monitor process.
auri system kill <pid>                                                      Kill the selected System monitor process.
auri system open-path <pid>                                                 Open the selected process path externally.
auri system tunnel start <port> [--install]                              Start a Cloudflare HTTPS tunnel for a process port.
auri system tunnel stop <port>                                           Stop the Cloudflare HTTPS tunnel for a process port.
auri info show                                                                 Open the Info subtab.
auri info clear                                                                Clear notifications and errors.
auri help                                                                      Show all available commands.

System monitor UI note: keep the process table Name column at twice the previous width (minmax(240px, 3fr)) so short process names stay readable before truncation; disk and network process tables share the same grid. The selected-process detail shows listening ports below the path field, one port per row. Each port can start or stop a confirmed Cloudflare cloudflared HTTPS tunnel; Auri checks PATH and ~/.local/bin, can install supported macOS/Linux binaries to ~/.local/bin after confirmation, closes the confirmation prompt into a per-port starting/stopping state, shows the generated trycloudflare.com URL beside the port, and copies that URL when clicked.

Examples:

auri tab new Research
auri subtab new webview
auri folder cd ~/Projects
auri terminal run printf '%s %s\\n' hello Auri
auri web open https://example.org
auri clipboard list
auri attachment add ~/Pictures/reference.png
auri ai ask Summarize the current screen and attached file
auri info show

Paths or values containing spaces should be quoted. Shell command tails and AI prompt tails preserve their original punctuation and quoting.

Error handling

Expected failures are shown both near the current interaction and in the Info subtab. Network errors, missing API keys, unsupported media capabilities, file access failures, malformed assistant output, and native capability failures should never silently disappear. A fallback message belongs in Info whenever content cannot be rendered in its intended panel.

Performance guidelines

  • Keep state transformations pure and incremental.
  • Truncate clipboard text longer than 150 characters to the first 100 and last 50 characters before rendering.
  • Avoid reading text files larger than 2 MB into the viewer.
  • Limit inline binary attachments to 32 MB and saved recordings to 256 MB.
  • Keep panel rendering modular; do not add unrelated behavior to panels.js.
  • Move long-running native operations off the UI thread when adding PTY, transcoding, or realtime streaming.

Visual design

  • Keep the System table scan-friendly: compact Name, RAM constrained to an about-seven-character display, Port wide enough for two port badges, Net/Disk metrics next, PID at the far right, and reset table scroll to top after sort changes.

The interface uses a light aurora palette, translucent surfaces, minimal separators, Unicode symbols with system-font fallbacks, and explicit press/hover/focus feedback. Buttons use icons where their meaning is standard; text remains where an icon alone would be ambiguous or unsafe. auri live record toggle Connect and record, or disconnect the active Live chat.

About

the all-in-one app that combine terminal with web will does everything more smooth for you on linux and mac, built with tauri

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors