Presentation control, reimagined.
moves is a presentation control system that uses offline speech recognition and a hybrid similarity engine to advance slides automatically based on your speech. This enables a hands-free presentation experience.
- Hands-Free: Automatically advances slides based on your speech for a seamless, hands-free presentation.
- Intelligent: Utilizes an LLM to analyze and segment transcripts, mapping them accurately to presentation slides.
- Private: Performs all speech-to-text and similarity matching locally. No internet connection is required, ensuring data privacy and low latency.
- Accurate: A hybrid similarity engine combines semantic and phonetic analysis for precise speech-to-slide alignment, accommodating variations in speech.
- Controlled: Provides full manual override via keyboard controls to pause, resume, or navigate slides at any time.
- Configurable: A command-line interface for managing speaker profiles, processing presentation data, and configuring system settings.
To install moves, use uv to install it as a tool:
- Python 3.13+
- uv a fast Python package installer and resolver.
Install moves-cli using uv tool install:
uv tool install moves-cli --python 3.13This will install moves and make it available as a command-line tool.
Using moves consists of three main steps: configuring the AI model, processing the presentation data, and starting the control session.
Configure moves with the desired Large Language Model (LLM) and provide an API key.
Note: A list of compatible models is available at LiteLLM Supported Models.
# Set the desired model (e.g., Gemini's Gemini-2.5-Flash-Lite)
moves settings set model gemini/gemini-2.5-flash-lite
# Set your API key
moves settings set key YOUR_API_KEY_HERECreate a speaker profile by providing a presentation and its corresponding transcript (both in PDF format). Then, process the data to align the transcript with the slides.
# Add a speaker with their presentation and transcript
moves speaker add john ./path/to/presentation.pdf ./path/to/transcript.pdf
# Process the speaker's data
moves speaker process johnThis step uses the configured LLM and may take a few moments to complete.
Open the presentation and execute the control command.
moves presentation control johnOnce started, moves listens for your speech and sends Right Arrow key presses to advance the slides at the appropriate times.
moves provides a command-line interface for managing presentations.
| Command | Description |
|---|---|
moves speaker |
Manage speaker profiles, files, and AI processing. |
moves presentation |
Start a live, voice-controlled presentation session. |
moves settings |
Configure the LLM model and API key. |
For more details, please refer to the CLI Commands.
Documentation is currently not up to date. It will be updated soon.
For a detailed explanation of the system's architecture, components, and design, please refer to the Documentation, which covers:
- Architecture: A high-level overview of the system's structure.
- Technical Details: In-depth explanations of key components like the similarity engine, data models, and STT pipeline.
Contributions are welcome. To contribute to the project, please follow these steps:
- Fork the repository on GitHub.
- Clone your fork locally.
- Create a new branch for your feature or bug fix (
git checkout -b feature/my-new-feature). - Set up the environment using
uv venvanduv sync. - Make your changes and commit them with a clear message.
- Push your branch to your fork (
git push origin feature/my-new-feature). - Open a pull request to the main repository.
This project is licensed under the terms of the GNU General Public License v3.0. For more details, see the LICENSE file.