Generate .funscript motion files from an audio track using a hybrid of pretrained neural networks and acoustic signal processing.
The headline feature: it tells apart someone talking from physical action, so a stroker device (e.g. The Handy, or anything Buttplug.io-compatible) stays held still during dialogue / intro lines instead of flailing around.
18+ / adults only. This is a signal-processing tool intended for use with legally obtained content you own. It does not bundle, host, download, or distribute any audio. Use responsibly and in accordance with your local laws.
Naive audio-to-motion converters map "loud = move", so they twitch during speech, moaning, and ambient noise. Hand-written spectral heuristics improve on this but hit a ceiling — they cannot really tell what a sound is.
funscript-ai solves this by combining three signal sources and letting each do
what it is best at:
| Source | Model / method | Good at |
|---|---|---|
| Speech / voice content | PANNs CNN14 (AudioSet, 527 classes) | Recognising Speech / Narration / Whispering / Moan / Pant / Breathing |
| Precise speech timing | Silero VAD (ONNX) | Exact "this is a person speaking" timestamps |
| Impact / rhythm | Multi-band STFT + RMS-dB silence gate | Detecting rhythmic impact sounds |
A joint decision tree labels every segment as holding / gentle / intense / climax,
then a motion generator produces physically-feasible stroke points and applies
device speed/interval limits.
See docs/DESIGN.md for the full story of how this evolved
from a naive heuristic (v1) to the current hybrid AI approach (v9).
Windows: double-click start.bat
macOS / Linux: bash start.sh
The launcher checks/installs dependencies, downloads model weights on first run,
and opens a browser UI at http://127.0.0.1:7860. Drop in an audio file, click
Generate, download the .funscript.
pip install -r requirements.txt
python scripts/download_models.py # one-time, ~312 MB
python -m funscript_ai.cli input.wavUseful flags:
python -m funscript_ai.cli input.mp3 -o out.funscript --debug
python -m funscript_ai.cli input.wav --max-speed 400 --gentle-high 40 --invert
python -m funscript_ai.cli input.wav --climax-sensitivity 0.10 # more climax
python -m funscript_ai.cli input.wav --device cuda # GPUfrom funscript_ai import generate_funscript, Config
cfg = Config(handy_max_speed=470, pos_gentle_high=45)
result = generate_funscript("input.wav", "out.funscript", config=cfg)
print(result["classes"]) # {'holding': 91, 'intense': 34, 'climax': 13, 'gentle': 3}| Label | When | Motion |
|---|---|---|
holding |
Dialogue, narration, silence (VAD/PANNs confirmed) | Stays at pos=100 (held), no movement |
gentle |
Light activity / breathing, weak impact | Irregular shallow strokes (0–45) |
intense |
Clear rhythmic impact | Irregular full-range strokes (0–100) |
climax |
Dense, strong impact | Regular full-range strokes (~220 ms/cycle) |
Position convention: pos=100 = deepest / held, pos=0 = withdrawn.
Holding at 100 (rather than a mid-point) mirrors how professional human-made
scripts behave during pauses — see the design doc.
- Python 3.10+
- ~1.5 GB disk for dependencies (PyTorch) + ~312 MB for the PANNs model
- CPU is fine (~15 s for a 12-minute track); CUDA optional
Model weights are not committed to the repo; they download automatically on
first run to ~/panns_data/.
- Toy moves during talking → raise
--climax-sensitivityslightly, or lower--silence-db(e.g.-40) to gate more aggressively. - Not enough climax detected → lower
--climax-sensitivity(e.g.0.10). - Strokes too deep/shallow for your anatomy/device → adjust
--gentle-highand device limits. - Run with
--debugto get*.debug.txt(per-segment reasons) and*.features.csv(probability time-series you can plot).
- PANNs — Q. Kong et al., PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition (2020).
- Silero VAD — Silero Team.
- librosa — audio analysis.
MIT.
Generated scripts are heuristic and may be imperfect — always review before use. The authors take no responsibility for how the output is used. No copyrighted or personal media is included in this repository.