This is a research issue for now to come up with a plan and track the idea.
In a past analysis it came out that we use the Qt media framework to play an uncompressed WAV file in a single use case: when someone is calling via Nextcloud Talk. It adds a huge overhead to our app for the most basic requirement of just playing a sound in a loop. Please draft a plan to get rid of the whole Qt multimedia framework and replace it with a custom C++ API which abstracts the OS specifics of Windows, Linux and macOS to play such sound file with their native APIs.
The plan Claude Code (Opus 4.7 with high effort) came up with:
Replace Qt Multimedia with a native audio abstraction
Context
Qt Multimedia is pulled into the desktop client for exactly one purpose: playing the call-notification.wav ringtone in a loop when an incoming Nextcloud Talk call arrives. The framework (and its platform backends — WMF on Windows, AVFoundation on macOS, GStreamer/ALSA/Pulse on Linux) plus its QML plugin add significant installer/bundle size for a feature that amounts to "play this 898 KB PCM WAV a few times, then stop." The goal is to delete this dependency and replace it with a thin in-tree C++ class that wraps native OS audio APIs.
Expected outcomes:
- Smaller Windows installer (no Qt6Multimedia.dll + Qt Multimedia QML plugin), smaller macOS bundle (no
QtMultimedia.framework or qml/QtMultimedia/*), smaller Linux distribution (no GStreamer plugin bring-in).
- No behavioral change to the call notification UX.
- One project-owned audio abstraction in case a future feature needs another sound.
Scope
Qt Multimedia usage is confined to:
- src/gui/tray/CallNotificationDialog.qml:11 —
import QtMultimedia
- src/gui/tray/CallNotificationDialog.qml:71-75 —
SoundEffect { source; loops: 9 }
- Start:
Component.onCompleted at line 59 (ringSound.play()).
- Stop:
closeNotification() at line 39 (ringSound.stop()).
- Asset: theme/call-notification.wav (PCM 16-bit stereo 44.1 kHz, 898 KB), embedded via theme.qrc.in:293 as
qrc:///client/theme/call-notification.wav.
- Qt Multimedia is NOT declared in any
find_package/target_link_libraries. It is resolved at QML-import time and deployed by macdeployqt via QML scanning. Removing the import is the whole story on macOS. Windows deployment is NSIS-template driven at cmake/modules/NSIS.template.in; lines 436–437 reference Qt5Multimedia.dll / Qt5MultimediaWidgets.dll — dead code in a Qt6 build, but delete them as part of this change.
Public C++ API
OCC::NotificationSoundPlayer — QObject with properties/methods API-shape-compatible with the used subset of SoundEffect, so the QML diff is one import and one element name.
// src/gui/notificationsoundplayer.h
namespace OCC {
class NotificationSoundPlayer : public QObject
{
Q_OBJECT
Q_PROPERTY(QString source READ source WRITE setSource NOTIFY sourceChanged)
Q_PROPERTY(int loops READ loops WRITE setLoops NOTIFY loopsChanged)
Q_PROPERTY(bool playing READ isPlaying NOTIFY playingChanged)
public:
explicit NotificationSoundPlayer(QObject *parent = nullptr);
~NotificationSoundPlayer() override; // must stop playback
QString source() const;
int loops() const; // number of plays; default 1
bool isPlaying() const;
public slots:
void setSource(const QString &source); // accepts qrc:///..., file:///..., plain path
void setLoops(int loops);
void play();
void stop();
signals:
void sourceChanged();
void loopsChanged();
void playingChanged();
private:
class Backend;
std::unique_ptr<Backend> _backend;
QString _source;
int _loops = 1;
bool _playing = false;
QString _resolvedFilePath;
QString resolveToFilesystemPath(const QString &source);
};
}
QRC → filesystem path
Native audio APIs (XAudio2, AVAudioPlayer, libcanberra) need real filesystem paths. Resolve QRC/qrc: sources once per process by copying to QStandardPaths::writableLocation(QStandardPaths::CacheLocation) + "/sounds/<hash>.wav", where <hash> derives from the resource path + size (idempotent across app updates). Static map guarded by QMutex; extractor writes to <path>.tmp then renames atomically. Fall back to QTemporaryFile if the cache dir is not writable (sandbox). Keep the WAV embedded in QRC — no install-path divergence per OS.
Platform backends
Windows — XAudio2 (not PlaySound)
Use XAudio2. PlaySound is rejected: SND_LOOP loops forever (needs a duration-based QTimer; duration requires parsing the WAV header anyway), only one PlaySound can play per process, and PlaySound(NULL, NULL, 0) stops process-wide.
XAudio2 provides exact loop counts via XAUDIO2_BUFFER::LoopCount (submit one buffer, LoopCount = loops - 1, LoopBegin = 0, LoopLength = 0), per-instance lifecycle (source voice), and no global state. Link xaudio2.lib. Inline WAV header parser (~50 LoC) for RIFF/fmt /data chunks; accept PCM 8/16/24-bit, mono/stereo, any sample rate; fail loudly on anything else (the theme WAV could be rebranded later — silent failure is worse than a logged error).
macOS — AVAudioPlayer via Objective-C++
.mm file wrapping AVAudioPlayer initWithContentsOfURL:error: with a file:// URL, numberOfLoops = loops - 1 (AVAudioPlayer semantic: additional plays after the first; -1 for infinite). -prepareToPlay, -play, -stop on destruction inside @autoreleasepool. Frameworks: AVFoundation, Foundation. Exact loop count, no timer, clean lifecycle.
Linux — libcanberra (committed)
libcanberra is the freedesktop event-sound API. An incoming-call ringtone is literally its intended use (CA_PROP_EVENT_ID = "phone-incoming-call" is a defined XDG sound theme event). It multiplexes PulseAudio, PipeWire (via pipewire-pulse or pipewire-alsa), and pure ALSA behind a single API. Packaged on every major distro, usually preinstalled on GNOME/KDE.
Rejected alternatives:
- libpulse-simple: breaks on pure ALSA / pipewire-without-pulse; more code (WAV parser + worker thread + re-feed loop).
- Direct ALSA: does not coexist with desktop sound servers.
- fork
paplay/aplay: hacky.
Loop counting: libcanberra has no native loop count. Subscribe to ca_finish_callback_t; on CA_SUCCESS, if remainingLoops > 0, decrement and ca_context_play_full again. The callback fires on the libcanberra worker thread — marshal to the main thread via QMetaObject::invokeMethod(this, ..., Qt::QueuedConnection) before touching members. stop() sets remainingLoops = 0 first (in-flight callback becomes a no-op), then ca_context_cancel.
Pass CA_PROP_MEDIA_FILENAME (extracted path), CA_PROP_MEDIA_ROLE = "event", CA_PROP_CANBERRA_CACHE_CONTROL = "permanent", CA_PROP_EVENT_ID = "phone-incoming-call".
Build-time optionality: pkg_check_modules(CANBERRA IMPORTED_TARGET libcanberra). If absent, compile a no-op backend that logs a warning on first play(). Do not hard-fail the build. Soft-dep libcanberra0 in distro packaging metadata.
File layout
Compile-switched backend files (mirrors the existing folderwatcher_{linux,win,mac}.cpp / systray_mac_common.mm convention) with PIMPL so platform headers never leak into the public header.
src/gui/notificationsoundplayer.h # public QObject, no platform headers
src/gui/notificationsoundplayer.cpp # dispatcher: source resolution, QRC extraction,
# loop-count bookkeeping for backends that don't
# do it natively (Linux), signals
src/gui/notificationsoundplayer_win.cpp # XAudio2 Backend impl + WAV header parser
src/gui/notificationsoundplayer_mac.mm # AVAudioPlayer Backend impl
src/gui/notificationsoundplayer_linux.cpp # libcanberra Backend impl (+ no-op fallback)
The Backend inner class has a uniform interface (setSource, play(remainingLoops), stop(), d'tor + finished signal). Windows and macOS backends handle loops natively and only signal finished at sequence end. Linux backend signals after each play; the common file issues the next play().
CMake / build changes
- Add
notificationsoundplayer.h and notificationsoundplayer.cpp to client_SRCS near callstatechecker (unconditional).
- In the
IF(APPLE) block (~line 289): append notificationsoundplayer_mac.mm.
- In the
IF(WIN32) block (~line 356): append notificationsoundplayer_win.cpp.
- In the
IF(NOT WIN32 AND NOT APPLE) block (~line 353): append notificationsoundplayer_linux.cpp.
- Linking:
- Windows: add
target_link_libraries(nextcloudCore PRIVATE winmm xaudio2) inside the Windows block after ~line 364.
- macOS: extend the existing
-framework … list near line 684 with -framework AVFoundation -framework Foundation.
- Linux: in the Linux block near line 677, add
pkg_check_modules(CANBERRA IMPORTED_TARGET libcanberra); conditionally target_link_libraries(nextcloudCore PRIVATE PkgConfig::CANBERRA) and target_compile_definitions(nextcloudCore PRIVATE HAVE_LIBCANBERRA).
- Add
#include "notificationsoundplayer.h" with the sibling GUI includes.
- Append next to line 138:
qmlRegisterType<NotificationSoundPlayer>("com.nextcloud.desktopclient", 1, 0, "NotificationSoundPlayer");
Delete both lines (Qt5Multimedia.dll, Qt5MultimediaWidgets.dll). They cannot resolve in a Qt6 build — dead code. Do not scrub the rest of the Qt5 block; out of scope.
Deployment audit (no changes expected; verify)
macdeployqt invocation at src/gui/CMakeLists.txt:733-744 scans -qmldir=${CMAKE_SOURCE_DIR}/src/gui for QML imports. Once import QtMultimedia is removed, it stops copying QtMultimedia.framework and qml/QtMultimedia/libquickmultimedia.dylib. Verify by diffing Contents/Frameworks and Contents/Resources/qml/ before/after.
- No
windeployqt is used anywhere; Windows bundling is entirely NSIS-template driven.
admin/osx/make_universal.py: no Multimedia references — confirmed clean.
QML migration
src/gui/tray/CallNotificationDialog.qml:
- Delete line 11:
import QtMultimedia.
- Replace lines 71–75:
NotificationSoundPlayer {
id: ringSound
source: root.ringtonePath
loops: 9
}
The com.nextcloud.desktopclient import at line 9 already exposes the new type. Call sites at lines 39 (ringSound.stop()) and 59 (ringSound.play()) are unchanged.
Implementation sequence
- Scaffold
notificationsoundplayer.{h,cpp} with the full public API, QRC extractor, PIMPL Backend interface, and a no-op Backend for all platforms.
- Register the type in owncloudgui.cpp and wire sources into src/gui/CMakeLists.txt. Build Win/macOS/Linux green with the no-op backend — this confirms wiring before any native code.
- Change CallNotificationDialog.qml. Dialog now works silently on all platforms; Qt Multimedia is fully unreferenced.
- Implement the macOS backend (smallest, cleanest). Verify end-to-end with a real Talk call.
- Implement the Windows backend (XAudio2 + WAV header parser). Verify end-to-end.
- Implement the Linux backend (libcanberra + main-thread-marshaled loop continuation). Verify on a PulseAudio host and a PipeWire host.
- Delete the two dead
Qt5Multimedia*.dll lines from the NSIS template.
- Rebuild installers on each platform; diff against a pre-change build to confirm Qt Multimedia framework / QML plugin are no longer shipped.
- Add unit tests for path resolution and Linux loop bookkeeping (see Verification).
Verification
Manual end-to-end (on each platform)
Triggered by a real Talk call: two accounts on a Talk-enabled Nextcloud server, callee logged into the desktop client on the platform under test, caller initiates a call from the Talk web UI. Expected: CallNotificationDialog appears, ringtone plays.
Cases to confirm on Windows, macOS, and Linux:
- Audible playback at normal volume through multiple cycles.
- Decline button stops playback immediately (tests
stop() from QML).
- 60-second timeout stops playback (tests
CallStateChecker-driven stop).
- Natural completion after 9 plays stops cleanly (tests native loop count on Win/macOS; callback-chained loop on Linux).
- Closing the app while ringing stops playback and releases resources (tests backend destructor — critical for XAudio2 voice release, AVAudioPlayer release,
ca_context_destroy).
- Audio device unplug during playback does not crash or hang.
Unit tests (worth writing)
resolveToFilesystemPath(): feed qrc:///client/theme/call-notification.wav, a file:// URL, a plain path. Assert readable filesystem result and idempotency.
- Linux loop bookkeeping: mock the
finished hook; assert play() is issued loops - 1 additional times and that stop() mid-sequence cancels remaining plays.
Full native-playback integration tests are not worth CI effort (real audio device, flaky). Keep those manual.
Risks / edge cases
- Overlapping notifications: each dialog owns its own player; all three backends handle concurrent instances. No special handling. If the product later wants "one ringtone at a time," route through a singleton — do not put that complexity in the player.
- No audio device / hotplug: all backends degrade gracefully (error return, error callback). Log once at warn; leave the dialog visible with no sound.
- Headless Linux / no sound server: libcanberra returns an error per-play. Soft failure, logged once.
- Build without libcanberra:
HAVE_LIBCANBERRA compile switch → no-op backend. Build still succeeds.
- App termination while playing: destructor must stop playback. Windows: stop + destroy source voice, release mastering voice, release
IXAudio2 (leaking the voice keeps a background callback thread alive past exec()). macOS: [player stop]; player = nil; in @autoreleasepool. Linux: remainingLoops = 0; ca_context_cancel; ca_context_destroy (destroy joins the worker thread).
- QRC extraction race: two dialogs constructed in quick succession both miss the cache. Mutex-guarded extraction keyed by source path; write to
<path>.tmp + atomic rename.
- Cache dir not writable (sandbox): fall back to
QTemporaryFile for that session; log once.
- WAV format assumptions (Windows): inline parser accepts PCM 8/16/24-bit mono/stereo at any sample rate; log + fail loudly on anything else. The current file qualifies; a future rebrand that swaps in a compressed WAV (ADPCM, float) will surface an actionable error instead of silent silence.
Critical files
Reused existing patterns
- QML type registration pattern: owncloudgui.cpp:138 (
CallStateChecker as sibling shape).
- Compile-switched platform sources: existing
folderwatcher_linux.cpp / folderwatcher_win.cpp / folderwatcher_mac.cpp and systray_mac_common.mm convention in src/gui/CMakeLists.txt.
#ifdef Q_OS_* point-of-use pattern: systray.cpp, openfilemanager.cpp, socketapi.cpp.
- Objective-C++
.mm for macOS backends: systray_mac_common.mm, fileprovider_mac.mm.
This is a research issue for now to come up with a plan and track the idea.
In a past analysis it came out that we use the Qt media framework to play an uncompressed WAV file in a single use case: when someone is calling via Nextcloud Talk. It adds a huge overhead to our app for the most basic requirement of just playing a sound in a loop. Please draft a plan to get rid of the whole Qt multimedia framework and replace it with a custom C++ API which abstracts the OS specifics of Windows, Linux and macOS to play such sound file with their native APIs.
The plan Claude Code (Opus 4.7 with high effort) came up with:
Replace Qt Multimedia with a native audio abstraction
Context
Qt Multimedia is pulled into the desktop client for exactly one purpose: playing the
call-notification.wavringtone in a loop when an incoming Nextcloud Talk call arrives. The framework (and its platform backends — WMF on Windows, AVFoundation on macOS, GStreamer/ALSA/Pulse on Linux) plus its QML plugin add significant installer/bundle size for a feature that amounts to "play this 898 KB PCM WAV a few times, then stop." The goal is to delete this dependency and replace it with a thin in-tree C++ class that wraps native OS audio APIs.Expected outcomes:
QtMultimedia.frameworkorqml/QtMultimedia/*), smaller Linux distribution (no GStreamer plugin bring-in).Scope
Qt Multimedia usage is confined to:
import QtMultimediaSoundEffect { source; loops: 9 }Component.onCompletedat line 59 (ringSound.play()).closeNotification()at line 39 (ringSound.stop()).qrc:///client/theme/call-notification.wav.find_package/target_link_libraries. It is resolved at QML-import time and deployed bymacdeployqtvia QML scanning. Removing the import is the whole story on macOS. Windows deployment is NSIS-template driven at cmake/modules/NSIS.template.in; lines 436–437 referenceQt5Multimedia.dll/Qt5MultimediaWidgets.dll— dead code in a Qt6 build, but delete them as part of this change.Public C++ API
OCC::NotificationSoundPlayer— QObject with properties/methods API-shape-compatible with the used subset ofSoundEffect, so the QML diff is one import and one element name.QRC → filesystem path
Native audio APIs (XAudio2, AVAudioPlayer, libcanberra) need real filesystem paths. Resolve QRC/
qrc:sources once per process by copying toQStandardPaths::writableLocation(QStandardPaths::CacheLocation) + "/sounds/<hash>.wav", where<hash>derives from the resource path + size (idempotent across app updates). Static map guarded byQMutex; extractor writes to<path>.tmpthen renames atomically. Fall back toQTemporaryFileif the cache dir is not writable (sandbox). Keep the WAV embedded in QRC — no install-path divergence per OS.Platform backends
Windows — XAudio2 (not
PlaySound)Use XAudio2.
PlaySoundis rejected:SND_LOOPloops forever (needs a duration-based QTimer; duration requires parsing the WAV header anyway), only onePlaySoundcan play per process, andPlaySound(NULL, NULL, 0)stops process-wide.XAudio2 provides exact loop counts via
XAUDIO2_BUFFER::LoopCount(submit one buffer,LoopCount = loops - 1,LoopBegin = 0,LoopLength = 0), per-instance lifecycle (source voice), and no global state. Linkxaudio2.lib. Inline WAV header parser (~50 LoC) forRIFF/fmt/datachunks; accept PCM 8/16/24-bit, mono/stereo, any sample rate; fail loudly on anything else (the theme WAV could be rebranded later — silent failure is worse than a logged error).macOS — AVAudioPlayer via Objective-C++
.mmfile wrappingAVAudioPlayer initWithContentsOfURL:error:with afile://URL,numberOfLoops = loops - 1(AVAudioPlayer semantic: additional plays after the first;-1for infinite).-prepareToPlay,-play,-stopon destruction inside@autoreleasepool. Frameworks:AVFoundation,Foundation. Exact loop count, no timer, clean lifecycle.Linux — libcanberra (committed)
libcanberrais the freedesktop event-sound API. An incoming-call ringtone is literally its intended use (CA_PROP_EVENT_ID = "phone-incoming-call"is a defined XDG sound theme event). It multiplexes PulseAudio, PipeWire (via pipewire-pulse or pipewire-alsa), and pure ALSA behind a single API. Packaged on every major distro, usually preinstalled on GNOME/KDE.Rejected alternatives:
paplay/aplay: hacky.Loop counting: libcanberra has no native loop count. Subscribe to
ca_finish_callback_t; onCA_SUCCESS, ifremainingLoops > 0, decrement andca_context_play_fullagain. The callback fires on the libcanberra worker thread — marshal to the main thread viaQMetaObject::invokeMethod(this, ..., Qt::QueuedConnection)before touching members.stop()setsremainingLoops = 0first (in-flight callback becomes a no-op), thenca_context_cancel.Pass
CA_PROP_MEDIA_FILENAME(extracted path),CA_PROP_MEDIA_ROLE = "event",CA_PROP_CANBERRA_CACHE_CONTROL = "permanent",CA_PROP_EVENT_ID = "phone-incoming-call".Build-time optionality:
pkg_check_modules(CANBERRA IMPORTED_TARGET libcanberra). If absent, compile a no-op backend that logs a warning on firstplay(). Do not hard-fail the build. Soft-deplibcanberra0in distro packaging metadata.File layout
Compile-switched backend files (mirrors the existing
folderwatcher_{linux,win,mac}.cpp/systray_mac_common.mmconvention) with PIMPL so platform headers never leak into the public header.The
Backendinner class has a uniform interface (setSource,play(remainingLoops),stop(), d'tor +finishedsignal). Windows and macOS backends handle loops natively and only signalfinishedat sequence end. Linux backend signals after each play; the common file issues the nextplay().CMake / build changes
src/gui/CMakeLists.txt
notificationsoundplayer.handnotificationsoundplayer.cpptoclient_SRCSnear callstatechecker (unconditional).IF(APPLE)block (~line 289): appendnotificationsoundplayer_mac.mm.IF(WIN32)block (~line 356): appendnotificationsoundplayer_win.cpp.IF(NOT WIN32 AND NOT APPLE)block (~line 353): appendnotificationsoundplayer_linux.cpp.target_link_libraries(nextcloudCore PRIVATE winmm xaudio2)inside the Windows block after ~line 364.-framework …list near line 684 with-framework AVFoundation -framework Foundation.pkg_check_modules(CANBERRA IMPORTED_TARGET libcanberra); conditionallytarget_link_libraries(nextcloudCore PRIVATE PkgConfig::CANBERRA)andtarget_compile_definitions(nextcloudCore PRIVATE HAVE_LIBCANBERRA).src/gui/owncloudgui.cpp
#include "notificationsoundplayer.h"with the sibling GUI includes.cmake/modules/NSIS.template.in:436-437
Delete both lines (
Qt5Multimedia.dll,Qt5MultimediaWidgets.dll). They cannot resolve in a Qt6 build — dead code. Do not scrub the rest of the Qt5 block; out of scope.Deployment audit (no changes expected; verify)
macdeployqtinvocation at src/gui/CMakeLists.txt:733-744 scans-qmldir=${CMAKE_SOURCE_DIR}/src/guifor QML imports. Onceimport QtMultimediais removed, it stops copyingQtMultimedia.frameworkandqml/QtMultimedia/libquickmultimedia.dylib. Verify by diffingContents/FrameworksandContents/Resources/qml/before/after.windeployqtis used anywhere; Windows bundling is entirely NSIS-template driven.admin/osx/make_universal.py: no Multimedia references — confirmed clean.QML migration
src/gui/tray/CallNotificationDialog.qml:
import QtMultimedia.The
com.nextcloud.desktopclientimport at line 9 already exposes the new type. Call sites at lines 39 (ringSound.stop()) and 59 (ringSound.play()) are unchanged.Implementation sequence
notificationsoundplayer.{h,cpp}with the full public API, QRC extractor, PIMPL Backend interface, and a no-op Backend for all platforms.Qt5Multimedia*.dlllines from the NSIS template.Verification
Manual end-to-end (on each platform)
Triggered by a real Talk call: two accounts on a Talk-enabled Nextcloud server, callee logged into the desktop client on the platform under test, caller initiates a call from the Talk web UI. Expected:
CallNotificationDialogappears, ringtone plays.Cases to confirm on Windows, macOS, and Linux:
stop()from QML).CallStateChecker-driven stop).ca_context_destroy).Unit tests (worth writing)
resolveToFilesystemPath(): feedqrc:///client/theme/call-notification.wav, afile://URL, a plain path. Assert readable filesystem result and idempotency.finishedhook; assertplay()is issuedloops - 1additional times and thatstop()mid-sequence cancels remaining plays.Full native-playback integration tests are not worth CI effort (real audio device, flaky). Keep those manual.
Risks / edge cases
HAVE_LIBCANBERRAcompile switch → no-op backend. Build still succeeds.IXAudio2(leaking the voice keeps a background callback thread alive pastexec()). macOS:[player stop]; player = nil;in@autoreleasepool. Linux:remainingLoops = 0; ca_context_cancel; ca_context_destroy(destroy joins the worker thread).<path>.tmp+ atomic rename.QTemporaryFilefor that session; log once.Critical files
src/gui/notificationsoundplayer.{h,cpp},notificationsoundplayer_win.cpp,notificationsoundplayer_mac.mm,notificationsoundplayer_linux.cppReused existing patterns
CallStateCheckeras sibling shape).folderwatcher_linux.cpp/folderwatcher_win.cpp/folderwatcher_mac.cppandsystray_mac_common.mmconvention in src/gui/CMakeLists.txt.#ifdef Q_OS_*point-of-use pattern:systray.cpp,openfilemanager.cpp,socketapi.cpp..mmfor macOS backends:systray_mac_common.mm,fileprovider_mac.mm.