Skip to content

Differences Across Cemu Audio APIs in Measurements and Subjective Listening #1520

Open
@tempehnoodle

Description

@tempehnoodle

Current Behavior

Each audio API in Cemu seems to show different characteristics in both measurements and subjective listening. These differences occur both when running directly from Cemu.exe, and within Retrobat. Variation in behavior between audio backends may be present in other emulators as well. The three audio API options tested are Cubeb, DirectSound, and Xaudio2. All other Windows and emulation settings are identical. Only one audio device, an external DAC is connected to the PC for audio playback. Under the Cemu audio settings, TV Device dropdown menu reads Default Device for Cubeb, and Primary Audio Driver for both Xaudio2 and DirectSound. These are the default settings. When manually changing to the 3- USB Sound Device, in the same Cemu TV Device dropdown menu, no noticeable changes in measurements and sound occurred compared to the default settings. The Cemu volume setting for testing is set 100 for ease of readability on the spectrogram. When the Cemu volume setting is set to the default of 50, the characteristics in sound and measurements are analogous, but are much harder to see on the spectrogram The sound samples provided are recorded as WAV files in Audacity with a sample rate of 16-Bit, 44,100Hz. The audio playback sample rate on the PC is also 16-Bit, 44,100Hz. The minimum frequency of these recordings is 0 Hz and the maximum is 22,050 Hz. Peak frequency spectrograms of each sound sample are provided.

Windows and Cemu Sound Settings, and Audacity Recording Settings

Image

The peak frequency spectrograms show the amplitude of sound energy across a range of frequencies over a period of time. The x-axis shows the timescale marked in seconds. The midpoint of the currently visible timeline is marked by a thick vertical line down the center of the x-axis. The bottom left of the window shows the lowest and highest values of the currently displayed timeline of the graph. The y-axis shows the frequency response scale measured in Hertz (Hz).

The color scale is mapped to sound amplitude/decibels (dB). The amplitude of sound corresponds to its color and intensity at the specific frequency and moment in time. When viewing only specific sections of the recording and not the whole recording, the relationship between color intensity and amplitude adjusts to the sound level of the currently visible portion, rather than the sound level of the entire recording. This means that a quiet sounding section that would otherwise appear dark blue and black due to the rest of the recording being loud, would be much brighter, possibly becoming red and yellow if the window only showed the quiet section.

Figure 1 shows peak frequency spectrograms of Cubeb, DirectSound, and Xaudio2 recordings of Mario Kart 8 at 11-20 kHz. This frequency range is within the range of human hearing, and focuses on the upper harmonics above the vocal range. The recorded section includes car sound effects for the first 2 seconds, followed by a voice clip for the next 2, and music for the remainder of the timeline. Xaudio 2 shows fewer broad dips in amplitude across the frequency band from 11-18 kHz during the first four seconds when compared to Cubeb and DirectSound. During these sound effects, the difference between amplitude peaks and dips are greater in Cubeb and DirectSound from 17-20 kHz than Xaudio2. In this same area, the sound energy of Cubeb and DirectSound is less evenly distributed across the frequency band. Xaudio2 shows stronger peaks than either from 11khz-14kHz during the initial sound effects, with fewer dips in sound energy across the frequency band. Similar behavior continues from the 4-5 second mark during the brass section’s introduction. All APIs show similar readings from 5-10 seconds from 11-14 kHz. From 5-10 seconds around 15-20 kHz, Cubeb and Xaudio show a greater number of high amplitude peaks. Xaudio2 shows fewer peaks in this region. The section from 5-10 seconds shows a bass guitar solo with percussion. During this bass solo, Cubeb and DirectSound show more peaks with a longer decay in the 15-20kHz region than Xaudio2. Xaudio2 show peaks and dips more evenly distributed along the time domain corresponding to quarter-notes beats of this bass solo.

Figure 1

mk8 opening cubeb, directsound, xaudio2 sample 1 100 Vol (11-20kHz) ~0-10 sec

Image
Image
Image

Figure 2 shows high frequencies from 10-22.05 kHz. Cubeb shows reduction in high frequencies above 20 kHz. DirectSound shows a rolloff at around 21 kHz. Xaudio2 shows the greatest extension up to 22.05 kHz, which is the highest frequency for a sampling rate of 44.1 kHz. Because the color scale is normalized in these images, they appear much brighter than they would if viewed from wider window of the frequency range.

Figure 2

mk8 opening cubeb, directsound, xaudio2 sample 1 100 Vol (10-22.05 kHz) full length

Image
Image
Image

Figure 3 shows the majority of frequencies from 40-22,050 Hz. The varying high frequency cutoffs between different APIs is still visible on the high frequency spike at 5 seconds, but is much less visible on the color normalized scale on Figure 2.

Figure 3

mk8 opening cubeb, directsound, xaudio2 sample 1 100 Vol (40 Hz-22.05 kHz) full length

Image
Image
Image

Subjective Listening

The recordings start with a sound effect of cars zooming by, a voice line, and a long music track. The brass section introduces the title theme, which leads into a bass solo. During this bass solo, the percussionist outlines each quarter note with the kick drum and hi-hat. There are four quarter note beats per measure, and the bpm is around 144. A useful listening aid for making sense of the music recording is clapping at certain intervals. Clapping during only the off-beats, which are beats 2 and 4, or alternatively clapping on beats 1 and 3, can help the listener get a general sense of how these recordings differ from each other.
The opening voice line sounds slightly coarse in timbre through the Cubeb and DirectSound samples when compared to Xaudio2. Xaudio2 produces more metallic, and expressive sounding brass section than the other APIs. Xaudio2 also provides more of a rhythmic impulse that leads into the off-beats (beats 2 and 4). It’s got more groove. Cubeb does not produce this emphasis in the off-beats, but on beats 1 and 3. When listening to the Cubeb sample, clapping on the off beats feels more like an interruption in the flow of the music. DirectSound is a bit closer to Xaudio2 in its brighter tone, but not quite as lively as Xaudio2 when it comes to the sense of rhythm. During the g funk whistle synth solo at about 30 seconds, the percussionist has a more articulate sound in Xaudio2 than the other APIs. This adds more nuance, and rhythmic variety to the performance. In the Cubeb and DirectSound samples, the brass section sounds controlled in dynamics during ascending scales and crescendos. The Xaudio2 sample sounds more dynamic in these same sections.

Sound Samples

mk8 opening cubeb sample 1 100.zip

mk8 opening cubeb sample 2 100.zip

mk8 opening cubeb sample 3 100.zip

mk8 opening directsound sample 1 100.zip

mk8 opening directsound sample 2 100.zip

mk8 opening directsound sample 3 100.zip

mk8 opening xaudio2 sample 1 100.zip

mk8 opening xaudio2 sample 2 100.zip

mk8 opening xaudio2 sample 3 100.zip

Listed below are images of all samples used, showing the same frequency and time window as Figure 1

Image
Image
Image
Image
Image
Image
Image
Image
Image

Expected Behavior

Audio APIs expected to be identical in sound.

Steps to Reproduce

  1. Open Cemu.exe
  2. Click on Options Tab
  3. Click on General Settings
  4. Click on Audio Tab
  5. Select Audio API
  6. Set Volume to either 50 or 100
  7. Close General Settings Window
  8. Double click on Mario Kart 8 icon

System Info (Optional)

OS: Windows 10 Home
GPU: Sapphire Pulse RX 6600 Lite Edition

Emulation Settings (Optional)

No response

Logs (Optional)

log.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions