Skip to content

Inconsistent AL_SAMPLE_OFFSET updates on Raspberry Pi OS leading to audio synchronization issues (too slow playing position update with too large sample steps) #6422

@qrp73

Description

@qrp73

Describe the bug

I am debugging my code, which is designed to synchronize precisely with the sound card to maintain a continuous audio stream without interruptions and with minimal latency. However, I have encountered an issue where my code fails to maintain stream synchronization with the sound card on Raspberry Pi OS, while it works fine on other systems (Windows, Linux Arch).

A detailed investigation revealed that on Raspberry Pi OS, the AL_SAMPLE_OFFSET position updates occur in excessively large steps with a long interval, and the update step is quite unusual (not a power of two).

For example, at a sample rate of 48 kHz, the step is typically 1114 samples with an update interval of 30 ms, which is too much to maintain synchronization with a buffer length of 1/50 or 1/60 seconds (to match the display refresh rate). In contrast, on Linux Arch with the same USB sound card, I see a stable update step of 512 samples and an update interval of 10 ms, which is sufficient for most use cases.

Could you please fix it to get 10 ms update and 512 sample steps at 48 kHz sample rate as it works on other OS?

Steps to reproduce the behaviour

  1. sudo apt install libopenal-dev
  2. create test-al.cpp with the following test code:
// g++ -o test-al test-al.cpp -lopenal
#include <AL/al.h>
#include <AL/alc.h>
#include <iostream>
#include <thread>
#include <vector>
#include <cmath>
#include <algorithm>
#include <chrono>
#include <numeric>
#include <iomanip>

const int SampleRate = 48000;
const int BufferSize = SampleRate / 50;
int BufferCount = 8;
bool isFinished = false;

double phase = 0;
double phaseStep = 2 * M_PI * 700 / SampleRate;

// 700 Hz sine generator
void RequestData(uint32_t* buffer) {
    for (int i = 0; i < BufferSize; i++) {
        auto sample = static_cast<uint16_t>(32760 * sin(phase));
        buffer[i] = static_cast<uint32_t>(sample | (sample << 16));
        phase += phaseStep;
        if (phase >= 2 * M_PI)
            phase -= 2 * M_PI;
    }
}

void BufferStreamingThread() {
    ALuint source = 0;
    ALuint buffers[BufferCount] = {0};

    try {
        alGenSources(1, &source);
        alGenBuffers(BufferCount, buffers);

        uint32_t data[BufferSize];
        for (int i = 0; i < BufferCount; i++) {
            RequestData(data);
            alBufferData(buffers[i], AL_FORMAT_STEREO16, data, sizeof(data), SampleRate);
            alSourceQueueBuffers(source, 1, &buffers[i]);
        }

        printf("SampleRate:  %d\n", SampleRate);
        printf("BufferSize:  %d\n", BufferSize);
        printf("BufferCount: %d\n", BufferCount);

        alSourcePlay(source);
        std::vector<std::tuple<long long, int, int>> records;

        while (!isFinished) {
            ALint processedBuffers = 0, sampleOffset = 0;
            alGetSourcei(source, AL_BUFFERS_PROCESSED, &processedBuffers);
            alGetSourcei(source, AL_SAMPLE_OFFSET, &sampleOffset);

            auto now = std::chrono::high_resolution_clock::now();
            auto timestamp = std::chrono::duration_cast<std::chrono::microseconds>(now.time_since_epoch()).count();
            records.emplace_back(timestamp, sampleOffset, processedBuffers);

            if (records.size() > 1 && std::get<1>(records[records.size() - 2]) > sampleOffset) {
                std::vector<double> stat_offdt;
                std::vector<int> stat_offds;
                std::vector<double> stat_bufdt;
                auto t0 = std::get<0>(records.front());
                int prevOffset = -1;
                long long prevOfTime = 0;
                int prevBuffer = -1;
                long long prevBuTime = 0;

                for (size_t i = 0; i < records.size(); ++i) {
                    auto t1 = std::get<0>(records[i]);
                    int offset = std::get<1>(records[i]);
                    int buffer = std::get<2>(records[i]);

                    double timef = (t1 - t0) / 1000.0;
                    double offdt = NAN;
                    double bufdt = NAN; 
                    int offds = 0;
                    if (prevOffset != offset) {
                        if (prevOffset != -1) {
                            offdt = (t1 - prevOfTime) / 1000.0;
                            stat_offdt.push_back(offdt);

                            offds = offset - prevOffset;                            
                            if (i != records.size()-1)  // do not include last record into statistics
                            {
                                stat_offds.push_back(offds);
                            }
                        }
                        prevOffset = offset;
                        prevOfTime = t1;
                    }
                    if (prevBuffer != buffer) {
                        if (prevBuffer != -1) {
                            bufdt = (t1 - prevBuTime) / 1000.0;
                            stat_bufdt.push_back(bufdt);
                        }
                        prevBuffer = buffer;
                        prevBuTime = t1;
                    }

                    printf("%zu: time=%.1f ms, offset=%d, buffers=%d", i, timef, offset, buffer);
                    if (!std::isnan(offdt))
                        printf(" => offset_dt=%.2f ms, delta=%d", offdt, offds);
                    if (!std::isnan(bufdt))
                        printf(" => buffer_dt=%.2f ms", bufdt);
                    printf("\n");
                }

                printf("Statistics: [sampleOffset] | [processedBuffers]\n");
                printf("min dt [ms]: %6.2f | %6.2f\n",
                    *std::min_element(stat_offdt.begin(), stat_offdt.end()),
                    *std::min_element(stat_bufdt.begin(), stat_bufdt.end()));
                printf("max dt [ms]: %6.2f | %6.2f\n",
                    *std::max_element(stat_offdt.begin(), stat_offdt.end()),
                    *std::max_element(stat_bufdt.begin(), stat_bufdt.end()));
                printf("avg dt [ms]: %6.2f | %6.2f\n",
                    std::accumulate(stat_offdt.begin(), stat_offdt.end(), 0.0) / stat_offdt.size(),
                    std::accumulate(stat_bufdt.begin(), stat_bufdt.end(), 0.0) / stat_bufdt.size());
                printf("min delta samples: %d\n", *std::min_element(stat_offds.begin(), stat_offds.end()));
                printf("max delta samples: %d\n", *std::max_element(stat_offds.begin(), stat_offds.end()));
                printf("avg delta samples: %d\n", std::accumulate(stat_offds.begin(), stat_offds.end(), 0) / stat_offds.size());
                
                isFinished = true;
                break;
            }

            std::this_thread::sleep_for(std::chrono::microseconds(50));
        }
    } catch (...) {
        std::cerr << "Error occurred in buffer streaming thread." << std::endl;
    }

    // Clean up resources
    alSourceStop(source);
    alDeleteSources(1, &source);
    alDeleteBuffers(BufferCount, buffers);
}

int main(int argc, char* argv[]) {
    if (argc < 2) {
        std::cerr << "Usage: " << argv[0] << " <BufferCount>" << std::endl;
        return -1;
    }
    BufferCount = std::stoi(argv[1]);

    ALCdevice* device = nullptr;
    // Attempt to open the default device
    device = alcOpenDevice(nullptr);
    if (!device) {
        std::cerr << "Failed to open device" << std::endl;
        return -1;
    }

    ALCint attributes[] = { 
        ALC_FREQUENCY, SampleRate, 
        ALC_REFRESH, 0,
        ALC_SYNC, 0,
        0 
    };
    ALCcontext* context = alcCreateContext(device, attributes);
    if (!context || alcMakeContextCurrent(context) == ALC_FALSE) {
        std::cerr << "Failed to create or set context" << std::endl;
        if (context) alcDestroyContext(context);
        alcCloseDevice(device);
        return -1;
    }

    std::thread workerThread(BufferStreamingThread);
    workerThread.join();

    alcDestroyContext(context);
    alcCloseDevice(device);

    return 0;
}
  1. compile g++ -o test-al test-al.cpp -lopenal
  2. run the test for 500 buffers (500/50 = 10 seconds) to collect statistics: ./test-al 500

Expected result: max delta samples: 512 and max dt about 10 ms +- jitter

Actual result:

Statistics: [sampleOffset] | [processedBuffers]
min dt [ms]:   0.16 |   0.23
max dt [ms]:  60.41 |  60.41
avg dt [ms]:  25.11 |  30.34
min delta samples: 180
max delta samples: 2228
avg delta samples: 1200

For comparison, here is results from Linux Arch with the same USB sound card:

Statistics: [sampleOffset] | [processedBuffers]
min dt [ms]:  10.36 |  10.45
max dt [ms]:  10.91 |  21.54
avg dt [ms]:  10.65 |  19.98
min delta samples: 512
max delta samples: 512
avg delta samples: 512

As observed, Raspberry Pi OS can update the sample position by as much as 2228 samples in a single step (and even more!), while the buffer size is just 960 samples. Consequently, it occasionally updates the sample position only once per three buffers, resulting in significant lag and sound streaming breaks.

Device (s)

Raspberry Pi 4 Mod. B

System

$ uname -a && cat /etc/rpi-issue && vcgencmd version
Linux raspi 6.6.51+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.51-1+rpt3 (2024-10-08) aarch64 GNU/Linux
Raspberry Pi reference 2023-09-22
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 40f37458ae7cadea1aec913ae10b5e7008ebce0a, stage4
Aug 30 2024 19:17:39 
Copyright (c) 2012 Broadcom
version 2808975b80149bbfe86844655fe45c7de66fc078 (clean) (release) (start)

Logs

Raspberry Pi OS:

$ ./test-al 500
SampleRate:  48000
BufferSize:  960
BufferCount: 500
<...>
81001: time=9946.5 ms, offset=475678, buffers=495 => offset_dt=31.66 ms, delta=1114 => buffer_dt=31.66 ms
81002: time=9946.7 ms, offset=475678, buffers=495
...
81247: time=9978.1 ms, offset=475678, buffers=495
81248: time=9978.5 ms, offset=476792, buffers=496 => offset_dt=31.99 ms, delta=1114 => buffer_dt=31.99 ms
81249: time=9978.7 ms, offset=476792, buffers=496
...
81493: time=10010.3 ms, offset=476792, buffers=496
81494: time=10010.5 ms, offset=478840, buffers=498 => offset_dt=31.93 ms, delta=2048 => buffer_dt=31.93 ms
81495: time=10010.6 ms, offset=478840, buffers=498
81497: time=10010.9 ms, offset=478840, buffers=498
81498: time=10011.0 ms, offset=479020, buffers=498 => offset_dt=0.52 ms, delta=180
81499: time=10011.1 ms, offset=479020, buffers=498
81500: time=10011.3 ms, offset=479020, buffers=498
...
81735: time=10042.2 ms, offset=479020, buffers=498
81736: time=10042.4 ms, offset=0, buffers=500 => offset_dt=31.37 ms, delta=-479020 => buffer_dt=31.89 ms
Statistics: [sampleOffset] | [processedBuffers]
min dt [ms]:   0.16 |   0.23
max dt [ms]:  60.41 |  60.41
avg dt [ms]:  25.11 |  30.34
min delta samples: 180
max delta samples: 2228
avg delta samples: 1200

Linux Arch on the same USB sound card:

$ ./test-al 500
SampleRate:  48000
BufferSize:  960
BufferCount: 500
<...>
62494: time=9947.5 ms, offset=478208, buffers=498 => offset_dt=10.65 ms, delta=512 => buffer_dt=21.37 ms
62495: time=9947.7 ms, offset=478208, buffers=498
...
62559: time=9958.0 ms, offset=478208, buffers=498
62560: time=9958.1 ms, offset=478720, buffers=498 => offset_dt=10.62 ms, delta=512
62561: time=9958.3 ms, offset=478720, buffers=498
...
62626: time=9968.6 ms, offset=478720, buffers=498
62627: time=9968.7 ms, offset=479232, buffers=499 => offset_dt=10.61 ms, delta=512 => buffer_dt=21.24 ms
62628: time=9969.0 ms, offset=479232, buffers=499
...
62692: time=9979.3 ms, offset=479232, buffers=499
62693: time=9979.5 ms, offset=479744, buffers=499 => offset_dt=10.76 ms, delta=512
62694: time=9979.7 ms, offset=479744, buffers=499
...
62758: time=9990.0 ms, offset=479744, buffers=499
62759: time=9990.1 ms, offset=0, buffers=500 => offset_dt=10.62 ms, delta=-479744 => buffer_dt=21.39 ms
Statistics: [sampleOffset] | [processedBuffers]
min dt [ms]:  10.36 |  10.45
max dt [ms]:  10.91 |  21.54
avg dt [ms]:  10.65 |  19.98
min delta samples: 512
max delta samples: 512
avg delta samples: 512

Additional context

Tested with USB sound card CX31993. But with other sound cards it shows the same issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions