Skip to content

feat(audio,task): Worker primitive + Processor::setCoreAffinity offload#2357

Closed
zackees wants to merge 1 commit intomasterfrom
feat/audio-processor-core-affinity-2308
Closed

feat(audio,task): Worker primitive + Processor::setCoreAffinity offload#2357
zackees wants to merge 1 commit intomasterfrom
feat/audio-processor-core-affinity-2308

Conversation

@zackees
Copy link
Copy Markdown
Member

@zackees zackees commented Apr 19, 2026

Summary

Phase 2 of #2308 option (B) — portable way to offload audio pump work off the core running `FastLED.show()`. Builds on the `CoroutineConfig::core_id` plumbing shipped in #2337.

Two new pieces:

1. `fl::task::Worker` — one-shot functor worker

Platform `Worker::run(fn)` behaviour
ESP32 dual-core (dev/S3/P4) Runs fn on a FreeRTOS task pinned to `core`.
ESP32 single-core (S2/C2/C3/...) Synchronous call on the calling thread.
Non-ESP32 (AVR / ARM / host) Synchronous call on the calling thread.

Semaphore handshake: binary semaphore to wake the worker on a pending functor pointer, a second binary semaphore to signal the caller on completion. Caller blocks during run() but the scheduler can service OS/WiFi on the caller's host core in the meantime — so the audio cost lands on the other physical core.

The audio pump functor stays one-shot (no resume, no coroutine loop). The coroutine / FreeRTOS task is the pinning mechanism, nothing more — matching the clarification in the issue thread.

2. `Processor::setCoreAffinity(int core)`

Portable opt-in. `0` / `1` / `-1` accepted. Silently a no-op on platforms that can't express the concept so user code compiles unchanged everywhere:

```cpp
auto proc = AudioManager::instance().add(config);
proc->setCoreAffinity(0); // offload audio to core 0 when dual-core
```

Scheduler still drives `every_ms(1)`; only the inner pump body hops to the worker. Threading caveat documented: when offloaded, `onBass` / `onBeat` / `onVibeLevels` callbacks fire on the pinned core, so user code must not call `FastLED.show()` or touch LED arrays from inside those callbacks.

Autoresearch hooks for hardware validation

Three new RPC handlers in `examples/AutoResearch/AutoResearchRemote.cpp`, all added to `testCoroutineAll`:

  • `testCoroutineCoreAffinity` — spawns a coroutine per core with `core_id=n`, asserts `xPortGetCoreID()` inside the lambda matches the request. Validates phase 1 plumbing.
  • `testWorkerAffinity` — end-to-end check that `fl::task::Worker` actually runs work on the requested core. Reports `isPinned()` and observed core per worker. Single-core path verifies synchronous fallback.
  • `testFftBackendTiming` — times the currently-compiled FFT backend (`kiss_fftr` default; ESP-DSP with `-DFL_FFT_USE_ESP_DSP=1`) against a single-tone sine and reports µs/call + peak bin. Runtime half of the hw-vs-software FFT comparison asked for in the issue. Two autoresearch runs (with and without the define) give the head-to-head.

Test plan — local verification

  • `bash test fl_task_worker --debug` → pass (new Worker unit test)
  • `bash test fl_audio_audio_processor --debug` → pass
  • `bash test fl_audio_audio_reactive --debug` → pass (same test flakes in quick mode, not caused by this change)
  • `bash compile esp32dev --examples Blink` → pass
  • `bash compile esp32s3 --examples Blink` → pass
  • `bash compile esp32c3 --examples Blink` → pass (synchronous fallback)
  • `bash compile esp32s3 --examples AutoResearch --no-filter` → pass
  • `bash lint` → pass
  • Hardware run of `bash autoresearch esp32s3 --coroutine` to get `testCoroutineAll` results including the three new handlers.
  • FPS measurement with audio reactive sketch + setCoreAffinity(0) vs. default, on ESP32-S3.

Refs #2308

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features

  • Added CPU core affinity support for audio processing tasks, enabling workload distribution across processor cores on multicore platforms
  • Introduced a new Worker API for flexible task execution with core pinning capabilities

Tests

  • Added comprehensive test suite validating Worker functionality and core affinity behavior across different platform configurations

Phase 2 of issue #2308 option (B) — give portable user code a way to
move audio pump work off the core that runs FastLED.show(), without
forcing the core-pinning concept into every sketch on platforms where
it can't be expressed.

New: fl::task::Worker

One-shot functor worker. Caller calls worker.run(fn); it runs fn on the
worker and blocks the caller until it returns. Platform behaviour:

  - ESP32 dual-core (FL_HAS_MULTICORE_AFFINITY) — spawns a persistent
    FreeRTOS task pinned to the requested core, wakes it on a binary
    semaphore, runs fn, signals completion on a second semaphore. The
    caller blocks during run() but the scheduler can service OS/WiFi on
    the caller's host core in the meantime, so audio cost lands on the
    other physical core.
  - Single-core ESP32 / AVR / ARM / host — run() is synchronous. Same
    API, same calling code; no-op on pinning. Keeps setCoreAffinity()
    compilable and meaningful everywhere.

Audio integration: Processor::setCoreAffinity(int core)

Portable: accepts 0/1/-1. On dual-core ESP32 creates a pinned Worker
and routes the per-tick pump closure through it. On single-core /
non-ESP32 it's a silent no-op for core==0 and an error for positive
others. The scheduler still drives every_ms(1); only the inner pump
body hops to the worker.

Audio task remains a one-shot functor — we're not running a long
coroutine loop. The coroutine (as FreeRTOS task) is the pinning
mechanism, nothing more, which matches the clarification in issue
#2308 review comments.

Threading caveat is documented on setCoreAffinity: when offloaded,
onBass/onBeat/onVibeLevels callbacks fire on the pinned core, so user
code must not call FastLED.show() or touch LED arrays from inside.

Autoresearch validation

- testCoroutineCoreAffinity (AutoResearchRemote.cpp): spawns coroutines
  with core_id=0..FL_CPU_CORES-1 and asserts xPortGetCoreID() inside
  the lambda matches the request. Validates the phase 1 plumbing.
- testWorkerAffinity: end-to-end check that fl::task::Worker actually
  runs work on the requested core. Reports isPinned() and observed
  core per worker. On single-core platforms verifies synchronous
  fallback runs the functor.
- testFftBackendTiming: times the currently-compiled FFT backend
  (kiss_fftr by default, ESP-DSP with -DFL_FFT_USE_ESP_DSP=1) against
  a single-tone sine, reports us-per-call + peak bin for side-by-side
  comparison across builds. This is the runtime half of the
  hw-vs-software FFT comparison in issue #2308 option (A).
All three are registered in testCoroutineAll so the existing
--coroutine autoresearch mode picks them up.

Test plan verified locally

- bash test fl_task_worker --debug            pass
- bash test fl_audio_audio_processor --debug  pass
- bash test fl_audio_audio_reactive --debug   pass
- bash compile esp32dev  --examples Blink     pass
- bash compile esp32s3   --examples Blink     pass
- bash compile esp32c3   --examples Blink     pass (synchronous fallback)
- bash compile esp32s3   --examples AutoResearch --no-filter  pass
- bash lint                                   pass

Refs #2308

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 19, 2026

📝 Walkthrough

Walkthrough

This PR adds CPU core affinity support for audio processor tasks on ESP32 platforms via a new fl::task::Worker class that pins tasks to specific cores. Includes worker implementation with FreeRTOS backend, public API methods for setting core affinity, unit tests, and RPC test handlers validating core pinning behavior.

Changes

Cohort / File(s) Summary
Worker Task Infrastructure
src/fl/task/worker.h, src/fl/task/worker.cpp.hpp, src/fl/task/_build.cpp.hpp
New fl::task::Worker class for pinned one-shot functor execution. Header defines public API with constructor, run(), isPinned(), and coreId(). Implementation uses FreeRTOS binary semaphores and task pinning on ESP32 multicore; falls back to synchronous execution on non-multicore platforms. Includes platform-specific shutdown handling via destructor.
Audio Processor Core Affinity
src/fl/audio/audio_processor.h, src/fl/audio/audio_processor.cpp.hpp
Added public setCoreAffinity(int core) and getCoreAffinity() methods to enable CPU pinning of the audio pump task. Refactored pump task creation into buildAutoPumpTask() helper to support dynamic worker reconfiguration. Added private state: mSelfWeak, mCoreAffinity, and mWorker unique pointer. Validates core IDs against FL_CPU_CORES on ESP32; treats core pinning as no-op on non-ESP32 for portability.
Worker Unit Tests
tests/fl/task/worker.cpp
Five test cases validating synchronous execution, re-entrancy, empty functor handling, isPinned() return value, and coreId() getter across host/ESP32 configurations.
AutoResearch RPC Test Handlers
examples/AutoResearch/AutoResearchRemote.cpp
Added three new RPC-exposed test functions: testCoroutineCoreAffinity (verifies CoroutineConfig::core_id pinning), testFftBackendTiming (measures FFT performance over 100 iterations), and testWorkerAffinity (validates Worker::run core pinning). Updated testCoroutineAll batch to include new tests and increased num_tests from 8 to 10.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant Proc as Processor
    participant Worker as fl::task::Worker
    participant FreeRTOS as FreeRTOS Kernel
    participant Core as Target CPU Core

    App->>Proc: setCoreAffinity(core_id)
    activate Proc
    Proc->>Worker: create Worker(core_id)
    activate Worker
    Worker->>FreeRTOS: xTaskCreatePinnedToCore(...)
    FreeRTOS->>Core: spawn pinned task
    Worker-->>Proc: return Worker instance
    deactivate Worker
    Proc->>Proc: buildAutoPumpTask()
    Proc-->>App: success
    deactivate Proc

    Note over Proc,Core: Audio pump execution

    App->>Proc: audio update (every 1ms)
    activate Proc
    Proc->>Proc: pump_fn: read samples, call update()
    Proc->>Worker: run(pump_fn)
    activate Worker
    Worker->>Worker: acquire mutex
    Worker->>FreeRTOS: xSemaphoreGive(mWorkAvailable)
    Core->>Core: execute pump_fn on pinned core
    FreeRTOS->>FreeRTOS: xSemaphoreGive(mWorkDone)
    Worker->>Worker: release mutex
    Worker-->>Proc: return
    deactivate Worker
    Proc-->>App: complete
    deactivate Proc
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • FastLED/FastLED#2308: Directly implements the proposed audio-core offload mechanism and worker core pinning infrastructure referenced in this issue.

Possibly related PRs

  • FastLED/FastLED#2337: Uses the same ESP32 multicore affinity feature flags (FL_CPU_CORES, FL_HAS_MULTICORE_AFFINITY) and implements complementary CoroutineConfig::core_id pinning alongside this PR's Worker core affinity.

Poem

🐰 A worker hops to each core with care,
Tasks pinned tight in the FreeRTOS air,
Audio pumps now dance where you say,
Cross-core affinity leads the way,
Fast and fair on the dual-core display! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.83% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the two main additions in the changeset: a new Worker primitive and Processor::setCoreAffinity for offloading audio processing, directly reflecting the primary changes across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/audio-processor-core-affinity-2308

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

📊 Header Compilation Performance

Total Time: 786.84ms

Top 5 Slowest Headers

  1. FastLED.h - 599.71ms
  2. fl/fastled.h - 250.72ms
  3. fl/task/executor.h - 197.70ms
  4. crgb.h - 137.96ms
  5. fl/gfx/crgb.h - 136.93ms

Warnings

⚠️ Header FastLED.h exceeds 150ms (599.7ms)
⚠️ Header fl/fastled.h exceeds 150ms (250.7ms)
⚠️ Header fl/task/executor.h exceeds 150ms (197.7ms)
⚠️ Header crgb.h exceeds 50ms (138.0ms)
⚠️ Header fl/gfx/crgb.h exceeds 50ms (136.9ms)
⚠️ Header fl/math/ease.h exceeds 50ms (133.6ms)
⚠️ Header fl/math/fixed_point.h exceeds 50ms (132.2ms)
⚠️ Header fl/task/scheduler.h exceeds 50ms (99.6ms)
⚠️ Header fl/stl/singleton.h exceeds 50ms (92.8ms)
⚠️ Header fl/stl/thread_local.h exceeds 50ms (92.3ms)
⚠️ Header fl/stl/thread.h exceeds 50ms (91.5ms)
⚠️ Header platforms/thread.h exceeds 50ms (91.4ms)
⚠️ Header platforms/stub/thread_stub.h exceeds 50ms (91.3ms)
⚠️ Header platforms/stub/thread_stub_stl.h exceeds 50ms (91.3ms)
⚠️ Header fl/task/promise.h exceeds 50ms (88.7ms)
⚠️ Header fl/ui.h exceeds 50ms (84.6ms)
⚠️ Header fl/stl/json.h exceeds 50ms (52.4ms)
⚠️ Template instantiation time is high (44.9%)

Full Report
================================================================================
FASTLED COMPILATION PERFORMANCE REPORT
================================================================================
Generated: 2026-04-19 10:43:29
Compiler: clang++-17
File: ci/perf/test_compile.cpp

SUMMARY
--------------------------------------------------------------------------------
Total Compilation Time:    1,361.09 ms
Frontend Time:             11.15 ms (0.8%)
Backend Time:              36.20 ms (2.7%)

COMPILATION PHASES
--------------------------------------------------------------------------------
Source                    10792.38 ms (792.9%)
Templates                   348.30 ms ( 25.6%)

FASTLED HEADERS (Level 1 - Direct Includes)
--------------------------------------------------------------------------------
  1172.8 ms - FastLED.h (SLOW)
   684.6 ms - fl/task/executor.h (SLOW)
   475.0 ms - fl/task/scheduler.h (SLOW)
   468.3 ms - fl/stl/singleton.h (SLOW)
   467.8 ms - fl/stl/thread_local.h (SLOW)
   466.8 ms - fl/stl/thread.h (SLOW)
   466.7 ms - platforms/thread.h (SLOW)
   466.7 ms - platforms/stub/thread_stub.h (SLOW)
   466.7 ms - platforms/stub/thread_stub_stl.h (SLOW)
   336.5 ms - fl/fastled.h (SLOW)
   199.8 ms - fl/task/promise.h (SLOW)
   194.2 ms - crgb.h (SLOW)
   193.2 ms - fl/gfx/crgb.h (SLOW)
   189.9 ms - fl/math/ease.h (SLOW)
   188.3 ms - fl/math/fixed_point.h (SLOW)
   167.7 ms - fl/stl/function.h (SLOW)
    95.8 ms - fl/math/fixed_point/s4x12.h (SLOW)
    90.8 ms - fl/stl/vector.h (SLOW)
    83.6 ms - fl/ui.h (SLOW)
    82.6 ms - fl/math/sin32.h (SLOW)

FASTLED HEADERS (Nested Includes - Up to 10 Levels)
--------------------------------------------------------------------------------
FastLED.h (1172.8ms)
  ├─  684.6 ms - fl/task/executor.h
  │ ├─  475.0 ms - fl/task/scheduler.h
  │ │ ├─  468.3 ms - fl/stl/singleton.h
  │ │ │ └─  467.8 ms - fl/stl/thread_local.h
  │ │ │   └─  466.8 ms - fl/stl/thread.h
  │ │ │     └─  466.7 ms - platforms/thread.h
  │ │ │       └─  466.7 ms - platforms/stub/thread_stub.h
  │ │ │         └─  466.7 ms - platforms/stub/thread_stub_stl.h
  │ │ │           └─  452.1 ms - thread
  │ │ │             ├─  220.1 ms - std_thread.h
  │ │ │             └─  210.7 ms - this_thread_sleep.h
  │ │ └─    6.1 ms - fl/task/task.h
  │ │   └─    2.4 ms - fl/trace.h
  │ │     ├─    1.5 ms - fl/stl/chrono.h
  │ │     └─    0.6 ms - fl/stl/tuple.h
  │ ├─  199.8 ms - fl/task/promise.h
  │ │ ├─  167.7 ms - fl/stl/function.h
  │ │ │ ├─   90.8 ms - fl/stl/vector.h
  │ │ │ │ ├─   66.2 ms - fl/math/math.h
  │ │ │ │ │ └─   62.8 ms - fl/stl/undef.h
  │ │ │ │ │   └─  103.7 ms - stdlib.h
  │ │ │ │ │     ├─   52.4 ms - cstdlib
  │ │ │ │ │     │ └─  103.7 ms - stdlib.h
  │ │ │ │ │     │   ├─   52.4 ms - cstdlib
  │ │ │ │ │     │   └─    9.8 ms - types.h
  │ │ │ │ │     └─    9.8 ms - types.h
  │ │ │ │ │       ├─    9.8 ms - types.h
  │ │ │ │ │       │ ├─    9.8 ms - types.h
  │ │ │ │ │       │ ├─    1.9 ms - pthreadtypes.h
  │ │ │ │ │       │ ├─    1.6 ms - select.h
  │ │ │ │ │       │ └─    1.6 ms - endian.h
  │ │ │ │ │       ├─    1.9 ms - pthreadtypes.h
  │ │ │ │ │       │ └─    1.6 ms - thread-shared-types.h
  │ │ │ │ │       ├─    1.6 ms - select.h
  │ │ │ │ │       └─    1.6 ms - endian.h
  │ │ │ │ ├─   11.8 ms - fl/stl/allocator.h
  │ │ │ │ │ └─    8.0 ms - fl/stl/bitset.h
  │ │ │ │ │   └─    5.2 ms - fl/stl/bitset_dynamic.h
  │ │ │ │ │     └─    1.5 ms - fl/stl/unique_ptr.h
  │ │ │ │ ├─    5.7 ms - fl/stl/initializer_list.h
  │ │ │ │ └─    1.8 ms - fl/stl/basic_vector.h
  │ │ │ ├─   61.6 ms - fl/stl/variant.h
  │ │ │ │ └─   60.3 ms - fl/stl/new.h
  │ │ │ │   └─   60.3 ms - platforms/new.h
  │ │ │ │     └─   60.3 ms - platforms/shared/new.h
  │ │ │ │       └─   48.5 ms - new
  │ │ │ │         └─   21.3 ms - c++config.h
  │ │ │ │           └─   10.3 ms - os_defines.h
  │ │ │ ├─    8.4 ms - fl/stl/shared_ptr.h
  │ │ │ │ ├─    4.1 ms - fl/stl/type_traits.h
  │ │ │ │ ├─    1.0 ms - fl/stl/atomic.h
  │ │ │ │ └─    0.5 ms - fl/stl/bit_cast.h
  │ │ │ ├─    2.9 ms - fl/stl/algorithm.h
  │ │ │ │ └─    1.3 ms - fl/math/random.h
  │ │ │ │   └─    0.6 ms - fl/math/random8.h
  │ │ │ └─    0.9 ms - fl/stl/pair.h
  │ │ └─   28.4 ms - fl/stl/string.h
  │ │   ├─    8.5 ms - fl/stl/basic_string.h
  │ │   │ ├─    0.6 ms - fl/stl/not_null.h
  │ │   │ └─    0.6 ms - fl/stl/iterator.h
  │ │   ├─    5.4 ms - fl/stl/span.h
  │ │   │ └─    2.3 ms - fl/math/geometry.h
  │ │   ├─    3.3 ms - fl/stl/string_view.h
  │ │   └─    1.0 ms - fl/stl/optional.h
  │ ├─    8.1 ms - platforms/await.h
  │ │ └─    7.8 ms - platforms/coroutine_runtime.h
  │ │   └─    6.6 ms - fl/system/engine_events.h
  │ │     ├─    3.3 ms - fl/math/screenmap.h
  │ │     │ └─    2.3 ms - fl/stl/flat_map.h
  │ │     └─    2.2 ms - fl/math/xymap.h
  │ │       └─    0.5 ms - fl/math/lut.h
  │ └─    0.5 ms - fl/task/promise_result.h
  ├─  336.5 ms - fl/fastled.h
  │ ├─  194.2 ms - crgb.h
  │ │ ├─  193.2 ms - fl/gfx/crgb.h
  │ │ │ └─  189.9 ms - fl/math/ease.h
  │ │ │   └─  188.3 ms - fl/math/fixed_point.h
  │ │ │     ├─   95.8 ms - fl/math/fixed_point/s4x12.h
  │ │ │     │ ├─   82.6 ms - fl/math/sin32.h
  │ │ │     │ │ └─   80.6 ms - fl/math/simd.h
  │ │ │     │ │   └─   80.2 ms - fl/math/simd/types.h
  │ │ │     │ │     └─   80.0 ms - platforms/simd.h
  │ │ │     │ ├─    0.7 ms - fl/math/fixed_point/traits.h
  │ │ │     │ └─    0.5 ms - fl/math/fixed_point/isqrt.h
  │ │ │     ├─   11.6 ms - fl/math/fixed_point/s16x16.h
  │ │ │     ├─   11.4 ms - fl/math/fixed_point/s8x8.h
  │ │ │     ├─   11.3 ms - fl/math/fixed_point/s8x24.h
  │ │ │     └─   11.2 ms - fl/math/fixed_point/s24x8.h
  │ │ └─    0.6 ms - chsv.h
  │ │   └─    0.6 ms - fl/gfx/hsv.h
  │ ├─   70.2 ms - platforms.h
  │ │ └─   70.2 ms - platforms/stub/fastled_stub.h
  │ │   ├─   44.9 ms - platforms/stub/fastspi_stub.h
  │ │   │ └─   44.8 ms - platforms/stub/fastspi_stub_generic.h
  │ │   │   ├─   40.3 ms - platforms/shared/active_strip_data/active_strip_data.h
  │ │   │   │ └─   38.7 ms - fl/id_tracker.h
  │ │   │   │   └─   38.0 ms - fl/stl/mutex.h
  │ │   │   │     └─   37.9 ms - platforms/mutex.h
  │ │   │   │       └─   37.8 ms - platforms/stub/mutex_stub.h
  │ │   │   └─    4.3 ms - platforms/shared/active_strip_tracker/active_strip_tracker.h
  │ │   └─   25.3 ms - platforms/stub/clockless_stub.h
  │ │     └─   25.2 ms - platforms/stub/clockless_channel_stub.h
  │ │       ├─    9.0 ms - fl/channels/data.h
  │ │       │ └─    6.6 ms - fl/channels/config.h
  │ │       │   ├─    1.4 ms - fl/chipsets/chipset_timing_config.h
  │ │       │   │ └─    0.9 ms - fl/chipsets/led_timing.h
  │ │       │   └─    1.1 ms - fl/chipsets/spi.h
  │ │       ├─    8.6 ms - fl/system/log.h
  │ │       │ └─    8.3 ms - fl/stl/strstream.h
  │ │       ├─    1.7 ms - fl/channels/manager.h
  │ │       ├─    1.1 ms - fl/stl/weak_ptr.h
  │ │       └─    0.8 ms - fl/channels/driver.h
  │ ├─   27.3 ms - colorutils.h
  │ │ └─   26.7 ms - fl/gfx/colorutils.h
  │ │   ├─    5.9 ms - fl/gfx/blur.h
  │ │   │ └─    3.2 ms - fl/gfx/canvas.h
  │ │   │   └─    1.9 ms - fl/math/alpha.h
  │ │   └─    2.3 ms - fl/gfx/fill.h
  │ ├─   15.5 ms - pixel_controller.h
  │ │ ├─   11.1 ms - pixel_iterator.h
  │ │ │ └─   11.0 ms - fl/chipsets/encoders/pixel_iterator.h
  │ │ │   ├─    3.1 ms - fl/chipsets/encoders/apa102.h
  │ │ │   │ └─    2.4 ms - fl/chipsets/encoders/encoder_utils.h
  │ │ │   │   └─    1.9 ms - fl/gfx/gamma_lut.h
  │ │ │   ├─    3.1 ms - fl/chipsets/encoders/pixel_iterator_adapters.h
  │ │ │   ├─    0.6 ms - fl/chipsets/encoders/sk9822.h
  │ │ │   └─    0.5 ms - fl/chipsets/encoders/hd108.h
  │ │ └─    1.8 ms - rgbw.h
  │ │   └─    1.7 ms - fl/gfx/rgbw.h
  │ └─   14.2 ms - pixeltypes.h
  │   ├─   11.6 ms - lib8tion.h
  │   │ ├─    5.8 ms - fl/math/math8.h
  │   │ │ ├─    3.0 ms - fl/math/intmap.h
  │   │ │ │ └─    2.3 ms - platforms/intmap.h
  │   │ │ ├─    1.5 ms - fl/math/scale8.h
  │   │ │ │ └─    1.5 ms - platforms/scale8.h
  │   │ │ │   └─    0.6 ms - platforms/shared/scale8.h
  │   │ │ └─    1.2 ms - platforms/math8.h
  │   │ │   └─    0.9 ms - platforms/shared/math8.h
  │   │ └─    1.2 ms - fl/math/beat.h
  │   │   └─    0.6 ms - fl/math/trig8.h
  │   │     └─    0.6 ms - platforms/trig8.h
  │   │       └─    0.5 ms - platforms/shared/trig8.h
  │   └─    2.3 ms - crgb.hpp
  ├─   83.6 ms - fl/ui.h
  │ ├─   51.6 ms - fl/stl/json.h
  │ │ └─   40.6 ms - fl/stl/json/types.h
  │ │   └─    4.8 ms - fl/stl/limits.h
  │ ├─    7.7 ms - fl/ui_impl.h
  │ │ └─    7.3 ms - platforms/ui_defs.h
  │ │   ├─    1.6 ms - platforms/shared/ui/json/audio.h
  │ │   │ └─    0.8 ms - platforms/shared/ui/json/audio_internal.h
  │ │   ├─    1.2 ms - platforms/shared/ui/json/button.h
  │ │   │ └─    0.5 ms - platforms/shared/ui/json/ui_internal.h
  │ │   ├─    0.7 ms - platforms/shared/ui/json/checkbox.h
  │ │   ├─    0.6 ms - platforms/shared/ui/json/help.h
  │ │   └─    0.6 ms - platforms/shared/ui/json/number_field.h
  │ └─    6.1 ms - fl/asset.h
  │   └─    5.4 ms - fl/stl/url.h
  ├─   19.1 ms - chipsets.h
  │ ├─    7.8 ms - fl/chipsets/ucs7604.h
  │ │ └─    5.8 ms - fl/chipsets/encoders/ucs7604.h
  │ ├─    4.6 ms - fl/chipsets/apa102.h
  │ ├─    1.2 ms - fl/chipsets/lpd880x.h
  │ ├─    0.6 ms - fl/chipsets/p9813.h
  │ └─    0.6 ms - fl/chipsets/encoders/ws2816.h
  └─   16.5 ms - fl/audio/audio_processor.h
    ├─    4.0 ms - fl/audio/audio_context.h
    │ └─    1.1 ms - fl/audio/fft/fft.h
    ├─    1.5 ms - fl/audio/detector/vibe.h
    └─    0.7 ms - fl/audio/signal_conditioner.h
fl/task/executor.h (684.6ms)
  ├─  475.0 ms - fl/task/scheduler.h
  │ ├─  468.3 ms - fl/stl/singleton.h
  │ │ └─  467.8 ms - fl/stl/thread_local.h
  │ │   └─  466.8 ms - fl/stl/thread.h
  │ │     └─  466.7 ms - platforms/thread.h
  │ │       └─  466.7 ms - platforms/stub/thread_stub.h
  │ │         └─  466.7 ms - platforms/stub/thread_stub_stl.h
  │ │           └─  452.1 ms - thread
  │ │             ├─  220.1 ms - std_thread.h
  │ │             │ ├─   59.1 ms - gthr.h
  │ │             │ ├─   54.2 ms - tuple
  │ │             │ ├─   34.3 ms - iosfwd
  │ │             │ ├─   20.5 ms - refwrap.h
  │ │             │ └─    4.0 ms - unique_ptr.h
  │ │             └─  210.7 ms - this_thread_sleep.h
  │ │               ├─   82.3 ms - chrono.h
  │ │               └─   45.1 ms - cerrno
  │ └─    6.1 ms - fl/task/task.h
  │   └─    2.4 ms - fl/trace.h
  │     ├─    1.5 ms - fl/stl/chrono.h
  │     └─    0.6 ms - fl/stl/tuple.h
  ├─  199.8 ms - fl/task/promise.h
  │ ├─  167.7 ms - fl/stl/function.h
  │ │ ├─   90.8 ms - fl/stl/vector.h
  │ │ │ ├─   66.2 ms - fl/math/math.h
  │ │ │ │ └─   62.8 ms - fl/stl/undef.h
  │ │ │ │   └─  103.7 ms - stdlib.h
  │ │ │ │     ├─   52.4 ms - cstdlib
  │ │ │ │     │ └─  103.7 ms - stdlib.h
  │ │ │ │     │   ├─   52.4 ms - cstdlib
  │ │ │ │     │   │ └─  103.7 ms - stdlib.h
  │ │ │ │     │   └─    9.8 ms - types.h
  │ │ │ │     │     ├─    9.8 ms - types.h
  │ │ │ │     │     ├─    1.9 ms - pthreadtypes.h
  │ │ │ │     │     ├─    1.6 ms - select.h
  │ │ │ │     │     └─    1.6 ms - endian.h
  │ │ │ │     └─    9.8 ms - types.h
  │ │ │ │       ├─    9.8 ms - types.h
  │ │ │ │       │ ├─    9.8 ms - types.h
  │ │ │ │       │ │ ├─    9.8 ms - types.h
  │ │ │ │       │ │ ├─    1.9 ms - pthreadtypes.h
  │ │ │ │       │ │ ├─    1.6 ms - select.h
  │ │ │ │       │ │ └─    1.6 ms - endian.h
  │ │ │ │       │ ├─    1.9 ms - pthreadtypes.h
  │ │ │ │       │ │ └─    1.6 ms - thread-shared-types.h
  │ │ │ │       │ ├─    1.6 ms - select.h
  │ │ │ │       │ └─    1.6 ms - endian.h
  │ │ │ │       ├─    1.9 ms - pthreadtypes.h
  │ │ │ │       │ └─    1.6 ms - thread-shared-types.h
  │ │ │ │       ├─    1.6 ms - select.h
  │ │ │ │       └─    1.6 ms - endian.h
  │ │ │ ├─   11.8 ms - fl/stl/allocator.h
  │ │ │ │ └─    8.0 ms - fl/stl/bitset.h
  │ │ │ │   └─    5.2 ms - fl/stl/bitset_dynamic.h
  │ │ │ │     └─    1.5 ms - fl/stl/unique_ptr.h
  │ │ │ ├─    5.7 ms - fl/stl/initializer_list.h
  │ │ │ └─    1.8 ms - fl/stl/basic_vector.h
  │ │ ├─   61.6 ms - fl/stl/variant.h
  │ │ │ └─   60.3 ms - fl/stl/new.h
  │ │ │   └─   60.3 ms - platforms/new.h
  │ │ │     └─   60.3 ms - platforms/shared/new.h
  │ │ │       └─   48.5 ms - new
  │ │ │         └─   21.3 ms - c++config.h
  │ │ │           └─   10.3 ms - os_defines.h
  │ │ │             └─    9.9 ms - features.h
  │ │ ├─    8.4 ms - fl/stl/shared_ptr.h
  │ │ │ ├─    4.1 ms - fl/stl/type_traits.h
  │ │ │ ├─    1.0 ms - fl/stl/atomic.h
  │ │ │ └─    0.5 ms - fl/stl/bit_cast.h
  │ │ ├─    2.9 ms - fl/stl/algorithm.h
  │ │ │ └─    1.3 ms - fl/math/random.h
  │ │ │   └─    0.6 ms - fl/math/random8.h
  │ │ └─    0.9 ms - fl/stl/pair.h
  │ └─   28.4 ms - fl/stl/string.h
  │   ├─    8.5 ms - fl/stl/basic_string.h
  │   │ ├─    0.6 ms - fl/stl/not_null.h
  │   │ └─    0.6 ms - fl/stl/iterator.h
  │   ├─    5.4 ms - fl/stl/span.h
  │   │ └─    2.3 ms - fl/math/geometry.h
  │   ├─    3.3 ms - fl/stl/string_view.h
  │   └─    1.0 ms - fl/stl/optional.h
  ├─    8.1 ms - platforms/await.h
  │ └─    7.8 ms - platforms/coroutine_runtime.h
  │   └─    6.6 ms - fl/system/engine_events.h
  │     ├─    3.3 ms - fl/math/screenmap.h
  │     │ └─    2.3 ms - fl/stl/flat_map.h
  │     └─    2.2 ms - fl/math/xymap.h
  │       └─    0.5 ms - fl/math/lut.h
  └─    0.5 ms - fl/task/promise_result.h
fl/fastled.h (336.5ms)
  ├─  194.2 ms - crgb.h
  │ ├─  193.2 ms - fl/gfx/crgb.h
  │ │ └─  189.9 ms - fl/math/ease.h
  │ │   └─  188.3 ms - fl/math/fixed_point.h
  │ │     ├─   95.8 ms - fl/math/fixed_point/s4x12.h
  │ │     │ ├─   82.6 ms - fl/math/sin32.h
  │ │     │ │ └─   80.6 ms - fl/math/simd.h
  │ │     │ │   └─   80.2 ms - fl/math/simd/types.h
  │ │     │ │     └─   80.0 ms - platforms/simd.h
  │ │     │ │       └─   79.9 ms - platforms/shared/simd_x86.hpp
  │ │     │ ├─    0.7 ms - fl/math/fixed_point/traits.h
  │ │     │ └─    0.5 ms - fl/math/fixed_point/isqrt.h
  │ │     ├─   11.6 ms - fl/math/fixed_point/s16x16.h
  │ │     ├─   11.4 ms - fl/math/fixed_point/s8x8.h
  │ │     ├─   11.3 ms - fl/math/fixed_point/s8x24.h
  │ │     └─   11.2 ms - fl/math/fixed_point/s24x8.h
  │ └─    0.6 ms - chsv.h
  │   └─    0.6 ms - fl/gfx/hsv.h
  ├─   70.2 ms - platforms.h
  │ └─   70.2 ms - platforms/stub/fastled_stub.h
  │   ├─   44.9 ms - platforms/stub/fastspi_stub.h
  │   │ └─   44.8 ms - platforms/stub/fastspi_stub_generic.h
  │   │   ├─   40.3 ms - platforms/shared/active_strip_data/active_strip_data.h
  │   │   │ └─   38.7 ms - fl/id_tracker.h
  │   │   │   └─   38.0 ms - fl/stl/mutex.h
  │   │   │     └─   37.9 ms - platforms/mutex.h
  │   │   │       └─   37.8 ms - platforms/stub/mutex_stub.h
  │   │   │         └─   37.7 ms - platforms/stub/mutex_stub_stl.h
  │   │   └─    4.3 ms - platforms/shared/active_strip_tracker/active_strip_tracker.h
  │   └─   25.3 ms - platforms/stub/clockless_stub.h
  │     └─   25.2 ms - platforms/stub/clockless_channel_stub.h
  │       ├─    9.0 ms - fl/channels/data.h
  │       │ └─    6.6 ms - fl/channels/config.h
  │       │   ├─    1.4 ms - fl/chipsets/chipset_timing_config.h
  │       │   │ └─    0.9 ms - fl/chipsets/led_timing.h
  │       │   └─    1.1 ms - fl/chipsets/spi.h
  │       ├─    8.6 ms - fl/system/log.h
  │       │ └─    8.3 ms - fl/stl/strstream.h
  │       ├─    1.7 ms - fl/channels/manager.h
  │       ├─    1.1 ms - fl/stl/weak_ptr.h
  │       └─    0.8 ms - fl/channels/driver.h
  ├─   27.3 ms - colorutils.h
  │ └─   26.7 ms - fl/gfx/colorutils.h
  │   ├─    5.9 ms - fl/gfx/blur.h
  │   │ └─    3.2 ms - fl/gfx/canvas.h
  │   │   └─    1.9 ms - fl/math/alpha.h
  │   └─    2.3 ms - fl/gfx/fill.h
  ├─   15.5 ms - pixel_controller.h
  │ ├─   11.1 ms - pixel_iterator.h
  │ │ └─   11.0 ms - fl/chipsets/encoders/pixel_iterator.h
  │ │   ├─    3.1 ms - fl/chipsets/encoders/apa102.h
  │ │   │ └─    2.4 ms - fl/chipsets/encoders/encoder_utils.h
  │ │   │   └─    1.9 ms - fl/gfx/gamma_lut.h
  │ │   ├─    3.1 ms - fl/chipsets/encoders/pixel_iterator_adapters.h
  │ │   ├─    0.6 ms - fl/chipsets/encoders/sk9822.h
  │ │   └─    0.5 ms - fl/chipsets/encoders/hd108.h
  │ └─    1.8 ms - rgbw.h
  │   └─    1.7 ms - fl/gfx/rgbw.h
  └─   14.2 ms - pixeltypes.h
    ├─   11.6 ms - lib8tion.h
    │ ├─    5.8 ms - fl/math/math8.h
    │ │ ├─    3.0 ms - fl/math/intmap.h
    │ │ │ └─    2.3 ms - platforms/intmap.h
    │ │ ├─    1.5 ms - fl/math/scale8.h
    │ │ │ └─    1.5 ms - platforms/scale8.h
    │ │ │   └─    0.6 ms - platforms/shared/scale8.h
    │ │ └─    1.2 ms - platforms/math8.h
    │ │   └─    0.9 ms - platforms/shared/math8.h
    │ └─    1.2 ms - fl/math/beat.h
    │   └─    0.6 ms - fl/math/trig8.h
    │     └─    0.6 ms - platforms/trig8.h
    │       └─    0.5 ms - platforms/shared/trig8.h
    └─    2.3 ms - crgb.hpp
fl/ui.h (83.6ms)
  ├─   51.6 ms - fl/stl/json.h
  │ └─   40.6 ms - fl/stl/json/types.h
  │   └─    4.8 ms - fl/stl/limits.h
  ├─    7.7 ms - fl/ui_impl.h
  │ └─    7.3 ms - platforms/ui_defs.h
  │   ├─    1.6 ms - platforms/shared/ui/json/audio.h
  │   │ └─    0.8 ms - platforms/shared/ui/json/audio_internal.h
  │   ├─    1.2 ms - platforms/shared/ui/json/button.h
  │   │ └─    0.5 ms - platforms/shared/ui/json/ui_internal.h
  │   ├─    0.7 ms - platforms/shared/ui/json/checkbox.h
  │   ├─    0.6 ms - platforms/shared/ui/json/help.h
  │   └─    0.6 ms - platforms/shared/ui/json/number_field.h
  └─    6.1 ms - fl/asset.h
    └─    5.4 ms - fl/stl/url.h
chipsets.h (19.1ms)
  ├─    7.8 ms - fl/chipsets/ucs7604.h
  │ └─    5.8 ms - fl/chipsets/encoders/ucs7604.h
  ├─    4.6 ms - fl/chipsets/apa102.h
  ├─    1.2 ms - fl/chipsets/lpd880x.h
  ├─    0.6 ms - fl/chipsets/p9813.h
  └─    0.6 ms - fl/chipsets/encoders/ws2816.h

TOP 20 SLOWEST OPERATIONS
--------------------------------------------------------------------------------
  1.  1172.8 ms - Source: FastLED.h
  2.  1172.8 ms - Total Source: 
  3.   684.6 ms - Source: fl/task/executor.h
  4.   475.0 ms - Source: fl/task/scheduler.h
  5.   468.3 ms - Source: fl/stl/singleton.h
  6.   467.8 ms - Source: fl/stl/thread_local.h
  7.   466.8 ms - Source: fl/stl/thread.h
  8.   466.7 ms - Source: platforms/thread.h
  9.   466.7 ms - Source: platforms/stub/thread_stub.h
 10.   466.7 ms - Source: platforms/stub/thread_stub_stl.h
 11.   452.1 ms - Source: thread
 12.   423.5 ms - Total ParseClass: 
 13.   336.5 ms - Source: fl/fastled.h
 14.   220.1 ms - Source: std_thread.h
 15.   210.7 ms - Source: this_thread_sleep.h
 16.   199.8 ms - Source: fl/task/promise.h
 17.   194.2 ms - Source: crgb.h
 18.   193.2 ms - Source: fl/gfx/crgb.h
 19.   189.9 ms - Source: fl/math/ease.h
 20.   188.3 ms - Source: fl/math/fixed_point.h

PERFORMANCE FLAGS
--------------------------------------------------------------------------------
⚠️  SLOW HEADERS (>50.0ms):
    - FastLED.h (1172.8ms)
    - fl/task/executor.h (684.6ms)
    - fl/task/scheduler.h (475.0ms)
    - fl/stl/singleton.h (468.3ms)
    - fl/stl/thread_local.h (467.8ms)
    - fl/stl/thread.h (466.8ms)
    - platforms/thread.h (466.7ms)
    - platforms/stub/thread_stub.h (466.7ms)
    - platforms/stub/thread_stub_stl.h (466.7ms)
    - fl/fastled.h (336.5ms)

⚠️  Template instantiation time is high (25.6% of total)
✓  Lexing time is acceptable (<10% of total)

RECOMMENDATIONS
--------------------------------------------------------------------------------
1. Consider optimizing FastLED.h (1172.8ms)
2. Consider optimizing fl/task/executor.h (684.6ms)
3. Consider optimizing fl/task/scheduler.h (475.0ms)

================================================================================

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/AutoResearch/AutoResearchRemote.cpp`:
- Around line 2357-2530: The help output currently omits the newly-bound RPCs;
update the help handler (the mRemote->bind("help", ...) response) to include
entries for testCoroutineCoreAffinity, testFftBackendTiming, and
testWorkerAffinity with short descriptions and indicate any platform-specific
skips (e.g., ESP32/multicore conditions) so operators can discover these RPCs;
ensure the exact symbol names match the bound handlers
testCoroutineCoreAffinity, testFftBackendTiming, and testWorkerAffinity when
adding them to the help JSON/listing.

In `@src/fl/task/worker.cpp.hpp`:
- Around line 60-72: Worker objects created with core == -1 or invalid cores are
being reported as pinned because mImpl/mTask is non-null even when affinity ==
tskNO_AFFINITY; update the WorkerImpl construction logic (around
WorkerImpl::taskEntry / the code that calls xTaskCreatePinnedToCore in Worker or
WorkerImpl) to record actual pinning state and make isPinned() check that
recorded flag (or affinity != tskNO_AFFINITY) rather than just mImpl != nullptr;
only set the "pinned" flag (or mark the instance as pinned) when affinity !=
tskNO_AFFINITY and the task creation succeeds, and ensure the alternative path
for unpinned creation leaves the pinned flag false so isPinned() returns correct
result.

In `@src/fl/task/worker.h`:
- Around line 83-86: Replace the owning raw pointer mImpl with
fl::unique_ptr<WorkerImpl> to ensure RAII ownership: change the declaration of
mImpl in class Worker to fl::unique_ptr<WorkerImpl>, update construction sites
in worker.cpp.hpp to use fl::make_unique<WorkerImpl>(...) instead of new, remove
manual delete calls and replace any explicit reset logic with mImpl.reset() or
rely on destructor, and keep the Worker destructor definition out-of-line (after
WorkerImpl is fully defined) so unique_ptr can be destructed correctly; update
any usage that assumed raw pointer (e.g., mImpl->...) to work unchanged since
unique_ptr supports operator->.

In `@tests/fl/task/worker.cpp`:
- Around line 40-46: The test unconditionally asserts Worker(0).isPinned() is
false which breaks on dual-core ESP32 where Worker may create a pinned task;
guard the assertion so it only runs for host/non-multicore builds. Modify the
FL_TEST_CASE for fl::task::Worker to wrap the FL_CHECK_FALSE(worker.isPinned())
with a preprocessor guard (e.g. `#ifndef` FL_HAS_MULTICORE_AFFINITY / `#endif`) or
an equivalent build-time check so the assertion is compiled only when
FL_HAS_MULTICORE_AFFINITY is not defined.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0158bec8-6229-415d-9fa9-8fe58557b18a

📥 Commits

Reviewing files that changed from the base of the PR and between a843234 and b5f284d.

📒 Files selected for processing (7)
  • examples/AutoResearch/AutoResearchRemote.cpp
  • src/fl/audio/audio_processor.cpp.hpp
  • src/fl/audio/audio_processor.h
  • src/fl/task/_build.cpp.hpp
  • src/fl/task/worker.cpp.hpp
  • src/fl/task/worker.h
  • tests/fl/task/worker.cpp

Comment on lines +2357 to +2530
// Test: CoroutineConfig::core_id actually pins the FreeRTOS task.
// Validates the plumbing shipped in issue #2308 phase 1 (#2337) and used
// by Processor::setCoreAffinity() in phase 2. ESP32-only because
// uxTaskGetCoreID() and multi-core affinity are FreeRTOS-SMP features.
mRemote->bind("testCoroutineCoreAffinity", [](const fl::json& args) -> fl::json {
(void)args;
fl::json r = fl::json::object();

#if defined(FL_IS_ESP32) && defined(FL_HAS_MULTICORE_AFFINITY) && FL_HAS_MULTICORE_AFFINITY
// Drive one test per valid core; report the observed core for each.
fl::json per_core = fl::json::array();
bool all_passed = true;
for (int requested = 0; requested < FL_CPU_CORES; ++requested) {
fl::atomic<int> observed(-1);
fl::atomic<bool> task_ran(false);
fl::task::CoroutineConfig cfg;
cfg.func = [&observed, &task_ran]() {
observed.store(static_cast<int>(xPortGetCoreID()));
task_ran.store(true);
};
cfg.name = "affinity_probe";
cfg.core_id = requested;
auto t = fl::task::coroutine(cfg);

uint32_t start = millis();
while (!task_ran.load() && (millis() - start) < 2000) {
delay(10);
}

fl::json entry = fl::json::object();
entry.set("requested", static_cast<int64_t>(requested));
entry.set("observed", static_cast<int64_t>(observed.load()));
bool ok = task_ran.load() && (observed.load() == requested);
entry.set("ok", ok);
per_core.push_back(entry);
if (!ok) {
all_passed = false;
}
}
r.set("success", all_passed);
r.set("cores", per_core);
r.set("cpuCores", static_cast<int64_t>(FL_CPU_CORES));
#else
// Single-core platform (ESP32-S2/C2/C3/C5/C6/H2) or non-ESP32.
// Report skip as a pass so the suite doesn't red on these variants.
r.set("success", true);
r.set("skipped", true);
r.set("reason", "FL_HAS_MULTICORE_AFFINITY is 0 on this platform");
#endif
return r;
});

// Test: FFT backend timing — hardware (ESP-DSP) vs software (kiss_fftr).
// Runs the currently-compiled backend against a known single-tone sine
// signal, reports µs-per-call and peak bin magnitudes so autoresearch
// can compare two runs (one with -DFL_FFT_USE_ESP_DSP=1, one without).
// The comparison lives in the post-processor, not here — this handler
// only measures ONE backend per build.
mRemote->bind("testFftBackendTiming", [](const fl::json& args) -> fl::json {
(void)args;
fl::json r = fl::json::object();

constexpr int N = 512;
constexpr int SAMPLE_RATE = 44100;
constexpr int TONE_HZ = 1000;
constexpr int ITERATIONS = 100;

// Generate a single-tone sine signal at TONE_HZ, 50 % full-scale.
fl::vector<fl::i16> samples(N);
for (int i = 0; i < N; ++i) {
float phase = 2.0f * 3.14159265358979323846f *
static_cast<float>(TONE_HZ) *
static_cast<float>(i) /
static_cast<float>(SAMPLE_RATE);
samples[i] = static_cast<fl::i16>(16000.0f * sinf(phase));
}

fl::audio::fft::FFT fft;
fl::audio::fft::Args fft_args;
fft_args.samples = N;
fft_args.bands = 16;
fft_args.sample_rate = SAMPLE_RATE;

fl::audio::fft::Bins bins(fft_args.bands);

// Warm-up: prime the FFT kernel cache so the first measured call
// doesn't pay for ImplCache allocation.
fft.run(fl::span<const fl::i16>(samples.data(), N), &bins, fft_args);

uint32_t t_start = micros();
for (int iter = 0; iter < ITERATIONS; ++iter) {
fft.run(fl::span<const fl::i16>(samples.data(), N), &bins, fft_args);
}
uint32_t t_end = micros();

float total_us = static_cast<float>(t_end - t_start);
float us_per_call = total_us / static_cast<float>(ITERATIONS);

// Find the peak CQ bin and its neighbours for a sanity signal.
fl::span<const float> raw_bins = bins.raw();
int peak_bin = 0;
float peak_mag = 0.0f;
for (int b = 0; b < static_cast<int>(raw_bins.size()); ++b) {
if (raw_bins[b] > peak_mag) {
peak_mag = raw_bins[b];
peak_bin = b;
}
}

r.set("success", us_per_call > 0.0f);
r.set("usPerCall", static_cast<double>(us_per_call));
r.set("iterations", static_cast<int64_t>(ITERATIONS));
r.set("samples", static_cast<int64_t>(N));
r.set("bands", static_cast<int64_t>(fft_args.bands));
r.set("sampleRate", static_cast<int64_t>(SAMPLE_RATE));
r.set("toneHz", static_cast<int64_t>(TONE_HZ));
r.set("peakBin", static_cast<int64_t>(peak_bin));
r.set("peakMagnitude", static_cast<double>(peak_mag));
#if FL_FFT_ESP_DSP_ACTIVE
r.set("backend", "esp-dsp");
#else
r.set("backend", "kiss_fftr");
#endif
return r;
});

// Test: fl::task::Worker end-to-end.
// Validates that Worker::run(fn) executes fn on the requested core (on
// dual-core ESP32) or synchronously (everywhere else). Report observed
// core per core_id so the autoresearch post-processor can confirm
// phase 2 of #2308 actually offloads across cores.
mRemote->bind("testWorkerAffinity", [](const fl::json& args) -> fl::json {
(void)args;
fl::json r = fl::json::object();

#if defined(FL_IS_ESP32) && defined(FL_HAS_MULTICORE_AFFINITY) && FL_HAS_MULTICORE_AFFINITY
fl::json per_core = fl::json::array();
bool all_passed = true;
for (int requested = 0; requested < FL_CPU_CORES; ++requested) {
fl::task::Worker worker(requested, "ar_probe");
fl::atomic<int> observed(-1);
fl::atomic<bool> ran(false);
worker.run([&observed, &ran]() {
observed.store(static_cast<int>(xPortGetCoreID()));
ran.store(true);
});

fl::json entry = fl::json::object();
entry.set("requested", static_cast<int64_t>(requested));
entry.set("observed", static_cast<int64_t>(observed.load()));
entry.set("isPinned", worker.isPinned());
bool ok = ran.load() && (observed.load() == requested);
entry.set("ok", ok);
per_core.push_back(entry);
if (!ok) {
all_passed = false;
}
}
r.set("success", all_passed);
r.set("cores", per_core);
r.set("cpuCores", static_cast<int64_t>(FL_CPU_CORES));
#else
// Single-core / non-ESP32: Worker::run is synchronous. Verify that
// the fallback actually executes the functor.
fl::task::Worker worker(0);
fl::atomic<bool> ran(false);
worker.run([&ran]() { ran.store(true); });
r.set("success", ran.load());
r.set("skipped", true);
r.set("reason", "FL_HAS_MULTICORE_AFFINITY is 0 — verified synchronous fallback");
r.set("isPinned", worker.isPinned());
#endif
return r;
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add the new RPCs to help.

testCoroutineCoreAffinity, testFftBackendTiming, and testWorkerAffinity are now bound, but the custom help response still omits them, so operators using the sketch’s help output won’t discover the new autoresearch hooks.

Proposed fix outline
+        fl::json testCoroutineCoreAffinity_fn = fl::json::object();
+        testCoroutineCoreAffinity_fn.set("name", "testCoroutineCoreAffinity");
+        testCoroutineCoreAffinity_fn.set("phase", "Phase 4: Utility");
+        testCoroutineCoreAffinity_fn.set("args", "[]");
+        testCoroutineCoreAffinity_fn.set("returns", "{success, skipped?, cores?, cpuCores?}");
+        testCoroutineCoreAffinity_fn.set("description", "Validate CoroutineConfig core affinity plumbing");
+        functions.push_back(testCoroutineCoreAffinity_fn);
+
+        fl::json testFftBackendTiming_fn = fl::json::object();
+        testFftBackendTiming_fn.set("name", "testFftBackendTiming");
+        testFftBackendTiming_fn.set("phase", "Phase 4: Utility");
+        testFftBackendTiming_fn.set("args", "[]");
+        testFftBackendTiming_fn.set("returns", "{success, usPerCall, backend, peakBin, peakMagnitude}");
+        testFftBackendTiming_fn.set("description", "Measure the active FFT backend timing");
+        functions.push_back(testFftBackendTiming_fn);
+
+        fl::json testWorkerAffinity_fn = fl::json::object();
+        testWorkerAffinity_fn.set("name", "testWorkerAffinity");
+        testWorkerAffinity_fn.set("phase", "Phase 4: Utility");
+        testWorkerAffinity_fn.set("args", "[]");
+        testWorkerAffinity_fn.set("returns", "{success, skipped?, cores?, cpuCores?, isPinned?}");
+        testWorkerAffinity_fn.set("description", "Validate Worker core pinning or synchronous fallback");
+        functions.push_back(testWorkerAffinity_fn);
🧰 Tools
🪛 GitHub Actions: linux_native

[error] 2431-2431: C++ compilation failed: use of undeclared identifier 'sinf'; did you mean 'sin8'? (examples/AutoResearch/AutoResearchRemote.cpp:2431:58)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/AutoResearch/AutoResearchRemote.cpp` around lines 2357 - 2530, The
help output currently omits the newly-bound RPCs; update the help handler (the
mRemote->bind("help", ...) response) to include entries for
testCoroutineCoreAffinity, testFftBackendTiming, and testWorkerAffinity with
short descriptions and indicate any platform-specific skips (e.g.,
ESP32/multicore conditions) so operators can discover these RPCs; ensure the
exact symbol names match the bound handlers testCoroutineCoreAffinity,
testFftBackendTiming, and testWorkerAffinity when adding them to the help
JSON/listing.

Comment on lines +60 to +72
BaseType_t affinity = (core >= 0 && core < FL_CPU_CORES)
? static_cast<BaseType_t>(core)
: tskNO_AFFINITY;
// Slightly elevated priority so the worker preempts idle work but
// doesn't starve user tasks on the same core.
BaseType_t rc = xTaskCreatePinnedToCore(
&WorkerImpl::taskEntry,
name,
static_cast<configSTACK_DEPTH_TYPE>(stack_size),
this,
tskIDLE_PRIORITY + 5,
&mTask,
affinity);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t report unpinned workers as pinned.

core == -1 and invalid core IDs are converted to tskNO_AFFINITY, but a real task is still created; then isPinned() returns true because mImpl != nullptr. That makes Worker() look pinned on dual-core ESP32 even though it is explicitly “no pinning”.

Proposed fix
-        BaseType_t affinity = (core >= 0 && core < FL_CPU_CORES)
-            ? static_cast<BaseType_t>(core)
-            : tskNO_AFFINITY;
+        if (core < 0 || core >= FL_CPU_CORES) {
+            mValid = false;
+            return;
+        }
+        BaseType_t affinity = static_cast<BaseType_t>(core);

Also applies to: 187-192

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fl/task/worker.cpp.hpp` around lines 60 - 72, Worker objects created with
core == -1 or invalid cores are being reported as pinned because mImpl/mTask is
non-null even when affinity == tskNO_AFFINITY; update the WorkerImpl
construction logic (around WorkerImpl::taskEntry / the code that calls
xTaskCreatePinnedToCore in Worker or WorkerImpl) to record actual pinning state
and make isPinned() check that recorded flag (or affinity != tskNO_AFFINITY)
rather than just mImpl != nullptr; only set the "pinned" flag (or mark the
instance as pinned) when affinity != tskNO_AFFINITY and the task creation
succeeds, and ensure the alternative path for unpinned creation leaves the
pinned flag false so isPinned() returns correct result.

Comment thread src/fl/task/worker.h
Comment on lines +83 to +86
private:
int mCoreId;
WorkerImpl* mImpl; ///< Owned; platform-specific payload. Null on
///< synchronous-fallback platforms.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use RAII for the owned worker implementation.

mImpl is an owning raw pointer, which forces manual allocation/deletion in worker.cpp.hpp. Prefer fl::unique_ptr<WorkerImpl> here and keep the destructor out-of-line after WorkerImpl is complete. As per coding guidelines, “Use proper RAII patterns with smart pointers (fl::shared_ptr, fl::unique_ptr) or moveable wrapper objects instead of raw pointers.”

Proposed direction
+#include "fl/stl/unique_ptr.h"
@@
-    WorkerImpl* mImpl;  ///< Owned; platform-specific payload. Null on
-                        ///< synchronous-fallback platforms.
+    fl::unique_ptr<WorkerImpl> mImpl;  ///< Platform-specific payload. Null on
+                                       ///< synchronous-fallback platforms.

Then replace manual new/delete in worker.cpp.hpp with fl::make_unique<WorkerImpl>(...) and mImpl.reset().

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private:
int mCoreId;
WorkerImpl* mImpl; ///< Owned; platform-specific payload. Null on
///< synchronous-fallback platforms.
private:
int mCoreId;
fl::unique_ptr<WorkerImpl> mImpl; ///< Platform-specific payload. Null on
///< synchronous-fallback platforms.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fl/task/worker.h` around lines 83 - 86, Replace the owning raw pointer
mImpl with fl::unique_ptr<WorkerImpl> to ensure RAII ownership: change the
declaration of mImpl in class Worker to fl::unique_ptr<WorkerImpl>, update
construction sites in worker.cpp.hpp to use fl::make_unique<WorkerImpl>(...)
instead of new, remove manual delete calls and replace any explicit reset logic
with mImpl.reset() or rely on destructor, and keep the Worker destructor
definition out-of-line (after WorkerImpl is fully defined) so unique_ptr can be
destructed correctly; update any usage that assumed raw pointer (e.g.,
mImpl->...) to work unchanged since unique_ptr supports operator->.

Comment thread tests/fl/task/worker.cpp
Comment on lines +40 to +46
FL_TEST_CASE("Worker: isPinned() is false on host / non-ESP32 builds") {
// Host tests build without FL_HAS_MULTICORE_AFFINITY so the worker
// falls through to the synchronous path and does not spawn a pinned
// FreeRTOS task. This encodes that contract.
fl::task::Worker worker(0);
FL_CHECK_FALSE(worker.isPinned());
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Gate the host-only isPinned() assertion.

On dual-core ESP32, Worker(0) is expected to create a pinned implementation, so this unconditional FL_CHECK_FALSE will fail if the worker tests run on hardware.

Proposed fix
 FL_TEST_CASE("Worker: isPinned() is false on host / non-ESP32 builds") {
@@
     fl::task::Worker worker(0);
+#if defined(FL_IS_ESP32) && defined(FL_HAS_MULTICORE_AFFINITY) && FL_HAS_MULTICORE_AFFINITY
+    FL_CHECK(worker.isPinned());
+#else
     FL_CHECK_FALSE(worker.isPinned());
+#endif
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
FL_TEST_CASE("Worker: isPinned() is false on host / non-ESP32 builds") {
// Host tests build without FL_HAS_MULTICORE_AFFINITY so the worker
// falls through to the synchronous path and does not spawn a pinned
// FreeRTOS task. This encodes that contract.
fl::task::Worker worker(0);
FL_CHECK_FALSE(worker.isPinned());
}
FL_TEST_CASE("Worker: isPinned() is false on host / non-ESP32 builds") {
// Host tests build without FL_HAS_MULTICORE_AFFINITY so the worker
// falls through to the synchronous path and does not spawn a pinned
// FreeRTOS task. This encodes that contract.
fl::task::Worker worker(0);
`#if` defined(FL_IS_ESP32) && defined(FL_HAS_MULTICORE_AFFINITY) && FL_HAS_MULTICORE_AFFINITY
FL_CHECK(worker.isPinned());
`#else`
FL_CHECK_FALSE(worker.isPinned());
`#endif`
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/fl/task/worker.cpp` around lines 40 - 46, The test unconditionally
asserts Worker(0).isPinned() is false which breaks on dual-core ESP32 where
Worker may create a pinned task; guard the assertion so it only runs for
host/non-multicore builds. Modify the FL_TEST_CASE for fl::task::Worker to wrap
the FL_CHECK_FALSE(worker.isPinned()) with a preprocessor guard (e.g. `#ifndef`
FL_HAS_MULTICORE_AFFINITY / `#endif`) or an equivalent build-time check so the
assertion is compiled only when FL_HAS_MULTICORE_AFFINITY is not defined.

@zackees
Copy link
Copy Markdown
Member Author

zackees commented Apr 20, 2026

Closing unmerged. Parked pending option (A) of #2308 landing first and a hardware-measured FPS gap to justify this work. Full lessons-learned writeup and unparking criteria in the tracking issue: #2359.

Phase 1 (#2337, CoroutineConfig::core_id plumbing) stays in master as the foundation. This prototype remains browsable on the branch and in the PR diff for when we pick it back up.

@zackees zackees closed this Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant