Skip to content

Conversation

@jserv
Copy link
Collaborator

@jserv jserv commented Oct 31, 2025

This reduces CPU overhead from timer interrupt checking by implementing:

  1. Lazy timer checking: Cache next_interrupt_at (earliest interrupt time across all harts) to skip expensive checks when no interrupt is due.
  2. Event-driven wait: Replace fixed 1ms periodic timer with dynamic one-shot timers (kqueue on macOS, timerfd on Linux) that wake exactly when next interrupt is due.

Summary by cubic

Cuts CPU usage from timer checks by caching the next scheduled interrupt and switching from a 1ms polling timer to event-driven one-shot timers (with a 1ms fallback during boot). Improves idle performance without changing observable behavior.

  • New Features

    • Cache next_interrupt_at (min mtimecmp across harts) in mtimer_state_t; added n_harts.
    • New helper aclint_mtimer_recalc_next_interrupt; called on mtimecmp writes, SBI set_timer, and after interrupts.
    • Lazy timer check in emu_update_timer_interrupt: skip work if time < next_interrupt_at; still sync hart->time.
    • Event-driven wait in semu_run: compute wait_ns (capped at 100ms) and arm one-shot timer (kqueue NOTE_USECONDS on macOS, timerfd on Linux); during boot, use a fixed 1ms timeout to avoid fake-timer vs wall-clock mismatch.
  • Bug Fixes

    • Overflow-safe wait_ns calculation and UINT64_MAX handling to prevent zero-wait loops when the timer is disabled or far in the future.

Written for commit 90aa27d. Summary will update automatically on new commits.

This reduces CPU overhead from timer interrupt checking by implementing:
1. Lazy timer checking: Cache next_interrupt_at (earliest interrupt time
   across all harts) to skip expensive checks when no interrupt is due.
2. Event-driven wait: Replace fixed 1ms periodic timer with dynamic
   one-shot timers (kqueue on macOS, timerfd on Linux) that wake exactly
   when next interrupt is due.
cubic-dev-ai[bot]

This comment was marked as resolved.

The previous event-driven timer implementation caused hrtimer warnings
and boot delays on macOS CI because it calculated wait times based on
emulator's fake incremental timer but used host OS real-time timers.

Root cause:
- During boot, semu_timer_get() returns fake ticks (slow linear growth)
- calc_ns_until_next_interrupt() converted these to nanoseconds
- kqueue/timerfd waited using wall clock time

Fix:
Use conservative 1ms fixed timeout during boot phase. After boot
completes and timer switches to real-time, use dynamic calculation for
optimal CPU efficiency.
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Oct 31, 2025
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Oct 31, 2025
cubic-dev-ai[bot]

This comment was marked as resolved.

When next_interrupt_at equals UINT64_MAX (disabled timer) or is very
large, the calculation '(ticks_remaining * 1000000000ULL) / freq'
overflows, resulting in wait_ns = 0. This prevents the sleep mechanism
from working, eliminating CPU efficiency gains.
@jserv
Copy link
Collaborator Author

jserv commented Nov 2, 2025

Closing this PR due to architectural incompatibility with the current master branch.

Analysis

After merging PR #110 (event-driven UART coroutine), the master branch has fundamentally changed from a polling-based architecture to a coroutine-based architecture. The refine-timer branch optimization was designed for the old polling architecture and cannot be directly rebased onto the new architecture.

Key Findings

  1. Architectural conflict: The timer optimization in refine-timer assumes synchronous polling loops, while master now uses asynchronous coroutine scheduling
  2. SMP=4 regression: Testing revealed that master branch currently has a regression where SMP=4 hangs during boot at "smp: Bringing up secondary CPUs ..." (likely introduced by PR Implement event-driven UART coroutine with CPU optimization #110)
  3. refine-timer works correctly: The refine-timer branch itself has working SMP=4 support and boots successfully

@jserv jserv closed this Nov 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants