Introduce max concurrency to DynamicFlushScheduler #1792

mpxr · 2025-03-13T17:56:23Z

Closes #1789

✅ Checklist

I have followed every step in the contributing guide
The PR title follows the convention.
I ran and tested the code works

Testing

Added tests to the DynamicFlushScheduler
Manual testing by running tasks locally

Changelog

Added the maxConcurrency setting to the DynamicFlushScheduler, limiting the number of concurrent requests made to the callback.
When the flush queue contains more than batchSize items, invoke the callback with only batchSize events.
Added tests for the DynamicFlushScheduler

TODOs:

Consider using Array.concat when adding items to currentBatch in addToBatch, as the spread operator is not performant when adding a large number of items.
Report pLimit.activeCount, pLimit.pendingCount, and pLimit.concurrency to /metrics.
Add a test case to validate both batchSize and flushInterval in a single test.

changeset-bot · 2025-03-13T17:56:27Z

⚠️ No Changeset found

Latest commit: 4153e38

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2025-03-13T17:56:32Z

Walkthrough

This update enhances the DynamicFlushScheduler by introducing concurrency control and improved error handling. An optional maxConcurrency property and the pLimit library are integrated to manage concurrent flush operations. Key methods such as addToBatch, checkAndFlush, and flushNextBatch have been updated to be asynchronous, and new shutdown handling logic has been added to ensure pending batches are processed on termination. Additionally, dependency updates in package.json and new tests for batch processing and signal handling have been incorporated.

Changes

File(s)	Summary of Changes
`apps/webapp/app/v3/dynamicFlushScheduler.server.ts`	- Added optional `maxConcurrency` to config and new properties (`MAX_CONCURRENCY`, `concurrencyLimiter` using `pLimit`) - Converted `addToBatch`, `checkAndFlush`, and `flushNextBatch` to asynchronous methods - Introduced shutdown handling with `setupShutdownHandlers`, `shutdown`, and timer management methods (`clearTimer`, `resetFlushTimer`)
`apps/webapp/package.json`	- Updated dependencies: restored `@depot/sdk-node`, `@internal/run-engine`, `@internal/zod-worker`, and `@opentelemetry/api-logs` - Added new dependency `p-limit`
`apps/webapp/test/dynamicFlushScheduler.test.ts`	- New test file added using Vitest - Covers scenarios for batch processing, flush intervals, and handling of the SIGTERM signal to ensure graceful shutdown

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Scheduler
    participant Limiter as pLimit
    User->>Scheduler: addToBatch(items)
    Scheduler->>Scheduler: Check batch threshold
    alt Batch threshold reached
        Scheduler->>Scheduler: flushNextBatch()
        Scheduler->>Limiter: Execute flush for sub-batch
        Limiter-->>Scheduler: Return flush result/error
    end
    Scheduler-->>User: Flush complete

sequenceDiagram
    participant OS
    participant Scheduler
    OS->>Scheduler: SIGTERM Received
    Scheduler->>Scheduler: setupShutdownHandlers()
    Scheduler->>Scheduler: shutdown()
    Scheduler->>Limiter: Process remaining batches
    Scheduler-->>OS: Graceful shutdown complete

Possibly related PRs

Add internal spans to event repo #1678: Modifications to DynamicFlushSchedulerConfig and callback function align closely with the scheduler updates.
re2: Add attempt metrics in dev (prod WIP). Added max concurrent runs setting to dev using p-limit #1766: Introduces concurrency management using a similar pLimit mechanism and a maxConcurrentRuns property, indicating a shared focus on controlling concurrency.

Poem

I'm a rabbit dancing in code delight,
Hopping through batches both day and night.
With pLimit guiding each flush in line,
Each async step works oh-so-fine.
Shutdowns are graceful, no need to fear –
A bunny cheers for changes made so clear!
🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

Scope: all 2 workspace projects
ERR_PNPM_OPTIONAL_DEPS_REQUIRE_PROD_DEPS Optional dependencies cannot be installed without production dependencies

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)

We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
- To enable this feature, set early_access to true under in the settings.

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (8)

apps/webapp/package.json (1)
148-148: Introduction of “p-limit” for concurrency control.
This aligns with the concurrency enhancements in DynamicFlushScheduler. Make sure to lock down the version if consistent behavior is critical across environments.
-    "p-limit": "^6.2.0",
+    "p-limit": "6.2.0",
apps/webapp/test/dynamicFlushScheduler.test.ts (2)

37-51: Flush interval-driven tests look solid.
By advancing the fake timers, you're verifying time-based flushes. This effectively ensures that the scheduler initiates a flush once the interval elapses.

Consider adding a test verifying that multiple flush intervals still cause repeated flushes if items keep coming in.

68-89: Signal handling test is comprehensive.
Simulating SIGTERM to confirm a final flush is crucial. This test ensures the scheduler properly returns the pending items.

Consider adding a concurrency test to confirm the maxConcurrency setting’s effect under multiple batches being flushed concurrently.

apps/webapp/app/v3/dynamicFlushScheduler.server.ts (5)

29-30: Initializing pLimit with user-specified concurrency.
Fallback to 1 avoids unbounded concurrency. This approach is safe for minimal parallelism but might be low for certain workloads.

Would you like to auto-scale concurrency based on system metrics (CPU load, memory usage)? I can help open an enhancement issue if desired.

43-56: Prometheus Gauges for batch size & failed batches.
This instrumentation broadens observability. Consider adding concurrency metrics (like active/queued flush tasks) for deeper insights.

67-77: Comment and logic in addToBatch.
The TODO about using .concat() for improved performance is valid for large arrays. This function is otherwise well-documented.

Use .concat(items) to avoid overhead from spread, especially for large arrays.

100-113: Graceful shutdown method.
The function checks if shutdown is already in progress, flushes pending items, and stops timers. This ensures no events are lost on exit.

Consider adding a timeout fallback to handle stuck callbacks during shutdown.

138-179: flushNextBatch concurrency logic.
Splitting items into smaller batches and applying pLimit ensures concurrency is well-managed. Error handling logs the batch-ID and increments failedBatchCount. This is robust.

A circuit-breaker approach or retry queue might handle repeated failures gracefully, especially for flakier downstream systems.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5896165 and 4153e38.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (3)

apps/webapp/app/v3/dynamicFlushScheduler.server.ts (1 hunks)
apps/webapp/package.json (2 hunks)
apps/webapp/test/dynamicFlushScheduler.test.ts (1 hunks)

🔇 Additional comments (17)

apps/webapp/package.json (3)

49-49: Good restoration of @depot/sdk-node dependency.
This reintroduces the @depot/sdk-node package at version ^1.0.0. If this is intentional, ensure there's no conflict with any previous or existing use in the code.

55-56: Restoring internal dependencies appears consistent.
Re-adding @internal/run-engine and @internal/zod-worker at workspace:* seems proper for your monorepo setup. Verify that the reintroduction doesn't conflict with prior removal rationale.

Would you like to confirm stable version references using a script snippet? I can provide one if needed.

60-60: Re-added @opentelemetry/api-logs dependency.
Confirm that usage of @opentelemetry/api-logs aligns with the new instrumentation approach. If the module was previously removed for compatibility reasons, validate any issues are resolved now.

apps/webapp/test/dynamicFlushScheduler.test.ts (4)

4-9: Test suite setup is clear.
Using vi.useFakeTimers() and resetting mocks in beforeEach is good practice to ensure each test runs independently.

11-21: Verify no-op behavior for empty batch.
This test confirms that the scheduler doesn't invoke the callback when no items are enqueued. The logic is straightforward and well structured.

23-35: Validating single flush scenario.
This test ensures the scheduler triggers the callback once the batch size is exactly met. This is a core scenario, and the expectations appear correct.

53-66: Batching multiple times is well tested.
You accurately check that two flushes happen for six items with a batch size of three. This confirms the correct partitioning of items.

apps/webapp/app/v3/dynamicFlushScheduler.server.ts (10)

2-3: Importing p-limit and prom-client.
You’ve introduced p-limit for concurrency limiting and prom-client metrics. Ensure any bundling or environment constraints (e.g., serverless) don’t conflict with these packages.

10-11: Optional maxConcurrency prop in the config.
Great addition to control concurrency. Ensure default maxConcurrency does not conflict with user expectations.

18-19: Tracking concurrency limiting fields.
Defining MAX_CONCURRENCY and concurrencyLimiter clarifies concurrency logic. The typed return from pLimit is correct.

22-24: New fields for shutdown logic and metrics.
isShuttingDown and failedBatchCount neatly track terminating conditions and error states.

36-40: Comprehensive initialization logging.
Logging initial config helps debug misconfigurations in production. The structure is concise.

79-84: Threshold-based flush trigger.
Flushing when currentBatch.length >= BATCH_SIZE ensures partial batch flushes. This logic is solid for real-time scenarios.

94-98: Setting up graceful shutdown hooks.
Listening for SIGTERM and SIGINT is essential for containerized environments. Logging the handler configuration is beneficial for troubleshooting.

115-120: Timer clearing process.
Clearing the interval is crucial to prevent unwanted flush attempts after we initiate shutdown. Logging helps confirm the timer removal.

122-126: Timer reset ensures fresh intervals.
This pattern restarts the timer after a successful flush, which supports continuous ingestion.

129-134: Periodic flush triggered.
If the batch length is non-zero, we flush. This prevents stale items from accumulating if batch thresholds are never reached.

mpxr added 13 commits March 13, 2025 12:04

flush takes batch size into consideration when calling callback

4f3a820

add tests to DynamicFlushScheduler

04c4950

install p-limit

ec96dd7

take concurrency limit into consideration

df7a484

make sure batches are flushed on SIGTERM

05c0803

report metrics to the /metrics endpoint

ba7dca6

comment out metric sending

a929974

enable SIGINT

09145b3

report failedBatchesCount

6944cda

add logging

4bb1e1d

report failedBatchesCount

3b012df

add some todo comments to the scheduler

6ce9b83

remove commented out code

4153e38

coderabbitai bot reviewed Mar 13, 2025

View reviewed changes

ericallam closed this May 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Introduce max concurrency to DynamicFlushScheduler #1792

Introduce max concurrency to DynamicFlushScheduler #1792

Uh oh!

mpxr commented Mar 13, 2025 •

edited

Loading

Uh oh!

changeset-bot bot commented Mar 13, 2025

Uh oh!

coderabbitai bot commented Mar 13, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Introduce max concurrency to DynamicFlushScheduler #1792

Introduce max concurrency to DynamicFlushScheduler #1792

Uh oh!

Conversation

mpxr commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Checklist

Testing

Changelog

TODOs:

Uh oh!

changeset-bot bot commented Mar 13, 2025

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mpxr commented Mar 13, 2025 •

edited

Loading

coderabbitai bot commented Mar 13, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)