Skip to content

Add evalution timeout for expressions#11741

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 5 commits into
masterfrom
jpbempel/cond-eval-timeout
Jul 2, 2026
Merged

Add evalution timeout for expressions#11741
gh-worker-dd-mergequeue-cf854d[bot] merged 5 commits into
masterfrom
jpbempel/cond-eval-timeout

Conversation

@jpbempel

@jpbempel jpbempel commented Jun 25, 2026

Copy link
Copy Markdown
Member

What Does This Do

Refactor TimeoutChecker to be an interface. The former concrete class
is now an interface with a static create(Config, Duration) factory
that returns either CpuTimeoutChecker or WallTimeoutChecker based on a new config flag
(DD_INTERNAL_DYNAMIC_INSTRUMENTATION_TIMEOUT_CHECKER_MODE). Wall clock time can be problematic in case of long GC pauses or sleep or constraint resource (container with less than a CPU). A cpu time driven timeout is more resilient for this. But by default we are using wall clock time for the time being and the cpu time is just an alternative.

Refactor Expression evaluation API: All Expression/BooleanExpression implementations now accept a new EvalContext class that bundles ValueReferenceResolver + TimeoutChecker into a single context object. That way we can check the timeout in any expression.

NB: for some special expressions (contains on strings or List) we have a per item count limit as we cannot introduce the timeout check for those invocations

new config option for the evaluation timeout
(DD_DYNAMIC_INSTRUMENTATION_EVALUATION_TIMEOUT), default is 50ms

The ValueScript json serialization used for tests was revisited to use a visitor.

Tests updated/added across all expression tests,
plus new smoke test (LogProbesIntegrationTest) and a configurable TimeoutCheckerTest.

Motivation

Conditions and expression values (log templates and span decorations) are not capped by a timeout to avoid spending too much time for complex expressions.

Additional Notes

Contributor Checklist

  • Format the title according to the contribution guidelines
  • Assign the type: and (comp: or inst:) labels in addition to any other useful labels
  • Avoid using close, fix, or any linking keywords when referencing an issue
    Use solves instead, and assign the PR milestone to the issue
  • Update the CODEOWNERS file on source file addition, migration, or deletion
  • Update public documentation with any new configuration flags or behaviors
  • Add your completed PR to the merge queue by commenting /merge. You can also:
    • Customize the commit message associated with the merge with /merge --commit-message "..."
    • Remove your PR from the merge queue with /merge -c
    • Skip all merge queue checks with /merge -f --reason "reason"; please use this judiciously, as some checks do not run at the PR-level (note: the PR still needs to be mergeable, this will only skip the pre-merge build)
    • Get more information in this doc

Jira ticket: [PROJ-IDENT]

@pr-commenter

pr-commenter Bot commented Jun 25, 2026

Copy link
Copy Markdown

Debugger benchmarks

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
ci_job_date 1782998010 1782998357
end_time 2026-07-02T13:14:57 2026-07-02T13:20:42
git_branch master jpbempel/cond-eval-timeout
git_commit_sha d3398ea 17c6669
start_time 2026-07-02T13:13:31 2026-07-02T13:19:17
See matching parameters
Baseline Candidate
ci_job_id 1824705449 1824705449
ci_pipeline_id 122352419 122352419
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
git_commit_date 1782997228 1782997228

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 9 metrics, 6 unstable metrics.

See unchanged results
scenario Δ mean agg_http_req_duration_min Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p75 Δ mean agg_http_req_duration_p99 Δ mean throughput
scenario:noprobe unstable
[-57.514µs; +77.921µs] or [-18.612%; +25.216%]
unstable
[-74.599µs; +107.653µs] or [-20.967%; +30.257%]
unstable
[-163.349µs; +386.500µs] or [-43.770%; +103.565%]
unstable
[-347.825µs; +619.817µs] or [-27.813%; +49.562%]
same
scenario:basic same same same unstable
[-36.736µs; +265.806µs] or [-3.602%; +26.062%]
unstable
[-251.251op/s; +129.300op/s] or [-10.050%; +5.172%]
scenario:loop same same same same same
Request duration reports for reports
gantt
    title reports - request duration [CI 0.99] : candidate=None, baseline=None
    dateFormat X
    axisFormat %s
section baseline
noprobe (355.799 µs) : 300, 412
.   : milestone, 356,
basic (297.133 µs) : 290, 304
.   : milestone, 297,
loop (8.982 ms) : 8977, 8987
.   : milestone, 8982,
section candidate
noprobe (372.326 µs) : 263, 481
.   : milestone, 372,
basic (300.866 µs) : 289, 313
.   : milestone, 301,
loop (8.978 ms) : 8970, 8986
.   : milestone, 8978,
Loading
  • baseline results
Scenario Request median duration [CI 0.99]
noprobe 355.799 µs [299.968 µs, 411.631 µs]
basic 297.133 µs [290.296 µs, 303.97 µs]
loop 8.982 ms [8.977 ms, 8.987 ms]
  • candidate results
Scenario Request median duration [CI 0.99]
noprobe 372.326 µs [263.289 µs, 481.364 µs]
basic 300.866 µs [288.873 µs, 312.859 µs]
loop 8.978 ms [8.97 ms, 8.986 ms]

@dd-octo-sts

dd-octo-sts Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

🟡 Java Benchmark SLOs — Performance SLO warning (near threshold)

Suite Status
Startup 🟡 warning

SLO thresholds are defined here based on automatically generated metrics. A warning is raised when results are within 5% of the threshold.

PR vs. master results
Scenario Candidate master Δ (95% CI of mean)
startup:insecure-bank:iast:Agent 13.96 s 14.01 s [-1.3%; +0.6%] (no difference)
startup:insecure-bank:tracing:Agent 12.95 s 12.98 s [-1.1%; +0.5%] (no difference)
startup:petclinic:appsec:Agent 16.33 s 16.67 s [-6.4%; +2.3%] (no difference)
startup:petclinic:iast:Agent 16.82 s 16.48 s [-2.4%; +6.6%] (no difference)
startup:petclinic:profiling:Agent 16.75 s 16.81 s [-1.6%; +0.8%] (no difference)
startup:petclinic:sca:Agent 16.76 s 16.66 s [-0.4%; +1.6%] (no difference)
startup:petclinic:tracing:Agent 15.56 s 16.05 s [-7.0%; +0.9%] (no difference)

Commit: 17c66699 · CI Pipeline · Benchmarking Platform UI


Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion.

@jpbempel jpbempel force-pushed the jpbempel/cond-eval-timeout branch 4 times, most recently from 09e48fe to 23714d7 Compare June 26, 2026 08:19
@jpbempel jpbempel marked this pull request as ready for review June 26, 2026 14:28
@jpbempel jpbempel requested review from a team as code owners June 26, 2026 14:28
@jpbempel jpbempel requested review from PerfectSlayer and evanchooly and removed request for a team June 26, 2026 14:28
@dd-octo-sts

dd-octo-sts Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23714d72cd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +23 to +24
public Duration getTimeOut() {
return null;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return the configured timeout from CPU checker

When DD_INTERNAL_DYNAMIC_INSTRUMENTATION_TIMEOUT_CHECKER_MODE=CPU is enabled and an expression actually times out, ExpressionHelper.checkTimeout calls checker.getTimeOut().toMillis() to build the EvaluationException; returning null here turns that path into a NullPointerException instead of the expected timeout error, so CPU mode reports unexpected/null evaluation errors rather than a proper timeout.

Useful? React with 👍 / 👎.

@jpbempel jpbempel added type: enhancement Enhancements and improvements comp: debugger Dynamic Instrumentation labels Jun 26, 2026
jpbempel added 2 commits July 1, 2026 22:50
Conditions and expression values (log templates and span decorations)
are not capped by a timeout to avoid spending too much time for
complex expressions.

Refactor TimeoutChecker to be an interface. The former concrete class
 is now an interface with a static create(Config, Duration) factory
that returns either CpuTimeoutChecker or WallTimeoutChecker
based on a new config flag
(DD_INTERNAL_DYNAMIC_INSTRUMENTATION_TIMEOUT_CHECKER_MODE).
Wall clock time can be problematic in case of long GC pauses or
sleep or constraint resource (container with less than a CPU).
A cpu time driven timeout is more resilient for this. But by default
we are using wall clock time for the time being and the cpu time is
just an alternative.

Refactor Expression evaluation API: All Expression/BooleanExpression
implementations now accept a new EvalContext class that bundles
ValueReferenceResolver + TimeoutChecker into a single context object.
That way we can check the timeout in any expression.

NB: for some special expressions (contains on strings or List) we have
a per item count limit as we cannot introduce the timeout check for
those invocations

new config option for the evaluation timeout
(DD_DYNAMIC_INSTRUMENTATION_EVALUATION_TIMEOUT), default is 50ms

The ValueScript json serialization used for tests was revisited to
use a visitor.

Tests updated/added across all expression tests,
plus new smoke test (LogProbesIntegrationTest) and a configurable
TimeoutCheckerTest.
@jpbempel jpbempel force-pushed the jpbempel/cond-eval-timeout branch from 0f02315 to 09404b0 Compare July 1, 2026 20:50
@datadog-datadog-prod-us1

datadog-datadog-prod-us1 Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

🎯 Code Coverage (details)
Patch Coverage: 51.07%
Overall Coverage: 56.91% (-0.04%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 17c6669 | Docs | Datadog PR Page | Give us feedback!

@jpbempel jpbempel force-pushed the jpbempel/cond-eval-timeout branch from 09404b0 to 255d479 Compare July 2, 2026 08:51

@evanchooly evanchooly left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few minor nits/questions but lgtm


private final long start;
private final Duration timeOut;
String CPU = "CPU";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gross. enum? 😄

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config is coming from Config and put an enum in Config is complicated for nothing. so the config is a String.

ComparisonExpression expression =
new ComparisonExpression(new NumericValue(1, INT), ValueExpression.UNDEFINED, EQ);
assertFalse(expression.evaluate(NoopResolver.INSTANCE));
assertFalse(expression.evaluate(new EvalContext(NoopResolver.INSTANCE, null)));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the inconsistent creation of EvalContexts? some are factory calls, others ctors.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not put a helper method for all creation of EvalContext depending on the arguments

private List<Integer> largeList = new ArrayList<>();

{
for (int i = 0; i < 1_000_001; i++) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is "just a test class" but a proper, named constructor would be cleaner here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialization code is closer to the field
for static field it would be the same.
the named constructor does not bring any value here.


@Test
void testLargeList() {
List<Integer> largeList = new ArrayList<>();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because code golfing is fun, you could use Collections.nCopies(1_000_001, 0) here.

But that aside, I'd consider using the max constant above to make sure the test stays in sync with the max value.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be interesting for optimization, but unfortunately What I am testing here is the fact that iterating over the list is taking too long and hit the timeout. With the nCopies it will be harder :D

// from line: System.out.println("fullMethod");
// to line: + String.join(",", argVar);
.where(MAIN_CLASS_NAME, 88, 97)
.where(MAIN_CLASS_NAME, 95, 104)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't we add a way to annotate/comment the line in the source and have the test find it to avoid having to maintain line numbers like this?

@jpbempel jpbempel Jul 2, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have something for unit tests, but for integrations tests it's harder to inject. Not worth the pain as I am modifying rarely this

@jpbempel jpbempel enabled auto-merge July 2, 2026 15:20
@jpbempel jpbempel added this pull request to the merge queue Jul 2, 2026
@dd-octo-sts

dd-octo-sts Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

/merge

@gh-worker-devflow-routing-ef8351

gh-worker-devflow-routing-ef8351 Bot commented Jul 2, 2026

Copy link
Copy Markdown

View all feedbacks in Devflow UI.

2026-07-02 15:23:30 UTC ℹ️ Start processing command /merge


2026-07-02 15:23:36 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 2h (p90).


2026-07-02 17:24:13 UTCMergeQueue: The build pipeline has timeout

The merge request has been interrupted because the build 1624692089015068691 took longer than expected. The current limit for the base branch 'master' is 120 minutes.

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jul 2, 2026
@jpbempel

jpbempel commented Jul 2, 2026

Copy link
Copy Markdown
Member Author

/merge

@gh-worker-devflow-routing-ef8351

gh-worker-devflow-routing-ef8351 Bot commented Jul 2, 2026

Copy link
Copy Markdown

View all feedbacks in Devflow UI.

2026-07-02 18:04:48 UTC ℹ️ Start processing command /merge


2026-07-02 18:04:52 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 1h (p90).


2026-07-02 19:05:46 UTC ℹ️ MergeQueue: This merge request was merged

@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 4190a5a into master Jul 2, 2026
592 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the jpbempel/cond-eval-timeout branch July 2, 2026 19:05
@github-actions github-actions Bot added this to the 1.64.0 milestone Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: debugger Dynamic Instrumentation type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants