Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dependency resolution for nested JARs (with fix for Spring Boot) #6720

Merged
merged 4 commits into from Mar 1, 2024

Conversation

smola
Copy link
Member

@smola smola commented Feb 22, 2024

What Does This Do

  • Fix dependency resolution for nested JARs, as originally implemented in Fix dependency resolution for nested JARs #6603
  • Now using our own JAR readers which avoid all interactions with Spring Boot loader (or whatever other esoteric URLConnection handlers we might find), and the end result is faster than re-using Spring Boot loader (see benchmarks below).

Motivation

Additional Notes

Jira ticket: APPSEC-51864

Original error reproduction

In the intermediate commits, you will find a test case reproducing the original error as well as a naive fix:

  • Introduce test by @manuel-alvarez-alvarez to reproduce the crash in Spring Boot 3 loader, as reported in Stream closed exception during Spring application startup #6704
  • Apply a fix for the above issue. The issue was caused by JarUrlConnection caching JarFile instances. Under certain order of actions, we would close the same JarFile instance while the application is using it to read something from it, causing an exception. Now we force disable the use of caching on our side with setUseCaches(false).

A test failure here would look like this (e.g. if you revert the last commit in this PR, which contains the actual fix):

java.io.IOException: Stream closed
	at java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:68)
	at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
	at org.springframework.boot.loader.net.protocol.jar.LazyDelegatingInputStream.read(LazyDelegatingInputStream.java:33)
	at datadog.telemetry.dependency.DependencyResolverSpring3Specification.resolving nested dependency does not interfere with application reading the same jar(DependencyResolverSpring3Specification.groovy:110)

Benchmarks

Master, no full scanning of nested dependencies:

DependencyResolverBenchmark.resolveNestedSpringBootJar  thrpt   25  288.602 ± 11.002  ops/ms
DependencyResolverBenchmark.resolveSimpleJar            thrpt   25   20.920 ±  2.308  ops/ms

Naive solution, re-using Spring's JAR URLConnection handler:

DependencyResolverBenchmark.resolveNestedSpringBootJar  thrpt   25   5.524 ± 0.162  ops/ms
DependencyResolverBenchmark.resolveSimpleJar            thrpt   25  20.918 ± 2.299  ops/ms

Final state, with our own JAR reader:

DependencyResolverBenchmark.resolveNestedSpringBootJar  thrpt   25  10.131 ± 0.364  ops/ms
DependencyResolverBenchmark.resolveSimpleJar            thrpt   25  22.093 ± 2.292  ops/ms

@smola smola changed the title Fix dependency resolution to nested JARs (with fix for Spring Boot) Fix dependency resolution for nested JARs (with fix for Spring Boot) Feb 22, 2024
@smola smola added the comp: telemetry Telemetry label Feb 22, 2024
@pr-commenter
Copy link

pr-commenter bot commented Feb 22, 2024

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master smola/nested-deps-again
git_commit_date 1709229545 1709279159
git_commit_sha a718db9 6c58278
release_version 1.31.0-SNAPSHOT~a718db98d8 1.31.0-SNAPSHOT~6c582788d6
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1709282031 1709282031
ci_job_id 447864320 447864320
ci_pipeline_id 29271131 29271131
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
module Agent Agent
parent None None
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 43 metrics, 11 unstable metrics.

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
end_time 2024-03-01T08:10:14 2024-03-01T08:28:50
git_branch master smola/nested-deps-again
git_commit_date 1709229545 1709279159
git_commit_sha a718db9 6c58278
release_version 1.31.0-SNAPSHOT~a718db98d8 1.31.0-SNAPSHOT~6c582788d6
start_time 2024-03-01T08:10:01 2024-03-01T08:28:37
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1709282031 1709282031
ci_job_id 447864320 447864320
ci_pipeline_id 29271131 29271131
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 14 unstable metrics.

Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.31.0-SNAPSHOT~6c582788d6, baseline=1.31.0-SNAPSHOT~a718db98d8
    dateFormat X
    axisFormat %s
section baseline
no_agent (367.919 µs) : 348, 388
.   : milestone, 368,
iast (469.793 µs) : 449, 490
.   : milestone, 470,
iast_FULL (532.882 µs) : 512, 553
.   : milestone, 533,
iast_GLOBAL (503.251 µs) : 481, 526
.   : milestone, 503,
iast_HARDCODED_SECRET_DISABLED (471.104 µs) : 451, 491
.   : milestone, 471,
iast_INACTIVE (448.893 µs) : 428, 469
.   : milestone, 449,
iast_TELEMETRY_OFF (470.709 µs) : 450, 492
.   : milestone, 471,
tracing (445.491 µs) : 425, 466
.   : milestone, 445,
section candidate
no_agent (366.407 µs) : 346, 386
.   : milestone, 366,
iast (477.585 µs) : 457, 498
.   : milestone, 478,
iast_FULL (537.312 µs) : 517, 558
.   : milestone, 537,
iast_GLOBAL (507.966 µs) : 485, 531
.   : milestone, 508,
iast_HARDCODED_SECRET_DISABLED (468.446 µs) : 448, 489
.   : milestone, 468,
iast_INACTIVE (445.544 µs) : 425, 466
.   : milestone, 446,
iast_TELEMETRY_OFF (464.909 µs) : 444, 486
.   : milestone, 465,
tracing (445.649 µs) : 425, 466
.   : milestone, 446,
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 367.919 µs [348.169 µs, 387.67 µs] -
iast 469.793 µs [449.35 µs, 490.237 µs] 101.874 µs (27.7%)
iast_FULL 532.882 µs [512.28 µs, 553.485 µs] 164.963 µs (44.8%)
iast_GLOBAL 503.251 µs [480.962 µs, 525.54 µs] 135.332 µs (36.8%)
iast_HARDCODED_SECRET_DISABLED 471.104 µs [450.764 µs, 491.444 µs] 103.185 µs (28.0%)
iast_INACTIVE 448.893 µs [428.292 µs, 469.494 µs] 80.973 µs (22.0%)
iast_TELEMETRY_OFF 470.709 µs [449.812 µs, 491.607 µs] 102.79 µs (27.9%)
tracing 445.491 µs [425.119 µs, 465.862 µs] 77.571 µs (21.1%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 366.407 µs [346.367 µs, 386.447 µs] -
iast 477.585 µs [457.007 µs, 498.163 µs] 111.178 µs (30.3%)
iast_FULL 537.312 µs [516.844 µs, 557.781 µs] 170.905 µs (46.6%)
iast_GLOBAL 507.966 µs [485.114 µs, 530.817 µs] 141.559 µs (38.6%)
iast_HARDCODED_SECRET_DISABLED 468.446 µs [448.297 µs, 488.596 µs] 102.039 µs (27.8%)
iast_INACTIVE 445.544 µs [424.822 µs, 466.266 µs] 79.137 µs (21.6%)
iast_TELEMETRY_OFF 464.909 µs [444.275 µs, 485.543 µs] 98.502 µs (26.9%)
tracing 445.649 µs [425.198 µs, 466.099 µs] 79.242 µs (21.6%)
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.31.0-SNAPSHOT~6c582788d6, baseline=1.31.0-SNAPSHOT~a718db98d8
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.351 ms) : 1332, 1370
.   : milestone, 1351,
appsec (1.804 ms) : 1780, 1828
.   : milestone, 1804,
iast (1.523 ms) : 1500, 1547
.   : milestone, 1523,
profiling (1.544 ms) : 1521, 1568
.   : milestone, 1544,
tracing (1.505 ms) : 1481, 1528
.   : milestone, 1505,
section candidate
no_agent (1.346 ms) : 1327, 1365
.   : milestone, 1346,
appsec (1.785 ms) : 1761, 1809
.   : milestone, 1785,
iast (1.519 ms) : 1496, 1542
.   : milestone, 1519,
profiling (1.585 ms) : 1561, 1610
.   : milestone, 1585,
tracing (1.513 ms) : 1489, 1536
.   : milestone, 1513,
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.351 ms [1.332 ms, 1.37 ms] -
appsec 1.804 ms [1.78 ms, 1.828 ms] 453.162 µs (33.6%)
iast 1.523 ms [1.5 ms, 1.547 ms] 172.768 µs (12.8%)
profiling 1.544 ms [1.521 ms, 1.568 ms] 193.601 µs (14.3%)
tracing 1.505 ms [1.481 ms, 1.528 ms] 153.896 µs (11.4%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.346 ms [1.327 ms, 1.365 ms] -
appsec 1.785 ms [1.761 ms, 1.809 ms] 439.461 µs (32.7%)
iast 1.519 ms [1.496 ms, 1.542 ms] 173.334 µs (12.9%)
profiling 1.585 ms [1.561 ms, 1.61 ms] 239.723 µs (17.8%)
tracing 1.513 ms [1.489 ms, 1.536 ms] 166.853 µs (12.4%)

Copy link
Contributor

@manuel-alvarez-alvarez manuel-alvarez-alvarez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@smola smola added the tag: do not merge Do not merge changes label Feb 27, 2024
@smola smola force-pushed the smola/nested-deps-again branch 3 times, most recently from e70344e to 37f798c Compare February 28, 2024 15:32
@smola smola removed the tag: do not merge Do not merge changes label Feb 28, 2024
try {
JarReader.Extracted metadata = null;
String path = null;
if ("file".equals(scheme)) {
File f;
if (uri.isOpaque()) {
f = new File(uri.getSchemeSpecificPart());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason of doing it like this instead of:

File f = Paths.get(uri).toFile();

I think this part is not covered by any test 😢

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The background story here is tricky, so I'm not touching this logic until I can test it properly. We'll probably require Windows tests for it. See #4820

if (innerJarPath.endsWith("!/")) {
innerJarPath = innerJarPath.substring(0, innerJarPath.length() - 2);
}
try (final JarFile outerJar = new JarFile(outerJarPath, false /* no verify */)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, but is it a thing to have more levels of nested jars?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen at least 1 user report at Spring Boot asking about a third level of nesting, but Spring Boot does not support that. It would definitely be possible to do it with a custom URLConnection handler. But until we find some real scenario, I would rather not implement it.

Also note that, while it should be relatively easy to add support for an arbitrary nesting depth, we would always set a limit to nesting for performance reasons since each additional level of nesting needs to decompress a stream.

@smola smola merged commit d41fb9c into master Mar 1, 2024
79 checks passed
@smola smola deleted the smola/nested-deps-again branch March 1, 2024 09:47
@github-actions github-actions bot added this to the 1.31.0 milestone Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants