Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script engine injection sanitizer with real life example #531

Conversation

gdemarcsek
Copy link
Contributor

This merge request adds a sanitiser to detect data flows of user input into the javax.script.ScriptEngine#eval sink which often results in arbitrary code execution (CWE-94) using a registered scripting engine.

JVM script engines (JSR-223 implementations) can be registered under multiple arbitrary names and they can implement any language (although ECMAScript is probably the most widely used one in real applications).

This sanitizer implementation only confirms data flow, it does not require successful script execution with a registered script engine (similarly to how the OsCommandInjection sanitizer does not guide the fuzzer into running the actual command). As soon as an input produces an execution that reaches the evaluation of the “1+1” expression by a javax.script.ScriptEngine implementation, we report a finding.

I have added a test that demonstrates the sanitizer by rediscovering the CVE-2022-42889 in Apache Commons Text using minimal assumptions of the library and the finding. (It does however limit the length of the input taken from the fuzzer and it adds a hook to avoid the ${const:…} interpolation which can be used to read public static members of any loadable class - not remotely as promising as the ${script:…} payload.)

bazel clean && bazel run --local_cpu_resources=8 //examples:CommonsTextFuzzer
...
INFO: Build completed successfully, 443 total actions
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //examples:CommonsTextFuzzer
-----------------------------------------------------------------------------
INFO: Loaded 125 hooks from com.code_intelligence.jazzer.runtime.TraceCmpHooks
INFO: Loaded 4 hooks from com.code_intelligence.jazzer.runtime.TraceDivHooks
INFO: Loaded 2 hooks from com.code_intelligence.jazzer.runtime.TraceIndirHooks
INFO: Loaded 4 hooks from com.code_intelligence.jazzer.runtime.NativeLibHooks
INFO: Loaded 1 hooks from com.example.CommonsTextFuzzerHooks
INFO: Loaded 8 hooks from com.code_intelligence.jazzer.sanitizers.Deserialization
INFO: Loaded 5 hooks from com.code_intelligence.jazzer.sanitizers.ExpressionLanguageInjection
INFO: Loaded 70 hooks from com.code_intelligence.jazzer.sanitizers.LdapInjection
INFO: Loaded 46 hooks from com.code_intelligence.jazzer.sanitizers.NamingContextLookup
INFO: Loaded 1 hooks from com.code_intelligence.jazzer.sanitizers.OsCommandInjection
INFO: Loaded 52 hooks from com.code_intelligence.jazzer.sanitizers.ReflectiveCall
INFO: Loaded 8 hooks from com.code_intelligence.jazzer.sanitizers.RegexInjection
INFO: Loaded 16 hooks from com.code_intelligence.jazzer.sanitizers.RegexRoadblocks
INFO: Loaded 19 hooks from com.code_intelligence.jazzer.sanitizers.SqlInjection
INFO: Loaded 5 hooks from com.code_intelligence.jazzer.sanitizers.ScriptEngineInjection
INFO: Instrumented com.example.CommonsTextFuzzer (took 17 ms, size +10%)
INFO: libFuzzer ignores flags that start with '--'
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 3229358036
INFO: Loaded 1 modules   (512 inline 8-bit counters): 512 [0x160278000, 0x160278200),
INFO: Loaded 1 PC tables (512 PCs): 512 [0x15a660e00,0x15a662e00),
INFO: -fork=8: fuzzing in separate process(s)
INFO: -fork=8: 0 seed inputs, starting to fuzz in /var/folders/j7/bry6w71d1jl97rd7s3cy3yrm0000gn/T//libFuzzerTemp.FuzzWithFork68681.dir
#24988: cov: 0 ft: 0 corp: 0 exec/s 12494 oom/timeout/crash: 0/0/0 time: 7s job: 1 dft_time: 0
#73551: cov: 278 ft: 843 corp: 109 exec/s 16187 oom/timeout/crash: 0/0/0 time: 11s job: 2 dft_time: 0
#177135: cov: 308 ft: 1068 corp: 241 exec/s 25896 oom/timeout/crash: 0/0/0 time: 13s job: 3 dft_time: 0
#295530: cov: 338 ft: 1314 corp: 391 exec/s 23679 oom/timeout/crash: 0/0/0 time: 16s job: 4 dft_time: 0
#464996: cov: 341 ft: 1365 corp: 436 exec/s 28244 oom/timeout/crash: 0/0/0 time: 19s job: 5 dft_time: 0
#668737: cov: 415 ft: 1617 corp: 547 exec/s 29105 oom/timeout/crash: 0/0/0 time: 22s job: 6 dft_time: 0
#895854: cov: 637 ft: 2369 corp: 625 exec/s 28389 oom/timeout/crash: 0/0/0 time: 26s job: 7 dft_time: 0
...
#453559	REDUCE cov: 745 ft: 4257 corp: 1474/25Kb lim: 661 exec/s: 5335 rss: 924Mb L: 21/68 MS: 1 EraseBytes-
Warning: Nashorn engine is planned to be removed from a future JDK release
Warning: Nashorn engine is planned to be removed from a future JDK release
#455106	REDUCE cov: 745 ft: 4257 corp: 1474/25Kb lim: 670 exec/s: 5354 rss: 924Mb L: 12/68 MS: 2 CopyPart-EraseBytes-
#455172	REDUCE cov: 745 ft: 4257 corp: 1474/25Kb lim: 670 exec/s: 5354 rss: 924Mb L: 13/68 MS: 1 EraseBytes-
Warning: Nashorn engine is planned to be removed from a future JDK release
Warning: Nashorn engine is planned to be removed from a future JDK release
#456349	REDUCE cov: 745 ft: 4257 corp: 1474/25Kb lim: 679 exec/s: 5368 rss: 924Mb L: 14/68 MS: 2 CopyPart-EraseBytes-
#456545	REDUCE cov: 745 ft: 4257 corp: 1474/25Kb lim: 679 exec/s: 5371 rss: 924Mb L: 12/68 MS: 1 EraseBytes-
Warning: Nashorn engine is planned to be removed from a future JDK release
Warning: Nashorn engine is planned to be removed from a future JDK release

== Java Exception: com.code_intelligence.jazzer.api.FuzzerSecurityIssueCritical: Possible script execution
	at com.code_intelligence.jazzer.sanitizers.ScriptEngineInjection.checkScriptEngineExecute(ScriptEngineInjection.java:114)
	at org.apache.commons.text.lookup.ScriptStringLookup.lookup(ScriptStringLookup.java:86)
	at org.apache.commons.text.lookup.InterpolatorStringLookup.lookup(InterpolatorStringLookup.java:135)
	at org.apache.commons.text.StringSubstitutor.resolveVariable(StringSubstitutor.java:1067)
	at org.apache.commons.text.StringSubstitutor.substitute(StringSubstitutor.java:1433)
	at org.apache.commons.text.StringSubstitutor.substitute(StringSubstitutor.java:1308)
	at org.apache.commons.text.StringSubstitutor.replace(StringSubstitutor.java:816)
	at com.example.CommonsTextFuzzer.fuzzerTestOneInput(CommonsTextFuzzer.java:24)
DEDUP_TOKEN: f10483b167fba3d6
== libFuzzer crashing input ==
MS: 2 ChangeASCIIInt-CMP- DE: "1+1"-; base unit: 6931eff51adba2b80c902e62511ee20b7fe8b3ae
0x24,0x7b,0x73,0x63,0x72,0x69,0x70,0x74,0x3a,0x6a,0x73,0x3a,0x31,0x2b,0x31,0x7d,0x4,0xe7,
${script:js:1+1}\004\347
artifact_prefix='/private/var/tmp/_bazel_gyorgydemarcsek/071a3ae2fb95e15eb05d82d6a2a2dd28/execroot/jazzer/bazel-out/darwin_arm64-opt/testlogs/examples/CommonsTextFuzzer/test.outputs/'; Test unit written to /private/var/tmp/_bazel_gyorgydemarcsek/071a3ae2fb95e15eb05d82d6a2a2dd28/execroot/jazzer/bazel-out/darwin_arm64-opt/testlogs/examples/CommonsTextFuzzer/test.outputs/crash-7b8a90d41dff830e912476dbe9304d2b56e9235e
Base64: JHtzY3JpcHQ6anM6MSsxfQTn
stat::number_of_executed_units: 456787
stat::average_exec_per_sec:     5373
stat::new_units_added:          3165
stat::slowest_unit_time_sec:    0
stat::peak_rss_mb:              924
Warning: Nashorn engine is planned to be removed from a future JDK release
reproducer_path='/private/var/tmp/_bazel_gyorgydemarcsek/071a3ae2fb95e15eb05d82d6a2a2dd28/execroot/jazzer/bazel-out/darwin_arm64-opt/testlogs/examples/CommonsTextFuzzer/test.outputs'; Java reproducer written to /private/var/tmp/_bazel_gyorgydemarcsek/071a3ae2fb95e15eb05d82d6a2a2dd28/execroot/jazzer/bazel-out/darwin_arm64-opt/testlogs/examples/CommonsTextFuzzer/test.outputs/Crash_7b8a90d41dff830e912476dbe9304d2b56e9235e.java
INFO: exiting: 19712 time: 2686s

I have managed to reproduce this a couple of times in reasonable running times on a laptop.

Copy link
Contributor

@fmeum fmeum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great contribution, we certainly don't want to miss detection capabilities for this type of issue.

I will read up on scripting engines and comment on the actual sanitizer a bit more. For the meantime, I left some basic review comments.

@bertschneider
Copy link
Contributor

Hey @gdemarcsek, did you get some time to adapt @fmeum comments? It would be great if we could finish this PR, as it's really valuable and could uncover many bugs.

@gdemarcsek
Copy link
Contributor Author

@bertschneider Got a bit drifted away from this, thx for the reminder, will give it a go this weekend. :)

@gdemarcsek gdemarcsek force-pushed the feat/add-script-engine-sanitizer branch 2 times, most recently from d311fbb to 24cc432 Compare March 7, 2023 21:42
@bertschneider
Copy link
Contributor

I thought about the false positive rate of raising a finding whenever the scripting engine is invoked and ways to reduce that possibility.

As I understand the documentation and other blog posts one would restrict the script engine to only access allowed classes. Using a properly secured script engine would still be fine.

To take that into account we could try to invoke Jazzer::reportFindingFromHook and/or block the execution thread (e.g. with an endless loop) to detect denial of service attacks. This would require some code and would not be appropriate for the guideTowardsContainment approach anymore.

But it should still be possible to guide the fuzzer to invoke the internal load function to execute a provided script, which could trigger the finding. E.g. load('classpath:script.js').

What do you think about this approach?

@bertschneider
Copy link
Contributor

I could also give it a try, if you like 😄

@bertschneider
Copy link
Contributor

I don't want to annoy, but any update on this sanitizer? I think it could be a great addition to the project.

@gdemarcsek
Copy link
Contributor Author

Sorry about the delay. Some thoughts (tl;dr I would ignore sandboxing and would avoid relying on any specific script language or engine but some smaller changes might make sense like checking containment instead of equality and randomizing the injected string to avoid detecting constants that happen to be the same):

I thought about the false positive rate of raising a finding whenever the scripting engine is invoked and ways to reduce that possibility.

If this raises a finding whenever the scripting engine is invoked, then I think I misunderstood the entire project. :( I thought the point of this is to prove that user-supplied input flows into the script engine sink, i.e. no constant string would trigger the detector provided it doesn't happen to be the test script - having said that, it might be better to randomize it, as it can be any string really given we are not checking for actual execution).

In fact, it's already quite low FP IMHO because it checks for equality, not even containment, so it detects only if you can control the entire script - maybe guideTowardsContainment would be more realistic as controlling even some parts can be dangerous, but it's a question of balancing FP and FN. I am happy to make any of these two changes ofc (randomized payload and/or containment instead of equality).

As I understand the documentation and other blog posts one would restrict the script engine to only access allowed classes. Using a properly secured script engine would still be fine.

As far as I understand, such sandboxing capabilities would be quite engine-specific and rely on engine-specific APIs (like the Rhino blog post you linked or this guide for Nashorn: https://docs.oracle.com/javase/10/nashorn/nashorn-java-api.htm#JSNUG125). This feature would only detect injections via the Java Scripting API (JSR 223) as it would be quite difficult to write a difficult to every single script engine out there. (Similarly how you already detect SQLi by hooking JDBC classes, not specific SQL driver classes.) On a more principled note, it's impossible to reason about the environment of the application from the vantage point of the detector IMHO. (Sure, there might be a sandbox, but is it really secure? How "far" should I look for it, for example the Java process might run in a container which may be quite restricted. Similarly, should I not report a confirmed command injection if AppArmor/SELinux/Seccomp/Capabilities might be making it difficult (impossible?) to spawn child processes?) You guys know better, but I think the vulnerability is still there - sandboxes mitigate (reduce risk), but do not remediate (remove root cause). I think the only way to avoid script injection while allowing the scripts to access user-input is via bindings maybe, but even then, user-controlled input should not flow to eval directly so it might make sense actually to even hook the versions of eval that do receive a ScriptContext or Bindings argument but I didn't want to go that far for a first iteration.

To take that into account we could try to invoke Jazzer::reportFindingFromHook and/or block the execution thread (e.g. with an endless loop) to detect denial of service attacks. This would require some code and would not be appropriate for the guideTowardsContainment approach anymore.

But it should still be possible to guide the fuzzer to invoke the internal load function to execute a provided script, which could trigger the finding. E.g. load('classpath:script.js').

What do you think about this approach?

I am not sure I fully understand this part, but indeed, my initial approach was to actually execute the script and confirm evaluation, e,g. execute "1 + 1" with the script engine and check that I get 2 or similar POCs (like load()). Then I thought the problem is that whatever we write there, it will be specific to the engine and its language which we cannot really anticipate and support universally. I think you might have faced the same dilemma with the command injection detector which also doesn't actually verify command execution, only injection. So yeah, it depends on what you really want to detect - my goal was to detect user-controllable data flow into javax.script.ScriptEngine.eval sink and show that it could have found a specific CVE without too much knowledge about the tested library.

@bertschneider
Copy link
Contributor

bertschneider commented Apr 17, 2023

I misphrased that, sorry, and your understanding is correct! I wondered if raising a finding whenever the script engine is invoked with the fixed string would be sufficient or not.

The argument that we would need script engine specific inputs to trigger findings from within the script engines themselves is quite convincing and I would also try to avoid that. I only thought about JS/Nashorn, as that mainly caused problems in the past, but there are plenty of other script engines out there as well.

I agree that user controlled input should not be part of the script content directly and could potentially circumvent the sandbox. Verifying that behavior, should be sufficient for this bug detector.
In that scenario checking for containment would be better, as you mentioned. It's not very likely but 1+1 could be included in a script content, so we should change that to something Jazzer specific before switching to a containment check. Could you change that part as offered?

If only user controlled input in the script content should be detected, the check could be done before the actual invocation. Then the hooks registerEngineName and getEngineByName would not be necessary anymore, wouldn't they?

@bertschneider
Copy link
Contributor

Friendly ping 😃

@bertschneider bertschneider force-pushed the feat/add-script-engine-sanitizer branch from 3c02253 to 2733872 Compare June 1, 2023 06:56
@bertschneider
Copy link
Contributor

@gdemarcsek I took the liberty to rebase your PR an master and also incorporate our last discussion. Switching to a contains check somewhat complicates the detection. The content of Reader parameters can now fully be read, checked if it contains our payload and passed on to the String version of eval. To do so requires some class loading trickery, though. WDYT of this approach?
I also noticed that there are two overloaded versions of eval, which are now hooked as well.

@bertschneider bertschneider force-pushed the feat/add-script-engine-sanitizer branch from 2733872 to d51403d Compare June 2, 2023 08:53
@bertschneider bertschneider force-pushed the feat/add-script-engine-sanitizer branch from d51403d to 9f17243 Compare June 2, 2023 09:43
@bertschneider
Copy link
Contributor

@gdemarcsek Do you want to take another look? Otherwise we will merge this PR in a few days. Further changes are always possible, though.

Check for containment of payload in script content in all overloads of
eval.
@bertschneider bertschneider force-pushed the feat/add-script-engine-sanitizer branch from 9f17243 to af3ae4d Compare June 13, 2023 06:19
@bertschneider bertschneider enabled auto-merge (rebase) June 13, 2023 06:21
@bertschneider bertschneider merged commit 7dcfa35 into CodeIntelligenceTesting:main Jun 13, 2023
9 checks passed
@bertschneider
Copy link
Contributor

Merged the PR now. @gdemarcsek, thanks again for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants