Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New detector for potential XML injection #663

Merged

Conversation

baloghadamsoftware
Copy link
Contributor

The new detector detects cases where an unsafe string is injected into an XML string.
See: https://wiki.sei.cmu.edu/confluence/display/java/IDS16-J.+Prevent+XML+Injection

The new detector detects cases where an unsafe string is injected into an XML string.
See: https://wiki.sei.cmu.edu/confluence/display/java/IDS16-J.+Prevent+XML+Injection
@h3xstream
Copy link
Member

I really like the idea of this detector.
I think it will catch a lot of XML injection and HTML as well (XSS).

XSS tag

If you look at XssServletDetector.java, you will see that you can add a method protected int getPriority(Taint taint) {.
The Taint will include the taintedness of the parameter but also tags that were added to the flow affecting this specific parameter.

One of the tag is XSS_SAFE. In your case, this could be helpful to eliminate false positive when an escape function for XML/HTML is used.

@baloghadamsoftware
Copy link
Contributor Author

XSS tag

If you look at XssServletDetector.java, you will see that you can add a method protected int getPriority(Taint taint) {. The Taint will include the taintedness of the parameter but also tags that were added to the flow affecting this specific parameter.

One of the tag is XSS_SAFE. In your case, this could be helpful to eliminate false positive when an escape function for XML/HTML is used.

Thank you for taking your time to review my PR. I checked the XssServletDetector class. However, I could not found a straightforward way that my class inherits form either AbstractInjectionDetector or BasicInjectionDetector. Maybe I am wrong, but for me the most convenient way was to inherit from AbstractTaintDetector. That is why I self-commented that it should probably go into the taint package and not into the injection one. So I do not have an inherited getPriority() method. What do you propose? Should I somehow find a solution the I inherit from AbstractInjectionDetector or should I somehow copy the behavior of its getPriority() method into my class?

@h3xstream
Copy link
Member

I updated the test cases just to illustrate what I have in mind that would be the basic test cases to support.

https://github.com/find-sec-bugs/find-sec-bugs/pull/663/files#diff-c62f8d7c755aee879348841ed957a0bd9c11e2666640d159278542f45d4d31bb

Quick notes:

  • You don't need to test all taint analysis (source: field, function, constants). All variants are already covered in taint analysis test cases. (This is why I remove some tests.)
  • Test cases are the net to catch error in future changes. Samples can also be refactor by error. If the safe implementation is moved from one or two lines of code. The test case verifying that no defect is reported will not be working properly. In short, line of code assertion is optional. If you can separate, samples in different functions this makes it easy to verify. Line of code assertions can be helpful if we need to verify 10+ function signatures coverage without encapsulating each call.

@h3xstream
Copy link
Member

h3xstream commented Jan 27, 2022

Here is working implementation:

https://github.com/find-sec-bugs/find-sec-bugs/pull/663/files#diff-e491bb406d10a3e9d517ab35b844bab9a7d112f725ae85a6e90bad007b04dd81R62-R101

Just do a git pull origin on your branch to see my changes.

@baloghadamsoftware
Copy link
Contributor Author

Thank you for working on this. Your implementation is more cleaner. Sorry, I am quite new to the taint analysis, I already learned a lot but I did not have enough knowledge to implement the checker based on the InjectionDetector. However, I like your solution much better than mine. Should I rename NewXmlInjectionDetector to XmlInjectionDetector and delete my version?

It seems that the new solution is more general. When testing on 7 different products the previous version found only 4 issues:
https://codechecker-demo.eastus.cloudapp.azure.com/java-static-analysis/reports?expanded=%7B%22278%22%3A%7B%22limit%22%3A10,%22offset%22%3A0%7D,%22279%22%3A%7B%22limit%22%3A10,%22offset%22%3A0%7D,%22280%22%3A%7B%22limit%22%3A10,%22offset%22%3A0%7D,%22281%22%3A%7B%22limit%22%3A10,%22offset%22%3A0%7D,%22282%22%3A%7B%22limit%22%3A10,%22offset%22%3A0%7D,%22283%22%3A%7B%22limit%22%3A10,%22offset%22%3A0%7D,%22284%22%3A%7B%22limit%22%3A10,%22offset%22%3A0%7D%7D&run=FindSecBugs%20on%20MATSim-Libs%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20SpotBugs%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20OpenGrok%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20Bt%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20Jenkins%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20Ttorrent%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20NanoHTTPD%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run-tag-newcheck=1048&run-tag-newcheck=1047&run-tag-newcheck=1046&run-tag-newcheck=1045&run-tag-newcheck=1044&run-tag-newcheck=1042&run-tag-newcheck=1043

However, the new version finds much more: https://codechecker-demo.eastus.cloudapp.azure.com/java-static-analysis/reports?run=FindSecBugs%20on%20MATSim-Libs%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20SpotBugs%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20OpenGrok%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20Bt%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20Jenkins%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20Ttorrent%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&run=FindSecBugs%20on%20NanoHTTPD%20%28user%3A%20find-sec-bugs,%20branch%3A%20master%29&newcheck=FindSecBugs%20on%20MATSim-Libs%20%28user%3A%20baloghadamsoftware,%20branch%3A%20PotentialXMLInjection%29&newcheck=FindSecBugs%20on%20SpotBugs%20%28user%3A%20baloghadamsoftware,%20branch%3A%20PotentialXMLInjection%29&newcheck=FindSecBugs%20on%20Jenkins%20%28user%3A%20baloghadamsoftware,%20branch%3A%20PotentialXMLInjection%29&newcheck=FindSecBugs%20on%20OpenGrok%20%28user%3A%20baloghadamsoftware,%20branch%3A%20PotentialXMLInjection%29&newcheck=FindSecBugs%20on%20Bt%20%28user%3A%20baloghadamsoftware,%20branch%3A%20PotentialXMLInjection%29&newcheck=FindSecBugs%20on%20NanoHTTPD%20%28user%3A%20baloghadamsoftware,%20branch%3A%20PotentialXMLInjection%29&newcheck=FindSecBugs%20on%20Ttorrent%20%28user%3A%20baloghadamsoftware,%20branch%3A%20PotentialXMLInjection%29&items-per-page=100&is-unique=off&diff-type=New&checker-name=POTENTIAL_XML_INJECTION

Is the false-positive rate acceptable?

I also have a question: is removing injectableMethod from InjectionPoint necessary for this particular detector? Or could it be a separate PR?

@h3xstream
Copy link
Member

I also have a question: is removing injectableMethod from InjectionPoint necessary for this particular detector? Or could it be a separate PR?

No it is a refactor I inserted in the PR. A new PR would makes sense.. but since it is just a small change (unused method) I didn't care too much. 😄

@h3xstream
Copy link
Member

h3xstream commented Feb 3, 2022

I will probably release before we integrate the PR because it need some testing.

The test cases that you linked are excellent!
It seems to be a create sample set to test this new detector.

I look at all the 41 cases, here is what I found.
In most case, it is XML injection. Escaping would be appropriate.

Here are the false positive I found:

  • False positive 1 getUri() is safe because URL encoded.
  • False positive 2 NumberFormat.format( <someint> ) should be consider safe. It is only displaying numeric value.

Supporting those two scenario would improve all other rules that use taint analysis.

@baloghadamsoftware
Copy link
Contributor Author

baloghadamsoftware commented Feb 3, 2022

I will probably release before we integrate the PR because it need some testing.

Not a problem, if the next release is not too far. I am glad that you plan a release now because our users were asking exactly this question. Anyway, what kind of further tests do you suggest?

The test cases that you linked are excellent! It seems to be a create sample set to test this new detector.

We randomly picked some open-source projects which are different enough to test different detectors of SpotBugs and FindSecBugs. We always try our modifications on them before creating a PR.

I look at all the 41 cases, here is what I found. In most case, it is XML injection. Escaping would be appropriate.

Here are the false positive I found:

OK, but how could we detect this automatically? Based on its name?

  • False positive 2 NumberFormat.format( <someint> ) should be consider safe. It is only displaying numeric value.

Strange, because I see this line in taint-config/java-lang.txt:

Ljava/text/NumberFormat;:SAFE#IMMUTABLE

I thought that this meanse that every method of this class is safe.

Supporting those two scenario would improve all other rules that use taint analysis.

@baloghadamsoftware
Copy link
Contributor Author

Any news regarding this, @h3xstream? I intend to fix th two scenarios you mentioned, but please answer my questions in my previous comment to be able to do it. Thus how should I tell the detector that getUri() is safe, maybe put it into a taint-config file? And why do we get false positive for NumberFormat.format() if the whole class already appears in the config file as immutable and safe?

@h3xstream
Copy link
Member

Any news regarding this, @h3xstream? I intend to fix th two scenarios you mentioned, but please answer my questions in my previous comment to be able to do it. Thus how should I tell the detector that getUri() is safe, maybe put it into a taint-config file? And why do we get false positive for NumberFormat.format() if the whole class already appears in the config file as immutable and safe?

I must apolagize for the delay..
I look at the current implementation and I remembered having a failing test case (a True Negative).
Now that I have time again to evaluate the missing piece, I see that every test cases is passing in the latest implementation. I don't see any major change in your last commit. So I am a bit confused.
I added few test cases just to be confident about merging the current implementation.

TLDR: It look ready.

By the way thanks for cleaning up the code in your last commit. 👌

@h3xstream h3xstream merged commit 379fae3 into find-sec-bugs:master Jul 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants