Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom resolvers as XXE solution #7607

Open
carlosame opened this issue Jan 15, 2022 · 9 comments
Open

Support custom resolvers as XXE solution #7607

carlosame opened this issue Jan 15, 2022 · 9 comments

Comments

@carlosame
Copy link

I attempted to set up CodeQL in the EchoSVG project (css4j/echosvg#37) but found a false positive claiming a critical Java XXE vulnerability in:

https://github.com/css4j/echosvg/blob/f79f0b9e201ba927745d1645ead1879c8f89e981/echosvg-dom/src/main/java/io/sf/carte/echosvg/dom/util/SAXDocumentFactory.java#L463-L470

The code uses FEATURE_SECURE_PROCESSING (as well as other configurations) together with a custom resolver that, by default, is configured to not retrieve remote DTDs. The approach is described here:

https://css4j.github.io/resolver.html

And that's similar to the resolver approach that OWASP describes, but instead of a no-op it is using a preloaded subset of safe DTDs. That's consistent with SonarQube S2755:

And use an entity resolver (and optionally an XML Catalog) to resolve only trusted entities.

Given that CodeQL uses OWASP as a guide for the identification of security vulnerabilities, it would seem reasonable to have a way to avoid false positives when a custom EntityResolver is being used.

@carlosame carlosame added the question Further information is requested label Jan 15, 2022
@tausbn tausbn added Java false-positive and removed question Further information is requested labels Jan 17, 2022
@tausbn
Copy link
Contributor

tausbn commented Jan 17, 2022

Hello and thanks for your report!

If I understand your description correctly, this is actually a false positive, so I have changed the label accordingly, and I will bring it to the attention of the team in charge of the CodeQL Java analysis.

@carlosame
Copy link
Author

If I understand your description correctly, this is actually a false positive, so I have changed the label accordingly

Yes, the "question" label was somehow put automatically after posting the issue.

and I will bring it to the attention of the team in charge of the CodeQL Java analysis.

Thank you.

@smowton
Copy link
Contributor

smowton commented Jan 28, 2022

Very likely we wouldn't be able to analyse the difference between a safe / no-op / restricted entity resolver and a resolver that can still be dangerous. I've passed your query to the security lab for their opinion on whether we'd see more false positives or negatives if we considered setEntityResolver to be always sanitizing.

@carlosame
Copy link
Author

carlosame commented Jan 28, 2022

Very likely we wouldn't be able to analyse the difference between a safe / no-op / restricted entity resolver and a resolver that can still be dangerous.

I'm not familiar with CodeQL's analytical capabilities, but if it can determine that the resolver never returns null, that would be a good indication that the resolver may be safe.

I've passed your query to the security lab for their opinion on whether we'd see more false positives or negatives if we considered setEntityResolver to be always sanitizing.

Unfortunately there are quite a few cases out there where setEntityResolver is used with an unsafe resolver (because the other resolvers that I've seen can return null).

If the never-null check that I mention above could be done, then when such a resolver was set with setEntityResolver and FEATURE_SECURE_PROCESSING was also set, that would probably justify removing the warning or at least decreasing the severity.

EDIT: the main security issue with resolvers is when EntityResolver.resolveEntity returns null and systemId is not null. If systemId is null, the parser has no URL to open a connection to, so returning null does not matter.

@smowton
Copy link
Contributor

smowton commented Jan 31, 2022

Thanks for the constructive suggestion -- sounds like a workable solution, though a nontrivial one to implement and backlog. We'll take a look in more detail, but likely this won't be an immediate high priority, so in the meantime I'd recommend dismissing the false positive once you're sure it is indeed false.

@carlosame
Copy link
Author

I'd recommend dismissing the false positive once you're sure it is indeed false.

Yes it is a false positive. Unfortunately there are also 29 "high severity" claimed vulnerabilities that are benign integer or floating point overflows. For example, the following is a high severity vulnerability according to CodeQL:

ang += Math.PI;

where ang is a float.

I can't cope with 30 false positives in a single project so I don't think that I'll use CodeQL there. (I have other projects where it can bring a value though, and the effort of the CodeQL team is appreciated)

@smowton
Copy link
Contributor

smowton commented Feb 1, 2022

I see a few of those overflow warnings which we could remove because the RHS is a small constant (e.g., pi). However about 25 of the 30 really are potentially large doubles, and generally we don't know that the overflow is benign, so we'd intend to warn about these in general.

I'll ask someone from the code scanning team to advise how to switch off the information-losing-cast warning if that would be helpful for this project.

@smowton
Copy link
Contributor

smowton commented Feb 2, 2022

OK, got this working:

https://github.com/smowton/echosvg/blob/master/.github/workflows/codeql-analysis.yml#L48 adds a custom config file,

https://github.com/smowton/echosvg/blob/master/.github/workflows/codeql/codeql-config.yml which says "disregard the default queries, and instead use...

https://github.com/smowton/echosvg/blob/master/.github/workflows/codeql/custom-queries.qls which disables the overflow query.

With regard to security relevance: I'd look out for overflows that could lead to non-termination of loops for example, which could present an avenue to DoS a server running your software. However if you want to go down the route of muting this particular query (or queries), this example should do the trick.

@carlosame
Copy link
Author

OK, got this working:

Thank you, I pulled your files to master.

With regard to security relevance: I'd look out for overflows that could lead to non-termination of loops for example, which could present an avenue to DoS a server running your software.

Thanks for the advice. Several of the overflows do not really look exploitable (for example animations whose loops are based on
timing functions and not on the quantities being overflown). A few of the overflows are less clear though, and for example I would be less confident about the possible exploitability of anything involving the long-deprecated SVG fonts (that only Inkscape, EchoSVG and Batik still support).

A full analysis done by a security researcher perhaps could find something exploitable, maybe, the bad news are that there are already known ways to perform a full-blown DoS on a server that uses EchoSVG or Batik for SVG transcoding, and perhaps at some point I should try to fix the known ones first...

Development resources in open source projects are limited, you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants