Skip to content

Java: CWE-502 Unsafe JSON deserialization with Gson, Flexjson, Jabsorb and JoddJson #5954

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

luchua-bc
Copy link
Contributor

@luchua-bc luchua-bc commented May 25, 2021

Deserialization of untrusted data with JSON frameworks could introduce critical security vulnerabilities such as Remote Code Execution (RCE).

This query checks four popular JSON frameworks - Gson, Flexjson, JoddJson, and Jabsorb and detects the following vulnerabilities:

  • Gson
    • both data and type of deserialized object are controlled by attackers
  • JoddJson
    • both data and type of deserialized object are controlled by attackers
    • whitelisted class types that are overly generic or have vulnerable subclasses
  • Flexjson
    • both data and type of deserialized object are controlled by attackers
    • any configured root class type that is not final or is overly generic
  • Jabsorb
    • both data and type of deserialized object are controlled by attackers

Some known CVEs and issues related to these JSON implementations are:

Please consider to merge the PR. Thanks.

@luchua-bc luchua-bc force-pushed the java/unsafe-gson-flexjson-joddjson-deserialization branch from 40e38ff to ce94c86 Compare June 2, 2021 17:27
@luchua-bc luchua-bc force-pushed the java/unsafe-gson-flexjson-joddjson-deserialization branch from bcf9ed5 to e879ae7 Compare June 14, 2021 03:43
@smowton
Copy link
Contributor

smowton commented Jun 14, 2021

I've started evaluating this as-is, but for future reference, could the new sanititzer be implemented with Configuration::isSanitizer(DataFlow::Node) instead of by not exists side-conditions in the predicate unsafeDeserialization?

@luchua-bc
Copy link
Contributor Author

Thanks @smowton for reviewing this PR. I'll make the change of moving not exists to isSanitizer together with other changes to be made for your suggestions.

@luchua-bc
Copy link
Contributor Author

@smowton I've made the change. Please review and let me know if more changes are needed. Thanks.

@smowton
Copy link
Contributor

smowton commented Jun 24, 2021

@luchua-bc please rebase this now that the other unsafe-deserialization PRs have landed

@luchua-bc luchua-bc force-pushed the java/unsafe-gson-flexjson-joddjson-deserialization branch 2 times, most recently from 0a1e6aa to 1c2c438 Compare June 25, 2021 01:13
@luchua-bc
Copy link
Contributor Author

@smowton I've done the rebase. Please review. Thanks.

@smowton
Copy link
Contributor

smowton commented Jun 25, 2021

File "ql/java/ql/test/stubs/joddjson-6.0.3/jodd/json/JsonParser.java" contains a non-ASCII character at the location marked with | in:

ch opens an actual security hole. Such classes are known as |

File "ql/java/ql/src/semmle/code/java/frameworks/google/Gson.qll" contains a non-ASCII character at the location marked with | in:

Gson and
    this.hasName("fromJson")
  }
}

/** The `toJson|

ASCII check failed!
FAILED: ascii-check.py

These will probably be non-breaking spaces or similar

@luchua-bc
Copy link
Contributor Author

@smowton I've fixed those two issues - one caused by the Windows style double quotes in the stub source file, and the other one was a Unicode character in Gson.qll.

Please run the ASCII check again. Thanks.

@smowton
Copy link
Contributor

smowton commented Jun 30, 2021

Also this needs a rebase

@luchua-bc
Copy link
Contributor Author

I've made the changes on the rebased branch so it seems that another rebase is not required. Please advise if so. Thanks.

@smowton
Copy link
Contributor

smowton commented Jun 30, 2021

@luchua-bc there are still conflicts with main that prevent directly merging this branch. Is there another one somewhere? What is the rebased branch you speak of?

@luchua-bc
Copy link
Contributor Author

I rebased my branch java/unsafe-gson-flexjson-joddjson-deserialization onto the main. Sorry that there are still some conflicts. I will try to rebase again.

@smowton
Copy link
Contributor

smowton commented Jun 30, 2021

A few questions:

  1. I still wonder what is important about the Class argument to deserialize being final? If you saw a version passed a non-final class would you think it was dangerous, and why?
  2. Annoyingly the use methods aren't well documented; do you know if you'd expect x.use(A.class); x.deserialize(...) to be safe, or it the returned deserializer distinct from the qualifier?
  3. What about the other overloads of use(...), do you expect them to be safe? Would it be sensible to consider any use(...) sanitizing?
  4. What about the deserialize overloads that take an ObjectFactory, are those likely to be safe?
  5. Finally because deserializeInto takes an actual object, do you expect that to constrain the deserialized type also and therefore be safe?

@luchua-bc
Copy link
Contributor Author

What I did is on my branch:

git checkout java/unsafe-gson-flexjson-joddjson-deserialization
git fetch https://github.com/github/codeql.git main
git rebase FETCH_HEAD
<resolve conflicted files and add my changes>
git add *
git push origin java/unsafe-gson-flexjson-joddjson-deserialization -f

Please let me know if my procedure is not correct. Thanks.

@smowton
Copy link
Contributor

smowton commented Jun 30, 2021

You need a git rebase --continue after that git add to note the conflicts are resolved.

@luchua-bc
Copy link
Contributor Author

Hi @smowton,

Please find my comments below:

A few questions:

  1. I still wonder what is important about the Class argument to deserialize being final? If you saw a version passed a non-final class would you think it was dangerous, and why?

If a class is non-final and one of its subclasses is vulnerable, attacker can feed a malicious payload with an object of the vulnerable subclass.

  1. Annoyingly the use methods aren't well documented; do you know if you'd expect x.use(A.class); x.deserialize(...) to be safe, or it the returned deserializer distinct from the qualifier?

If x.use(A.class); x.deserialize(...) is used, only an object of the specified class A can be taken (not objects of subclasses).

  1. What about the other overloads of use(...), do you expect them to be safe? Would it be sensible to consider any use(...) sanitizing?

I will test all the other overloads and verify. I have assumed using use(...) will make the class object more specific therefore not vulnerable. Not using use(...) definitely means developers don't have the extra security control.

  1. What about the deserialize overloads that take an ObjectFactory, are those likely to be safe?

I will test and verify.

  1. Finally because deserializeInto takes an actual object, do you expect that to constrain the deserialized type also and therefore be safe?

As this method is not widely used, I have tested it yet. I will test and verify then report back.

@luchua-bc luchua-bc force-pushed the java/unsafe-gson-flexjson-joddjson-deserialization branch from fed7142 to d0854e9 Compare July 2, 2021 02:33
@luchua-bc luchua-bc force-pushed the java/unsafe-gson-flexjson-joddjson-deserialization branch from d0854e9 to 42948c6 Compare July 2, 2021 03:16
@luchua-bc
Copy link
Contributor Author

Hi @smowton,

For questions 3-5:

  1. What about the other overloads of use(...), do you expect them to be safe? Would it be sensible to consider any use(...) sanitizing?

I've verified that the other overloads of use(...) help to put a constraint on the class type to deserialize into as shown in the newly added test case below in FlexjsonServlet.java:

    // GOOD: Specify the concrete class type to deserialize with `ObjectFactory`
    public void doPut3(HttpServletRequest req, HttpServletResponse resp) throws IOException {
        String json = req.getParameter("json");
        Person person = new JSONDeserializer<Person>().use(Person.class, new ExistingObjectFactory(new Person())).deserialize(json);
    }
  1. What about the deserialize overloads that take an ObjectFactory, are those likely to be safe?

deserialize overloads that take an ObjectFactory are safe so I've modified the query to accommodate this. A new test case has been added:

    // GOOD: Specify the concrete class type to deserialize with `ObjectFactory`
    public void doPut4(HttpServletRequest req, HttpServletResponse resp) throws IOException {
        String json = req.getParameter("json");
        Person person = new JSONDeserializer<Person>().deserialize(json, new ExistingObjectFactory(new Person()));
    }
  1. Finally because deserializeInto takes an actual object, do you expect that to constrain the deserialized type also and therefore be safe?

deserializeInto takes an actual object, which invokes the deserialize overloads that take an ObjectFactory behind the scene. Therefore it's safe as well. I've modified the query and added the following test case:

    // GOOD: Specify the class type to deserialize into
    public void doPut5(HttpServletRequest req, HttpServletResponse resp) throws IOException {
        String json = req.getParameter("json");
        Person person = new JSONDeserializer<Person>().deserializeInto(json, new Person());
    }

And I've rebased the code. Please review.

Thanks,
@luchua-bc

@smowton smowton changed the title Java: CWE-502 Unsafe JSON deserialization with Gson, Flexjson, and JoddJson Java: CWE-502 Unsafe JSON deserialization with Gson, Flexjson, Jabsorb and JoddJson Jul 12, 2021
@smowton
Copy link
Contributor

smowton commented Jul 15, 2021

That's not quite what I meant regarding Gson, but at this point it's faster to just write what I mean rather than try to explain further.

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other improvements, but happy to make these myself. Would be glad to hear your take re: the false-positive vs. false-negative risk and non-final classes.

|
dma.getMethod() instanceof DeserializerUseMethod and
(
ma.getQualifier() = dma or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that

JSONDeserializer<Person> jsd = new flexjson.JSONDeserializer<Person>();
jsd.use(null, Person.class);
jsd.deserialize(json);

is also safe. Without taking on all the cost of a subsidiary Configuration, it might be best to achieve this by modelling all the use(...) methods as having value-flow from qualifier to return value, then using localFlow(any(MethodAccess ma | ma.getMethod() instanceof DeserializerUseMethod).getQualifier(), ...)

This would be best achieved by something like adding a lib file that is imported from ExternalFlow.qll that contains:

class FlexJsonModels extends SummaryModelCsv {
  override predicate row(string row) {
    row =
      [
        "flexjson;JSONDeserializer;false;use;;;Argument[-1];ReturnValue;value"
      ]
  }
}

This says that the use methods are fluent (i.e., they return this), and enables localFlow to do the rest of the work for us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. This looks like a nice solution and I will learn a new technique from your final version. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smowton - for this one, will you make the change or do you want me to make the change following your code snippet? Please advise. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll pick it up from here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smowton:-)

@luchua-bc luchua-bc dismissed a stale review via cff33a6 July 19, 2021 15:20
Copy link
Contributor

@Marcono1234 Marcono1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding Flexjson, it appears only usage of use(String, ...) where the path is null or use(ObjectFactory, String...) where the string varargs (or array) contains null is safe. In all other cases (even empty String as Path) it applies only to sub elements of the root JSON value. For example, the following has no effect and calls the malicious constructor:

new JSONDeserializer<Safe>()
    // Only applies to JSON path 'abc'
    .use("abc", Safe.class)
    // Only affects *how* type is deserialized, but not *as which* type JSON is deserialized 
    .use(Safe.class, new ExistingObjectFactory(new Safe()))
    .deserialize(json);

Additionally, if I see that correctly even if you explicitly specify a type, the application might still be vulnerable if that type has fields or setter methods with type Object (or similar), but that might be out of scope for this pull request.

And not really relevant for this query, but Flexjson initializes arbitrary classes in case no thread context class loader is set (there might be one by default), see https://sourceforge.net/p/flexjson/code/162/tree/trunk/src/main/java/flexjson/ObjectBinder.java#l245. That code does not appear to be in the latest commits (maybe it was just moved somewhere else), but they have not been released anyway.

@luchua-bc
Copy link
Contributor Author

@Marcono1234 Thanks for reviewing the PR and I've made requested changes. Please review.

@luchua-bc
Copy link
Contributor Author

@smowton and @Marcono1234 - with the most recent change, do we still need to revamp the query using FlexJsonModels?

smowton added a commit to smowton/codeql that referenced this pull request Aug 4, 2021
@smowton
Copy link
Contributor

smowton commented Aug 4, 2021

Jabsorb support factored with existing Jackson support and split off as #6419

@luchua-bc
Copy link
Contributor Author

Thanks @smowton.

@smowton
Copy link
Contributor

smowton commented Aug 5, 2021

Part 2: Jodd JSON: #6434

@smowton
Copy link
Contributor

smowton commented Aug 10, 2021

Part 3: #6456

Part 4: #6463

That last one implements Gson support and takes quite a different approach, introducing a single step from an Intent to the Parcel parameter of a createFromParcel method whenever the result of getParcelableExtra flows to something of the corresponding type. I also changed the approach to look for concurrent user control of the text and type, similar to our existing analysis of Flexjson.

@luchua-bc
Copy link
Contributor Author

Wow! That's a lot of changes. Thanks @smowton a lot for your help. For future PRs, I will submit one PR per library/query.

@smowton
Copy link
Contributor

smowton commented Aug 10, 2021

That's useful, especially when targeting non-experimental which is always going to require extra effort on our part.

@smowton
Copy link
Contributor

smowton commented Oct 12, 2021

Last bits of this are now merged

@smowton smowton closed this Oct 12, 2021
@luchua-bc
Copy link
Contributor Author

Thanks @smowton a lot for all your help with this complex and comprehensive PR, which is literally 4 PRs:-).

@luchua-bc luchua-bc deleted the java/unsafe-gson-flexjson-joddjson-deserialization branch October 12, 2021 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants