Skip to content

Java: Adapt unsafe deserialization to SnakeYaml 2.0, which is secure by default #13347

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jorgectf
Copy link
Contributor

@jorgectf jorgectf commented Jun 1, 2023

SnakeYaml became secure by default against unsafe deserialization in its 2.0 release. Now only a specific configuration can enable the deserialization of arbitrary classes.

This PR adapts the SnakeYaml library to the updated logic: it now only raises an alert when an overly permissive isGlobalTagAllowed is configured in the YAML parser.

@jorgectf jorgectf requested a review from a team as a code owner June 1, 2023 15:13
@github-actions github-actions bot added the Java label Jun 1, 2023
@atorralba atorralba force-pushed the jorgectf/snakeyaml-extend-safeconstructor branch from db47d9c to 75a1dc8 Compare June 2, 2023 10:51
@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

QHelp previews:

java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp

Deserialization of user-controlled data

Deserializing untrusted data using any deserialization framework that allows the construction of arbitrary serializable objects is easily exploitable and in many cases allows an attacker to execute arbitrary code. Even before a deserialized object is returned to the caller of a deserialization method a lot of code may have been executed, including static initializers, constructors, and finalizers. Automatic deserialization of fields means that an attacker may craft a nested combination of objects on which the executed initialization code may have unforeseen effects, such as the execution of arbitrary code.

There are many different serialization frameworks. This query currently supports Kryo, XmlDecoder, XStream, SnakeYaml, JYaml, JsonIO, YAMLBeans, HessianBurlap, Castor, Burlap, Jackson, Jabsorb, Jodd JSON, Flexjson, Gson and Java IO serialization through ObjectInputStream/ObjectOutputStream.

Recommendation

Avoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON or XML. However, these formats should not be deserialized into complex objects because this provides further opportunities for attack. For example, XML-based deserialization attacks are possible through libraries such as XStream and XmlDecoder.

Alternatively, a tightly controlled whitelist can limit the vulnerability of code, but be aware of the existence of so-called Bypass Gadgets, which can circumvent such protection measures.

Recommendations specific to particular frameworks supported by this query:

FastJson - com.alibaba:fastjson

  • Secure by Default: Partially
  • Recommendation: Call com.alibaba.fastjson.parser.ParserConfig#setSafeMode with the argument true before deserializing untrusted data.

FasterXML - com.fasterxml.jackson.core:jackson-databind

  • Secure by Default: Yes
  • Recommendation: Don't call com.fasterxml.jackson.databind.ObjectMapper#enableDefaultTyping and don't annotate any object fields with com.fasterxml.jackson.annotation.JsonTypeInfo passing either the CLASS or MINIMAL_CLASS values to the annotation. Read this guide.

Kryo - com.esotericsoftware:kryo and com.esotericsoftware:kryo5

  • Secure by Default: Yes for com.esotericsoftware:kryo5 and for com.esotericsoftware:kryo >= v5.0.0
  • Recommendation: Don't call com.esotericsoftware.kryo(5).Kryo#setRegistrationRequired with the argument false on any Kryo instance that may deserialize untrusted data.

ObjectInputStream - Java Standard Library

  • Secure by Default: No
  • Recommendation: Use a validating input stream, such as org.apache.commons.io.serialization.ValidatingObjectInputStream.

SnakeYAML - org.yaml:snakeyaml

  • Secure by Default: Yes since version 2.0.
  • Recommendation: Don't deserialize untrusted data with an instance of org.yaml.snakeyaml.Yaml with unsafe org.yaml.snakeyaml.LoaderOptions, that is, using a org.yaml.snakeyaml.inspector.TagInspector that allows global tags with isGlobalTagAllowed.

XML Decoder - Standard Java Library

  • Secure by Default: No
  • Recommendation: Do not use with untrusted user input.

Example

The following example calls readObject directly on an ObjectInputStream that is constructed from untrusted data, and is therefore inherently unsafe.

public MyObject {
  public int field;
  MyObject(int field) {
    this.field = field;
  }
}

public MyObject deserialize(Socket sock) {
  try(ObjectInputStream in = new ObjectInputStream(sock.getInputStream())) {
    return (MyObject)in.readObject(); // unsafe
  }
}

Rewriting the communication protocol to only rely on reading primitive types from the input stream removes the vulnerability.

public MyObject deserialize(Socket sock) {
  try(DataInputStream in = new DataInputStream(sock.getInputStream())) {
    return new MyObject(in.readInt());
  }
}

References

@atorralba atorralba changed the title Java: Make SafeSnakeYamlConstruction also find custom extensions of SafeConstructor Java: Adapt unsafe deserialization to SnakeYaml 2.0, which is secure by default Jun 2, 2023
@pwntester
Copy link
Contributor

@atorralba @jorgectf should we keep the old version around (maybe in experimental) to be used by security researchers? We can always adopt it in the securitylab QLPacks

We can inspect pom files (not gradle ones) so we can also keep this version around and enable it when we know for sure the version used is lower than 2.0

@JLLeitschuh
Copy link
Contributor

should we keep the old version around (maybe in experimental) to be used by security researchers? We can always adopt it in the securitylab QLPacks

If CodeQL captured dependency information, the old query could remain with a version check. See: #7837

@atorralba
Copy link
Contributor

atorralba commented Jun 6, 2023

Thanks @pwntester and @JLLeitschuh for the feedback!

should we keep the old version around (maybe in experimental) to be used by security researchers? We can always adopt it in the securitylab QLPacks

We can inspect pom files (not gradle ones) so we can also keep this version around and enable it when we know for sure the version used is lower than 2.0

I agree that, for security research purposes, assuming users are always using the latest version of their dependencies can lead to FNs. On the other hand, vulnerabilities caused by using outdated/vulnerable dependencies are not CodeQL's responsibility (but Dependabot's). The current direction is that CodeQL cares about misues (vulnerable patterns) of APIs, rather than normal uses of vulnerable APIs that have been fixed (e.g. Log4Shell). And I'm aware we have an experimental Log4Shell query, but that was created as a response to such an exceptional case, and it's usefulness was limited in time anyway.

IMHO, the grey area is in cases like #7837, which isn't about an outdated version, but rather a totally different library being used as an interface implementation. In such cases, having an experimental query to flag this "conditional" alerts that vary according to dependencies seems like an acceptable short term solution, but it sounds like a problem that we'd want to support better. I'll share this conversation internally to see what can be done.

Still, I think this PR falls in the first category, so I'm not sure there's value in having an experimental query for this (specially when security researchers could just remove the condition in the charpred of UnsafeSnakeYamlParse to get the same effect).

@JLLeitschuh
Copy link
Contributor

On the other hand, vulnerabilities caused by using outdated/vulnerable dependencies are not CodeQL's responsibility (but Dependabot's).

This has two assumptions:

  1. That it's easy to migrate from a 1.x release to something like a 2.x release.
  2. That a CVE is assigned to end-of-life software. In my experience, often vendors don't assign CVEs to end of life software. https://portswigger.net/daily-swig/cnas-and-cves-can-allowing-vendors-to-assign-their-own-vulnerability-ids-actually-hinder-security

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants