New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For security, prevent Function0 execution during LazyList deserialization
#10118
Conversation
| @@ -1370,7 +1377,7 @@ object LazyList extends SeqFactory[LazyList] { | |||
| case a => init += a.asInstanceOf[A] | |||
| } | |||
| val tail = in.readObject().asInstanceOf[LazyList[A]] | |||
| coll = init ++: tail | |||
| coll = tail.withNullLazyState(tail.prependedAll(init)) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we use a Builder[LazyList] for init instead of an ArrayBuffer, and just call lazyAppendedAll on the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact, because tail is also a LazyList, I believe we can use LazyBuilder and just call addAll with tail as an argument, and we don't even rebuild the LazyList ever (not that I think it has a huge performance impact)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
side note: there was a discussion at some point somewhere suggesting that no Builders should be lazy (don't remember where), and that LazyList.newBuilder should return an eager Builder (probably for List) that calls mapResult to create a LazyList. On the off-chance that ever happens, we should explicitly create a new LazyBuilder rather than calling LazyList.newBuilder if we go with the change I suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to put it differently: evaluating tail there does not represent a fundamental flaw, shortcoming or gap in the design of LazyList, it represents a careless programming error likely due to copy-pasting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at LazyBuilder, I actually think we'd run into the same issue. If we change the code to
val tail = in.readObject().asInstanceOf[LazyList[A]]
coll = lazyBuilder.addAll(tail).result()
we have
override def addAll(xs: IterableOnce[A]): this.type = {
if (xs.knownSize != 0) {
...
override def knownSize: Int = if (knownIsEmpty) 0 else -1
...
private[this] def knownIsEmpty: Boolean = stateEvaluated && isEmpty
...
override def isEmpty: Boolean = state eq State.Empty
So calling addAll could evaluate the state lazy val, which calls the lazyState function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, a forged serialization stream can have that field set to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
point. I guess we fall back to lazyAppendedAll then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lrytz I whipped up my own fix that uses stateFromIteratorConcatSuffix. I was going to push an alternate PR to consider, but I haven't figured out how to actually test that the fix works. Been having issues deserializing the lambda.
If you'd like to use my code, the only changes are to the SerializationProxy (below)
final class SerializationProxy[A](@transient protected var coll: LazyList[A]) extends Serializable {
private[this] def writeObject(out: ObjectOutputStream): Unit = {
out.defaultWriteObject()
var these = coll
while(these.knownNonEmpty) {
out.writeObject(these.head)
these = these.tail
}
out.writeObject(SerializeEnd)
out.writeObject(these)
}
private[this] def readObject(in: ObjectInputStream): Unit = {
in.defaultReadObject()
val init = new ListBuffer[A]
var initRead = false
while (!initRead) in.readObject match {
case SerializeEnd => initRead = true
case a => init += a.asInstanceOf[A]
}
val tail = in.readObject().asInstanceOf[LazyList[A]]
coll = newLL(stateFromIteratorConcatSuffix(init.toList.iterator)(tail.state))
}
private[this] def readResolve(): Any = coll
}and to import ListBuffer instead of ArrayBuffer.
Edit: I think the only change is to readObject?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this looks good, nicer than my workaround
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels weird to me to add a new method used just for this when we already have lazy methods. I think the main issue was using an ArrayBuffer (which is on me)
a8aacb0
to
a40c17a
Compare
This PR ensures that LazyList deserialization will not execute an arbitrary Function0 when being passed a forged serialization stream. See the PR description for a detailed explanation.
Function0 execution during LazyList deserialization
|
|
This PR ensures that LazyList deserialization will not execute an arbitrary
Function0when being passed a forged serialization stream.Overview
The most important take-away is: Deserializing untrusted data in a Scala (or Java) application is always a security risk. This recommendation does not change after this PR is merged.
A forged serialization stream can be used to execute unrelated code, which can be exploited. The next section explains the case that is being addressed in this PR in detail. However, any class on an application's classpath with similar code can be exploited. Therefore, untrusted data should never be deserialized.
A presentation about the topic: https://www.youtube.com/watch?v=MTfE2OgUlKc
Technical details
The following details play together:
LazyListhas avar lazyState: () => LazyList.State[A]field, which is of typeFunction0in bytecodetail.prependedAll(init)to reconstruct the lazy list (initare the evaluated elements)prependedAllmay invoke thelazyState()function of thetailobject; this does not happen under normal circumstances, but a forged serialization stream can force the control flow into this situationlazyStateobject is created when deserializing thetailobjectFunction0subclasses; for example, every argument to a by-name parameter is encoded into aFunction0Function0instance for thelazyStatefield, for example one that interacts with the file system or the networkIn this scenario, the invocation of the
lazyStatefunction happens before the deserializedLazyListis actually used (for example assigned to a local variable). So even if theLazyListobject would lead to aClassCastExceptionlater on, the customlazyStatefunction is already executed. In other words, a serialization stream containing a forgedLazyListwould lead to executing theFunction0at any deserialization of the application, no matter what object type the application expects to read.Change in this PR
In this PR, we change the
LazyListdeserialization method to ensure that this process cannot evaluate thestateof the lazytail.The forged
Function0can still be executed later when evaluating the tail of the lazy list. However, this is only possible if the application actually expects a serializedLazyListand evaluates its tail. This makes an exploit much less likely.Credits
We would like to thank Marc Bohler for finding this issue and reporting it to us.
This issue is reported as CVE-2022-36944.