New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For security, prevent Function0
execution during LazyList
deserialization
#10118
Conversation
@@ -1370,7 +1377,7 @@ object LazyList extends SeqFactory[LazyList] { | |||
case a => init += a.asInstanceOf[A] | |||
} | |||
val tail = in.readObject().asInstanceOf[LazyList[A]] | |||
coll = init ++: tail | |||
coll = tail.withNullLazyState(tail.prependedAll(init)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we use a Builder[LazyList]
for init
instead of an ArrayBuffer
, and just call lazyAppendedAll
on the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact, because tail
is also a LazyList
, I believe we can use LazyBuilder
and just call addAll
with tail
as an argument, and we don't even rebuild the LazyList
ever (not that I think it has a huge performance impact)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
side note: there was a discussion at some point somewhere suggesting that no Builder
s should be lazy (don't remember where), and that LazyList.newBuilder
should return an eager Builder
(probably for List
) that calls mapResult
to create a LazyList
. On the off-chance that ever happens, we should explicitly create a new LazyBuilder
rather than calling LazyList.newBuilder
if we go with the change I suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to put it differently: evaluating tail
there does not represent a fundamental flaw, shortcoming or gap in the design of LazyList
, it represents a careless programming error likely due to copy-pasting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at LazyBuilder
, I actually think we'd run into the same issue. If we change the code to
val tail = in.readObject().asInstanceOf[LazyList[A]]
coll = lazyBuilder.addAll(tail).result()
we have
override def addAll(xs: IterableOnce[A]): this.type = {
if (xs.knownSize != 0) {
...
override def knownSize: Int = if (knownIsEmpty) 0 else -1
...
private[this] def knownIsEmpty: Boolean = stateEvaluated && isEmpty
...
override def isEmpty: Boolean = state eq State.Empty
So calling addAll
could evaluate the state
lazy val, which calls the lazyState
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, a forged serialization stream can have that field set to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
point. I guess we fall back to lazyAppendedAll
then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lrytz I whipped up my own fix that uses stateFromIteratorConcatSuffix
. I was going to push an alternate PR to consider, but I haven't figured out how to actually test that the fix works. Been having issues deserializing the lambda.
If you'd like to use my code, the only changes are to the SerializationProxy
(below)
final class SerializationProxy[A](@transient protected var coll: LazyList[A]) extends Serializable {
private[this] def writeObject(out: ObjectOutputStream): Unit = {
out.defaultWriteObject()
var these = coll
while(these.knownNonEmpty) {
out.writeObject(these.head)
these = these.tail
}
out.writeObject(SerializeEnd)
out.writeObject(these)
}
private[this] def readObject(in: ObjectInputStream): Unit = {
in.defaultReadObject()
val init = new ListBuffer[A]
var initRead = false
while (!initRead) in.readObject match {
case SerializeEnd => initRead = true
case a => init += a.asInstanceOf[A]
}
val tail = in.readObject().asInstanceOf[LazyList[A]]
coll = newLL(stateFromIteratorConcatSuffix(init.toList.iterator)(tail.state))
}
private[this] def readResolve(): Any = coll
}
and to import ListBuffer
instead of ArrayBuffer
.
Edit: I think the only change is to readObject
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this looks good, nicer than my workaround 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels weird to me to add a new method used just for this when we already have lazy methods. I think the main issue was using an ArrayBuffer
(which is on me)
a8aacb0
to
a40c17a
Compare
This PR ensures that LazyList deserialization will not execute an arbitrary Function0 when being passed a forged serialization stream. See the PR description for a detailed explanation.
Function0
execution during LazyList
deserialization
👍 thank you for the pointer! |
This PR ensures that LazyList deserialization will not execute an arbitrary
Function0
when being passed a forged serialization stream.Overview
The most important take-away is: Deserializing untrusted data in a Scala (or Java) application is always a security risk. This recommendation does not change after this PR is merged.
A forged serialization stream can be used to execute unrelated code, which can be exploited. The next section explains the case that is being addressed in this PR in detail. However, any class on an application's classpath with similar code can be exploited. Therefore, untrusted data should never be deserialized.
A presentation about the topic: https://www.youtube.com/watch?v=MTfE2OgUlKc
Technical details
The following details play together:
LazyList
has avar lazyState: () => LazyList.State[A]
field, which is of typeFunction0
in bytecodetail.prependedAll(init)
to reconstruct the lazy list (init
are the evaluated elements)prependedAll
may invoke thelazyState()
function of thetail
object; this does not happen under normal circumstances, but a forged serialization stream can force the control flow into this situationlazyState
object is created when deserializing thetail
objectFunction0
subclasses; for example, every argument to a by-name parameter is encoded into aFunction0
Function0
instance for thelazyState
field, for example one that interacts with the file system or the networkIn this scenario, the invocation of the
lazyState
function happens before the deserializedLazyList
is actually used (for example assigned to a local variable). So even if theLazyList
object would lead to aClassCastException
later on, the customlazyState
function is already executed. In other words, a serialization stream containing a forgedLazyList
would lead to executing theFunction0
at any deserialization of the application, no matter what object type the application expects to read.Change in this PR
In this PR, we change the
LazyList
deserialization method to ensure that this process cannot evaluate thestate
of the lazytail
.The forged
Function0
can still be executed later when evaluating the tail of the lazy list. However, this is only possible if the application actually expects a serializedLazyList
and evaluates its tail. This makes an exploit much less likely.Credits
We would like to thank Marc Bohler for finding this issue and reporting it to us.
This issue is reported as CVE-2022-36944.