Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scala.collection.immutable.StreamIterator[+A] has O(n) memory consumption #4835

Closed
scabug opened this issue Jul 25, 2011 · 9 comments
Closed

scala.collection.immutable.StreamIterator[+A] has O(n) memory consumption #4835

scabug opened this issue Jul 25, 2011 · 9 comments
Assignees
Labels
Milestone

Comments

@scabug
Copy link

@scabug scabug commented Jul 25, 2011

In the following program using Stream#iterator, OutOfMemoryError should not occur
Because no references to Stream's head are not hold and Stream was converted to Iterator
using Stream[+A]#iterator.

object StreamIteratorProblem {
def main(args: Array[String]) {
val infiniteIterator = Stream.from(1).iterator
for(i <- 1 to 10000000) {
val e = infiniteIterator.next
if(e % 1000000 == 0) println(e)
}
}
}

However, When this program runs, OutOfMemoryError occurs as the following:

scala StreamIteratorProblem
1000000
2000000
3000000
4000000
java.lang.OutOfMemoryError: Java heap space
at scala.collection.immutable.Stream$.from(Stream.scala:657)
at scala.collection.immutable.Stream$$anonfun$from$1.apply(Stream.scala:657)
at scala.collection.immutable.Stream$$anonfun$from$1.apply(Stream.scala:657)
at scala.collection.immutable.Stream$Cons.tail(Stream.scala:630)
at scala.collection.immutable.Stream$Cons.tail(Stream.scala:622)

I found that there exists the reason of this problem in StreamIterator in "Stream.scala`.
In the definition of StreamIterator as the followings,

class StreamIterator[+A](self: Stream[A]) extends Iterator[A] {
// A call-by-need cell.
class LazyCell(st: => Stream[A]) {
lazy val v = st
}
}

an argument self of primary constructor continues to hold the reference to a Stream.
Then, whenever StreamIterator#next is called, memory consumption increases.

I created a patch to this problem and will attach it. It seems works well.

@scabug

This comment has been minimized.

Copy link
Author

@scabug scabug commented Jul 25, 2011

Imported From: https://issues.scala-lang.org/browse/SI-4835?orig=1
Reporter: @kmizu
Affected Versions: 2.9.0
Other Milestones: 2.10.0-M1
Attachments:

@scabug

This comment has been minimized.

Copy link
Author

@scabug scabug commented Jul 25, 2011

@dcsobral said:
This patch breaks the lazyness of stream for the initial element. I think that isn't a big problem, though effects of that can be subtle. Assigning null to self shouldn't be necessary -- if you don't use it in a method, it shouldn't become part of the object.

I'm surprised, however, that "val stream" inside the definition of "var these" doesn't suffer from the same problem -- even though it is in a small scope, it ought to be saved so that it is preserved when the by-name LazyCell gets invoked. It might even be a bug here.

@scabug

This comment has been minimized.

Copy link
Author

@scabug scabug commented Jul 25, 2011

@kmizu said:
Hello. Thanks for commenting to my patch. However, I couldn't understand that my patch breaks the lazyness of stream for the initial element. If you don't mind, I would like to see concrete example of it.

I think that isn't a big problem, though effects of that can be subtle

I think that it isn't big problem, too. However, I think that Iterator#next is not to expected consume O(n) memory generally.

Assigning null to self shouldn't be necessary – if you don't use it in a method, it shouldn't become part of the object.

Because primary constructor's arguments self continues to exist until ths object is garbage collected, assigning null to self is needed now if StreamIterator's API shouldn't be changed.

Thanks again.

Kota Mizushima

@scabug

This comment has been minimized.

Copy link
Author

@scabug scabug commented Jul 26, 2011

@dcsobral said:
Indeed, the first element isn't lazy at all, as it is passed by value. So that shouldn't be a problem.

As for self, it * shouldn't* continue to exist. It only did so because it was used in a by-name parameter.

@scabug

This comment has been minimized.

Copy link
Author

@scabug scabug commented Jul 26, 2011

@kmizu said:

Indeed, the first element isn't lazy at all, as it is passed by value. So that shouldn't be a problem.

I see that because argument self: Stream itself is passed by value, then the first element wasn't lazy
without my patch. Is is right?

As for self, it * shouldn't* continue to exist. It only did so because it was used in a by-name parameter.

OK. I created another version of patch and will attach it as StreamIterator.patch.2. By applying this patch,
no longer, unneeded field self is erased.

@scabug

This comment has been minimized.

Copy link
Author

@scabug scabug commented Aug 31, 2011

huynhjl said:
Note that this used to run in constant memory in 2.8.1 and most likely in 2.8.0 as well. Something happened in the 2.9.0 release.

@scabug

This comment has been minimized.

Copy link
Author

@scabug scabug commented Jan 26, 2012

@kmizu said:
Several days ago, I sent pull request fixing this isssue on git://github.com/scala/scala.git and the pull request was merged into master branch:

commit 39a837c8357e78c949613db881bd060d176f3279
Author: Kota Mizushima mizukota@gmail.com
Date: Mon Jan 23 18:53:05 2012 +0900

Add test case for SI-4835 (https://issues.scala-lang.org/browse/SI-4835)

This test case only confirm that StreamIterator's lazyiness is not broken.
Test case about memory consumption could be created.  However, such a
test cause a greatly increased time of test.

I'm not a commiter. However, because the fix was merged into master branch, I've thought that it should be noted.

@scabug

This comment has been minimized.

Copy link
Author

@scabug scabug commented Apr 15, 2012

Jed Wesley-Smith (jedws) said:
this is tagged as a fix in the 2.9.2 release notes but is unresolved here. What is the source of truth?

@scabug

This comment has been minimized.

Copy link
Author

@scabug scabug commented Apr 15, 2012

@kmizu said:
As far as I see https://github.com/scala/scala/blob/v2.9.2/src/library/scala/collection/immutable/Stream.scala, my fix is included in 2.9.2. However, because I'm just another contributor, I don't know whether or not I have permission to close an Issue myself.

@scabug scabug closed this Apr 15, 2012
@scabug scabug added the performance label Apr 7, 2017
@scabug scabug added this to the 2.9.2 milestone Apr 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.