New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should duplicating/splitting iterators use LazyList instead of internal queues/buffers? #11377

Open
NthPortal opened this Issue Jan 22, 2019 · 0 comments

Comments

Projects
None yet
1 participant
@NthPortal
Copy link

NthPortal commented Jan 22, 2019

Now that LazyList has a strict head (whenever scala/scala#7558 gets merged), would it be better to implement iterator duplicating/splitting operations using it? With the current implementations, the following scenarios will all leak a tremendous amount of memory, which could otherwise be reclaimed:

// setup - doesn't leak
type It = Iterator[Int]
def it: It = (0 to Int.MaxValue).iterator
def init(iter: It): It = { it.hasNext; it }
val almostMax = Int.MaxValue - 2

// all leak
def duplicateLeak: It = init { it.duplicate._1.drop(almostMax) }
def spanLeak: It = init { it.span(_ < almostMax)._2 }
def splitAtLeak: It = init { it.splitAt(almostMax)._2 }
def partitionLeak: It = init { it.partition(_ < almostMax)._2 }
def partitionWithLeak: It = init { it.partitionWith(i => if (i < almostMax) Left(i) else Right(i))._2 }

For all of the above leaking methods, implementing them by creating a LazyList from the iterator, and then performing the operations on iterators returned by the LazyList will allow the beginning of the discarded LazyList referenced by the discarded iterator to be collected, while the iterator that has traversed to the end only references the end of the LazyList.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment