Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1308] Fix stack overflow in join with custom iterator impl. #1315

Conversation

@fnothaft
Copy link
Member

fnothaft commented Dec 12, 2016

@AmplabJenkins
Copy link

AmplabJenkins commented Dec 12, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1679/
Test PASSed.

@@ -263,6 +263,30 @@ private[rdd] case class ManualRegionPartitioner(partitions: Int) extends Partiti
}
}

private class AppendableIterator[T] extends Iterator[T] {
var iterators: ListBuffer[Iterator[T]] = ListBuffer.empty

This comment has been minimized.

Copy link
@heuermh

heuermh Dec 13, 2016

Member

Do you need to worry about thread safety? Would immutable copy-on-write help here?

This comment has been minimized.

Copy link
@fnothaft

fnothaft Dec 13, 2016

Author Member

Nah, this is only accessed inside of a single thread.


def append(iter: Iterator[T]) {
if (iter.hasNext) {
iterators += iter

This comment has been minimized.

Copy link
@heuermh

heuermh Dec 13, 2016

Member

safe copy? iter could be modified by the caller

This comment has been minimized.

Copy link
@fnothaft

fnothaft Dec 13, 2016

Author Member

In general, Scala has a pretty strict "advisory" (not quite a contract?) on using Iterators:

It is of particular importance to note that, unless stated otherwise, one should never use an iterator after calling a method on it.

This is a private class used only in this file, so I think it is OK to make the assumption that iter is not modified after being passed.

This comment has been minimized.

Copy link
@heuermh

heuermh Dec 13, 2016

Member

Thanks for the clarification!

This comment has been minimized.

Copy link
@fnothaft

fnothaft Dec 13, 2016

Author Member

Np! I should make a pass and document this assumption better.

}

def hasNext: Boolean = {
iterators.nonEmpty

This comment has been minimized.

Copy link
@heuermh

heuermh Dec 13, 2016

Member

shouldn't this query all of the iterators in iterators?

This comment has been minimized.

Copy link
@fnothaft

fnothaft Dec 13, 2016

Author Member

To avoid having to loop over all the iterators, one of the optimizations I made is that we remove empty iterators from the list, and we don't add empty iterators to the list.

@fnothaft fnothaft requested review from akmorrow13 and devin-petersohn Dec 13, 2016
Copy link
Member

devin-petersohn left a comment

Looks good to me

@akmorrow13
Copy link
Contributor

akmorrow13 commented Dec 15, 2016

I was going to try running this on the problem before it's merged unless anyone deems that unnecessary.

@fnothaft
Copy link
Member Author

fnothaft commented Dec 15, 2016

I was going to try running this on the problem before it's merged unless anyone deems that unnecessary.

I would really like that so we can confirm that this is a fix before we merge.

@heuermh heuermh modified the milestone: 0.21.0 Dec 15, 2016
@devin-petersohn
Copy link
Member

devin-petersohn commented Dec 15, 2016

I computed on the same code as before, and did not run into stack overflow errors. I think we can go forward with the merge.

@akmorrow13
Copy link
Contributor

akmorrow13 commented Dec 16, 2016

I reran my pipeline and it looks good. Thanks @fnothaft !

@heuermh
Copy link
Member

heuermh commented Dec 16, 2016

Thank you for confirming, @devin-petersohn @akmorrow13

@heuermh heuermh merged commit 3bb9736 into bigdatagenomics:master Dec 16, 2016
1 check passed
1 check passed
default Merged build finished.
Details
@heuermh
Copy link
Member

heuermh commented Dec 16, 2016

Thank you, @fnothaft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.