[ADAM-1308] Fix stack overflow in join with custom iterator impl. #1315

Merged
merged 1 commit into from Dec 16, 2016

Conversation

Projects
5 participants
@fnothaft
Member

fnothaft commented Dec 12, 2016

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Dec 12, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1679/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1679/
Test PASSed.

@@ -263,6 +263,30 @@ private[rdd] case class ManualRegionPartitioner(partitions: Int) extends Partiti
}
}
+private class AppendableIterator[T] extends Iterator[T] {
+ var iterators: ListBuffer[Iterator[T]] = ListBuffer.empty

This comment has been minimized.

@heuermh

heuermh Dec 13, 2016

Member

Do you need to worry about thread safety? Would immutable copy-on-write help here?

@heuermh

heuermh Dec 13, 2016

Member

Do you need to worry about thread safety? Would immutable copy-on-write help here?

This comment has been minimized.

@fnothaft

fnothaft Dec 13, 2016

Member

Nah, this is only accessed inside of a single thread.

@fnothaft

fnothaft Dec 13, 2016

Member

Nah, this is only accessed inside of a single thread.

+
+ def append(iter: Iterator[T]) {
+ if (iter.hasNext) {
+ iterators += iter

This comment has been minimized.

@heuermh

heuermh Dec 13, 2016

Member

safe copy? iter could be modified by the caller

@heuermh

heuermh Dec 13, 2016

Member

safe copy? iter could be modified by the caller

This comment has been minimized.

@fnothaft

fnothaft Dec 13, 2016

Member

In general, Scala has a pretty strict "advisory" (not quite a contract?) on using Iterators:

It is of particular importance to note that, unless stated otherwise, one should never use an iterator after calling a method on it.

This is a private class used only in this file, so I think it is OK to make the assumption that iter is not modified after being passed.

@fnothaft

fnothaft Dec 13, 2016

Member

In general, Scala has a pretty strict "advisory" (not quite a contract?) on using Iterators:

It is of particular importance to note that, unless stated otherwise, one should never use an iterator after calling a method on it.

This is a private class used only in this file, so I think it is OK to make the assumption that iter is not modified after being passed.

This comment has been minimized.

@heuermh

heuermh Dec 13, 2016

Member

Thanks for the clarification!

@heuermh

heuermh Dec 13, 2016

Member

Thanks for the clarification!

This comment has been minimized.

@fnothaft

fnothaft Dec 13, 2016

Member

Np! I should make a pass and document this assumption better.

@fnothaft

fnothaft Dec 13, 2016

Member

Np! I should make a pass and document this assumption better.

+ }
+
+ def hasNext: Boolean = {
+ iterators.nonEmpty

This comment has been minimized.

@heuermh

heuermh Dec 13, 2016

Member

shouldn't this query all of the iterators in iterators?

@heuermh

heuermh Dec 13, 2016

Member

shouldn't this query all of the iterators in iterators?

This comment has been minimized.

@fnothaft

fnothaft Dec 13, 2016

Member

To avoid having to loop over all the iterators, one of the optimizations I made is that we remove empty iterators from the list, and we don't add empty iterators to the list.

@fnothaft

fnothaft Dec 13, 2016

Member

To avoid having to loop over all the iterators, one of the optimizations I made is that we remove empty iterators from the list, and we don't add empty iterators to the list.

@fnothaft fnothaft requested review from akmorrow13 and devin-petersohn Dec 13, 2016

@devin-petersohn

Looks good to me

@akmorrow13

This comment has been minimized.

Show comment
Hide comment
@akmorrow13

akmorrow13 Dec 15, 2016

Contributor

I was going to try running this on the problem before it's merged unless anyone deems that unnecessary.

Contributor

akmorrow13 commented Dec 15, 2016

I was going to try running this on the problem before it's merged unless anyone deems that unnecessary.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Dec 15, 2016

Member

I was going to try running this on the problem before it's merged unless anyone deems that unnecessary.

I would really like that so we can confirm that this is a fix before we merge.

Member

fnothaft commented Dec 15, 2016

I was going to try running this on the problem before it's merged unless anyone deems that unnecessary.

I would really like that so we can confirm that this is a fix before we merge.

@heuermh heuermh modified the milestone: 0.21.0 Dec 15, 2016

@devin-petersohn

This comment has been minimized.

Show comment
Hide comment
@devin-petersohn

devin-petersohn Dec 15, 2016

Member

I computed on the same code as before, and did not run into stack overflow errors. I think we can go forward with the merge.

Member

devin-petersohn commented Dec 15, 2016

I computed on the same code as before, and did not run into stack overflow errors. I think we can go forward with the merge.

@akmorrow13

This comment has been minimized.

Show comment
Hide comment
@akmorrow13

akmorrow13 Dec 16, 2016

Contributor

I reran my pipeline and it looks good. Thanks @fnothaft !

Contributor

akmorrow13 commented Dec 16, 2016

I reran my pipeline and it looks good. Thanks @fnothaft !

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Dec 16, 2016

Member

Thank you for confirming, @devin-petersohn @akmorrow13

Member

heuermh commented Dec 16, 2016

Thank you for confirming, @devin-petersohn @akmorrow13

@heuermh heuermh merged commit 3bb9736 into bigdatagenomics:master Dec 16, 2016

1 check passed

default Merged build finished.
Details
@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Dec 16, 2016

Member

Thank you, @fnothaft

Member

heuermh commented Dec 16, 2016

Thank you, @fnothaft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment