Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hail] Better scaling on RVD.union #6943

Merged
merged 1 commit into from Aug 26, 2019

Conversation

@tpoterba
Copy link
Collaborator

commented Aug 26, 2019

Do a tree reduce instead of a linear reduce. This means that the java
stack depth is log2(N) instead of N, and prevents stack overflow errors
when unioning hundreds of tables together.

[hail] Better scaling on RVD.union
Do a tree reduce instead of a linear reduce. This means that the java
stack depth is log2(N) instead of N, and prevents stack overflow errors
when unioning hundreds of tables together.
@patrick-schultz

This comment has been minimized.

Copy link
Collaborator

commented Aug 26, 2019

I'm confused by the stack depth problem. reduce isn't recursive, it forwards to reduceLeft:

  def reduceLeft[B >: A](op: (B, A) => B): B = {
    if (isEmpty)
      throw new UnsupportedOperationException("empty.reduceLeft")

    var first = true
    var acc: B = 0.asInstanceOf[B]

    for (x <- self) {
      if (first) {
        acc = x
        first = false
      }
      else acc = op(acc, x)
    }
    acc
  }
@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 26, 2019

The problem is that in the ordered merge usage, the spark DAG builds up a stack of 200 RDDs / iterators.

@patrick-schultz

This comment has been minimized.

Copy link
Collaborator

commented Aug 26, 2019

Ah, right, that stack.

@danking danking merged commit 990e875 into hail-is:master Aug 26, 2019

1 check passed

ci-test success
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.