Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Fix subtle hamming performance regression/bug #8497

Merged
merged 2 commits into from Apr 8, 2020

Conversation

chrisvittal
Copy link
Collaborator

This change alters the signature of IEmitCode's apply method to take
a thunk returning a PCode, rather than a PCode. This was an issue
because the CodeBuilder is built up in strict append order. In an idiom
like:

EmitCode.fromI(mb) { cb =>
  cb += setupIsMissing
  val isMissing = missingExpr
  cb += setupThePCode
  val pc = pcodeExpr
  IEmitCode(cb, isMissing, pc)
}

We were running all of the Code added to the CodeBuilder as a sort of
"setup" before computing the missingness expression. There was no way to
sequence this such that the Code needed to produce the PCode was not
also run in the setup, thus potentially leading to extra work, even when
the result would be missing, like in hamming.

By changing the signature of apply to take a thunk that produces
a PCode, we give callers the ability to control if code is executed
before or after the missingness check.

Co-Authored-By: Patrick Schultz pschultz@broadinstitute.org

@chrisvittal
Copy link
Collaborator Author

This also could have caused a serious buffer overflow if the first string is much much longer than the second.

@chrisvittal
Copy link
Collaborator Author

I tested this by adding an assertion error that would get hit just before the while loop.With the old code, we always hit the assertion, with the new code, we do not.

@chrisvittal chrisvittal changed the title [query] Fix subtle hamming performance regression [query] Fix subtle hamming performance regression/bug Apr 8, 2020
This change alters the signature of IEmitCode's apply method to take
a thunk returning a PCode, rather than a PCode. This was an issue
because the CodeBuilder is built up in strict append order. In an idiom
like:

    EmitCode.fromI(mb) { cb =>
      cb += setupIsMissing
      val isMissing = missingExpr
      cb += setupThePCode
      val pc = pcodeExpr
      IEmitCode(cb, isMissing, pc)
    }

We were running all of the Code added to the CodeBuilder as a sort of
"setup" before computing the missingness expression. There was no way to
sequence this such that the Code needed to produce the PCode was not
also run in the setup, thus potentially leading to extra work, even when
the result would be missing, like in hamming.

By changing the signature of apply to take a thunk that produces
a PCode, we give callers the ability to control if code is executed
before or after the missingness check.

Co-Authored-By: Patrick Schultz <pschultz@broadinstitute.org>
@danking danking merged commit dc349f7 into hail-is:master Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants