Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MakeNDArray OOM on stream data #14559

Closed
ehigham opened this issue May 22, 2024 · 1 comment · Fixed by #14571
Closed

MakeNDArray OOM on stream data #14559

ehigham opened this issue May 22, 2024 · 1 comment · Fixed by #14571
Assignees

Comments

@ehigham
Copy link
Collaborator

ehigham commented May 22, 2024

What happened?

From: https://discuss.hail.is/t/connectionerror-after-mt-aggregate-cols-hl-agg-collect-and-hl-nd-array-in-linear-skat/3839

The following code snippet blows up from OOM. Interestingly, I can only reproduce for _localize=False, indicating we have an problem with our Emit rule for MakeNDArray for data IRs of type TStream.

hl.init()

mt = hl.utils.range_matrix_table(n_rows=7944, n_cols=442075)
covariates = [1.0]

mt = mt.select_cols(covariates=covariates)
covmat = mt.aggregate_cols(
    hl.agg.collect(mt.covariates.map(hl.float)),
    _localize=False,
)

hl.nd.array(covmat).show() # boom

Version

0.2.130

Relevant log output

No response

@ehigham ehigham added needs-triage A brand new issue that needs triaging. query labels May 22, 2024
@daniel-goldstein daniel-goldstein added bug and removed needs-triage A brand new issue that needs triaging. labels Jun 3, 2024
@ehigham
Copy link
Collaborator Author

ehigham commented Jun 4, 2024

Hard to pinpoint what's the true cause of this bug.

  1. We're doing an excessive amount of copying and resizing of the output ndarray because we've lost information about the bounds of the ndarray. Arguably that's a harder problem in real world contexts.
  2. We're computing the aggregation twice owing to a problem in CSE: the same underlying collection ir is used to compute the shape and the flattened array of data. If you modify the code above to be:
hl.bind(covmat, hl.nd.array).show()

the program successfully completes

ehigham added a commit to ehigham/hail that referenced this issue Jun 5, 2024
Fixes: hail-is#14559
`hl.nd.array`s constructed from stream pipelines can cause out of memory
exceptions owing to a limitation in the python CSE algorithm that
does not eliminate partially redundant expressions if if-expressions.
Explicitly `let`-binding the input collection prevents it from being
evaluated twice: once for the flattened data stream and once for the
original shape.
ehigham added a commit to ehigham/hail that referenced this issue Jun 5, 2024
Fixes: hail-is#14559
hl.nd.arrays constructed from stream pipelines can cause out of memory
exceptions owing to a limitation in the python CSE algorithm that does
not eliminate partially redundant expressions if if-expressions.

Explicitly let-binding the input collection prevents it from being evaluated
twice: once for the flattened data stream and once for the original shape.
hail-ci-robot pushed a commit that referenced this issue Jun 6, 2024
Fixes: #14559
`hl.nd.array`s constructed from stream pipelines can cause out of memory
exceptions owing to a limitation in the python CSE algorithm that does
not eliminate partially redundant expressions in if-expressions.
Explicitly `let`-binding the input collection prevents it from being
evaluated twice: once for the flattened data stream and once for the
original shape.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants