Skip to content

[hail] Fix memory leak in BM and make cache size configurable#9501

Merged
danking merged 8 commits intohail-is:mainfrom
danking:bm-fixes
Sep 25, 2020
Merged

[hail] Fix memory leak in BM and make cache size configurable#9501
danking merged 8 commits intohail-is:mainfrom
danking:bm-fixes

Conversation

@danking
Copy link
Contributor

@danking danking commented Sep 24, 2020

CHANGELOG: Remove memory leak in BlockMatrix.to_matrix_table_row_major and BlockMatrix.to_table_row_major. Also, make the cache size used in both methods configurable.

The ref variable was holding entire blocks in memory for no reason. It was a
vestiage of debugging. Moreover, the configurable cache permits users to fine tune
memory usage.

CHANGELOG: Remove memory leak in `BlockMatrix.to_matrix_table_row_major` and `BlockMatrix.to_table_row_major`. Also, make the cache size used in both methods configurable.

The `ref` variable was holding entire blocks in memory for no reason. It was a
vestiage of debugging. Moreover, the configurable cache permits users to fine tune
memory usage.
Daniel King added 3 commits September 24, 2020 11:23
Copy link
Contributor

@johnc1231 johnc1231 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs fixes, otherwise good.

matrix in memory. This value must be at least large enough to hold
one row of th matrix in memory. If this value is exactly the size of
one row, then a partition makes a network request for every row of
every block. Larger values reduce the number of network requests. If
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by the wording. Is:

If this value is exactly the size of
            one row, then a partition makes a network request for every row of
            every block.

saying the same thing as:

If memory permits, setting this value to the size of one row permits
            one network request per block per partition.

If so, why are we saying it twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the verbiage. Latter should have said one partition.

maximum_cache_memory_in_bytes : int or None
The amount of memory to reserve, per partition, to cache rows of the
matrix in memory. This value must be at least large enough to hold
one row of th matrix in memory. If this value is exactly the size of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

th => the

----------
n_partitions : int or None
Number of partitions of the matrix table.
maximum_cache_memory_in_bytes : int or None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as above, as this is the same paragraph.

@johnc1231
Copy link
Contributor

KING tests are failing

Copy link
Contributor

@johnc1231 johnc1231 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing

@danking danking merged commit d766d78 into hail-is:main Sep 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants