Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free ephemeral resources used for scanning #70

Merged
merged 2 commits into from
Jun 11, 2020

Conversation

jcharum
Copy link
Contributor

@jcharum jcharum commented Jun 11, 2020

When we scan results, we get a slice of *openerAtReaders, one for each result task. We read from each reader sequentially. When we are done with a given reader, we retain some of the resources used when reading from it, most notably gob decode buffers. As we scan, we accumulate these defunct buffers, and our memory footprint grows.

This happens for two reasons:

  1. We pop off the slice of readers to iterate, i.e. q = q[1:]. However, we do not clear the backing array reference to the reader.
  2. When we close the reader, we don't clear the sliceioReader, which in turn holds the gob decoder.

Fixing either would eliminate the specific scan leak. Fix both, as I think it's the correct behavior.

@jcharum jcharum requested a review from josh-newman June 11, 2020 15:36
Copy link

@jschellenberger jschellenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Did you try running? Should I try running?

@jcharum
Copy link
Contributor Author

jcharum commented Jun 11, 2020

LGTM. Did you try running? Should I try running?

I diagnosed and tested this using a test program.

You can patch this if you have urgent needs and don't want to wait for this to land internally.

@prasadgopal
Copy link
Collaborator

is it possible to write a test to make sure we don't undo this fix at some point? (if it is not too difficult)

@jcharum
Copy link
Contributor Author

jcharum commented Jun 11, 2020

I did not come up with a good way of testing, as it is really an internal resource usage detail. Two possibilities:

  1. Write a test that accesses internals. This does not meet my personal judgment for utility to fragility ratio.
  2. Write a test that creates, uses, and closes many readers, forces a GC, and checks memory usage. This seems possibly fragile and I again question the utility to fragility ratio.

This goes for both the multiReader and openerAtReader changes. I'm open to debate and suggestion though.

Copy link
Contributor

@josh-newman josh-newman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with landing this and improving testing as we think of better alternatives (as measured by utility vs. fragility).

Just to toss in another idea, we could write a test that scans some notably large objects, then uses a recursive, reflective object size estimator [1] to walk the scanner's references and make sure total size drops after close.

[1] For GRAILers: @jschellenberger made a prototype of something like this: D35649.

@jcharum jcharum merged commit 5a96188 into grailbio:master Jun 11, 2020
@jcharum jcharum deleted the scan-leak branch June 11, 2020 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants