-
-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Triage high memory usage when running many tests concurrently #12662
Comments
We don’t have any built-in facilities for this, so the best bet currently is a combination of native and python memory profiling tools. To tackle this more deeply though, I expect that we’ll need to add support to pants for reporting the top-memory-consuming |
Some good news: it looks like the Rust EDIT: Almost! nnethercote/dhat-rs#19 |
#14638 implements the memory analysis tooling I mentioned above. There are some peculiarities immediately obvious with the new |
) As described in #12662, some use cases have surprising memory usage. To enable tracking those cases down, this change adds a `--memory-summary` option which summarizes the deep sizes of live objects in the `Graph`. On the Python side, the deep size is calculated using a very basic deduping walk of `gc.get_referents` (after having investigated [pympler](https://pypi.org/project/Pympler/), [guppy3](https://pypi.org/project/guppy3/), [objsize](https://pypi.org/project/objsize/)). On the Rust side, the `deepsize` crate is used, with sizes derived for all types reachable from `NodeKey` and `NodeOutput`. Example output: ``` Memory summary: 64 1 pants.backend.docker.subsystems.dockerfile_parser.DockerfileParser 64 1 pants.backend.docker.subsystems.dockerfile_parser.ParserSetup 64 1 pants.backend.java.dependency_inference.java_parser_launcher.JavaParserCompiledClassfiles 64 1 pants.backend.java.dependency_inference.symbol_mapper.FirstPartyJavaTargetsMappingRequest <snip> 1588620 957 (native) pants.engine.internals.graph.hydrate_sources 2317920 2195 (native) pants.backend.python.dependency_inference.module_mapper.map_module_to_address 2774760 2434 (native) pants.engine.internals.graph.determine_explicitly_provided_dependencies 4446900 1458 (native) pants.engine.internals.graph.resolve_dependencies ```
…tsbuild#14638) As described in pantsbuild#12662, some use cases have surprising memory usage. To enable tracking those cases down, this change adds a `--memory-summary` option which summarizes the deep sizes of live objects in the `Graph`. On the Python side, the deep size is calculated using a very basic deduping walk of `gc.get_referents` (after having investigated [pympler](https://pypi.org/project/Pympler/), [guppy3](https://pypi.org/project/guppy3/), [objsize](https://pypi.org/project/objsize/)). On the Rust side, the `deepsize` crate is used, with sizes derived for all types reachable from `NodeKey` and `NodeOutput`. Example output: ``` Memory summary: 64 1 pants.backend.docker.subsystems.dockerfile_parser.DockerfileParser 64 1 pants.backend.docker.subsystems.dockerfile_parser.ParserSetup 64 1 pants.backend.java.dependency_inference.java_parser_launcher.JavaParserCompiledClassfiles 64 1 pants.backend.java.dependency_inference.symbol_mapper.FirstPartyJavaTargetsMappingRequest <snip> 1588620 957 (native) pants.engine.internals.graph.hydrate_sources 2317920 2195 (native) pants.backend.python.dependency_inference.module_mapper.map_module_to_address 2774760 2434 (native) pants.engine.internals.graph.determine_explicitly_provided_dependencies 4446900 1458 (native) pants.engine.internals.graph.resolve_dependencies ``` [ci skip-build-wheels]
…tsbuild#14638) As described in pantsbuild#12662, some use cases have surprising memory usage. To enable tracking those cases down, this change adds a `--memory-summary` option which summarizes the deep sizes of live objects in the `Graph`. On the Python side, the deep size is calculated using a very basic deduping walk of `gc.get_referents` (after having investigated [pympler](https://pypi.org/project/Pympler/), [guppy3](https://pypi.org/project/guppy3/), [objsize](https://pypi.org/project/objsize/)). On the Rust side, the `deepsize` crate is used, with sizes derived for all types reachable from `NodeKey` and `NodeOutput`. Example output: ``` Memory summary: 64 1 pants.backend.docker.subsystems.dockerfile_parser.DockerfileParser 64 1 pants.backend.docker.subsystems.dockerfile_parser.ParserSetup 64 1 pants.backend.java.dependency_inference.java_parser_launcher.JavaParserCompiledClassfiles 64 1 pants.backend.java.dependency_inference.symbol_mapper.FirstPartyJavaTargetsMappingRequest <snip> 1588620 957 (native) pants.engine.internals.graph.hydrate_sources 2317920 2195 (native) pants.backend.python.dependency_inference.module_mapper.map_module_to_address 2774760 2434 (native) pants.engine.internals.graph.determine_explicitly_provided_dependencies 4446900 1458 (native) pants.engine.internals.graph.resolve_dependencies ``` [ci skip-build-wheels]
…rrypick of #14638) (#14644) As described in #12662, some use cases have surprising memory usage. To enable tracking those cases down, this change adds a `--memory-summary` option which summarizes the deep sizes of live objects in the `Graph`. On the Python side, the deep size is calculated using a very basic deduping walk of `gc.get_referents` (after having investigated [pympler](https://pypi.org/project/Pympler/), [guppy3](https://pypi.org/project/guppy3/), [objsize](https://pypi.org/project/objsize/)). On the Rust side, the `deepsize` crate is used, with sizes derived for all types reachable from `NodeKey` and `NodeOutput`. Example output: ``` Memory summary: 64 1 pants.backend.docker.subsystems.dockerfile_parser.DockerfileParser 64 1 pants.backend.docker.subsystems.dockerfile_parser.ParserSetup 64 1 pants.backend.java.dependency_inference.java_parser_launcher.JavaParserCompiledClassfiles 64 1 pants.backend.java.dependency_inference.symbol_mapper.FirstPartyJavaTargetsMappingRequest <snip> 1588620 957 (native) pants.engine.internals.graph.hydrate_sources 2317920 2195 (native) pants.backend.python.dependency_inference.module_mapper.map_module_to_address 2774760 2434 (native) pants.engine.internals.graph.determine_explicitly_provided_dependencies 4446900 1458 (native) pants.engine.internals.graph.resolve_dependencies ``` [ci skip-build-wheels]
A quick summary of why what As described on #14638: these are de-duplicated "deep" sizes for these objects. Items from the rust side (items prefixed with That means that the To triage this, someone should poke at adding debug output that uses |
The `--stats-memory-summary` added in #14638/#14652 was [reporting surprisingly large sizes](#12662 (comment)) for native `NodeKey` structs -- even when excluding the actual Python values that they held. Investigation showed that both the `Task` and `Entry` structs were contributing significantly to the size of the `Task` struct. The [`internment` crate](https://crates.io/crates/internment) used here (and in #14654) is an alternative to giving these values integer IDs. They become pointers to a unique, shared (technically: leaked) copy of the value. They are consequently 1) much smaller, 2) much faster to compare. The `top`-reported memory usage of `./pants dependencies --transitive ::`: * `313M` before (summary [before.txt](https://github.com/pantsbuild/pants/files/8175461/before.txt)) * `220M` after (summary [after.txt](https://github.com/pantsbuild/pants/files/8175462/after.txt)) [ci skip-build-wheels]
…uild#14683) The `--stats-memory-summary` added in pantsbuild#14638/pantsbuild#14652 was [reporting surprisingly large sizes](pantsbuild#12662 (comment)) for native `NodeKey` structs -- even when excluding the actual Python values that they held. Investigation showed that both the `Task` and `Entry` structs were contributing significantly to the size of the `Task` struct. The [`internment` crate](https://crates.io/crates/internment) used here (and in pantsbuild#14654) is an alternative to giving these values integer IDs. They become pointers to a unique, shared (technically: leaked) copy of the value. They are consequently 1) much smaller, 2) much faster to compare. The `top`-reported memory usage of `./pants dependencies --transitive ::`: * `313M` before (summary [before.txt](https://github.com/pantsbuild/pants/files/8175461/before.txt)) * `220M` after (summary [after.txt](https://github.com/pantsbuild/pants/files/8175462/after.txt)) [ci skip-build-wheels]
#13483 and #14683 reduce memory usage by 15% and 30% (respectively), for a combined ~40%. #14683 will be cherry-picked to Based on those changes, I'm going to call this closed. Please definitely use the |
…pick of #14683) (#14689) The `--stats-memory-summary` added in #14638/#14652 was [reporting surprisingly large sizes](#12662 (comment)) for native `NodeKey` structs -- even when excluding the actual Python values that they held. Investigation showed that both the `Task` and `Entry` structs were contributing significantly to the size of the `Task` struct. The [`internment` crate](https://crates.io/crates/internment) used here (and in #14654) is an alternative to giving these values integer IDs. They become pointers to a unique, shared (technically: leaked) copy of the value. They are consequently 1) much smaller, 2) much faster to compare. The `top`-reported memory usage of `./pants dependencies --transitive ::`: * `313M` before (summary [before.txt](https://github.com/pantsbuild/pants/files/8175461/before.txt)) * `220M` after (summary [after.txt](https://github.com/pantsbuild/pants/files/8175462/after.txt)) [ci skip-build-wheels]
./pants test ::
uses a surprising amount of memory: >1GB for ~200 tests has been observed, with ~400MB for./pants dependencies --transitive ::
in the same repository.The text was updated successfully, but these errors were encountered: