Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow aggregating allocations in memory in the tracked process #277

Merged
merged 21 commits into from
Jan 19, 2023

Conversation

godlygeek
Copy link
Contributor

Relates-to: #68

@godlygeek godlygeek self-assigned this Dec 23, 2022
@godlygeek godlygeek force-pushed the record_aggregated_allocations branch 9 times, most recently from 4c4716e to bbe7644 Compare December 24, 2022 06:31
@godlygeek godlygeek marked this pull request as ready for review December 30, 2022 23:49
@godlygeek godlygeek force-pushed the record_aggregated_allocations branch 2 times, most recently from 4afec07 to 127874a Compare December 31, 2022 01:45
@pablogsal
Copy link
Member

Some profiling data of hotspots:

overhead1
overhead2
overhead3
overhead4

@pablogsal
Copy link
Member

Seems that locking continues to be a pain (this is self time):

overhead5

@godlygeek godlygeek force-pushed the record_aggregated_allocations branch 4 times, most recently from e21c1ab to 1e805fe Compare January 4, 2023 23:39
src/memray/_memray/snapshot.cpp Show resolved Hide resolved
src/memray/_memray/snapshot.cpp Outdated Show resolved Hide resolved
src/memray/_memray/snapshot.cpp Show resolved Hide resolved
src/memray/_memray/snapshot.cpp Outdated Show resolved Hide resolved
src/memray/_memray/snapshot.cpp Show resolved Hide resolved
src/memray/_memray/record_writer.cpp Outdated Show resolved Hide resolved
src/memray/_memray/record_writer.cpp Show resolved Hide resolved
src/memray/_memray/record_writer.cpp Outdated Show resolved Hide resolved
src/memray/_memray/record_writer.cpp Show resolved Hide resolved
src/memray/_memray/record_writer.cpp Outdated Show resolved Hide resolved
@pablogsal
Copy link
Member

You need to fix the conflicts in src/memray/commands/run.py from your other PR :)

Previously we handled this by allowing clients of the `RecordWriter` to
write each of the 3 types of records used for describing the memory
mappings in the capture file. Instead, provide a method that takes all
loaded segments and writes them. This is slightly less efficient, but
this is a better abstraction and prevents the tracker from needing to
know implementation details of the representation of memory mappings
used by the writer.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Cut and paste some functions into different locations in the file. This
has no effect on the compiled code, but this prefactoring will make the
next diff easier to read.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
This refactoring lays the ground work for introducing an abstract base
class that `RecordWriter` extends, and thereby a new type of writer.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
A future commit will make `RecordWriter` an abstract base class with
multiple implementations to choose from. This means that every call will
necessarily indirect through a vtable, and there will no longer be any
advantage to having declared these methods as inline. In preparation for
that change, move the methods into the .cpp file.

There are no changes to the contents of any of these methods. The only
changes are moving them and removing the `inline` declarations.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Going forward, we want `RecordWriter` to be the name of an abstract base
class with multiple implementations. Call our original implementation
`StreamingRecordWriter`, as it streams each record to the provided sink
immediately, without any aggregations or pre-processing.

Keep the original `RecordWriter` name as an alias for now, to avoid
needing to change clients of the `RecordWriter`.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
This lays the groundwork for the reader to recognize and decode
different file formats.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Keep the utility methods for writing objects of different types to the
sink in the base class, so that they can be shared by different writers.
Make `d_sink` a protected member of the parent class, so that these
utility methods in the parent class have a sink to work with.

Rather than construct instances of subclasses directly, expose
a `createRecordWriter` factory function that creates and configures an
appropriate `RecordWriter` instance based on the provided `FileFormat`.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Our existing aggregators don't need much information about what effects
any given munmap() call resulted in. One cares only about how it changed
the total number of heap bytes tracked by memray, and one doesn't even
care about that as long as the interval tree is updated to reflect the
new set of allocations.

We'll be introducing a new aggregator that needs additional information,
though. Specifically, it needs to know which allocations were affected
by the `munmap` (so that it can allocate the freed bytes to the correct
location), and whether they were completely freed, or shrunk, or split
into two separate pieces (so that it can adjust the number of
allocations associated with each location).

Make `removeInterval` expose extra information about what effects the
`munmap` call had on existing allocations, to support this new use case.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
By tracking the last known high water mark as of the last allocation or
deallocation at any given location, we can report records contributing
to the high water mark or records indicating leaked allocations by just
looping over the history associated with each allocation.

Tracking all of the information that we need takes time proportional to
the number of allocations, and reporting the high water mark or leaked
records takes time proportional to the number of distinct allocation
locations.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
This allows callers to iterate over every node in the tree by calling
`nextNode` for each valid index.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
This implementation of the `RecordWriter` interface processes the
allocations fed to it in-memory in order to detect the high water mark
records or the leaked records. Once all allocations have been processed,
it writes information about either the high water mark records or the
leaked records to the capture file. It does not retain history of each
allocation that was performed, only the number of allocations and number
of bytes that each allocation location contributed to the high water
mark or to the leaked allocations.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
No interface changes are required to support returning aggregated
allocations, as we were already returning an allocation object which
included `n_allocations` as a field. However, interface changes are
required to support the heap memory size over time graph. Up until now
the `RecordReader` has emitted only the rss size and time, and the
caller needed to augment those by tracking the heap size on its own and
adding that field. With aggregated capture files the caller can't do
this (since it has no way of knowing the heap size over time), so
instead we need to emit a different type of record, `MemorySnapshot`.
The new record type will come only from aggregated capture files, since
capture files with all allocations don't contain enough information for
the record reader to provide the heap size without tracking it itself.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Expose the `FileFormat` enum from the `memray._memray` module, and let
users specify a file format to use when constructing a `Tracker`.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
The `AggregatingRecordWriter` throws the thread records it gets into an
`unordered_map`, which can allocate memory. Don't try to capture that
allocation.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Update `TestHighWatermark` and `TestLeaks` to exercise both capture file
formats.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Apparently Alpine implements `calloc` in terms of `malloc`, and
`posix_memalign` in terms of `aligned_alloc`. Ensure that we don't
record duplicate allocations on Alpine.

Our `ALL_ALLOCATIONS` capture file format incorrectly recorded both the
`calloc` and the `malloc` it is implemented in terms of, but because the
`malloc` happens first and returns the same pointer as the `calloc`
later returns, the aggregator wound up overwriting the allocation record
for the `malloc` with the one for the `calloc`, silently dropping it.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Add a test helper to `memray._memray` to help us exercise
`HighWaterMarkAggregator`. I originally intended to put this in
`_memray_test_utils.pyx`, but because that doesn't link in any `.cpp`
files or `libunwind`, that turned out to require a lot of changes, and
it was easier to just put this in `_memray.pyx`.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Some operations aren't supported when working with aggregated capture
files, and we raise if you attempt them. Add tests exercising this.

Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>
Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This PR is already gigantic and I have reviewed it 2 times and could not find anything immediate that appears to be incorrect so I prefer to land it now and then fix things later if we discover that we missed something.

Excellent, job @godlygeek! 🚀

@pablogsal pablogsal merged commit 95e8210 into bloomberg:main Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants