-
Notifications
You must be signed in to change notification settings - Fork 22.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report more information for memory profiling #61282
Conversation
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit d6130f3 (more details on the Dr. CI page):
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
Let's see what happens if we change the API function prototype. |
It seems to be OK to change the API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comment at head of this PR should be changed. Suggest to explain why changing relative size to absolute size.
And the downstream profiler's processing on memory events will need to update when alloc_size is changed from relative size to absolute size.
Seems reasonable to me. cc @ilia-cher @ngimel |
c10/core/CPUAllocator.cpp
Outdated
@@ -323,7 +324,7 @@ void ProfiledCPUMemoryReporter::Delete(void* ptr) { | |||
} | |||
if (profile_memory) { | |||
reportMemoryUsageToProfiler( | |||
ptr, -nbytes, c10::Device(c10::DeviceType::CPU)); | |||
ptr, nbytes, allocated, 0, c10::Device(c10::DeviceType::CPU)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did nbytes sign change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, a mistake when bringing it back. fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, a mistake like this that's not caught by the tests is making me very nervous. Can you please add a test that would catch this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are added for CPU and CUDA. Some minor bug are found during testing.
I changed the cuda DeviceCachingAllocator
to report Block wrapped ptr. It will be unwrapped by THCCachingAllocator
and be wrapped it again as DataPtr
. This will unify reported ptr semantic for CPU and CUDA allocator.
A bug is found due to free_block
will change the passed in Block
.
See the commit for details.
@ngimel I added cpu and cuda tests for There is a lint error ask me to use angle bracket for |
@gdankel Ping.... |
c10/core/Allocator.h
Outdated
virtual void reportMemoryUsage( | ||
void* ptr, | ||
int64_t alloc_size, | ||
int64_t allocated_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming is confusing. How about total_allocated and total_reserved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New names will be much better. Will change.
record both the ptr size, total allocated size and total reserved size
… DefaultCPUAllocator
…rmation before the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@gdankel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Report pointed memory size, total allocated memory, total reserved size all in one report.
ptr
andalloc_size
will be used for associating with op trace.allocated_size
,reserved_size
will be used for memory trace.