-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete methods for sequences inside managers #113
Conversation
Why is this PR trying to completely delete and renew |
Thank you for the contribution @aliPMPAINT - I think I'll need to dive into it a bit further, as we need to confirm all the subsequent resources are being freed (so there are no memory leaks). It's also important to make sure to consider that some asynchronous sequences may be running, so these should either be awaited or an issue should be raised. In regards to your question, this is most likely due to the line ending, as the output file would have different line endings if built on windows vs linux (I think it would be good to standardise this with post proxessing / regex). |
Ok just had a quick look, it seems there is a failing test:
|
I can guess what made the test fail, I have changed the manager's destructor(the first commit), and as the sequence was created outside of the manager's scope, it didn't destruct it thoroughly.
and
If we observe( |
Ok I see, I think it would still be important to ensure the destructor is called correctly in the manager to avoid lingering resources, as ultimately the memory in the GPU is managed hierarchically |
Yeah sure, and sorry for my silly mistake |
No worries, the memory management hierary is not too complex once you get your head around it, but it has certainly gotten me in some long debugging sessions, using memory profilers does help understand what objects / code components may be causing issues, more detail via #15 |
@axsaucedo Bump, this is quite important. |
@alexander-g hmm that is actually a pretty good idea... ok I agree this is quite important, thank you @aliPMPAINT for the initial work, I will pull the branch, and do some deeper testing, as well as expose via python interface to make sure we can merge. Thanks both for driving forward this areas. |
@alexander-g currently looking at this, I now remember why we don't delete the function right after the sequence is executed. The reason why this is by design is because the sequence actually acquires the GPU memory ownership of the Tensors when running the OpCreateTensors. More specifically, it is currently in the memory hierarchy for the OpBase to have a choice whether to free the Tensors it owns or not - here is the line that shows how OpCreateTensor owns the tensors it uses by setting the last value to true Here you can see the line that specifies the ownership of the tensors themselves: This is actually tied to the discussion I provided in your #130 PR (#130 (comment)) where I basically mention that there may be a better way to think about the OpTensorCreate operation, and the hierarchy of the Tensor memory management overall. This is something that would require a deeper dive, primarily as I need to check if it can make sense for the memory ownership to no longer be Manager -> Sequence -> Op -> Tensor, and update it to remove Tensor as a dependency of the operation. This could require a less trivial refactor, as it's not clear what that hierarchy would be otherwise. This is teh reason why anonymous Sequences are actually kept. This could actually be change such that anonymous sequences are destroyed after execution, which is what we had at the very beginning but as you can imagine this led to tensor memory being destroyed right after a OpTensorCreate in an anonymous function. But this woudl require specific awareness of users, which would have to always run the OpTensorCreate in a non-anonymous function, and then the rest of the operations in an anonymous function. Does this make sense? |
For completeness it may be worth sharing another piece of insight to provide the full picture. Namely that when I was initially desigining the tensors, i had an idea where instead of having two tensors - a Staging and a Device tensor - all the logic would be contained as a single Tensor. Namely this single tensor would contain both relevant memories for the staging and host tensors. The reason why this was not pursued at the end is because the idea was to provide further access and granularity into how the memory is made available to the user - this way the user can actually decide to build their own OpTensor operations, which may own different tensors, sometimes perhaps destroying the staging tensor completely to avoid having the memory overhead. With this in mind, it could be revisited, but there is that disadvantage that for any device tensor, there would be always the overhead of an extra memory component that would be obscured by the user. The advantage of this is that in this case the staging tensor wouldnt' be created directly by the OpCreate operation, which means that the tensors could be owned by the top level manager. For all of these there are various tradeoffs that could be reassessed. For now, the simplest way to approach this is to enable a flag that alllows for deletion of anonymous sequences as soon as they are executed, which can be used by more advanced users that would know that using the OpCreateTensor would involve memory management, and hence should be executed in a non-anonymous function that should be managed manually (and then deleted explicitly with the |
I will update #36 with the discussion from this issue as well |
Closing in favour of #113 |
This solves #36 partially, the only thing that remains is creating a method that gives the ability delete a given anonymous sequence.
It's my first time making a PR for a C++ project, so it might not be good. I have provided comments for each commit explaining what each does.