Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to torch.load(mmap=True) to do MAP_SHARED rather than MAP_PRIVATE #124528

Closed
mikaylagawarecki opened this issue Apr 19, 2024 · 3 comments
Assignees
Labels
module: serialization Issues related to serialization (e.g., via pickle, or otherwise) of PyTorch objects triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@mikaylagawarecki
Copy link
Contributor

mikaylagawarecki commented Apr 19, 2024

🚀 The feature, motivation and pitch

Currently torch.load(mmap=True) calls mmap(2) with MAP_PRIVATE, this ensures that changes to the loaded checkpoint are not written back to the file.

As suggested by @albanD there might be a way to do postprocessing on checkpoints in CPU RAM-constrained environments if we pass MAP_SHARED instead

This would just involve passing shared=True to UntypedStorage.from_file

Alternatives

No response

Additional context

No response

cc @mruberry

@mikaylagawarecki mikaylagawarecki added module: serialization Issues related to serialization (e.g., via pickle, or otherwise) of PyTorch objects triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 19, 2024
@colinacassidy
Copy link

Can we claim this task? Looking at getting into contributing to this project and this seems like a good first task. Should have a PR up in a day or so.

@mikaylagawarecki
Copy link
Contributor Author

Hey @colinacassidy, thanks for your interest. Unfortunately I plan to pick this one up :( You could check out the actionable label for a list of issues open to be picked up!

@mikaylagawarecki mikaylagawarecki self-assigned this Apr 22, 2024
@albanD
Copy link
Collaborator

albanD commented Apr 22, 2024

I am not a fan of adding an extra argument to load() just for this. So maybe a torch.serialization.default_mmap_visibility would be the best API? Not sure if there are other settings of the mmap call we could expose the same way that might be useful?

mikaylagawarecki added a commit that referenced this issue Apr 24, 2024
mikaylagawarecki added a commit that referenced this issue Apr 25, 2024
Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 25, 2024
Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 25, 2024
…True)"

Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 25, 2024
Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 26, 2024
…True)"

Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 26, 2024
Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 29, 2024
…True)"

Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 29, 2024
Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 29, 2024
…True)"

Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 29, 2024
Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 30, 2024
…True)"

Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 30, 2024
Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 30, 2024
…True)"

Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this issue Apr 30, 2024
Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future




cc albanD

[ghstack-poisoned]
andoorve pushed a commit to andoorve/pytorch that referenced this issue May 1, 2024
Fixes pytorch#124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future

Pull Request resolved: pytorch#124889
Approved by: https://github.com/albanD
pytorch-bot bot pushed a commit that referenced this issue May 3, 2024
Fixes #124528

Going over the options for our MapAllocator and what they do, I don't think any other of them need to be piped up to `torch.load`

https://github.com/pytorch/pytorch/blob/4f29103749c5011529f1abb10b1508a682588909/aten/src/ATen/MapAllocator.h#L8-L16

~However, I wonder if this `MmapVisibility(Enum)` is a good way to represent "or-ing" together of `mmap` flags if we want to extend it in the future. I looked over the flags for [`mmap(2)`](https://man7.org/linux/man-pages/man2/mmap.2.html), and could not immediately see how most of them would be useful for `torch.load` (would maybe `MAP_LOCKED` (like `mlock`) or `MAP_HUGE` ever be worthwhile?)~

Using the flags provided by the python `mmap` library so that we can extend the allowed flags and pipe them down to the cpp `mmap` call if there is a need for other flags in the future

Pull Request resolved: #124889
Approved by: https://github.com/albanD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: serialization Issues related to serialization (e.g., via pickle, or otherwise) of PyTorch objects triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

3 participants