Add `weights_only` option to `torch.load` #86812

malfet · 2022-10-12T18:03:33Z

This addresses the security issue in default Python's unpickler that allows arbitrary code execution while unpickling.
Restrict classes allowed to be unpicked to in None, int, bool, str, float, list, tuple, dict/OrderedDict as well as torch.Size, torch.nn.Param as well as torch.Tensor and torch.Storage variants.

Defaults weights_only is set to False, but allows global override to safe only load via TORCH_FORCE_WEIGHTS_ONLY_LOAD environment variable.

To some extent, addresses #52596

pytorch-bot · 2022-10-12T18:03:35Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86812

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 Failures, 3 Pending

As of commit 2a2912f:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

dzhulgakov

Nice!

Can you also add some tests please?

torch/_weights_only_unpickler.py

torch/serialization.py

McPatate

Kudos on reimplementing the unpickler yourselves, this look very nice !

torch/_weights_only_unpickler.py

ezyang · 2022-10-17T15:05:21Z

torch/serialization.py

@@ -742,6 +746,15 @@ def load(
        # Load a module with 'ascii' encoding for unpickling
        >>> torch.load('module.pt', encoding='ascii')
    """
+    UNSAFE_MESSAGE = (
+        "*WARNING* Safe load failed, defaulting to standard Python pickle module, "


It's seems weird to have a mode which tries safe load, and then AUTOMATICALLY tries the unsafe load after. Like, what's the point? It's not any more secure, and it also gives you a false sense of security when there is no warning because the safe load succeeded.

Also thought it was strange to go with the unsafe load if safe failed, I suppose it's to avoid causing breaking changes for backwards compat.

I guess the question is, do we care about backward compatibility or not. I'm of two minds to be frank, but fine with flipping the logic to load should be called with weights_only set to False

We do care about BC... but if you want to deprecate the old insecure API, you should raise the warning unconditionally.

It's seems weird to have a mode which tries safe load, and then AUTOMATICALLY tries the unsafe load after.

@ezyang - note that this warning as written currently in the diff gets triggered only if weights_only=False. So it doesn't violate safety.

This behavior (as implemented) makes sense only as a temporary "one release" workaround to drive people to put explicit weights_only=False in the needed places. I.e. those would be places where this logic fails and generates a warning. Then in the follow up release we'd switch default to be weights_only=True. Places that were handling tensors only anyway (presumably the majority) won't need to change the code, will never see the warning and will become secure in the future.

Of course we can jump to weights_only=True right away but I'm a bit hesitant about breaking BC

ezyang · 2022-10-17T15:10:38Z

torch/_weights_only_unpickler.py

+                    "torch.LongStorage": torch.FloatStorage,
+                    "torch.nn.parameter.Parameter": torch.nn.Parameter,
+                    "torch._utils._rebuild_parameter": torch._utils._rebuild_parameter,
+                    "torch._utils._rebuild_tensor_v2": torch._utils._rebuild_tensor_v2,


There is also an implicit invariant, which is that each of these functions/constructors must be able to take arbitrary user input and "construct" it safely, without triggering arbitrary code execution. This is not obvious when implementing these functions and needs to be notated at all the spots.

torch/_weights_only_unpickler.py

torch/serialization.py

dzhulgakov · 2022-10-18T07:51:18Z

torch/serialization.py

@@ -742,6 +746,15 @@ def load(
        # Load a module with 'ascii' encoding for unpickling
        >>> torch.load('module.pt', encoding='ascii')
    """
+    UNSAFE_MESSAGE = (
+        "*WARNING* Safe load failed, defaulting to standard Python pickle module, "


It's seems weird to have a mode which tries safe load, and then AUTOMATICALLY tries the unsafe load after.

@ezyang - note that this warning as written currently in the diff gets triggered only if weights_only=False. So it doesn't violate safety.

This behavior (as implemented) makes sense only as a temporary "one release" workaround to drive people to put explicit weights_only=False in the needed places. I.e. those would be places where this logic fails and generates a warning. Then in the follow up release we'd switch default to be weights_only=True. Places that were handling tensors only anyway (presumably the majority) won't need to change the code, will never see the warning and will become secure in the future.

Of course we can jump to weights_only=True right away but I'm a bit hesitant about breaking BC

To validate that loading pickled object raises an error

malfet · 2022-10-20T00:03:06Z

Incorporating more feedback from the discussions:

Allowlist all torch._tensor_types and torch._storage_types
Add global override to the behavior via the environment variable, but by default set it to false
Enable safe unpickle for dict_only usecases in unittests

ezyang · 2022-10-20T23:09:02Z

torch/serialization.py

+                    try:
+                        return _load(opened_zipfile, map_location, _weights_only_unpickler, **pickle_load_args)
+                    except RuntimeError as e:
+                        raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None


Are we sure we definitely want to suppress the internal exception?

All error messages raised by _weights_only_unpickler are unique, but on the other hand error message is much cleaner than way.

malfet · 2022-10-21T01:08:03Z

@pytorchbot merge -f "Mac failures are spurious"

pytorchmergebot · 2022-10-21T01:09:45Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

github-actions · 2022-10-21T01:10:21Z

Hey @malfet.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

This addresses the security issue in default Python's `unpickler` that allows arbitrary code execution while unpickling. Restrict classes allowed to be unpicked to in `None`, `int`, `bool`, `str`, `float`, `list`, `tuple`, `dict`/`OrderedDict` as well as `torch.Size`, `torch.nn.Param` as well as `torch.Tensor` and `torch.Storage` variants. Defaults `weights_only` is set to `False`, but allows global override to safe only load via `TORCH_FORCE_WEIGHTS_ONLY_LOAD` environment variable. To some extent, addresses #52596 Pull Request resolved: #86812 Approved by: https://github.com/ezyang (cherry picked from commit 961ebca)

* Tweak several test serialization to store models state_dict (#87143) Namely, change: - `test_meta_serialization` - `test_serialization_2gb_file` - `test_pathlike_serialization` Pull Request resolved: #87143 Approved by: https://github.com/ezyang (cherry picked from commit 4a533f1) * Add `weights_only` option to `torch.load` (#86812) This addresses the security issue in default Python's `unpickler` that allows arbitrary code execution while unpickling. Restrict classes allowed to be unpicked to in `None`, `int`, `bool`, `str`, `float`, `list`, `tuple`, `dict`/`OrderedDict` as well as `torch.Size`, `torch.nn.Param` as well as `torch.Tensor` and `torch.Storage` variants. Defaults `weights_only` is set to `False`, but allows global override to safe only load via `TORCH_FORCE_WEIGHTS_ONLY_LOAD` environment variable. To some extent, addresses #52596 Pull Request resolved: #86812 Approved by: https://github.com/ezyang (cherry picked from commit 961ebca)

dzhulgakov · 2022-10-22T05:37:39Z

torch/_weights_only_unpickler.py

+    for ts in torch._storage_classes:
+        rc[f"{ts.__module__}.{ts.__name__}"] = ts
+    # Rebuild functions
+    for f in [


is the _rebuild_qtensor missing? How can we ensure this list is up to date?

This addresses the security issue in default Python's `unpickler` that allows arbitrary code execution while unpickling. Restrict classes allowed to be unpicked to in `None`, `int`, `bool`, `str`, `float`, `list`, `tuple`, `dict`/`OrderedDict` as well as `torch.Size`, `torch.nn.Param` as well as `torch.Tensor` and `torch.Storage` variants. Defaults `weights_only` is set to `False`, but allows global override to safe only load via `TORCH_FORCE_WEIGHTS_ONLY_LOAD` environment variable. To some extent, addresses pytorch#52596 Pull Request resolved: pytorch#86812 Approved by: https://github.com/ezyang

…98479) This adds a `weights_only` option to torch.hub.load_state_dict_from_url which is helpful for loading pretrained models from potentially untrusted sources. Ex: https://github.com/d4l3k/torchdrive/blob/main/torchdrive/models/simple_bev.py#L618-L621 See #86812 for more info on weights_only Test plan: ``` pytest test/test_hub.py ``` Pull Request resolved: #98479 Approved by: https://github.com/NicolasHug

malfet force-pushed the malfet/safer-unpickler branch 3 times, most recently from e5c20ff to 7af2388 Compare October 14, 2022 22:52

malfet changed the title ~~[WIP] Safer unpickler~~ Add load_weights_only option to torch.load Oct 14, 2022

malfet marked this pull request as ready for review October 14, 2022 23:26

malfet requested review from ezyang and soumith October 14, 2022 23:32

dzhulgakov self-requested a review October 15, 2022 04:23

dzhulgakov requested changes Oct 15, 2022

View reviewed changes

McPatate reviewed Oct 15, 2022

View reviewed changes

ezyang reviewed Oct 17, 2022

View reviewed changes

torch/_weights_only_unpickler.py Outdated Show resolved Hide resolved

ezyang reviewed Oct 17, 2022

View reviewed changes

torch/_weights_only_unpickler.py Outdated Show resolved Hide resolved

ezyang reviewed Oct 17, 2022

View reviewed changes

torch/_weights_only_unpickler.py Show resolved Hide resolved

ezyang reviewed Oct 17, 2022

View reviewed changes

torch/_weights_only_unpickler.py Outdated Show resolved Hide resolved

malfet force-pushed the malfet/safer-unpickler branch from 43f7b75 to a7ceca3 Compare October 17, 2022 21:13

malfet changed the title ~~Add load_weights_only option to torch.load~~ Add weights_only option to torch.load Oct 18, 2022

dzhulgakov reviewed Oct 18, 2022

View reviewed changes

malfet added 8 commits October 19, 2022 20:54

[WIP] Safer unpickler

895557d

Fix lint, further restrict unpickler

0ae754e

Improve unsafe warning

86af160

Address some review feedback

9f1d492

And tweak one tests

428c7f9

To validate that loading pickled object raises an error

Another typo

60ad125

More changes, but need to add validation

f00f088

Add testing, more datatypes, docs etc

77fb38d

malfet force-pushed the malfet/safer-unpickler branch from 6d90cd5 to 77fb38d Compare October 19, 2022 23:58

malfet requested review from dzhulgakov and ezyang October 20, 2022 00:03

malfet added 5 commits October 20, 2022 01:43

Add unittest for trivial exploit

414069b

More checks in NEWOBJ

c669de6

More tests

8794620

Audit persistent_load args

36ff87e

Fix lint

2a2912f

ezyang approved these changes Oct 20, 2022

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 20, 2022

pytorchmergebot added the Merged label Oct 21, 2022

pytorchmergebot closed this in 961ebca Oct 21, 2022

malfet mentioned this pull request Oct 21, 2022

[v.1.13.0] Release Tracker #86312

Closed

dzhulgakov reviewed Oct 22, 2022

View reviewed changes

kshitij12345 mentioned this pull request Oct 26, 2022

[fix] allow saving python attr on Tensor and Parameter via torch.save #81616

Closed

1 task

malfet deleted the malfet/safer-unpickler branch November 4, 2022 04:30

lopho mentioned this pull request Nov 4, 2022

torch.load overwrites pickle_module parameter indescriminately #88438

Closed

d4l3k mentioned this pull request Apr 6, 2023

torch.hub: add safe weights_only option to load_state_dict_from_url #98479

Closed

mvesin mentioned this pull request Aug 11, 2023

torch.load() secure unpickling fedbiomed/fedbiomed#832

Open

kit1980 mentioned this pull request Feb 29, 2024

torch.load without weights_only parameter is unsafe DLR-RM/stable-baselines3#1852

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `weights_only` option to `torch.load` #86812

Add `weights_only` option to `torch.load` #86812

malfet commented Oct 12, 2022 •

edited

Loading

pytorch-bot bot commented Oct 12, 2022 •

edited

Loading

dzhulgakov left a comment •

edited

Loading

McPatate left a comment

ezyang Oct 17, 2022

McPatate Oct 17, 2022

malfet Oct 17, 2022

ezyang Oct 18, 2022

dzhulgakov Oct 18, 2022

ezyang Oct 17, 2022

dzhulgakov Oct 18, 2022

malfet commented Oct 20, 2022

ezyang Oct 20, 2022

malfet Oct 21, 2022

malfet commented Oct 21, 2022

pytorchmergebot commented Oct 21, 2022

github-actions bot commented Oct 21, 2022

dzhulgakov Oct 22, 2022

Add weights_only option to torch.load #86812

Add weights_only option to torch.load #86812

Conversation

malfet commented Oct 12, 2022 • edited Loading

pytorch-bot bot commented Oct 12, 2022 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86812

❌ 3 Failures, 3 Pending

dzhulgakov left a comment • edited Loading

Choose a reason for hiding this comment

McPatate left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

malfet commented Oct 20, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

malfet commented Oct 21, 2022

pytorchmergebot commented Oct 21, 2022

Merge started

github-actions bot commented Oct 21, 2022

Choose a reason for hiding this comment

Add `weights_only` option to `torch.load` #86812

Add `weights_only` option to `torch.load` #86812

malfet commented Oct 12, 2022 •

edited

Loading

pytorch-bot bot commented Oct 12, 2022 •

edited

Loading

dzhulgakov left a comment •

edited

Loading