[TorchFix] Add weights_only to torch.load #8105

kit1980 · 2023-11-09T01:38:02Z

torch.load without weights_only is a potential security issue (see pytorch/test-infra#4671)

Adding weights_only=True is potentially unsafe correctness-wise if full pickling functionality is needed, but it should be rare and the tests should catch it (and I changed couple of places to weights_only=False after the tests failures).

The "unittests-macos (3.8, macos-m1-12)" failure is preexisting.

pytorch-bot · 2023-11-09T01:38:05Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8105

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit cd17926 with merge base 01dca0e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

#4671 added linter-only `TorchUnsafeLoadVisitor`, but it turned out that the issue is so widespread that manual fixes would be tedious. The codemod is somewhat unsafe correctness-wise because full pickling functionality may still be needed even without `pickle_module`, but I think it's OK because it fixes a security-related issue and the codemods need to be verified anyway. Maybe later we should add something like Ruff's recently added `--unsafe-fixes`: https://docs.astral.sh/ruff/linter/#fix-safety I used this for pytorch/vision#8105

pmeier

For the other reviewers: the parameter weights_only is a misnomer. Setting weights_only=True means that on unpickling only certain types are deserialized. However this is not restricted to "weights", but rather

Indicates whether unpickler should be restricted to loading only tensors, primitive types and dictionaries

The use case here is to avoid security issues by potentially unpickling and executing arbitrary code.

However, in all instances in this PR that is not an issue. We have generated the pickled file either dynamically at runtime or statically as part of the repository. Thus, there is no security issue since we are not loading third-party stuff.

I guess we could go for this to lead by example, but there is no other benefit to it. @NicolasHug thoughts?

test/test_functional_tensor.py

test/test_transforms_v2.py

vfdev-5 · 2023-11-09T09:59:25Z

references/classification/train.py

@@ -127,7 +127,7 @@ def load_data(traindir, valdir, args):
    if args.cache_dataset and os.path.exists(cache_path):
        # Attention, as the transforms are also cached!
        print(f"Loading dataset_train from {cache_path}")
-        dataset, _ = torch.load(cache_path)
+        dataset, _ = torch.load(cache_path, weights_only=True)


Should we check if ref script is still working with this update?

Yes, certainly. Caching the dataset is not that common I guess and there were some discussion regarding removing it (#6727 (comment)). However, I agree that this is likely something that could break with weights_only.

We can change to weights_only=False if it's required, just be explicit about it.

Suggested change

dataset, _ = torch.load(cache_path, weights_only=True)

# TODO: this could probably be weights_only=True

dataset, _ = torch.load(cache_path, weights_only=False)

Sorry, I don't have much bandwidth to check that at the time but to unblock the PR, let's set it to False as suggested, with a TODO.

kit1980 · 2023-11-09T17:50:58Z

I guess we could go for this to lead by example, but there is no other benefit to it

@pmeier The thing is that very likely soon weights_only=True will become default in PyTorch.
So specifying appropriate value for weights_only now will also prevent surprises.

pytorch/test-infra#4671 added linter-only `TorchUnsafeLoadVisitor`, but it turned out that the issue is so widespread that manual fixes would be tedious. The codemod is somewhat unsafe correctness-wise because full pickling functionality may still be needed even without `pickle_module`, but I think it's OK because it fixes a security-related issue and the codemods need to be verified anyway. Maybe later we should add something like Ruff's recently added `--unsafe-fixes`: https://docs.astral.sh/ruff/linter/#fix-safety I used this for pytorch/vision#8105

references/classification/train.py

malfet · 2023-12-06T15:30:50Z

However, in all instances in this PR that is not an issue. We have generated the pickled file either dynamically at runtime or statically as part of the repository. Thus, there is no security issue since we are not loading third-party stuff.

I agree that weigths_only should not be a concern for training checkpoints (as they are not persistent), but I would argue that any potentially executable binary code checked into the repository/torch.hub is a security treat and for that codepath I would insist on using the weighs_only

@kit1980 do you mind updating the checkpoint loading path to weigths_only=False as @NicolasHug suggested or test that it's fine to use True (and mention it in the test plan)

kit1980 · 2023-12-14T01:14:18Z

Committed the suggestions by @NicolasHug and rebased the PR.

Co-authored-by: Philip Meier <github.pmeier@posteo.de>

github-actions · 2023-12-14T20:15:33Z

Hey @kit1980!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

Reviewed By: vmoens Differential Revision: D52538999 fbshipit-source-id: 656cea3784918905cdb8db302ae3f081ed3a8b28 Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Nicolas Hug <nh.nicolas.hug@gmail.com>

facebook-github-bot added the cla signed label Nov 9, 2023

kit1980 mentioned this pull request Nov 9, 2023

[TorchFix] Add codemod for unsafe load pytorch/test-infra#4715

Merged

kit1980 requested review from pmeier, NicolasHug and malfet November 9, 2023 03:01

kit1980 marked this pull request as ready for review November 9, 2023 04:34

pmeier reviewed Nov 9, 2023

View reviewed changes

test/test_functional_tensor.py Outdated Show resolved Hide resolved

test/test_transforms_v2.py Outdated Show resolved Hide resolved

vfdev-5 reviewed Nov 9, 2023

View reviewed changes

NicolasHug reviewed Dec 6, 2023

View reviewed changes

references/classification/train.py Outdated Show resolved Hide resolved

kit1980 force-pushed the sdym/unsafe-load branch from 97ac576 to 00623a4 Compare December 14, 2023 01:11

kit1980 and others added 9 commits December 14, 2023 10:26

Add weights_only to torch.load

bdc08a1

Fix formatting

eac6253

Add weights_only=False

72bcd69

Fix formatting

ada9564

More weights_only=False

1d3cb40

Update test/test_functional_tensor.py

bfa64e1

Co-authored-by: Philip Meier <github.pmeier@posteo.de>

Update test/test_transforms_v2.py

bf01588

Co-authored-by: Philip Meier <github.pmeier@posteo.de>

Update references/classification/train.py

a708255

Update references/classification/train.py

cd17926

kit1980 force-pushed the sdym/unsafe-load branch from 00623a4 to cd17926 Compare December 14, 2023 18:26

kit1980 merged commit c35d385 into main Dec 14, 2023
64 checks passed

kit1980 mentioned this pull request Dec 19, 2023

[TorchFix] Use weights_only for load pytorch/torchtune#108

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TorchFix] Add weights_only to torch.load #8105

[TorchFix] Add weights_only to torch.load #8105

kit1980 commented Nov 9, 2023 •

edited

Loading

pytorch-bot bot commented Nov 9, 2023 •

edited

Loading

pmeier left a comment

vfdev-5 Nov 9, 2023

pmeier Nov 9, 2023

kit1980 Nov 9, 2023

NicolasHug Dec 6, 2023

NicolasHug Dec 6, 2023

kit1980 commented Nov 9, 2023

malfet commented Dec 6, 2023

kit1980 commented Dec 14, 2023

github-actions bot commented Dec 14, 2023

	dataset, _ = torch.load(cache_path, weights_only=True)
	# TODO: this could probably be weights_only=True
	dataset, _ = torch.load(cache_path, weights_only=False)

[TorchFix] Add weights_only to torch.load #8105

[TorchFix] Add weights_only to torch.load #8105

Conversation

kit1980 commented Nov 9, 2023 • edited Loading

pytorch-bot bot commented Nov 9, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8105

✅ No Failures

pmeier left a comment

Choose a reason for hiding this comment

vfdev-5 Nov 9, 2023

Choose a reason for hiding this comment

pmeier Nov 9, 2023

Choose a reason for hiding this comment

kit1980 Nov 9, 2023

Choose a reason for hiding this comment

NicolasHug Dec 6, 2023

Choose a reason for hiding this comment

NicolasHug Dec 6, 2023

Choose a reason for hiding this comment

kit1980 commented Nov 9, 2023

malfet commented Dec 6, 2023

kit1980 commented Dec 14, 2023

github-actions bot commented Dec 14, 2023

kit1980 commented Nov 9, 2023 •

edited

Loading

pytorch-bot bot commented Nov 9, 2023 •

edited

Loading