Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed file-extension-based arbitrary code execution attack vector #2946

Merged
merged 2 commits into from
Mar 14, 2023

Conversation

CodeZombie
Copy link
Contributor

The Problem

Pickle files (.pkl, .ckpt, etc) are extremely unsafe as they can be trivially crafted to execute arbitrary code when parsed using torch.load
Right now the conventional wisdom among ML researchers and users is to simply not run untrusted pickle files ever and instead only use Safetensor files, which cannot be injected with arbitrary code. This is very good advice.

Unfortunately, I have discovered a vulnerability inside of InvokeAI that allows an attacker to disguise a pickle file as a safetensor and have the payload execute within InvokeAI.

How It Works

Within model_manager.py and convert_ckpt_to_diffusers.py there are if-statements that decide which load method to use based on the file extension of the model file. The logic (written in a slightly more readable format than it exists in the codebase) is as follows:

if Path(file).suffix == '.safetensors':
   safetensor_load(file)
else:
   unsafe_pickle_load(file)

A malicious actor would only need to create an infected .ckpt file, and then rename the extension to something that does not pass the == '.safetensors' check, but still appears to a user to be a safetensors file.
For example, this might be something like .Safetensors, .SAFETENSORS, SafeTensors, etc.

InvokeAI will happily import the file in the Model Manager and execute the payload.

Proof of Concept

  1. Create a malicious pickle file. (https://gist.github.com/CodeZombie/27baa20710d976f45fb93928cbcfe368)
  2. Rename the .ckpt extension to some variation of .Safetensors, ensuring there is a capital letter anywhere in the extension (eg. malicious_pickle.SAFETENSORS)
  3. Import the 'model' like you would normally with any other safetensors file with the Model Manager.
  4. Upon trying to select the model in the web ui, it will be loaded (or attempt to be converted to a Diffuser) with torch.load and the payload will execute.

image

The Fix

This pull request changes the logic InvokeAI uses to decide which model loader to use so that the safe behavior is the default. Instead of loading as a pickle if the extension is not exactly .safetensors, it will now always load as a safetensors file unless the extension is exactly .ckpt.

Notes:

I think support for pickle files should be totally dropped ASAP as a matter of security, but I understand that there are reasons this would be difficult.

In the meantime, I think RestrictedUnpickler or something similar should be implemented as a replacement for torch.load, as this significantly reduces the amount of Python methods that an attacker has to work with when crafting malicious payloads
inside a pickle file.
Automatic1111 already uses this with some success. (https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/modules/safe.py)

Prior to this commit, all models would be loaded with the extremely unsafe `torch.load` method, except those with the exact extension `.safetensors`. Even a change in casing (eg. `saFetensors`, `Safetensors`, etc) would cause the file to be loaded with torch.load instead of the much safer `safetensors.toch.load_file`.
If a malicious actor renamed an infected `.ckpt` to something like `.SafeTensors` or `.SAFETENSORS` an unsuspecting user would think they are loading a safe .safetensor, but would in fact be parsing an unsafe pickle file, and executing an attacker's payload. This commit fixes this vulnerability by reversing the loading-method decision logic to only use the unsafe `torch.load` when the file extension is exactly `.ckpt`.
Copy link
Contributor

@damian0815 damian0815 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch, clean patch, thank you.

@blessedcoolant blessedcoolant merged commit cb48bbd into invoke-ai:main Mar 14, 2023
lstein added a commit that referenced this pull request Mar 23, 2023
Two related security fixes:

1. Port #2946 from main to 2.3.2 branch - this closes a hole that
   allows a pickle checkpoint file to masquerade as a safetensors
   file.

2. Add pickle scanning to the checkpoint to diffusers conversion
   script. This will be ported to main in a separate PR.
blessedcoolant added a commit that referenced this pull request Mar 24, 2023
…etensor loading (#3011)

Several related security fixes:

1. Port #2946 from main to 2.3.2 branch - this closes a hole that allows
a pickle checkpoint file to masquerade as a safetensors file.
2. Add pickle scanning to the checkpoint to diffusers conversion script.
3. Pickle scan VAE non-safetensors files
4. Avoid running scanner twice on same file during the probing and
conversion process.
5. Clean up diagnostic messages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants