Skip to content

Add pickle package for generating exploit payloads#587

Merged
lobsterjerusalem merged 12 commits into
mainfrom
feat/pickle-pkg
May 14, 2026
Merged

Add pickle package for generating exploit payloads#587
lobsterjerusalem merged 12 commits into
mainfrom
feat/pickle-pkg

Conversation

@vlobstein-vc
Copy link
Copy Markdown
Collaborator

@vlobstein-vc vlobstein-vc commented May 6, 2026

Summary

The framework already ships dedicated packages for the deserialization sinks it sees most: dotnet/ for .NET BinaryFormatter / SOAPFormatter and java/gadgets/ for Java. Python pickle has not had the same treatment. This PR fills that gap with a pickle/ package modelled on the existing dotnet API.

og-rek already exposes a Call{Callable: Class{Module, Name}, Args: Tuple{...}} type whose encoder emits GLOBAL + tuple + REDUCE, so the byte-level capability has been there. The package this PR ships re-implements that primitive in-tree to keep the framework dep-free and adds an offsec-shaped API on top:

  • Low-level opcode emitters (Proto, Global, BinUnicode, Tuple1/2/3, ...) named after Lib/pickle.py for easy cross-reference, plus a length-prefix helper that drives BinUnicode / BinUnicode8 / ShortBinUnicode / BinBytes / ShortBinBytes from a single function.

  • Typed Fragment composition: primitives Str/Int/Bool/Bytes/None, composites TupleOf/ListOf/DictOf, callables Call/CallFragment/Method/Build, top-level wrapper Dump. Every primitive returns a Fragment that pushes one value onto the unpickler stack, and Fragments nest arbitrarily.

// Quick path:
payload := pickle.CreateOSSystem(cmd)

// Or build directly:
payload := pickle.Dump(pickle.Call("builtins", "exec", pickle.Str(code)))

// Nested chains, e.g. eval(open("/etc/passwd").read()):
payload := pickle.Dump(pickle.Call("builtins", "eval",
    pickle.Method(pickle.Call("builtins", "open", pickle.Str("/etc/passwd")), "read"),
))

Create* shortcuts ship for the canonical gadgets: CreateOSSystem, CreateNTSystem, CreateExec, CreateEval, CreateSubprocessPopen. Method / Build cover the getattr and __setstate__ chain patterns that go-exploit modules will reach for.

A pure-Go Disassemble walks a stream into named opcodes; the test suite uses it instead of calling pickle.loads on attacker-shaped bytes. 17 tests, all golden hex or disassembly round-trip, no Python interpreter required at test time.

Default protocol is 2: universally supported across Python 2.3+ and 3.x, deterministic output across CPython versions.

The package mirrors the dotnet package's gadget-creation API but for
Python pickle. Two layers:

  - Low-level opcode emitters (Proto, Global, BinUnicode, Tuple1, ...)
    plus value encoders (EncodeString, EncodeTuple, EncodeAny) named
    after CPython's Lib/pickle.py opcodes for easy cross-reference.
  - Fragment composition API (Str, Int, Bool, TupleOf, ListOf, DictOf,
    Call, CallFragment, Method, Build, Dump) where every primitive
    returns a Fragment that pushes one value onto the unpickler stack.
    Calls compose freely, so nested gadgets like
    eval(open(path).read()) build naturally.

CreateOSSystem / CreateNTSystem / CreateExec / CreateEval /
CreateSubprocessPopen wrap the typical RCE shapes; the legacy
Reduce(module, attr, []any) entry point covers callers that have
untyped Go values.

A pure-Go Disassemble walks a stream into named opcode list, used by
the test suite to validate gadget structure without ever calling
pickle.loads on attacker-shaped bytes.

The package locks to protocol 2 by default: supported by every Python
2.3+ and 3.x release, no protocol-4-only opcodes (SHORT_BINUNICODE,
STACK_GLOBAL, MEMOIZE, FRAME) needed for typical gadgets, and the
emitted bytes are deterministic across CPython versions.

21 tests covering the low-level emitters with byte-exact golden hex,
the high-level shortcuts, the Fragment composition API including a
nested call chain, dict and BUILD shapes, and the disassembler error
paths.

Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
Audit pass to remove duplication:

- values.go and the legacy Reduce(module, attr, []any) in gadgets.go
  duplicated the Fragment API. They are gone; the typed
  Dump(Call(module, attr, ...)) path is the single way to compose.
- Length-prefixed emitters (BinUnicode, BinUnicode8, ShortBinUnicode,
  BinBytes, ShortBinBytes) now share a single emitLengthPrefixed
  helper instead of repeating the make/copy ceremony five times.
- Single-byte opcode helpers (Stop, Mark, Tuple, Tuple1, Tuple2,
  Tuple3, EmptyTuple, EmptyList, EmptyDict, ReduceOp, Appends) route
  through one tiny `single` constructor.
- Disasm collapses six prefixed string/bytes decoders into two
  (decodeStringArg / decodeBytesArg) backed by a shared
  readPrefixedBytes that itself reuses readInt for the length.
- Fragment composites (TupleOf, ListOf, DictOf, CallFragment, Method,
  Build) drop the bytes.Buffer ceremony in favour of a small
  concatFragments helper that allocates exactly once.

Net diff is roughly -450 lines for the same public API surface (modulo
the dropped Encode* / legacy Reduce, which had no callers). Tests stay
14 functions and pass; lint stays at zero issues.

Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
Doc strings on every exported symbol were stating the obvious. Tightened
to one-liners where the function name + signature already communicate
the behaviour, kept the why on each helper that has a non-obvious choice
(protocol gating, panic conditions, opcode shape).

Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
@vlobstein-vc vlobstein-vc requested a review from terrorbyte May 6, 2026 11:10
The decoder had a 41-line switch with one error-wrapping branch per
byte count and a separate signed-int path on the 4-byte case. Replace
with a single readUint that reads N little-endian bytes into a uint64
through one io.ReadFull plus a tiny shift loop. Sign-extension moves to
decodeIntArg where it actually matters (BININT). readPrefixedBytes now
shares the same primitive.

Net: disasm.go drops from 253 to 233 lines, all 17 tests pass.
Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
decodeOp had a 30-line switch that mapped each opcode to (name,
byteCount, signed) by hand. Replace with a single argOps map[byte]argSpec
and one tiny dispatcher: lookup, call the spec's reader, wrap the
arg into Op. Adding a new opcode now means one map entry instead of
one switch case plus one helper.

Net: same LOC (237 vs 233), but the source of truth for "which opcodes
carry args, what shape, what name" is one block of data rather than
scattered across switch cases and per-shape helper signatures.

Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
ListOf and DictOf shared the same shape (empty case + MARK..close
case) with only the opener / closer opcodes and the item-flatten step
differing. Extract collection(emptyOp, closeOp, items) so both end up
as 3-line wrappers. A future SetOf (proto-4 EMPTY_SET / ADDITEMS) drops
in as one collection call.

TupleOf's per-arity switch becomes a tupleSmallClosers byte array
indexed by len(elems); the large-arity branch keeps the MARK..TUPLE
fallback. Same byte output, less dispatch code.

Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
Stop / Mark / ReduceOp / EmptyTuple / EmptyList / EmptyDict / Appends /
Tuple / Tuple1..3 were one-line returns of []byte{OpXxx}. Internal
callers already use Fragment{OpXxx} directly, so these wrappers were
exported-but-dead. Drop them along with TestSingleByteOpcodes; the byte
constants stay and serve any caller that needs the byte form.

BinUnicode8 covers strings >4 GiB; gadget payloads never reach that
size and at protocol 2 the opcode would not even decode. Drop.

Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new pickle/ package to generate attacker-controlled Python pickle byte streams (protocol 2 by default) and to disassemble pickle streams into named opcodes for inspection/testing, aligning with existing deserialization-sink support in the framework.

Changes:

  • Introduces low-level opcode emitters (Proto, Global, BinUnicode, integer encoders, etc.) and higher-level composable Fragment builders (Str, Int, TupleOf, ListOf, DictOf, Call, Method, Build, Dump).
  • Adds common exploit “shortcut” gadgets (e.g., CreateOSSystem, CreateExec, CreateSubprocessPopen).
  • Adds a pure-Go disassembler and a golden/disassembly-based test suite for deterministic verification without needing Python at test time.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pickle/pickle.go Package docs and protocol constant for the new pickle payload generator.
pickle/opcodes.go Low-level opcode constants and byte emitters for pickle encoding.
pickle/fragments.go High-level fragment composition API and Dump wrapper.
pickle/gadgets.go Convenience gadget constructors for common RCE primitives.
pickle/disasm.go Pure-Go pickle disassembler used by tests and for safe inspection.
pickle/pickle_test.go Golden-hex and disassembly-shape tests covering emitters, gadgets, and disassembler behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pickle/fragments.go Outdated
Comment thread pickle/opcodes.go Outdated
Comment thread pickle/pickle.go Outdated
Comment thread pickle/pickle.go Outdated
…add BinUnicode8 + BinBytes8

Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
…ary-side panics

Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@lobsterjerusalem lobsterjerusalem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added the bit about standardizing the length checking throughout.

Btw this is awesome, it was actually something on my TODO list that I can knock off now 👍.

Comment thread pickle/opcodes.go Outdated
Comment thread pickle/opcodes.go Outdated
Comment thread pickle/opcodes.go Outdated
Comment thread pickle/opcodes.go Outdated
… (value, bool) through wrappers

Signed-off-by: Valentin Lobstein <281638514+vlobstein-vc@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@lobsterjerusalem lobsterjerusalem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a slight change to the pickle docs for example usage to match the update. But looks good to go.

@lobsterjerusalem lobsterjerusalem merged commit 12e83a4 into main May 14, 2026
5 checks passed
@terrorbyte terrorbyte mentioned this pull request May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants