Skip to content

feat(lint): add encode-packed-collision#14914

Merged
grandizzy merged 13 commits into
masterfrom
mablr/encode-packed-collision_lint
May 29, 2026
Merged

feat(lint): add encode-packed-collision#14914
grandizzy merged 13 commits into
masterfrom
mablr/encode-packed-collision_lint

Conversation

@mablr
Copy link
Copy Markdown
Collaborator

@mablr mablr commented May 26, 2026

Description

Closes OSS-58

Add encode-packed-collision lint.

Detects abi.encodePacked() calls with 2+ dynamic-type arguments (string, bytes, T[]) which can produce hash collisions, e.g. encodePacked("a","bc") == encodePacked("ab","c").

PR Checklist

  • Added Tests
  • Added Documentation
  • Breaking changes

Detects `abi.encodePacked()` calls with 2+ dynamic-type arguments
(`string`, `bytes`, `T[]`) which can produce hash collisions, e.g.
`encodePacked("a","bc") == encodePacked("ab","c")`.
Comment thread crates/lint/src/sol/high/encode_packed_collision.rs Outdated
Comment thread crates/lint/testdata/EncodePackedCollision.sol
figtracer
figtracer previously approved these changes May 26, 2026
Comment thread crates/lint/src/sol/high/encode_packed_collision.rs Outdated
Copy link
Copy Markdown
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Please fix member-call return inference before shipping. The current logic scans inherited contract items by method name and then bails unless there is exactly one match: encode_packed_collision.rs#L110-L126.

For concrete contracts that override inherited methods, e.g. name() / symbol(), both the base function and override can match, so abi.encodePacked(token.name(), token.symbol()) is missed even though both return string.

Suggested fix: resolve the actual called function using the call args / semantic function type instead of collecting by name only, or filter overridden base functions before the uniqueness check. Please also add a regression test with a concrete contract overriding inherited name() and symbol() that still warns.

mablr added 3 commits May 28, 2026 11:07
ternary, and slices

- Extend `contract_id_of` to handle interface/contract casts
  (`IERC20Metadata(addr).name()`) by inspecting the call callee for
  `Ident -> Contract` and `Type -> Contract` resolutions; add a general
  fallback through `expr_type` to cover struct-field and array-index
  receivers (`cfg.token.name()`, `tokens[i].name()`).
- Thread `n_call_args: Option<usize>` through `call_return_type` so
  same-name overloads that differ only in arity are disambiguated before
  the dynamic-agreement check (`f()` vs `f(uint256)` now resolves
  correctly).
- Handle `ExprKind::Slice` in `expr_type`; calldata slice `data[:4]`
  preserves the bytes type of the base expression.
- Handle `ExprKind::Ternary` in `expr_type`; both branches must agree
  on dynamic-ness; if they do, the then-branch type is returned.
@mablr mablr requested a review from mattsse May 28, 2026 10:24
figtracer
figtracer previously approved these changes May 28, 2026
@grandizzy
Copy link
Copy Markdown
Collaborator

A few things worth addressing before merge, mostly around the hand-rolled type inference in expr_type / call_return_type:

Likely false negatives (silent miss on real bugs)

The ExprKind::Member arm in expr_type only handles struct fields and returns None for everything else. That causes the lint to miss several common dynamic expressions:

  • abi.encodePacked(msg.data, s)msg.data is bytes calldata
  • abi.encodePacked(addr.code, s) / address(this).code / type(C).creationCode|runtimeCode
  • abi.encodePacked(abi.encode(a), abi.encode(b)) — works only with an explicit bytes(...) cast today (that's all the fixture covers); the bare nested call resolves as a member call on the abi builtin, and contract_id_of bails because abi is not a contract. Same for nested abi.encodePacked, abi.encodeWithSelector, etc.
  • abi.encodePacked(string.concat(a, b), c) and bytes.concat(...) — receiver is ExprKind::Type
  • abi.encodePacked(new bytes(n), s)
  • abi.encodePacked(token.name(), s) where name is a string public state variable — item.as_function() skips synthesized getters if the HIR represents them as variables (worth checking)
  • using X for string method-style calls
  • Literal inside a ternary: flag ? a : "x"expr_type(Ternary) recurses into expr_type, which has no Lit arm, so the literal branch isn't classified as dynamic
  • Overload selection: call_return_type filters by name+arity only. If two same-arity overloads exist and the chosen one returns string, the bail returns None. The current fixture only exercises the inverse direction (selected overload non-dynamic → no warn); please add the opposite.

Likely false positives worth refining

The literal short-circuit (LitKind::Str → dynamic) over-fires on safe idioms:

  • abi.encodePacked("prefix:", s) and abi.encodePacked(hex"dead", s)
  • abi.encodePacked(uint256(bytes(a).length), a, uint256(bytes(b).length), b) — the canonical safe pattern
  • abi.encodePacked(s, s) — injective in s
  • EIP-191 personal-sign with uintToString(len) before data

A common mitigation other linters use: only count non-literal dynamics, or require ≥2 non-literal dynamic args.

Minor soundness / style notes

  • is_abi_builtin uses reses.iter().any(...) — if abi is ever shadowed and co-resolves with the builtin, the lint still fires. Exact-match would be safer.
  • call_return_type's "all candidates agree on dynamic-ness, return first" comment should be explicit that the result is only meaningful for the dynamic/non-dynamic decision, not as a true type — easy footgun if reused.
  • The Slice arm comment says "preserves the element type of the base"; it actually preserves the base type (correct for bytes slices, just misleading wording).

Suggested additional fixtures

// Currently MISSED — should warn
abi.encodePacked(msg.data, s);
abi.encodePacked(addr.code, s);
abi.encodePacked(abi.encode(a), abi.encode(b));      // no bytes() cast
abi.encodePacked(string.concat(a, b), c);
abi.encodePacked(bytes.concat(a, b), c);
abi.encodePacked(new bytes(10), s);
abi.encodePacked(flag ? a : "x", b);
abi.encodePacked(token.name(), s);                    // public string getter
abi.encodePacked(Strings.toString(x), Strings.toString(y)); // library

// Known FPs
abi.encodePacked("prefix:", s);
abi.encodePacked(s, s);
abi.encodePacked(uint256(bytes(a).length), a, uint256(bytes(b).length), b);

Minimal fix suggestion

If solar exposes a resolved-callee or expression-type API, prefer that over the ad-hoc walker. If not, special-casing the handful of well-known Solidity builtins inside expr_type(Member) / call_return_type (abi.encode*bytes, msg.databytes, *.codebytes, string.concatstring, bytes.concatbytes) plus a Lit arm in expr_type would close most of the gaps above with little added complexity.

Copy link
Copy Markdown
Collaborator Author

@mablr mablr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grandizzy I fixed your findings in 8f60378

@grandizzy
Copy link
Copy Markdown
Collaborator

grandizzy commented May 29, 2026

@grandizzy I fixed your findings in 8f60378

@mablr - looks good, one small new FP worth addressing before merge:

.code / .creationCode / .runtimeCode member match is unconditional

is_dynamic_builtin_member matches these member names without checking the base type, so a struct field with one of those names is wrongly treated as dynamic bytes:

struct S { uint256 code; }

function f(S memory x, string memory s) public pure returns (bytes32) {
    return keccak256(abi.encodePacked(x.code, s)); // currently warns; should pass
}

(msg.data is already safe because is_builtin_named(base, sym::msg) requires the base to resolve to the msg builtin — same pattern would work here, but expr_type-first is simpler.)

Trivial hardening — prefer the resolved type, fall back to the builtin-member check:

ExprKind::Member(base, member) => {
    if let Some(ty) = expr_type(hir, expr) {
        is_dynamic_type(&ty.kind)
    } else {
        is_dynamic_builtin_member(base, member)
    }
}

Worth adding a PASS fixture for the struct-field case too.

Minor doc nit

The "Why is this bad?" section could note that the lint flags by type, not by exploitability — abi.encodePacked(s, s) and length-prefix-safe patterns like abi.encodePacked(uint256(bytes(a).length), a, …, b) will warn even though they're not collidable. That's a fine conservative choice, just worth setting expectations.

Otherwise LGTM.

Copy link
Copy Markdown
Collaborator

@grandizzy grandizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@grandizzy grandizzy enabled auto-merge (squash) May 29, 2026 12:18
Copy link
Copy Markdown
Collaborator

@figtracer figtracer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@grandizzy grandizzy merged commit 84f6a86 into master May 29, 2026
19 checks passed
@grandizzy grandizzy deleted the mablr/encode-packed-collision_lint branch May 29, 2026 12:47
@github-project-automation github-project-automation Bot moved this to Done in Foundry May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants