Optimize Rule memory layout with packed permissions and inline pattern storage by Sewer56 · Pull Request #32 · Reloaded-Project/ReloadedCode

Sewer56 · 2026-02-17T05:45:42Z

Summary

Refactor Rule to store permission+action in PackedPermission and pattern in TinyString<[u8; 14]>.
Remove separate action field and string permission storage from each rule.
Keep behavior the same (case-insensitive matching, last-match-wins), while reducing per-rule memory footprint.

Real-Config Memory Impact

Measured from:

/home/sewer/nixos/users/sewer/home-manager/programs/opencode/config/agent
/home/sewer/nixos/users/sewer/home-manager/programs/opencode/config/command

Total extracted rule entries: 161.

Metric	Before (String-based Rule)	After (`PackedPermission` + `TinyString<[u8; 14]>`)	Delta
Total bytes (161 rules)	10,486 B	5,572 B	-4,914 B
Average bytes / rule	65.1 B	34.6 B	-30.5 B
Reduction	—	—	46.9%

Estimated Heap Payload

Metric	Before	After	Delta
Heap payload bytes	1,470 B	420 B	-1,050 B

The memory model assumes:

BEFORE: 56 bytes base + utf8(permission) + utf8(pattern)
AFTER: 32 bytes base + TinyString heap payload capacity
- 0 for pattern length <= 14
- 28, 56, ... for longer patterns due to TinyVec/Vec growth

Alignment not calculated in heap allocations, so chances are the real number for BEFORE is larger.
See: https://gist.github.com/Sewer56/641221a9a69a9bc0f51d86ea78554d1d

Pattern Distribution

Bucket	Count	Percent
`<= 14` (inline)	151	93.8%
`15-36`	10	6.2%
`> 36`	0	0.0%

Implementation Details

Rule.permission now stores a packed hash+action (PackedPermission).
Rule.pattern now uses inline-first storage (TinyString<[u8; 14]>).
Permission matching uses pre-hashed comparison (Hash63 from Hash64), preserving current semantics.
Added/updated tests including a hard size check (Rule = 32 bytes).

Implement bit-packed permission storage using 63 bits for hash and 1 bit for PermissionAction. Enables compact permission representation with minimal memory overhead. Changes: - Add PackedPermission struct with new(), hash(), and action() methods - Use bit masks (ACTION_MASK, HASH_MASK) for efficient packing/unpacking - Use transmute for zero-cost action extraction from bit 63 - Add pack_unpack_roundtrip test for correctness validation Benefits: - Reduces memory usage for permission storage by packing into single u64 - Enables efficient hash-based permission lookups with compact representation - Compile-time assertions ensure correct PermissionAction size

…tring Complete the Rule struct optimization by storing permission/action in packed form and pattern as inline TinyString<[u8; 14]>. Changes: - Store permission as PackedPermission (63-bit hash + action bit) - Store pattern as TinyString<[u8; 14]> with direct type usage - Use Hash63::from_hash64(hash_u64(permission_lower)) for matching - Remove intermediate pattern capacity constants - Add rule_size_is_32_bytes test for memory layout validation - Add hash case-insensitivity test for permission matching Benefits: - Rule struct now exactly 32 bytes (down from String-based layout) - Faster hashed permission matching vs string comparison - Zero-allocation pattern storage for short patterns - Maintains case-insensitive permission and pattern semantics

codecov · 2026-02-17T05:47:02Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.75%. Comparing base (7dd231b) to head (b738be4).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #32      +/-   ##
==========================================
+ Coverage   65.92%   66.75%   +0.82%     
==========================================
  Files          38       41       +3     
  Lines        1130     1158      +28     
==========================================
+ Hits          745      773      +28     
  Misses        385      385

Flag	Coverage Δ
unittests	`66.75% <100.00%> (+0.82%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/llm-coding-tools-core/src/internal/hash63.rs	`100.00% <100.00%> (ø)`
src/llm-coding-tools-core/src/internal/hash64.rs	`100.00% <100.00%> (ø)`
...oding-tools-core/src/internal/packed_permission.rs	`100.00% <100.00%> (ø)`
src/llm-coding-tools-core/src/permissions.rs	`94.87% <100.00%> (+0.42%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai · 2026-02-17T05:50:23Z

Caution

Review failed

The pull request is closed.

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e379743 and b738be4.

📒 Files selected for processing (1)

src/llm-coding-tools-core/src/permissions.rs

Walkthrough

This PR adds internal 64-bit and 63-bit hashing modules and a packed permission module, and adds ahash and tinyvec_string dependencies. The Rule struct is refactored to store permissions as a PackedPermission (63-bit hash + 1-bit action) and patterns as TinyString, with Rule::new lowercasing inputs. Ruleset evaluation now compares permission hashes and uses pattern.as_str() for matching. New internal types: Hash64, Hash63, and PackedPermission; lib.rs exposes the internal module.

Possibly related PRs

Optimize Rule memory layout with packed permissions and inline pattern storage #32: Makes the same code-level changes—adding ahash and tinyvec_string, introducing Hash64/Hash63, PackedPermission, internal modules, and switching Rule to use PackedPermission and TinyString.

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main optimization changes: packed permissions and inline pattern storage for Rule memory layout reduction.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`
Description check	✅ Passed	The pull request description comprehensively explains the refactoring, provides detailed memory impact analysis with real-world measurements, and documents implementation details and validation steps.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch optimize-rule-memory

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/llm-coding-tools-core/src/permissions.rs (2)
57-66: ⚠️ Potential issue | 🔴 Critical

PermissionAction needs #[repr(u8)] for the unsafe transmute in packed_permission.rs.

As flagged in packed_permission.rs, the action() method transmutes a u8 back to PermissionAction. Without #[repr(u8)] and explicit discriminants here, that transmute is undefined behavior.
🐛 Proposed fix
+#[repr(u8)]
 #[derive(Debug, Clone, Copy, PartialEq, Eq, Default, Serialize, Deserialize)]
 #[serde(rename_all = "lowercase")]
 pub enum PermissionAction {
     /// Tool is allowed.
-    Allow,
+    Allow = 0,
     /// Tool is denied.
     #[default]
-    Deny,
+    Deny = 1,
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/permissions.rs` around lines 57 - 66, Add a
defined representation to PermissionAction so the unsafe transmute in
packed_permission.rs is well-defined: add #[repr(u8)] to the PermissionAction
enum and give explicit u8 discriminants for the variants (e.g., map Allow and
Deny to fixed u8 values) so the action() method in packed_permission.rs can
safely transmute a u8 into PermissionAction.
181-200: ⚠️ Potential issue | 🟡 Minor

Hash-only permission matching trades correctness guarantee for performance — document the collision risk.

Permission keys are now compared via 63-bit hash equality instead of exact string comparison. While the collision probability is astronomically low (~n²/2⁶⁴ for n distinct keys), this is a security-relevant code path — a collision could silently map an unrelated tool name to another tool's rules, causing incorrect allow/deny decisions.

The practical risk is negligible for realistic workloads (< 100 tool names), but since this is access control, the trade-off should be explicitly documented (e.g., in the module-level //! doc or on evaluate). Consider also storing and comparing a second, cheaper discriminator (e.g., string length) if zero-collision guarantee is ever needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/permissions.rs` around lines 181 - 200,
Document the security trade-off of using 63-bit hash equality for permission
keys inside the evaluate method: update the doc comment for the evaluate
function (and/or module-level //! doc) to explain that permission matching uses
Hash63::from_hash64(hash_u64(...)) (permission_hash) and therefore has a
non-zero collision risk, quantify the practical probability for typical
workloads, and recommend mitigations (e.g., also compare a secondary cheap
discriminator like string length or fall back to full string comparison if
needed); mention that callers should be aware this is a performance/correctness
trade-off for permission matching.

🧹 Nitpick comments (6)

src/llm-coding-tools-core/src/internal/hash64.rs (2)
34-39: RandomState is reconstructed on every hash call — consider a static.

ahash::RandomState::with_seed(0xDEAD_CAFE) is created on each invocation of hash_u64_bytes. While the seed derivation isn't extremely expensive, evaluate() calls this on every permission check. A cached static eliminates repeated initialization.
♻️ Proposed fix using `std::sync::LazyLock`
+use std::sync::LazyLock;
+
+static HASHER: LazyLock<ahash::RandomState> =
+    LazyLock::new(|| ahash::RandomState::with_seed(0xDEAD_CAFE));
+
 pub(crate) fn hash_u64_bytes(bytes: &[u8]) -> Hash64 {
-    Hash64(ahash::RandomState::with_seed(0xDEAD_CAFE).hash_one(bytes))
+    Hash64(HASHER.hash_one(bytes))
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/internal/hash64.rs` around lines 34 - 39, The
function hash_u64_bytes currently calls
ahash::RandomState::with_seed(0xDEAD_CAFE) on every invocation; change this to
use a single cached static RandomState (e.g., via std::sync::LazyLock or
once_cell::sync::Lazy) so the RandomState is constructed once and reused; update
hash_u64_bytes to reference that static (keeping the return type Hash64 and
behavior the same) to avoid repeated initialization during frequent calls like
evaluate().
13-24: from_u64 and as_u64 could be const fn for consistency with Hash63.

Hash63 declares these as const fn (see hash63.rs lines 22, 29). The Hash64 counterparts are identical in structure but lack const. Minor consistency nit.
♻️ Proposed fix
     #[inline]
     #[allow(dead_code)] // internal public API
-    pub(crate) fn from_u64(value: u64) -> Self {
+    pub(crate) const fn from_u64(value: u64) -> Self {
         Self(value)
     }

     #[inline]
     #[allow(dead_code)] // internal public API
-    pub(crate) fn as_u64(&self) -> u64 {
+    pub(crate) const fn as_u64(&self) -> u64 {
         self.0
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/internal/hash64.rs` around lines 13 - 24, Make
the two methods on Hash64 const like their Hash63 counterparts: change
pub(crate) fn from_u64(value: u64) -> Self and pub(crate) fn as_u64(&self) ->
u64 to be const fn so they can be evaluated at compile time; update the
signatures of Hash64::from_u64 and Hash64::as_u64 accordingly to match Hash63's
const declarations.
src/llm-coding-tools-core/src/permissions.rs (1)
83-98: Rule::new always allocates two Strings that are immediately discarded after hashing/copying.

Both permission.into() and pattern.into() allocate heap Strings only to lowercase them, hash/copy them, and drop. For the pattern path, the allocation feeds TinyString::from, which for ≤14-byte patterns will copy inline and then the String is wasted.

This is construction-time only (not the hot path), so it's low priority, but a stack-based lowercase approach (or accepting &str and lowercasing into a small buffer) would eliminate these transient allocations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/permissions.rs` around lines 83 - 98, Rule::new
currently allocates two temporary Strings just to lowercase and hash/copy them;
change its signature to accept permission and pattern as impl AsRef<str> (or
&str) and perform ASCII-lowercasing into small stack buffers to avoid heap
allocations: copy the input bytes into fixed-size [u8; N] buffers (choose N >=
14 for pattern and a reasonable size for permission), run make_ascii_lowercase
on those buffers, compute hash_u64 over the lowered permission slice and pass
that to PackedPermission::new, and construct TinyString from the lowered pattern
slice (using TinyString::from_str or equivalent) instead of creating
intermediate String values; update Rule::new to use these stack buffers and
remove the temporary String allocations.
src/llm-coding-tools-core/src/internal/hash63.rs (2)
34-37: from_hash64 could be const fn.

Hash64::as_u64() in hash64.rs is not const, but the relevant snippets show Hash63::from_u64 and Hash63::as_u64 are. If Hash64::as_u64 were made const (it's trivially eligible), this could be const too. Minor consistency nit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/internal/hash63.rs` around lines 34 - 37, Make
from_hash64 a const fn to match the const-ness of Hash63::from_u64 and
Hash63::as_u64: update Hash64::as_u64 to be a const method (it's trivially
eligible) and then change pub(crate) fn from_hash64 to pub(crate) const fn
from_hash64 so it can be used in const contexts; ensure signatures for
Hash64::as_u64 and Hash63::from_u64/Hash63::as_u64 remain consistent.
20-24: Consider adding a debug_assert! to enforce the bit-63 invariant.

The doc says "caller is responsible for ensuring bit 63 is 0," but a debug_assert! would catch misuse in dev/test builds at zero runtime cost in release.
🛡️ Proposed addition
     pub(crate) const fn from_u64(value: u64) -> Self {
+        debug_assert!(value & (1u64 << 63) == 0, "Hash63: bit 63 must be 0");
         Self(value)
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/internal/hash63.rs` around lines 20 - 24, The
constructor from_u64 is currently a const fn so you can't use debug_assert!
there; change pub(crate) const fn from_u64(value: u64) -> Self to a non-const
pub(crate) fn from_u64(value: u64) -> Self and add a debug_assert!(value & (1u64
<< 63) == 0, "from_u64: bit 63 must be zero") before returning Self(value) so
misuse is caught in dev/test builds without affecting release performance.
src/llm-coding-tools-core/src/internal/packed_permission.rs (1)
17-21: Prefer core::mem over std::mem.

The compile-time assertion and the transmute both use std::mem. As per coding guidelines, prefer core over std where possible.
Proposed fix
-const _: () = assert!(
-    std::mem::size_of::<PermissionAction>() == 1,
+const _: () = assert!(
+    core::mem::size_of::<PermissionAction>() == 1,
     "PermissionAction must be 1 byte for bit-packing"
 );
As per coding guidelines: "Prefer core over std where possible (e.g., core::mem over std::mem)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/internal/packed_permission.rs` around lines 17
- 21, Update usages of std::mem to core::mem in the packed-permission code:
replace std::mem::size_of in the compile-time assertion for PermissionAction and
any std::mem::transmute calls related to the bit-packing logic with
core::mem::size_of and core::mem::transmute respectively, ensuring the assert
and transmute references (e.g., the const _: () =
assert!(std::mem::size_of::<PermissionAction>() == 1, ...) and any transmute
invocations) now import from core so the code follows the guideline to prefer
core over std.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7dd231b and e379743.

⛔ Files ignored due to path filters (1)

src/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

src/llm-coding-tools-core/Cargo.toml
src/llm-coding-tools-core/src/internal/hash63.rs
src/llm-coding-tools-core/src/internal/hash64.rs
src/llm-coding-tools-core/src/internal/mod.rs
src/llm-coding-tools-core/src/internal/packed_permission.rs
src/llm-coding-tools-core/src/lib.rs
src/llm-coding-tools-core/src/permissions.rs

🧰 Additional context used

📓 Path-based instructions (2)

src/**/Cargo.toml

📄 CodeRabbit inference engine (src/AGENTS.md)

src/**/Cargo.toml: Enable tokio feature flag (default) for async mode with tokio runtime, or blocking for sync/blocking mode. The async and blocking features are mutually exclusive.
Prefer performance-oriented crates: parking_lot over std::sync, memchr for byte search

Files:

src/llm-coding-tools-core/Cargo.toml

src/**/*.rs

📄 CodeRabbit inference engine (src/AGENTS.md)

src/**/*.rs: Preallocate collections when size is known or estimable using String::with_capacity(estimated_len), Vec::with_capacity(count), or BufReader::with_capacity(size, reader)
Prefer &str / &[T] returns over owned types when lifetime allows
Use Cow<'_, str> for conditional ownership (e.g., String::from_utf8_lossy)
Use &'static str for compile-time constant strings
Reuse buffers by calling .clear() and reusing Vec/String instead of reallocating
Use const generics for compile-time branching (e.g., <const LINE_NUMBERS: bool>)
Use #[inline] on small, hot-path functions
Prefer core over std where possible (e.g., core::mem over std::mem)
Use memchr crate for fast byte searching over manual iteration
Keep modules under 500 lines (excluding tests); split if larger
Place use statements inside functions only for #[cfg] conditional compilation
Document public items with /// and add examples in docs where helpful
Use //! for module-level documentation
Use [TypeName] rustdoc links instead of backticks in documentation

Files:

src/llm-coding-tools-core/src/internal/hash64.rs
src/llm-coding-tools-core/src/internal/mod.rs
src/llm-coding-tools-core/src/lib.rs
src/llm-coding-tools-core/src/internal/hash63.rs
src/llm-coding-tools-core/src/internal/packed_permission.rs
src/llm-coding-tools-core/src/permissions.rs

🧠 Learnings (6)