Fix secrets detector model dump counting#65
Conversation
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Signed-off-by: lucarlig <luca.carlig@ibm.com>
5f77275 to
9d81feb
Compare
Signed-off-by: lucarlig <luca.carlig@ibm.com>
msureshkumar88
left a comment
There was a problem hiding this comment.
Review: REQUEST CHANGES
Two HIGH issues and five MEDIUM issues require resolution before merge. Full findings below.
HIGH Issues — Must Fix
H1 · Missing @pytest.mark.asyncio on two test functions (test_hook_smoke.py)
Two of the three top-level async test functions are missing @pytest.mark.asyncio:
# Missing decorator — will not execute as async tests
async def test_prompt_pre_fetch_rebuilds_frozen_payload_on_redaction():
...
async def test_prompt_pre_fetch_blocks_without_redaction_and_keeps_original_payload():
...
# Third function in same file has it — confirms this is an error, not intentional
@pytest.mark.asyncio
async def test_prompt_pre_fetch_blocks_with_redaction_without_leaking_secret():
...Without asyncio_mode = "auto" in pytest config, these two async functions are silently collected as synchronous tests and return a coroutine object without executing any assertions. All assertions pass vacuously, producing false green coverage over the redaction and blocking paths — exactly the paths this PR is fixing.
Required action: Add @pytest.mark.asyncio to both functions, or confirm asyncio_mode = "auto" is set globally and add a comment explaining the omission.
H2 · text_scan.rs behavioral change silently alters count semantics (undocumented)
detect_and_redact was completely rewritten. The old implementation ran each regex pattern's replace_all independently, accumulating one finding per pattern match. The new implementation collects all candidates, resolves overlaps by a specificity/length ranking, and emits one finding per non-overlapping span.
What this changes:
| Scenario | Old count |
New count |
|---|---|---|
Single AWS key matching both aws_access_key_id AND generic_api_key_assignment |
2 | 1 |
Google API key partially overlapping generic_api_key_assignment |
2 | 1 |
| Two distinct secrets in the same string | 2 | 2 (unchanged) |
This directly affects min_findings_to_block thresholds in production. A deployment with min_findings_to_block=2 to tolerate incidental generic matches will see different blocking behavior after this upgrade — where it previously blocked, it now may not (or vice versa). The change is tested and is the right fix, but the PR description must document it so operators can audit their threshold configurations before upgrading.
Required action: Add a section to the PR description describing the text_scan.rs change and its effect on count / min_findings_to_block. Consider a note in DEVELOPING.md about the count semantics change.
MEDIUM Issues — Should Fix
M1 · py.import("builtins") called inside loop in apply_extra_dict_state (performance)
object_model.rs — py.import("builtins") and .getattr("object") are called on every loop iteration in apply_extra_dict_state. These values are invariant within the loop body. The PyO3 call still crosses the FFI boundary per iteration regardless of CPython's module cache.
// Current — builtins imported on every key
for (key, value) in updates.iter() {
if key.is_exact_instance_of::<PyString>() {
let builtins = py.import("builtins")?; // ← inside loop
let object_type = builtins.getattr("object")?; // ← inside loop
...
}
}Fix: Hoist both bindings before the loop.
M2 · Slack token regex widening is undocumented (patterns.rs)
-Regex::new(r"\bxox[abpqr]-[0-9A-Za-z\-]{10,48}\b")
+Regex::new(r"\bxox[abpqr]-[0-9A-Za-z\-]{10,80}\b")The test token added in scanner.rs is 51 chars after xoxb- — which would not have matched the old {10,48} bound. This means either the old regex was silently broken for modern Slack Bot tokens, or the test was introduced alongside the widening to cover a previously-missed format. Neither scenario is documented. The wider upper bound increases false-positive surface.
Required action: Add a comment in patterns.rs citing the Slack token format spec that justifies 80. Add boundary tests asserting tokens at exactly 48 and 49 chars both match. Note in the PR description if the old regex was broken.
M3 · same_safe_value allocates HashSet::new() at every call site (performance)
python_scan.rs — every call site constructs a fresh HashSet for cycle detection, including inside the inner for (key, value) loop of serialized_dict_duplicates_rebuild_state:
// Allocates per dict entry:
if !same_safe_value(&serialized_value, &rebuild_value, &mut HashSet::new())? {For flat structures (the common case), the HashSet is allocated and never written. For models with many fields this adds up.
Required action: Introduce a wrapper fn values_equal(left, right) -> PyResult<bool> that owns the seen set internally, removing &mut HashSet::new() from all call sites.
M4 · No logging when duplicate gate suppresses a scan (security observability gap)
python_scan.rs — when serialized_scan_target returns None because the duplicate gate determined the serialized state already duplicates the rebuild state, the scan silently skips that path. In production, if a secret is not detected and an operator investigates, there is no diagnostic signal indicating that the serialized path was suppressed and why.
This is a security observability gap: a misconfigured or edge-case duplicate gate could silently suppress detection of a real secret with no log evidence.
Required action: Add tracing::debug! or log::debug! at each return Ok(None) in serialized_scan_target identifying which gate fired. Example:
tracing::debug!("duplicate gate: serialized dict duplicates rebuild state, skipping serialized scan");
return Ok(None);M5 · inspect_object_state_without_model_dump over-exposed as pub (object_model.rs)
This function is only used within the crate (called from serialized_scan_target in python_scan.rs). Exposing it as pub unnecessarily widens the API surface.
Required action: Change to pub(crate).
M6 · set_attr_without_hooks called with unchecked string keys (security hardening)
object_model.rs — apply_extra_dict_state calls set_attr_without_hooks with attribute names sourced from the original object's __dict__:
set_attr_without_hooks(&object_type, target, &key.extract::<String>()?, &value)?;While target is a freshly-created blank instance from prepare_rebuild_target, dunder names (__class__, __reduce__, __reduce_ex__, __init_subclass__) in __dict__ (unusual but possible on deliberately crafted objects) would set special attributes via CPython's base object.__setattr__. In environments where untrusted data constructs scanned objects this is a potential injection vector.
Required action: Add a guard rejecting dunder attribute names before passing to set_attr_without_hooks, and document the security invariant:
let name = key.extract::<String>()?;
if name.starts_with("__") && name.ends_with("__") {
dict.set_item(&key, value)?; // fall back to raw dict insert for dunders
continue;
}
set_attr_without_hooks(&object_type, target, &name, &value)?;Testing Gaps (MEDIUM)
T1 · No RootModel + min_findings_to_block=2 end-to-end regression test
The duplicate gate in serialized_duplicates_rebuild_root specifically targets Pydantic RootModel (the "root" key check). The existing tests cover BaseModel with min_findings_to_block=2, but there is no end-to-end test confirming that a RootModel with a secret field does not double-count. This is the model type the root-key gate was designed for.
T2 · serialized_result non-string-key guard is untested
The new guard added in serialized_result:
if !dict_has_only_exact_string_keys(redacted_dict) {
return Ok(redacted_state.clone());
}has no dedicated test. Without one, a future refactor could silently remove it.
Low-Priority Observations (informational, not blocking)
- Test boilerplate duplication: All five new Python test files contain identical copies of
make_context,make_config,_make_context,_make_config. Move to a sharedconftest.py. _make_context/_make_configdead aliases: Both are exact wrappers with no behavioral difference from the non-underscore versions. Remove the aliases.split_dict_statedouble-iterates on mixed-key dicts:dict_has_only_exact_string_keysand then the split loop both traverse the dict. Could be fused.scan_object_statecomplexity: 120+ lines with four interleaved mutable state variables (rebuilt,rebuild_state_for_extra,rebuild_state_for_gate,pending_scan_state). Thepending_scan_statedeferral + cycle-rewrite interaction warrants a brief inline comment on the invariant.
Positive Observations
serialized_scan_targetrefactor (bool →Option<SerializedScanTarget>) cleanly eliminates a redundant re-inspection call and expresses skip/scan in the type system.same_safe_valuecorrectly avoids calling Python__eq__on arbitrary user objects. TheEqBombadversarial tests confirm the hazard is real and addressed.str_typethreading (one import perscan_containercall, replacing per-nodeextract::<String>()) is both more correct and cheaper.text_scan.rssingle-pass redaction replacing cascadedreplace_allis strictly better in correctness and performance.dict_has_only_exact_string_keysguard is applied uniformly before any dict lookup, correctly blocking hash-collision and__eq__-bomb attacks from custom key objects.- Version bump applied consistently across
Cargo.toml,plugin-manifest.yaml, andCargo.lock. - Core double-count regression is directly tested by
test_tool_post_invoke_does_not_double_count_model_dump_fieldsand the list-field variant intest_plugin_hook_results.py. - CI: Do not merge until
build-test (ubuntu/windows, secrets_detection),coverage, andrelease-validation / preflightcomplete.
Signed-off-by: lucarlig <luca.carlig@ibm.com>
|
@msureshkumar88 I pushed I also revalidated the scanner/object-model findings from the latest review set against current branch code. The mixed-key Verification run:
|
Re-review: Most issues remain openReviewed against current HEAD ( H1 ·
|
| ID | Status |
|---|---|
| H1 (asyncio comment) | |
| H2 (count semantics docs) | ❌ Not addressed |
| M1 (builtins hoist) | ❌ Not addressed |
| M2 (Slack regex comment + tests) | ❌ Not addressed |
| M3 (HashSet wrapper) | ❌ Not addressed |
| M4 (debug logging) | ❌ Not addressed |
| M5 (pub → pub(crate)) | ❌ Not addressed |
| M6 (dunder guard) | ❌ Not addressed |
| T1 (RootModel e2e test) | ❌ Not addressed |
| T2 (serialized_result guard test) | ❌ Not addressed |
| Low: boilerplate centralization | ✅ Done (230fb65) |
| Low: dead aliases removed | ❌ Not addressed |
Signed-off-by: lucarlig <luca.carlig@ibm.com>
|
@msureshkumar88 I pushed Addressed:
Verification:
|
Signed-off-by: lucarlig <luca.carlig@ibm.com>
Review follow-up — all requested changes confirmed except one test gapChecked every item from the original review against the current branch state. HIGH — both resolved ✓
MEDIUM — all resolved ✓
Testing gaps — mostly resolved ✓
One gap still open
#[test]
fn scan_container_returns_original_for_clean_scan_state_and_clean_serialized_path() {
// pending_scan_state deferred (scan_state clean, no prior rebuild);
// serialized path also clean — original object must be returned as-is.
Python::initialize();
Python::attach(|py| -> PyResult<()> {
let code = CString::new(r#"
class BadKey:
pass
class Model:
def __init__(self):
self.text = "clean"
self.__dict__[BadKey()] = "side-channel"
def model_dump(self):
return {"text": "also clean"}
"#).unwrap();
let module = PyModule::from_code(py, code.as_c_str(), c"test_module.py", c"test_module")?;
let instance = module.getattr("Model")?.call0()?;
let config = SecretsDetectionConfig {
redact: true,
redaction_text: "[REDACTED]".to_string(),
..Default::default()
};
let (count, redacted, findings) = scan_container(py, &instance, &config)?;
assert_eq!(count, 0);
assert_eq!(findings.len(), 0);
assert!(redacted.is(&instance));
Ok(())
}).unwrap();
}Everything else looks good — happy to approve once T3 is added. |
Signed-off-by: lucarlig <luca.carlig@ibm.com>
|
Addressed T3 in Added Verification:
|
msureshkumar88
left a comment
There was a problem hiding this comment.
Final review — all items resolved ✓
Verified 6bb4255 against the remaining open item.
T3 — scan_container_returns_original_for_clean_scan_state_and_clean_serialized_path is present and covers exactly the path I flagged: non-string-keyed __dict__ entries alongside a present-but-clean model_dump(), asserting zero findings and redacted.is(&instance) for object identity. The clean-path branch is now locked against silent regression.
Full resolution summary:
| ID | Status |
|---|---|
| H1 (asyncio comment) | ✅ |
| H2 (count semantics docs) | ✅ |
| M1 (builtins hoist) | ✅ |
| M2 (Slack regex comment + boundary tests) | ✅ |
| M3 (HashSet wrapper) | ✅ |
| M4 (debug logging) | ✅ |
| M5 (pub → pub(crate)) | ✅ |
| M6 (dunder guard) | ✅ |
| T1 (RootModel e2e test) | ✅ |
| T2 (serialized_result guard test) | ✅ |
| T3 (clean scan state + clean serialized path) | ✅ |
| Low: boilerplate centralization | ✅ |
| Low: dead aliases removed | ✅ |
Approved. Ready to merge.
🤖 Automated PR #65 Verification ReportStatus: ✅ All checks passed 📊 Test Results Summary
🔍 Detailed Verification1. ✅ Rust Test Suite (52 tests)All Rust unit tests pass, including:
Key Test: 2. ✅ Python Integration Tests (74 tests)All Python integration tests pass, including critical regression tests: Double-Count Prevention Tests:
Model Dump Detection Tests:
3. ✅ Regression Test CoverageFound 23 test methods explicitly covering
These tests ensure the duplicate detection gate works correctly for:
4. ✅ Code Implementation Verified
🎯 Key Fixes Confirmed
|
🤖 Automated Verification Report for PR #65I've created and run an automated verification script to validate the model dump counting fix. All tests passed successfully! ✅ Verification ScriptCreated: This script comprehensively tests the double-counting bug fix with 6 automated test cases. Test Results
What Was Verified
Test CoverageThe verification script uses actual secret patterns from
|
|
Thanks @lucarlig and @msureshkumar88 for quick resolution. Regards, |
Summary
Fixes the secrets detection scanner so equivalent
model_dump()output does not double-count secrets already scanned through object rebuild state.Root cause
Pydantic-style objects can expose the same field both through object state and
model_dump(). The scanner counted both paths separately, so a single distinct secret could producecount == 2and incorrectly tripmin_findings_to_block=2.Changes
Validation
cargo test -p secrets_detection scan_container --libcargo test -p secrets_detection duplicate_gate_ignores_non_string_model_dump_keys_without_lookup --libmake check-allmake test-integrationmake test-allOperator note: secrets finding count semantics
This PR changes
secrets_detectiontext scanning to report one finding per non-overlapping secret span. If multiple enabled patterns match overlapping bytes, the scanner redacts the merged span once and reports the most specific detector type instead of counting every pattern match separately.Distinct non-overlapping secrets still count separately. Operators using
min_findings_to_blockvalues greater than1should audit thresholds before upgrading from earlier behavior, where overlapping broad and specific detector matches could increasecount.