Release v0.1.0: fix(testing): adversarial test pass — patch all bugs found in layers 1-5 · AdityaBelhekar/AgentShield

v0.1.0
4e42045
Choose a tag to compare

Filter

View all tags

v0.1.0: fix(testing): adversarial test pass — patch all bugs found in layers 1-5

v0.1.0
4e42045
Choose a tag to compare

Filter

View all tags

AdityaBelhekar tagged this 06 Apr 17:26

Patched source files discovered via adversarial testing across detector, pipeline, and adapter layers.

- agentshield/detection/prompt_injection.py:
- Added text normalization for zero-width and homoglyph evasion.
- Expanded high-risk signatures (policy bypass/no-rules/system exfil cues).
- Improved semantic robustness by evaluating normalized text path.
- Raised ALERT sensitivity for semantically strong jailbreaks.

- agentshield/canary/system.py:
- Added prompt-level canary scanning with guard to ignore self-injected integrity envelope.
- Enables immediate block on leaked token in prompt content.

- agentshield/detection/base_detector.py:
- Clipped cosine distance to documented [0,1] range for scoring stability.
- Extended DetectionContext with recent threat memory for correlation windowing.

- agentshield/detection/goal_drift.py:
- Added session peak drift retention so drift cannot be trivially reset by brief benign prompts.
- Added warmup suppression for early-session medium drift noise.
- Cleans up peak state on session clear.

- agentshield/detection/tool_chain.py:
- Added ordered subsequence matching for forbidden patterns with benign tools in-between.
- Added ordered category-pair checks for read->send and execute->send regardless adjacency.
- Added execution-streak and chain-depth heuristics for escalation chains.

- agentshield/detection/memory_poison.py:
- Added explicit memory poison signatures for SYSTEM OVERRIDE/safety bypass/exfil phrasing.
- Added write-velocity burst anomaly detection over a short window.
- Integrated velocity signal into confidence/evidence.

- agentshield/detection/engine.py:
- Added short-window cross-event detector correlation memory.
- Correlation now counts unique detectors, reducing duplicate-detector inflation.
- Added policy-action caps (LOG/FLAG/ALERT) so non-block policies can downscope correlation decisions.
- Normalized final threat actions for ALERT/FLAG/LOG_ONLY paths.

- agentshield/events/emitter.py:
- Treats RedisConnectionError as retryable publish failure.
- Ensures JSONL fallback continues even when Redis is unreachable (emit and batch paths).

- agentshield/adapters/langchain_adapter.py:
- Added robust callable assignment fallback (object.__setattr__) for Pydantic-backed LangChain objects.
- Fixes adapter patching on real LangChain model instances that reject normal setattr.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!