feat: add content sanitization and interruptible shutdown#23
Merged
Conversation
MCP servers can return non-text content (images, audio, binary resources) and tool parameters can contain embedded base64 blobs. These bloat event payloads and aren't useful for analytics. This adds a two-layer sanitization step to the event processing pipeline: - Layer 1: replaces image/audio/blob/unknown content blocks in responses - Layer 2: recursively scans parameters and structured_content for large base64 strings (>=10KB) and redacts them Sanitization runs after customer redaction and before event ID generation, using deepcopy to preserve immutability of the original event. Also improves event queue shutdown behavior: - Replace time.sleep with shutdown_event.wait for interruptible retry backoff - Early return in _send_event when shutdown is detected - Pass cancel_futures=True to executor.shutdown()
kashishhora
approved these changes
Feb 26, 2026
src/mcpcat/modules/sanitization.py
Outdated
| _sanitize_content_block(block) for block in response["content"] | ||
| ] | ||
|
|
||
| if "structured_content" in response: |
Member
There was a problem hiding this comment.
I guess no guarantee the key won't be camelCase, I'd probably add that here just in case. I searched and we do check for result.isError in some places too, so camelCase isn't out of the question
| event = copy.deepcopy(event) | ||
|
|
||
| if event.response is not None: | ||
| _sanitize_response(event.response) |
Member
There was a problem hiding this comment.
Same comment as TypeScript here, should wrap in try/except just in case sanitization throws.
…ontent The sanitizer previously only checked the `structured_content` key, which would miss base64 data if serialization produced `structuredContent` (camelCase) or any other response field. Now scans all non-content response fields through the base64 scanner.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
time.sleepwith interruptibleshutdown_event.waitfor retry backoff, add early return on shutdown detection, and cancel pending futures during executor shutdownTest plan
pytest tests/test_sanitization.py -v(19 tests covering response content, parameter scanning, boundary conditions, and immutability)pytest tests/test_event_queue.py -v(26 tests including 2 new shutdown-path tests)pytest tests/ -vto verify no regressionsmypy src/mcpcat/modules/sanitization.py