MAINT: Refactor SkeletonKeyAttack to single-turn prepended conversati…#1739
Open
ksaravindakashyap wants to merge 1 commit into
Open
MAINT: Refactor SkeletonKeyAttack to single-turn prepended conversati…#1739ksaravindakashyap wants to merge 1 commit into
ksaravindakashyap wants to merge 1 commit into
Conversation
Author
|
@microsoft-github-policy-service agree |
Contributor
There was a problem hiding this comment.
Pull request overview
Refactors SkeletonKeyAttack from a two-call flow to a single target call with a simulated skeleton-key exchange prepended as conversation history.
Changes:
- Adds default simulated acceptance prompt and constructor override.
- Moves skeleton-key setup into
_setup_asyncand relies on parent execution. - Updates unit tests and synced documentation/notebook content.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
pyrit/executor/attack/single_turn/skeleton_key.py |
Implements prepended skeleton-key conversation setup. |
pyrit/datasets/executors/skeleton_key/skeleton_key_acceptance.prompt |
Adds default simulated assistant acceptance text. |
tests/unit/executor/attack/single_turn/test_skeleton_key.py |
Rewrites tests around setup and initialization behavior. |
doc/code/executor/attack/skeleton_key_attack.py |
Updates documentation for the single-turn pattern. |
doc/code/executor/attack/skeleton_key_attack.ipynb |
Syncs notebook text/code and clears stale outputs. |
Comments suppressed due to low confidence (2)
pyrit/executor/attack/single_turn/skeleton_key.py:112
- This setup drops the configured request converters when adding the skeleton-key prompt as prepended history. The parent setup passes
request_converters=self._request_converters, and the previous implementation sent the skeleton-key prompt through_send_prompt_to_objective_target_async, so converters applied to it; after this change they apply only to the final objective turn. Pass the converter configuration through when initializing the prepended conversation, or otherwise preserve the previous converter behavior.
await self._conversation_manager.initialize_context_async(
context=context,
target=self._objective_target,
conversation_id=context.conversation_id,
memory_labels=self._memory_labels,
tests/unit/executor/attack/single_turn/test_skeleton_key.py:247
- The rewritten setup tests patch out
initialize_context_asyncand only assert a few arguments, so they no longer cover that configured request converters are preserved when the skeleton-key prompt is moved into prepended history. Please add a test that exercises_setup_asyncwith anAttackConverterConfigand verifies the converters are passed through/applied to the prepended skeleton-key message.
mock_init = AsyncMock()
with patch.object(attack._conversation_manager, "initialize_context_async", mock_init):
await attack._setup_async(context=basic_context)
mock_init.assert_called_once()
call_kwargs = mock_init.call_args.kwargs
assert call_kwargs["context"] is basic_context
assert call_kwargs["target"] is mock_target
assert call_kwargs["conversation_id"] == basic_context.conversation_id
Comment on lines
+108
to
+112
| await self._conversation_manager.initialize_context_async( | ||
| context=context, | ||
| target=self._objective_target, | ||
| conversation_id=context.conversation_id, | ||
| objective=context.objective, | ||
| atomic_attack_identifier=build_atomic_attack_identifier(attack_identifier=self.get_identifier()), | ||
| last_response=None, | ||
| last_score=None, | ||
| outcome=AttackOutcome.FAILURE, | ||
| outcome_reason="Skeleton key prompt was filtered or failed", | ||
| executed_turns=1, | ||
| labels=context.memory_labels, | ||
| memory_labels=self._memory_labels, |
Comment on lines
+188
to
+201
| with patch.object(attack._conversation_manager, "initialize_context_async", new_callable=AsyncMock): | ||
| await attack._setup_async(context=basic_context) | ||
|
|
||
| # Check that skeleton key prompt was included in message | ||
| message = call_args.kwargs["message"] | ||
| assert isinstance(message, Message) | ||
| assert len(message.message_pieces) == 1 | ||
| assert message.message_pieces[0].original_value == "Test skeleton key" | ||
| assert message.message_pieces[0].original_value_data_type == "text" | ||
| assert basic_context.conversation_id != original_id | ||
|
|
||
| async def test_send_skeleton_key_prompt_filtered_response(self, mock_target, mock_prompt_normalizer, basic_context): | ||
| """Test handling of filtered skeleton key prompt response.""" | ||
| async def test_setup_creates_two_prepended_messages(self, mock_target, basic_context): | ||
| attack = SkeletonKeyAttack( | ||
| objective_target=mock_target, | ||
| prompt_normalizer=mock_prompt_normalizer, | ||
| skeleton_key_prompt="Test skeleton key", | ||
| skeleton_key_prompt="sk prompt", | ||
| skeleton_key_acceptance="acceptance", | ||
| ) | ||
|
|
||
| # Simulate filtered response | ||
| mock_prompt_normalizer.send_prompt_async.return_value = None | ||
|
|
||
| result = await attack._send_skeleton_key_prompt_async(context=basic_context) | ||
|
|
||
| assert result is None | ||
|
|
||
| async def test_send_skeleton_key_prompt_uses_correct_converters( | ||
| self, mock_target, mock_prompt_normalizer, basic_context | ||
| ): | ||
| """Test that skeleton key prompt sending uses correct converter configurations.""" | ||
| from pyrit.prompt_normalizer.prompt_converter_configuration import ( | ||
| PromptConverterConfiguration, | ||
| ) | ||
| with patch.object(attack._conversation_manager, "initialize_context_async", new_callable=AsyncMock): | ||
| await attack._setup_async(context=basic_context) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Title:
MAINT: Refactor SkeletonKeyAttack to single-turn prepended conversation pattern
Body:
Description
Refactors
SkeletonKeyAttackto follow the same pattern asFlipAttack— using_setup_asyncto prepend a simulated skeleton key exchange (user prompt + assistant acceptance) as conversation history, then sending the objective in a single real API call via the parentPromptSendingAttack._perform_async.Previously, the attack made two real API calls: one to send the skeleton key prompt and await the model's response, then a second for the objective. This replaces that custom two-turn flow with the standard prepended conversation mechanism already used across the codebase.
Related: #650
Changes:
_setup_asyncadded to setcontext.prepended_conversationwith[user: skeleton_key, assistant: acceptance]SkeletonKeyAttack._perform_asyncoverride removed — parent handles executionskeleton_key_acceptanceconstructor parameter (optional, mirrorsskeleton_key_prompt)pyrit/datasets/executors/skeleton_key/skeleton_key_acceptance.prompt_send_skeleton_key_prompt_async,_create_skeleton_key_failure_result,_load_skeleton_key_promptexecuted_turnsinAttackResultnow returns1instead of2to reflect the true single-turn nature@romanlutz
Tests and Documentation
tests/unit/executor/attack/single_turn/test_skeleton_key.pyrewritten to cover the new_setup_asyncbehaviour (19 tests, all passing)doc/code/executor/attack/skeleton_key_attack.pyand.ipynbupdated with revised description reflecting the single-turn approach; stale pre-run notebook outputs cleared