Skip to content

Conversation

@GangGreenTemperTatum
Copy link
Contributor

@GangGreenTemperTatum GangGreenTemperTatum commented Apr 21, 2025

Notes

  • a small update on the existing crucible challenge example (for prompt injection challenges since theres no tool calling etc in this specific example)
  • fixes the platform domain name change also
  • removes the weird crucible generator setup and turned it into asyncio process
  • the model at the moment does not submit the flag and is done within the code itself

Generated Summary

  • Enhanced the SYSTEM_PROMPT to provide clearer guidance on interaction with the challenge, emphasizing techniques for prompt injection and flag extraction.
  • Implemented CrucibleState dataclass to track attempts, successful techniques, and potential flags, improving state management between attempts.
  • Added methods for querying and submitting flags to the challenge API, streamlining interactions with the backend.
  • Introduced a new logic in main to vary the temperature parameter during attempts to explore different strategies, potentially increasing success rates.
  • Updated command line interface to include a random temperature option and configurable maximum steps, allowing for greater flexibility during execution.
  • Revised logging for more detailed progress reporting, including success rates and information regarding encountered flags.
  • Removed the old CrucibleGenerator class in favor of simplified function calls, enhancing code clarity and maintainability.
  • Improved error handling for API requests with clearer feedback for failure cases.

These changes significantly improve the functionality, usability, and robustness of the crucible example, setting a solid foundation for further development and experimentation.

This summary was generated with ❤️ by rigging

Generated Summary

  • Enhanced the SYSTEM_PROMPT with additional context on the challenge and guidance for participants, potentially improving their approach.
  • Introduced CrucibleState dataclass for tracking the state during attempts, including metrics for successful and failed techniques, which aids in strategizing.
  • Added detailed logging for tracking attempts, including success rates and temperature adjustments for strategy exploration, improving debugging capabilities.
  • Refactored the way responses and flags are handled during the challenge, introducing error handling for API interactions to improve robustness.
  • Updated cli function to use the new challenge URL format and added options for temperature randomization and maximum steps, enhancing configurability.
  • Removed the CrucibleGenerator class in favor of direct asynchronous API calls, simplifying the code and improving maintainability.
  • Overall, these changes elevate the challenge experience by providing clearer guidance and a more structured approach to flag extraction.

This summary was generated with ❤️ by rigging

@GangGreenTemperTatum GangGreenTemperTatum merged commit 5d7604c into main Apr 22, 2025
4 checks passed
@monoxgas monoxgas deleted the ads/eng-1740-fix-example-crucible-challenge-code branch May 30, 2025 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants