Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM07 - Insecure Plugin Design - Mitigation/How to Prevent Enhancements #242

Open
GangGreenTemperTatum opened this issue Nov 6, 2023 · 0 comments
Assignees
Labels
enhancement Changes/additions to the Top 10; eg. clarifications, examples, links to external resources, etc

Comments

@GangGreenTemperTatum
Copy link
Collaborator

GangGreenTemperTatum commented Nov 6, 2023

I believe LLM07 could benefit from some or all of these mitigation methods to be included in the vulnerability:

  • Human in the Loop: A plugin should not be able to invoke another plugin (by default), especially plugins with high stakes operations and to ensure that generated content meets quality and ethical standards.
  • It should be transparent to the user which plugin will be invoked, and what data is sent to it. Possibly even allow modifying the data before its sent.
  • A security contract and threat model for plugins should be created, so we can have a secure and open infrastructure where all parties know what their security responsibilities are.
  • An LLM application must assume plugins cannot be trusted (e.g. direct or indirect prompt injection), and similarly plugins cannot blindly trust LLM application invocations (example: confused deputy attack)
  • Regularly perform red teaming and model serialization attacks with thorough benchmarking and reporting of input and outputs.
  • Plugins that handle PII and/or impersonate the user are high stakes.
  • Isolation. Discussed before that follows a Kernel LLM vs. Sandbox LLMs could help.

These mitigation techniques are primarily focused towards combatting indirect prompt injection, but should pretty much be a defacto standard. I also think there should be some sort of statement or wording such as "Plugins should never be inheriently trusted".

Resource and inspiration kudos to embracethered.

@GangGreenTemperTatum GangGreenTemperTatum added the enhancement Changes/additions to the Top 10; eg. clarifications, examples, links to external resources, etc label Nov 6, 2023
@GangGreenTemperTatum GangGreenTemperTatum removed their assignment Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Changes/additions to the Top 10; eg. clarifications, examples, links to external resources, etc
Projects
None yet
Development

No branches or pull requests

2 participants