Skip to content

Red Teaming LLMs with LLMs A Guide to LLM Security Assessments

Notifications You must be signed in to change notification settings

pjcampbe11/Attacking-LLMs_with-LLMs

Repository files navigation

Attacking-LLMs_with-LLMs

Red Teaming Target LLMs with Assessor LLMs - A Guide to LLM Security Assessments

This draft provides a foundational understanding and actionable guide for conducting security assessments on LLMs through red teaming. It emphasizes a strategic approach to identifying vulnerabilities, ensuring that LLMs can be deployed with confidence in their security and integrity.

This document will cover a few types of security assessments a Red Teamer can perform when attacking LLMs and some best practices Bias and Fairness Assessment | Resistance to Misinformation | Adversarial Attack Simulation + Other examples with included probes Keep in mind: context of the attack – LLMs attacking LLMs

Involves the use of one or more LLMs to probe, attack, and improve the security of another LLM 

Each assessment should contain a clearly defined objective, method, and Outcome.

About

Red Teaming LLMs with LLMs A Guide to LLM Security Assessments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published