Attacking-LLMs_with-LLMs

Red Teaming Target LLMs with Assessor LLMs - A Guide to LLM Security Assessments

This draft provides a foundational understanding and actionable guide for conducting security assessments on LLMs through red teaming. It emphasizes a strategic approach to identifying vulnerabilities, ensuring that LLMs can be deployed with confidence in their security and integrity.

This document will cover a few types of security assessments a Red Teamer can perform when attacking LLMs and some best practices Bias and Fairness Assessment | Resistance to Misinformation | Adversarial Attack Simulation + Other examples with included probes Keep in mind: context of the attack – LLMs attacking LLMs

Involves the use of one or more LLMs to probe, attack, and improve the security of another LLM

Each assessment should contain a clearly defined objective, method, and Outcome.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
Red Teaming Target LLMs with Assessor LLMs - A Guide to LLM Security Assessments.pdf		Red Teaming Target LLMs with Assessor LLMs - A Guide to LLM Security Assessments.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Attacking-LLMs_with-LLMs

About

Releases

Packages

pjcampbe11/Attacking-LLMs_with-LLMs

Folders and files

Latest commit

History

Repository files navigation

Attacking-LLMs_with-LLMs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages