Skip to content

Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.

License

Notifications You must be signed in to change notification settings

avidml/evaluating-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Evaluating LLMs on Hugging Face

The AVID (AI Vulnerability Database) team is examining a few large language models (LLMs) on Hugging Face. We will develop a way to evaluate and catalog their vulnerabilities in the hopes of encouraging the community to contribute. As a first step, we’re going to pick a single model and try to evaluate it for vulnerabilities on a specific task. Once we have done one model, we’ll see if we can generalize our data sets and tools to function broadly on the Hugging Face platform.

Vision

Build a foundation for evaluating LLMs using the Hugging Face platform and start populating our database with real incidents.

Goals

  • Build, test, and refine our own data sets for evaluating models
  • Identify existing data sets we want to use for evaluating models (Ex. Stereoset, wino_bias, etc.)
  • Test different tools and methods for evaluating LLMs so we can start to create and support some for cataloging vulnerabilities in our database
  • Start populating the database with known, verified, and discovered vulnerabilities for models hosted on Hugging Face

Resources

The links below should help anyone who wants to support the project find a place to start. They are not exhaustive, and people should feel free to add anything relevant.

About

Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published