🐢 Open-Source Evaluation & Testing for LLMs and ML models
-
Updated
Jun 13, 2024 - Python
🐢 Open-Source Evaluation & Testing for LLMs and ML models
RuLES: a benchmark for evaluating rule-following in language models
Evaluation & testing framework for computer vision models
AiShields is an open-source Artificial Intelligence Data Input and Output Sanitizer
Discover and inventory the SaaS applications used across your organization by intelligently analyzing incoming Gmail emails, providing valuable insights into your SaaS landscape.
ATLAS tactics, techniques, and case studies data
Performing website vulnerability scanning using OpenAI technologie
[IJCAI 2024] Imperio is an LLM-powered backdoor attack. It allows the adversary to issue language-guided instructions to control the victim model's prediction for arbitrary targets.
GeminiHacker is a Python script designed to harness the power of a generative AI model for security research, bug bounty hunting, and vulnerability scanning. This README.md file provides detailed instructions on how to install, configure, and use the script effectively.
The Prompt Injection Testing Tool is a Python script designed to assess the security of your AI system's prompt handling against a predefined list of user prompts commonly used for injection attacks. This tool utilizes the OpenAI GPT-3.5 model to generate responses to system-user prompt pairs and outputs the results to a CSV file for analysis.
Image Prompt Injection is a Python script that demonstrates how to embed a secret prompt within an image using steganography techniques. This hidden prompt can be later extracted by an AI system for analysis, enabling covert communication with AI models through images.
CLI tool that uses the Lakera API to perform security checks in LLM inputs
Official code for paper: Z. Zhang, X. Wang, J. Huang and S. Zhang, "Analysis and Utilization of Hidden Information in Model Inversion Attacks," in IEEE Transactions on Information Forensics and Security, doi: 10.1109/TIFS.2023.3295942
Unofficial pytorch implementation of paper: Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures
Prompt Engineering Tool for AI Models with cli prompt or api usage
Python library for Modzy Machine Learning Operations (MLOps) Platform
Learning to Identify Critical States for Reinforcement Learning from Videos (Accepted to ICCV'23)
The official implementation of the CCS'23 paper, Narcissus clean-label backdoor attack -- only takes THREE images to poison a face recognition dataset in a clean-label way and achieves a 99.89% attack success rate.
The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on poisoned dataset.
Add a description, image, and links to the ai-security topic page so that developers can more easily learn about it.
To associate your repository with the ai-security topic, visit your repo's landing page and select "manage topics."