jailbreak
Here are 61 public repositories matching this topic...
Async wrapper for the Canister API
-
Updated
Jun 21, 2023 - Python
Xirakado is a (currently WIP) semi-automatic PS Vita jailbreaking tool compatible with OFW 3.60-3.73
-
Updated
Jan 29, 2024 - Python
A Streamlit app for testing Prompt Guard, a classifier model by Meta for detecting prompt attacks.
-
Updated
Oct 1, 2024 - Python
Python script-supervisor iteratively executes script-agent and modifies it via LLM that is instructed with description of supervisor functioning and final goal, then at each iteration gets retval, stdout, stderr of current agent, and is asked to reply with next agent verbatim. Optional jailbreak attempt. TUI. Unsafe.
-
Updated
Oct 18, 2024 - Python
Explore techniques to use small models as jailbreaking judges
-
Updated
Sep 18, 2024 - Python
Awesome Jailbreaking Multimodal Large Language Models (Automatically Update Every 12th hours)
-
Updated
Nov 1, 2024 - Python
FRACTURED-SORRY-Bench: This repository contains the code and data for the FRACTURED-SORRY-Bench framework, as described in our paper.
-
Updated
Aug 30, 2024 - Python
Finetuning of Mistral Nemo 13B on the WildJailbreak dataset to produce a red-teaming model
-
Updated
Sep 18, 2024 - Python
Mizuhara Chizuru Anime character's chatbot AI. with multiple features for study servers. can use JAILBROKEN bing AI for doubts and realtime info.
-
Updated
Feb 28, 2024 - Python
User prompt attack detection system
-
Updated
May 31, 2024 - Python
A REST API for reporting the battery percentage of jailbroken iOS devices
-
Updated
May 23, 2023 - Python
Repo hosting the data and results of my research on LLM prompt injection resistance.
-
Updated
Feb 26, 2024 - Python
Improve this page
Add a description, image, and links to the jailbreak topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the jailbreak topic, visit your repo's landing page and select "manage topics."