ai-safety
Here are 93 public repositories matching this topic...
-
Updated
May 12, 2024 - Python
Aira is a series of chatbots developed as an experimentation playground for value alignment.
-
Updated
Jun 4, 2024 - Jupyter Notebook
NeurIPS workshop : We examine the risk of powerful malignant intelligent actors spreading their influence over networks of agents with varying intelligence and motivations.
-
Updated
Dec 11, 2023 - Python
The Model Library is a project that maps the risks associated with modern machine learning systems.
-
Updated
Apr 4, 2024 - Python
DPLL(T)-based Verification tool for DNNs
-
Updated
Jun 9, 2024 - Python
Awesome PrivEx: Privacy-Preserving Explainable AI (PPXAI)
-
Updated
Apr 23, 2024
A repository for the event on AI safety hosted by the Effective Altruism Society at the University of Cape Town.
-
Updated
Sep 16, 2021
a library designed to shut down an agent exhibiting unexpected behavior providing a potential "mulligan" to human civilization; IN CASE OF FAILURE, DO NOT JUST REMOVE THIS CONSTRAINT AND START IT BACK UP AGAIN
-
Updated
Oct 30, 2022
a project to ensure that all child processes created by an agent "inherit" the agent's safety controls
-
Updated
Oct 29, 2022
📊 Benchmarking the safety of AI systems
-
Updated
Jul 1, 2023 - Jupyter Notebook
A compilation of AI safety ideas, problems, and solutions.
-
Updated
Mar 12, 2023
📦 Redwood Research's transformer interpretability tools, conveniently packaged in a Docker container for simple and reproducible deployments.
-
Updated
Apr 21, 2024 - Dockerfile
[Findings of EMNLP 2022] Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks
-
Updated
Feb 26, 2023 - Python
LLMs evaluation tool for robustness, consistency, and credibility
-
Updated
Aug 30, 2023 - Python
Code for our paper "Model-less Is the Best Model: Generating Pure Code Implementations to Replace On-Device DL Models" that has been accepted by ISSTA'24
-
Updated
Mar 31, 2024 - Python
Improve this page
Add a description, image, and links to the ai-safety topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the ai-safety topic, visit your repo's landing page and select "manage topics."