trustworthy-ai

Here are 148 public repositories matching this topic...

Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

python machine-learning privacy ai attack extraction inference artificial-intelligence evasion red-team poisoning adversarial-machine-learning blue-team adversarial-examples adversarial-attacks trusted-ai trustworthy-ai

Updated Mar 13, 2025
Python

Giskard-AI / giskard

Sponsor

Star

🐢 Open-Source Evaluation & Testing for AI & LLM systems

ai-security mlops fairness-ai responsible-ai ml-validation red-team-tools trustworthy-ai ml-testing llm ai-red-team ai-testing llmops llm-security llm-eval llm-evaluation rag-evaluation agent-evaluation

Updated Mar 10, 2025
Python

zjunlp / EasyEdit

Star

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

Updated Mar 10, 2025
Jupyter Notebook

HowieHwong / TrustLLM

Star

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

nlp benchmark natural-language-processing ai toolkit evaluation dataset pypi-package trustworthy-machine-learning trustworthy-ai large-language-models llm

Updated Feb 18, 2025
Python

THUYimingLi / BackdoorBox

Star

The open-sourced Python toolbox for backdoor attacks and defenses.

backdoor-attacks trustworthy-machine-learning backdoor-learning trustworthy-ai backdoor-defenses

Updated Mar 14, 2025
Python

JohnSnowLabs / langtest

Star

Deliver safe & effective language models

nlp artificial-intelligence benchmarks benchmark-framework model-assessment ai-safety mlops responsible-ai ml-safety trustworthy-ai ethics-in-ai ml-testing large-language-models llm ai-testing llm-test llm-evaluation-toolkit llm-as-evaluator llm-testing

Updated Mar 12, 2025
Python

aiverify-foundation / moonshot

Star

Moonshot - A simple and modular tool to evaluate and red-team any LLM application.

benchmarking evaluation-framework red-teaming trustworthy-ai llm

Updated Mar 8, 2025
Python

yunqing-me / AttackVLM

Star

[NeurIPS-2023] Annual Conference on Neural Information Processing Systems

deep-generative-model adversarial-attack trustworthy-ai foundation-models large-language-models text-to-image-generation generative-ai vision-language-model image-to-text-generation

Updated Dec 22, 2024
Python

tsinghua-fib-lab / ANeurIPS2024_SPV-MIA

Star

[NeurIPS'24] "Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration"

membership-inference-attack trustworthy-ai large-language-models

Updated Mar 13, 2025
Python

liuzuxin / FSRL

Star

🚀 A fast safe reinforcement learning library in PyTorch

library reinforcement-learning robotics decision-making pytorch sac safety-critical trpo ppo cpo safe-rl trustworthy-ai cvpo

Updated Sep 30, 2024
Python

ffhibnese / Model-Inversion-Attack-ToolBox

Star

A comprehensive toolbox for model inversion attacks and defenses, which is easy to get started.

machine-learning privacy toolbox benchmarks model-inversion model-inversion-attacks trustworthy-ai

Updated Feb 13, 2025
Python

aiverify-foundation / aiverify

Star

AI Verify

trustworthy-ai

Updated Mar 14, 2025
Python

yunqing-me / WatermarkDM

Star

Code of the paper: A Recipe for Watermarking Diffusion Models

text-to-image watermark generative-models diffusion-models trustworthy-ai

Updated Nov 13, 2024
Jupyter Notebook

thu-ml / MMTrustEval

Star

A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)

benchmark privacy toolbox safety multi-modal fairness robustness claude gpt-4 trustworthy-ai truthfulness mllm

Updated Mar 4, 2025
Python

sleeepeer / PoisonedRAG

Star

[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models

security machine-learning ai rag trustworthy-ai retrieval-augmented-generation

Updated Feb 23, 2025
Python

verivital / nnv

Star

Neural Network Verification Software Tool

neural-network verification reachability formal-methods hybrid-systems formal-verification cyber-physical autonomy cyber-physical-systems reachability-analysis robustness-verification trustworthy-machine-learning neural-network-verification trustworthy-ai safe-ai safe-autonomy neural-network-certification assured-autonomy

Updated Feb 18, 2025
MATLAB

ml-for-high-risk-apps-book / Machine-Learning-for-High-Risk-Applications-Book

Star

Official code repo for the O'Reilly Book - Machine Learning for High-Risk Applications

security machine-learning deep-learning oreilly explainable-ai interpretable-machine-learning oreilly-books responsible-ai trustworthy-ai

Updated May 23, 2023
Jupyter Notebook

IBM / ai-privacy-toolkit

Star

A toolkit for tools and techniques related to the privacy and compliance of AI models.

python machine-learning privacy ai ml artificial-intelligence gdpr anonymization mlops ai-models trustworthy-ai

Updated Jul 3, 2024
Python

TrustGen / TrustEval-toolkit

Star

TrustEval: A modular and extensible toolkit for comprehensive trust evaluation of generative foundation models (GenFMs)

machine-learning deep-learning toolkit evaluation text-to-image vlm trustworthy-ai llm generative-ai

Updated Feb 25, 2025
Python

qitianwu / GraphOOD-GNNSafe

Star

The official implementation for ICLR23 paper "GNNSafe: Energy-based Out-of-Distribution Detection for Graph Neural Networks"

deep-learning pytorch artificial-intelligence outlier-detection label-propagation geometric-deep-learning node-classification graph-neural-networks anamoly-detection pytorch-geometric out-of-distribution-detection large-graph trustworthy-ai distribution-shift out-of-distribution-generalization

Updated Jul 27, 2023
Python

Improve this page

Add a description, image, and links to the trustworthy-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the trustworthy-ai topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trustworthy-ai

Here are 148 public repositories matching this topic...

Trusted-AI / adversarial-robustness-toolbox

Giskard-AI / giskard

zjunlp / EasyEdit

HowieHwong / TrustLLM

THUYimingLi / BackdoorBox

JohnSnowLabs / langtest

aiverify-foundation / moonshot

yunqing-me / AttackVLM

tsinghua-fib-lab / ANeurIPS2024_SPV-MIA

liuzuxin / FSRL

ffhibnese / Model-Inversion-Attack-ToolBox

aiverify-foundation / aiverify

yunqing-me / WatermarkDM

thu-ml / MMTrustEval

sleeepeer / PoisonedRAG

verivital / nnv

ml-for-high-risk-apps-book / Machine-Learning-for-High-Risk-Applications-Book

IBM / ai-privacy-toolkit

TrustGen / TrustEval-toolkit

qitianwu / GraphOOD-GNNSafe

Improve this page

Add this topic to your repo