pardcomper

Ziyu Wang pardcomper

PhD candidate · trying to figure out whether multimodal LLMs are safe to deploy, and how badly they break when they aren't.

Achievements

mllm-jailbreak-bench mllm-jailbreak-bench Public

Reproducible benchmark for adversarial attacks on multimodal large language models

Python 28 4
safegate safegate Public

Lightweight runtime safety guard for multimodal LLM I/O

Python 17
trust-eval-mm trust-eval-mm Public

Multi-dimensional trustworthiness evaluation for multimodal LLMs

Python 15
chainreason chainreason Public

Forked from joshawome/chainreason

A benchmark for evaluating LLM reasoning on Ethereum and DeFi tasks

Python
pardcomper pardcomper Public

Profile README