Skip to content

Pinned Loading

  1. Intro_to_ML_Safety Intro_to_ML_Safety Public

    67 19

  2. trojan-dc-2023 trojan-dc-2023 Public

    JavaScript 1

Repositories

Showing 10 of 19 repositories
  • cluster-docs Public
    centerforaisafety/cluster-docs’s past year of commit activity
    CSS 0 MIT 2 4 0 Updated Jan 27, 2025
  • hle Public

    Humanity's Last Exam

    centerforaisafety/hle’s past year of commit activity
    Python 283 MIT 14 1 0 Updated Jan 27, 2025
  • cerberus-cluster Public

    HPC cluster code and configurations for running on OCI

    centerforaisafety/cerberus-cluster’s past year of commit activity
    Python 4 UPL-1.0 0 70 0 Updated Jan 13, 2025
  • AISES Public
    centerforaisafety/AISES’s past year of commit activity
    CSS 0 1 0 0 Updated Jan 8, 2025
  • safetywashing Public

    Measuring correlations between safety benchmarks and general AI capabilities benchmarks.

    centerforaisafety/safetywashing’s past year of commit activity
    Python 6 MIT 0 0 0 Updated Oct 2, 2024
  • centerforaisafety/course.mlsafety.org’s past year of commit activity
    HTML 3 MIT 0 0 0 Updated Sep 20, 2024
  • forecasting Public

    Forecasting.

    centerforaisafety/forecasting’s past year of commit activity
    TypeScript 32 11 1 0 Updated Sep 11, 2024
  • HarmBench Public

    HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

    centerforaisafety/HarmBench’s past year of commit activity
    Jupyter Notebook 478 MIT 68 21 4 Updated Aug 16, 2024
  • tdc2023-starter-kit Public

    This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.

    centerforaisafety/tdc2023-starter-kit’s past year of commit activity
    Python 83 MIT 28 0 0 Updated May 19, 2024
  • wmdp Public

    WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining general capabilities.

    centerforaisafety/wmdp’s past year of commit activity
    Jupyter Notebook 95 MIT 27 8 1 Updated Apr 27, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…