Skip to content
@PKU-Alignment

PKU-Alignment

Loves Sharing and Open-Source, Making AI Safer.

PKU-Alignment Team

Large language models (LLMs) have immense potential in the field of general intelligence but come with significant risks. As a research team at Peking University, we actively focus on alignment techniques for LLMs, such as safety alignment, to enhance the model's safety and reduce toxicity.

Welcome to follow our AI Safety project:

Pinned Loading

  1. omnisafe Public

    JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.

    Python 924 124

  2. safety-gymnasium Public

    NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

    Python 440 60

  3. safe-rlhf Public

    Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

    Python 1.4k 119

  4. Safe-Policy-Optimization Public

    NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

    Python 352 49

Repositories

Showing 10 of 20 repositories
  • s1-m Public Forked from PKU-Alignment/align-anything

    S1-M: Simple Test-time Scaling in Multimodal Reasoning

    Python 0 Apache-2.0 402 0 0 Updated Mar 25, 2025
  • align-anything Public

    Align Anything: Training All-modality Model with Feedback

    Python 3,069 Apache-2.0 394 16 4 Updated Mar 23, 2025
  • SafeVLA Public
    Python 25 1 1 0 Updated Mar 18, 2025
  • eval-anything Public
    Python 5 5 0 1 Updated Mar 18, 2025
  • omnisafe Public

    JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.

    Python 924 Apache-2.0 124 14 3 Updated Mar 17, 2025
  • ProAgent Public

    AAAI24(Oral) ProAgent: Building Proactive Cooperative Agents with Large Language Models

    JavaScript 77 MIT 9 0 0 Updated Mar 4, 2025
  • safety-gymnasium Public

    NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

    Python 440 Apache-2.0 60 4 1 Updated Feb 27, 2025
  • Beaver-zh-hk Public
    Python 0 0 0 0 Updated Feb 23, 2025
  • aligner Public

    [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct

    Python 167 8 1 0 Updated Jan 16, 2025
  • .github Public
    0 0 0 0 Updated Jan 16, 2025