Skip to content

alecabodi/Eliciting-Latent-Knowledge-in-Comprehensive-AI-Services-Models

Repository files navigation

Eliciting-Latent-Knowledge-in-Comprehensive-AI-Services-Models

A Conceptual Framework and Preliminary Proposals for AI Alignment and Safety in R&D

This is a research report I authored over the summer as part of the CHERI fellowship program, under the supervision of Patrick Levermore. The project explores the complexities of AI alignment, with a specific focus on reinterpreting the Eliciting Latent Knowledge problem through the lens of the Comprehensive AI Services (CAIS) model. Furthermore, I delve into the model's applicability in ensuring R&D design safety and certification.

Refer to the blog post for a summary of the paper.

About

Research Project for CHERI Fellowship

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published