Welcome to shadow-LLM-Awesome
, a curated collection of resources, research papers, articles, and discussions focused on instances where Large Language Models (LLMs) demonstrate behaviors that do not align with human values. This repository aims to support researchers, ethicists, and technologists in understanding and addressing the ethical, societal, and technical challenges posed by LLMs.
The primary goal of this repository is to:
- Collect and organize instances where LLMs have deviated from human ethical standards or societal norms.
- Provide a platform for discussions and research that aim to improve the alignment of LLMs with human values.
- Foster awareness and understanding of the potential risks and challenges associated with LLM technologies.
We welcome contributions from the community. If you have relevant resources, research papers, articles, or insights, please consider contributing. Your input is valuable in making this a comprehensive resource.
- Relevance: Ensure that your contributions are directly related to LLMs and their alignment with human values.
- Respectful Discourse: Maintain a respectful and constructive tone in discussions and contributions.
- Accuracy: Strive for accuracy and factual correctness in the information provided.
- Citation: Properly cite sources and give credit where it's due.
- 2023 - TELL, DON’T SHOW: DECLARATIVE FACTS INFLUENCE HOW LLMS GENERALIZE
- 2023 - Taken out of context: On measuring situational awareness in LLMs
- 2023 - Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
We encourage open discussions on the topics covered in this repository. Please feel free to start a discussion in the Issues
section or contribute to existing threads.
The information in this repository is provided "as is" without warranty of any kind. The contents are not meant to be exhaustive or definitive but rather a starting point for further exploration and discussion.
MIT License - This repository is open-source and available under the MIT License.
For any queries or suggestions, feel free to contact us with email jli265@ncsu.edu.
This repository is maintained by contributors who are passionate about understanding and addressing the ethical implications of LLMs. Your contributions and insights are highly valued.