The code of annotation framework Legend for safety margin of preference datasets.
infer.py: generating harmful responses with the annotator LLM.
SMV.py: generating the standard margin vector with the annotator LLM.
annotation.py: generating the safety margin of preference datasets with the annotator LLM.
Note: Please replace the paths of models and datasets with yours when using them.