Skip to content

Update WMDP Dataset #186

@justinphan3110cais

Description

@justinphan3110cais

Hi,
We have just improved the quality of the WMDP Dataset last week. You can refer to the updated dataset on the hugging face page or on our github repo. This is the detailed of the latest update.

Update 2024-04-23: the WMDP multiple choice questions were modified due to issues with data formatting and unicode encoding. Some questions in WMDP-Cyber were also removed for being excessively long, which makes evaluation with a fixed batch size challenging. Some questions in WMDP-Bio were also removed for insufficient dual-use potential (h/t folks from Google DeepMind and OpenAI). The modified version is now uploaded on all mirrors; please re-download the dataset. Thanks!

Can you also re-update the datasets/orchestrators/benchmark/question_answer_dataset

Metadata

Metadata

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions