MolecularSafety

Description

MolecularSafety is an environment for evaluating agents on molecular safety classification tasks. Given a molecule's SMILES string and a safety endpoint, the agent predicts whether the molecule is safe or unsafe. The dataset pools three toxicity classification datasets from Therapeutics Data Commons (TDC): AMES mutagenicity, hERG cardiotoxicity, and ClinTox clinical trial toxicity.

Capabilities

Classifying molecular safety across multiple toxicity endpoints
Predicting AMES mutagenicity from molecular structure
Predicting hERG cardiotoxicity (potassium channel blocking)
Predicting clinical trial toxicity

Compute Requirements

MolecularSafety does not require a sandbox. It has minimal compute requirements.

License

CC BY 4.0 (following the TDC dataset licenses).

Tasks

There are two splits: train (1,000 tasks) and test (100 tasks), totaling 1,100 tasks. Tasks are pooled from three TDC safety datasets (~22,000 molecules total) with proportional sampling:

Dataset	Property	Classes	Train	Test
AMES	Mutagenicity	0 = non-mutagenic, 1 = mutagenic	334	34
hERG_Karim	hERG Cardiotoxicity	0 = non-blocker, 1 = blocker	609	56
ClinTox	Clinical Trial Toxicity	0 = non-toxic, 1 = toxic	57	10

Overall class balance is ~48.5% positive (unsafe).

Reward Structure

This is a sparse, verifiable reward environment with binary scoring. The agent calls submit_prediction once with a classification (0 = safe, 1 = unsafe).

Correct: Reward 1.0.
Incorrect: Reward 0.0.

We do not use LLM graders for this task.

Data

Task data is pooled from three TDC toxicity datasets: AMES (Hansen et al.), hERG_Karim (Karim et al.), and ClinTox (Gayvert et al.). Data files are stored on the OpenReward platform.

Tools

Agents are given a single tool:

submit_prediction: Submit a safety classification (0 = safe/negative, 1 = unsafe/positive). Returns whether the prediction is correct. This tool can only be called once per task.

Time Horizon

MolecularSafety is a single-turn environment. The agent receives a molecule's SMILES string and safety endpoint, and submits one classification. Each task requires exactly one tool call.

Environment Difficulty

[Statistics on environment difficulty here]

Other Environment Requirements

There are no further environment requirements; MolecularSafety works out of the box with the OpenReward endpoint.

Safety

Agents in MolecularSafety are asked to classify molecules for toxicity across multiple safety endpoints. The environment does not present direct safety risks, as agents only provide classification predictions with no access to external systems.

However, this is a dual-use domain. Models trained for toxicity prediction capabilities could potentially be misused for designing harmful compounds in other contexts.

Citations

@dataset{GRMolecularSafety,
  author    = {General Reasoning Inc. Team},
  title     = {MolecularSafety},
  year      = {2026},
  publisher = {OpenReward},
  url       = {https://openreward.ai/GeneralReasoning/MolecularSafety}
}

@article{huang2021therapeutics,
  title={Therapeutics Data Commons: Machine learning datasets and tasks for drug discovery and development},
  author={Huang, Kexin and Fu, Tianfan and Gao, Wenhao and Zhao, Yue and Roohani, Yusuf and Leskovec, Jure and Coley, Connor W and Xiao, Cao and Sun, Jimeng and Zitnik, Marinka},
  journal={Proceedings of NeurIPS Datasets and Benchmarks},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
DATA_UPLOAD.md		DATA_UPLOAD.md
Dockerfile		Dockerfile
README.md		README.md
prepare_data.py		prepare_data.py
requirements.txt		requirements.txt
safetyclassify.py		safetyclassify.py
server.py		server.py
test_agent.py		test_agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MolecularSafety

Description

Capabilities

Compute Requirements

License

Tasks

Reward Structure

Data

Tools

Time Horizon

Environment Difficulty

Other Environment Requirements

Safety

Citations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MolecularSafety

Description

Capabilities

Compute Requirements

License

Tasks

Reward Structure

Data

Tools

Time Horizon

Environment Difficulty

Other Environment Requirements

Safety

Citations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages