Skip to content

EnvCommons/ValidMol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ValidMol

OpenReward Environment

Description

ValidMol is an environment for evaluating an agent's ability to complete and fix corrupted molecular SMILES strings. Given a partial or corrupted SMILES notation, the agent must produce a valid, chemically stable molecule. Tasks apply four corruption strategies to real PubChem compounds: truncation, character deletion, ring opening, and bond corruption.

Capabilities

  • Completing and repairing corrupted SMILES molecular representations
  • Understanding chemical structure and bonding rules
  • Reasoning about molecular stability and validity
  • Working with cheminformatics concepts (ring closure, bond types)

Compute Requirements

Agents are given a standard environment with no sandbox or file system access.

Tasks

There are two splits in this environment:

  • train: 789 molecular completion tasks
  • test: 142 molecular completion tasks

Tasks are generated from PubChem compounds (CIDs 2500-3500 for train, CIDs 4500-4700 for test) with four corruption types:

Corruption Type Frequency Description
Truncation 40% Removes 20-50% of the SMILES suffix
Character deletion 25% Removes 1-3 random characters
Ring opening 20% Removes a closing ring number
Bond corruption 15% Removes or changes bond symbols

Reward Structure

This is a sparse, verifiable reward environment. The agent calls submit_completion with a SMILES string and receives a binary reward based on three-tier RDKit validation:

  1. Syntactic validity: RDKit parsing succeeds
  2. Structural validity: RDKit sanitization passes
  3. Chemical stability: No peroxides, hydrazines, or conjugated alkynes
  • 1.0: All three tiers pass
  • 0.0: Any tier fails

No LLM graders are used.

Data

Task data is generated from PubChem compounds via the PUG-REST API. Quality filters include SMILES length 10-80, heavy atoms 5-50, and a maximum of 5 rings. Data is stored on the OpenReward platform.

Tools

Tool Description
submit_completion Submit a completed/fixed SMILES string for three-tier chemical validation

Time Horizon

Single-turn. Agents receive a corrupted SMILES string and submit one completion.

Environment Difficulty

[Put environment difficulty statistics here]

Other Environment Requirements

There are no further environment requirements; ValidMol works out of the box with the OpenReward endpoint.

Safety

Agents in ValidMol complete molecular structures from corrupted inputs. There is a dual-use concern in that improved molecular generation capabilities could be applied to both beneficial and harmful purposes.

Citation

@dataset{GRValidMol,
  author    = {General Reasoning Inc. Team},
  title     = {ValidMol},
  year      = {2026},
  publisher = {OpenReward},
  url       = {https://www.openreward.ai/GeneralReasoning/ValidMol}
}

About

Molecule generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors