Skywork-OR1-RL-Data

Description

Skywork-OR1-RL-Data is an environment for evaluating agents on mathematical reasoning and code generation tasks. It is based on the Skywork-OR1 RL training dataset from Skywork AI, consisting of 105,000 math problems and 14,112 code problems curated from diverse open-source datasets including NuminaMath, DeepScaler, and competitive programming collections.

The environment has two variants:

skyworkmath: Single-turn math problem solving with rule-based answer verification via math_verify
skyworkcode: Multi-step code generation with sandbox test execution

Capabilities

Mathematical reasoning across competition-level problems (olympiads, AMC/AIME, Chinese contests)
Code generation with stdin/stdout test case verification
Rule-based verifiable rewards (no LLM grader needed for either variant)

Compute Requirements

The math variant does not require a sandbox and has minimal compute requirements. The code variant provides agents with a sandbox (0.5 CPU, 1GB RAM) for code development and execution.

License

Apache 2.0 (matching the original dataset license).

Tasks

Variant	Split	Tasks
skyworkmath	train	~99,750
skyworkmath	test	~5,250
skyworkcode	train	~13,400
skyworkcode	test	~700

Math tasks: Each task presents a competition-level math problem. The agent submits an answer via the answer tool, which is verified using the math_verify library against one or more acceptable ground truth answers.

Code tasks: Each task presents a programming problem. The agent uses CLI tools to develop a Python solution that reads from stdin and writes to stdout, then submits it via the submit tool which runs it against hidden test cases.

Reward Structure

Math variant: Binary reward (0 or 1). The answer is verified using rule-based math_verify comparison against ground truth.

Code variant: Proportional reward (0.0 to 1.0) based on fraction of test cases passed: passed / total.

We do not use LLM graders for this environment.

Data

Problems are sourced from the Skywork/Skywork-OR1-RL-Data HuggingFace dataset. Math problems originate from NuminaMath-1.5, DeepScaler, and STILL collections. Code problems include competitive programming challenges. The dataset includes model-aware difficulty scores from DeepSeek-R1-Distill variants (1.5B, 7B, 32B).

Tools

Math variant (1 tool):

answer: Submit a final answer for rule-based verification

Code variant (10 tools):

CLI tools: bash, glob, grep, ls, read, write, edit, multi_edit, todo_write
submit: Submit a solution file for execution against hidden test cases

Time Horizon

The math variant is single-turn (one tool call per task). The code variant is multi-step, allowing iterative development and testing before final submission.

Other Environment Requirements

The code variant requires an OpenReward API key for sandbox provisioning. The math variant has no additional requirements.

Safety

Agents interact only with mathematical problems or write code in a sandboxed environment. No real-world systems are affected. Code execution is sandboxed with resource limits (0.5 CPU, 1GB RAM) and per-test-case timeouts (10 seconds).

Citations

@article{he2025skywork,
  title={Skywork Open Reasoner 1 Technical Report},
  author={Jujie He and Jiacai Liu and Chris Yuhao Liu and Rui Yan and Chaojie Wang and Peng Cheng and Xiaoyu Zhang and Fuxiang Zhang and Jiacheng Xu and Wei Shen and Siyuan Li and Liang Zeng and Tianwen Wei and Cheng Cheng and Bo An and Yang Liu and Yahui Zhou},
  journal={arXiv preprint arXiv:2505.22312},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Dockerfile		Dockerfile
README.md		README.md
cli_environment.py		cli_environment.py
code_env.py		code_env.py
download_datasets.sh		download_datasets.sh
golden_tests.py		golden_tests.py
math_env.py		math_env.py
requirements.txt		requirements.txt
server.py		server.py
test_agent.py		test_agent.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Skywork-OR1-RL-Data

Description

Capabilities

Compute Requirements

License

Tasks

Reward Structure

Data

Tools

Time Horizon

Other Environment Requirements

Safety

Citations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Skywork-OR1-RL-Data

Description

Capabilities

Compute Requirements

License

Tasks

Reward Structure

Data

Tools

Time Horizon

Other Environment Requirements

Safety

Citations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages