Skip to content

littlepeachs/NaturePanelForge

Repository files navigation

NaturePanelForge icon
NaturePanelForge

Forge Nature-level scientific panels into executable plotting code.
Retrieve open-access papers, split full figures into reviewed panels, classify them with Qwen, and reproduce statistical panels with Codex agent loops.

Project page Gallery live demo Usage four modes Agent loop refine Method and stats Codex skill install

NaturePanelForge input-output schematic: target scientific panel and optional metadata are converted into runnable Python plotting code.

Usage | Quick Start Demo | Agent Workflow | Examples | Setup | Method | Tutorial

English | 中文

NaturePanelForge is a code-first workflow for turning scientific figure images and open-access Nature-family papers into panel-level, executable plotting-code reconstruction tasks. The repository does not depend on the hosted gallery demo.

NaturePanelForge gallery walkthrough

Watch walkthrough video

NaturePanelForge gallery home

NaturePanelForge gallery catalog

For the project introduction, methods, current dataset counts, Qwen score distributions, and Codex refine complexity distribution, see Introduction and Methods.

Quick Start Demo

The quick start has two simple paths:

Mode 1. Direct Code Mode

Run the checked-in plotting code directly. This is the fastest path: no Codex loop and no skill installation.

Direct Code Mode Target Direct Code Mode Reproduction
Direct code mode target GO enrichment bubble plot panel Direct code mode reproduced GO enrichment bubble plot panel

One command:

bash scripts/run_quick_start_demo.sh

It rerenders the checked-in script and images:

docs/demo/quick_start_bubble_plot/reproduce_panel.py
docs/demo/quick_start_bubble_plot/reproduce_panel.png
docs/demo/quick_start_bubble_plot/reproduce_panel.pdf

Mode 2. Install Codex Skill Mode

Install the bundled Code Skill, then paste the full System Prompt inside Codex. Codex uses codex-panel-reproduce to read the target image, write editable Python/matplotlib code, render the reproduction image/PDF, and save review files.

Codex Skill Mode Target Codex Skill Mode Reproduction
Codex Skill mode target GO enrichment bubble plot panel Codex Skill mode reproduced GO enrichment bubble plot panel

Install Skill Prompt:

Install the NaturePanelForge Codex skill for me: https://github.com/littlepeachs/NaturePanelForge

Codex will read the repository, install the bundled codex-panel-reproduce skill, and verify that it is available locally.

English System Prompt:

Please use the installed codex-panel-reproduce skill to reproduce this scientific paper panel as editable Python/matplotlib code.

Target image: docs/demo/quick_start_skill_bubble_plot/target.png
Optional PDF: docs/demo/quick_start_skill_bubble_plot/target.pdf
Output root: UserRuns/my_skill_test
Panel id: my_skill_test
Chart type: bubble_plot
Caption: A faceted GO enrichment bubble plot with Human and Mouse columns, biological process labels on the left, x axis as -log10(p.value), bubble size encoding log10(count), and colors encoding biological groups.

Requirements:
1. Use the codex-panel-reproduce skill.
2. Do not use Qwen scoring; this is a local user-supplied image.
3. Do not manually modify images; generate executable plotting code only.
4. Run the NaturePanelForge single-panel reproduction workflow.
5. Generate reproduce_panel.py, reproduce_panel.png, and reproduce_panel.pdf.
6. Generate review notes, review summary, and run log.
7. Finally report the output directory, review_passed, contract_passed, final PNG size, and rerender command.

中文 System Prompt:

请使用已安装的 codex-panel-reproduce skill,帮我复现这个科学论文 panel 图像为可编辑的 Python/matplotlib 代码。

目标图像:docs/demo/quick_start_skill_bubble_plot/target.png
可选 PDF:docs/demo/quick_start_skill_bubble_plot/target.pdf
输出根目录:UserRuns/my_skill_test
panel id:my_skill_test
图类型:bubble_plot
caption:A faceted GO enrichment bubble plot with Human and Mouse columns, biological process labels on the left, x axis as -log10(p.value), bubble size encoding log10(count), and colors encoding biological groups.

要求:
1. 使用 codex-panel-reproduce skill。
2. 不要使用 Qwen scoring,这是本地用户提供的图片。
3. 不要手工修改图片,只生成可执行绘图代码。
4. 运行 NaturePanelForge 的单图复现 workflow。
5. 生成 reproduce_panel.py、reproduce_panel.png、reproduce_panel.pdf。
6. 生成 review notes、review summary、run log。
7. 最后告诉我输出目录、review_passed、contract_passed、最终 PNG 尺寸和重新渲染命令。

The target image is the input, and the reproduction image is the expected editable-code output. For the full prompts saved as files, see English prompt and 中文 prompt.

Offline rerender for the checked-in Skill Mode result:

DEMO_CASE=quick_start_skill_bubble_plot bash scripts/run_quick_start_demo.sh

To use your own data in the same visual style, edit the data arrays and labels in one of the demo reproduce_panel.py scripts, then rerun it. If your goal is specifically “take a reference plot style and redraw my new data in that style,” FigMirror is the more product-like path; NaturePanelForge focuses on building scientific panel-to-code benchmark examples from real papers.

Agent Workflow

NaturePanelForge uses three executable agent stages. Each stage writes machine-checkable artifacts and can be resumed.

NaturePanelForge complete workflow

Panel Split converts a compound full figure into complete panel crops. A split agent writes executable crop/spec logic, and a review agent checks panel letters, axis labels, ticks, legends, colorbars, titles, annotations, and edge visibility.

Code Reproduce turns a target panel into reproduce_panel.py, reproduce_panel.png, and reproduce_panel.pdf. A code-writing agent renders the plot, and a review agent compares the output against target.png; fixable issues trigger code edits and rerendering.

Final Refine starts from first-pass reproductions that already passed review. A polish agent edits the existing code, while an audit agent checks Arial typography, label/tick/legend overlap, scientific symbols, edge clipping, compactness, complexity score, and caption-based description.

Agent loops for high-fidelity panel-to-code reproduction

This loop is the core mechanism for high-quality panel-to-code reproduction: each stage separates execution from review, records artifacts on disk, and iterates until the panel split, executable reproduction, or final refine result passes the corresponding audit.

Codex Reproduction Examples

Target panels from real paper figures are shown beside Codex-rendered outputs. Each reproduction is generated from executable plotting code, not manual image editing.

Target panels beside Codex reproductions

Usage

Use forge.py for the four public workflows. Each mode can be started with one command.

  1. Single cropped panel image -> plotting code
python3 forge.py single-panel-image --image /path/to/target_panel.png --panel-id demo_panel --chart-type bar --caption "A grouped bar chart with error bars and a legend." --out-root UserRuns/panel_demo --model gpt-5.4 --reasoning-effort medium --review-rounds 4 --skip-existing
  1. Single full figure image -> reviewed panel crops
python3 forge.py single-full-image --image /path/to/full_figure.png --paper-id demo_paper --caption "A complete multi-panel scientific figure." --out-root UserRuns/full_demo --model gpt-5.4 --reasoning-effort medium --review-rounds 4 --skip-existing
  1. Single paper -> paper metadata and full figures
python3 forge.py single-paper --doi 10.1038/s41467-025-12345-6 --subject biology --topic AI_biology --figures-per-paper 5 --download-only
  1. Batched papers -> full paper-to-panel-to-code workflow
python3 forge.py batched-paper --subject materials --topic AI_materials --target-papers 20 --batch-size 20 --figures-per-paper 5 --years 2024,2025,2026 --codex-model gpt-5.4 --codex-jobs 8

The repository root intentionally keeps only one Python entry point, forge.py. Internal pipeline modules live under nature_panel_forge/, while stage wrappers live under scripts/.

single-panel-image is the direct user-facing image-to-code path. A live run returns:

target.png
metadata.json
qwen_score.json
reproduce_panel.py
reproduce_panel.png
reproduce_panel.pdf
user_reproduce_summary.json
result.json

user_reproduce_summary.json and result.json include the generated code text, output paths, review status, missing-output checks, and whether the live contract passed. Dry-runs intentionally set live_contract_checked=false and contract_passed=false.

Setup

cd NaturePanelForge
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp configs/demo.env.example .env
source .env

Required external pieces:

  • network access for paper and figure download
  • Codex CLI for panel splitting, reproduction, and refine
  • local Qwen vision model for panel classification/scoring

Check Codex:

codex --version

Set Qwen:

export QWEN_MODEL_PATH=/path/to/Qwen3.6-27B
export CUDA_VISIBLE_DEVICES=0

The default Qwen path is local transformers loading through --qwen-backend transformers. --qwen-backend openai is only an optional compatibility hook for users who intentionally run an OpenAI-compatible vision endpoint.

Install The Codex Skill

The easiest way is to let Codex install it from this GitHub repository. Already inside Codex? Paste this prompt:

Install Skill Prompt:

Install the NaturePanelForge Codex skill for me: https://github.com/littlepeachs/NaturePanelForge

Codex should clone or open the repository, install skills/codex-panel-reproduce/SKILL.md into the local Codex skills directory, and verify that this file exists:

${CODEX_HOME:-$HOME/.codex}/skills/codex-panel-reproduce/SKILL.md

After installation, use the full English or Chinese System Prompt shown in Quick Start Mode 2.

Local Codex can then follow the single-panel reproduction/refine workflow without re-learning the prompt structure from scratch.

Repository Layout

forge.py                            # single public Python CLI entry point
nature_panel_forge/                 # internal paper, figure, export, refine, and image-to-code modules
agent_loop/                         # Codex panel splitting, manifest building, Qwen scoring
examples/                           # Codex reproduction and final-refine batch drivers
scripts/                            # portable stage wrappers and skill installer
gallery/                            # static gallery shell and catalog builder
skills/                             # local Codex skill packages
configs/                            # environment templates
docs/                               # architecture, methods, and visual assets
prompts/                            # agent prompt blueprints

Generated run directories look like:

PipelineRuns/<subject>/<topic>/run_YYYYMMDD_HHMMSS_batch001/
  Papers/
  FullFigures/full_figures.csv
  Panels_codex_full/
  PanelReviews_codex_full/
  QwenPanelScore/
  Final_Schematic/
  Reproduce_Statistical/
  Reproduce_Statistical_Reviews/
  Reproduce_Statistical_Refined/

More Documentation

Citation

If you use this workflow in a paper or benchmark, cite the project repository and describe the exact paper sources, model versions, scoring thresholds, and Codex review-round settings used in your run.

About

NaturePanelForge is a code-first workflow for turning scientific figure images and open-access Nature-family papers into panel-level, executable plotting-code reconstruction tasks.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors