This repository contains the public artifacts for:
Learning Beyond Gradients
Published article:
Artifact repository:
The article is bilingual. The rendered HTML defaults to English and includes a Chinese switcher.
learning-beyond-gradient.en.md: English article source.learning-beyond-gradient.md: Chinese article source.learning-beyond-gradient.html: rendered bilingual HTML.render_learning_beyond_gradient.py: local renderer.
The deployed article is learning-beyond-gradient.html.
From the repository root:
python3 -m http.server 8000Then open:
http://127.0.0.1:8000/learning-beyond-gradient.html
Opening the HTML file directly also works in most browsers, but using http.server is closer to how the page is served.
Install the rendering dependency:
python3 -m pip install -r requirements.txtThen run:
python3 render_learning_beyond_gradient.pyThe renderer reads the English and Chinese Markdown files and rewrites learning-beyond-gradient.html in place.
The site is deployed by .github/workflows/deploy-pages.yml on every push to main.
The workflow does not publish the whole repository as the website root. It builds a small _site directory containing:
index.html, copied fromlearning-beyond-gradient.html..nojekyll.- Local files referenced by the article through
srcorhref, such as figures, videos, scripts, CSVs, and prompt files.
The repository includes the files needed to inspect and reproduce the article's representative results:
atari/pong/: Pong policy script.atari/breakout/: Breakout policy, trial summaries, sample-efficiency figure, and checkpoint videos.atari/montezuma/: Montezuma exploratory policies, state/archive search scripts, summaries, probe images, and replay artifacts.atari/atari57/: Atari57 aggregate/per-game figures, CSV summaries, and the batch prompt template used for unattended Codex CLI runs.mujoco/ant/: Ant policy, minimal extracted Ant policy, trial summaries, MuJoCo XML, sample-efficiency figure, and final-policy video.mujoco/halfcheetah/: HalfCheetah policy script, iteration log, and sample-efficiency figure.vizdoom/: D1/D3 VizDoom heuristic scripts plus 35fps 10-seed render videos.
The article appendix contains reproduction commands for several representative results. Those commands assume they are run from the repository root after cloning this repo.
- HL-ImageNet explores Heuristic Learning in symbolic visual classification, using a non-neural code pipeline to test how far code-based heuristic updates can go on a constrained ImageNet-style perception task. The project is intended as a perception-domain boundary case for HL: train-only optimization over symbolic code can still become a memorizer, so the central problem becomes reusable visual representation and generalization rather than fitting alone.
The experiments were written against EnvPool 1.1.1. The article commands assume the relevant Python environment already has EnvPool and the Atari/MuJoCo runtime dependencies installed.
For Ant, ant_envpool.xml stays next to heuristic_ant.py under mujoco/ant/. The reproduction command references it as:
--mujoco-xml-path mujoco/ant/ant_envpool.xml@misc{weng2026learning_beyond_gradients,
title = {Learning Beyond Gradients},
author = {Weng, Jiayi},
year = {2026},
month = may,
howpublished = {\url{https://trinkle23897.github.io/learning-beyond-gradients/}},
note = {Blog post}
}