Reference implementation and reproducibility package for the ICML 2026 paper "CauchyNet: Compact and Data-Efficient Learning Using Holomorphic Activation Functions" by Hong-Kun Zhang, Xin Li, Sikun Yang, and Zhihong Xia.
CauchyNet is a single-hidden-layer complex-valued network whose activation is the multivariate Cauchy kernel [ \mathscr{X}(\mathbf{z}) = \prod_{i=1}^{N} z_i^{-1}. ] It is designed for data-scarce regimes and targets with sharp, rational-like spikes, where standard real-valued networks become width-hungry.
Requires Python 3.9+ and PyTorch 2.0+.
pip install -r requirements.txtA CUDA-enabled GPU is optional; all experiments here fit comfortably on CPU (the largest run is a few minutes).
Use this code_release/ directory as the repository root. The surrounding
camera-ready manuscript workspace contains legacy notebooks, paper build
artifacts, and large intermediate files that are not part of the clean code
release.
Before pushing, the lightweight integrity check is:
cd experiments
python verify_table1.py
python run_all.py 11The full python run_all.py command retrains the released experiments and may
download sklearn tabular datasets for the UCI benchmark. Use
verify_table1.py when you only need to validate the shipped result files
against the paper.
code_release/
├── README.md
├── LICENSE
├── requirements.txt
└── experiments/
├── shared.py # CauchyNet, FNN, SIREN, Transformer, N-BEATS, training utils
├── run_all.py # Entry point — runs all experiments
├── verify_table1.py # Cross-check JSON results vs. Table 1 claims
├── exp1_epsilon_sensitivity.py # Eps sweep (Supp Exp 1)
├── exp2_computational_overhead.py # Wall-clock comparison (Supp Exp 2)
├── exp3_piecewise_affine.py # Step-ramp failure mode (Supp Exp 3)
├── exp4_parameter_matched.py # Param-match: 20x vs FNN (Table 1, row 1)
├── exp5_convergence_rate.py # Convergence rate (Supp Exp 5)
├── exp6_multilayer_cauchy.py # Multi-layer Skip variant (Table 1)
├── exp7_uci_tabular.py # UCI tabular (Table 1, row 4)
├── exp7_hybrid.py # FNN→Cauchy hybrid (Table 1, row 4)
├── exp7_hybrid2.py # Hybrid ablation
├── exp7_sweep.py # Hybrid hyperparameter sweep
├── exp8_rational_nn.py # RationalNN comparison (Table 1, row 5)
├── exp9_delta_sweep.py # δ-sweep: 102x vs FNN (Table 1, row 2)
├── exp10_sample_efficiency.py # Sample efficiency
├── exp11_fixed_circle.py # Fixed-pole / contour-shape ablation
├── exp11_fixed_pole_ridge.py # Mixed 1D fixed-pole ridge result
├── best_config_gap_filling.py # Final fixed-pole gap-filling config (Table 1, row 7)
├── sweep_epochs.py # Epoch sweep utility
├── sweep_params.py # Parameter sweep utility
├── sweep_small_n.py # Small-n sweep utility
└── results/ # JSON + LaTeX tables of completed runs
Most numbers in Table 1 of the paper come from these scripts:
| Paper Table-1 row | Script |
|---|---|
| Param-match (20× vs FNN, n=20, near-singular) | exp4_parameter_matched.py |
| δ-sweep (up to 102× vs FNN) | exp9_delta_sweep.py |
| Multi-layer (Skip) (1.7× vs 1-layer CN) | exp6_multilayer_cauchy.py |
| UCI tabular (wins 3/3, 4× fewer params) | exp7_uci_tabular.py, exp7_hybrid.py |
| RationalNN (wins 3/3, 1.2–2.0×) | exp8_rational_nn.py |
| Supp. mixed 1D fixed-pole ridge (14.8× at n=20) | exp11_fixed_pole_ridge.py |
| Final fixed-pole gap-filling config | best_config_gap_filling.py |
| Gap-filling (MAE 0.0202, 4.3× vs SIREN) | best_config_gap_filling.py, plot_main_figure5.py |
Run everything sequentially:
cd experiments
python run_all.py # all experiments
python run_all.py 1 4 9 # only experiments 1, 4, 9To cross-check the shipped JSON against the paper's Table 1 (no retraining, runs in <1 second):
python verify_table1.py
python verify_table1.py --tolerance 0.10 # tighten to 10%The script prints PASS/FAIL per row and exits non-zero if any check fails.
Each script writes JSON and (where applicable) LaTeX tables and PDF/PNG plots
into experiments/results/. The shipped results/ directory contains the
JSON/LaTeX outputs from our runs; figures are regenerated on each invocation.
import torch
import shared
# Synthetic regression with a sharp peak
x = torch.linspace(-1, 1, 200).unsqueeze(1)
y = (4.0 / ((x - 0.5) ** 2 + 0.01) + torch.sin(3 * x)).squeeze()
# 80/20 split
n_train = 160
x_train, x_test = x[:n_train], x[n_train:]
y_train, y_test = y[:n_train], y[n_train:]
model = shared.CauchyNet1D(hidden_size=64)
metrics = shared.train_and_eval(
model, x_train, y_train, x_test, y_test,
epochs=500, lr=1e-2, imag_penalty=0.1,
)
print(metrics)See shared.py for the full model definition (≈40 lines) and training loop.
All experiments use the defaults in shared.py, which match the supplement
of the paper (Table 2):
Most non-gap-filling scripts use the following default training settings:
| Parameter | Value |
|---|---|
Hidden size h |
64 |
| Batch size | 32 |
| Learning rate | 1e-2 |
| Weight decay | 1e-4 |
| Epochs | 200 |
Imaginary penalty λ |
0.1 |
| Optimizer | Adam |
| Epsilon (denominator regularizer) | 1e-8 |
The gap-filling result in Table 1 row 7 of the paper uses the final
configuration identified by a 5-stage sweep:
ellipse (r_re, r_im) = (2.5, 0.4), fixed poles, λ_imag = 0,
lr = 5e-2, h = 64, n_train = 200, and epochs = 3000.
The shipped result file reports CauchyNet mean MAE 0.0202 versus SIREN
0.0872, FNN 0.2032, and N-BEATS 0.1940. This is encoded in
best_config_gap_filling.py and plotted by plot_main_figure5.py.
The supplementary fixed-pole mixed-target and tuned-UCI appendix tables are
backed by exp11_fixed_pole_ridge.py,
results/exp11_fixed_pole_ridge_results.json, and
results/exp7_uci_tuned_results.json. The older
results/exp11_n_sweep_focused.json file is retained as an Adam-trained
contour-sensitivity ablation, not as the reported mixed-target result.
- All scripts seed
torch,numpy, andrandomat the top. - Each cell in Table 1 averages 10 seeds (configurable via
--seedswhere exposed); we ship the per-seed numbers inresults/*.json. - Reported results were obtained on an Apple-Silicon laptop (CPU, no GPU). GPU runs reproduce the same numbers within ±1% MSE.
@inproceedings{zhang2026cauchynet,
title = {{CauchyNet}: Compact and Data-Efficient Learning Using Holomorphic Activation Functions},
author = {Zhang, Hong-Kun and Li, Xin and Yang, Sikun and Xia, Zhihong},
booktitle = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},
year = {2026}
}Code is released under the MIT License (see LICENSE).
Questions or issues: open a GitHub issue, or email
xiazh@gbu.edu.cn (corresponding author).