Skip to content

sikunyang/CauchyNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

CauchyNet: Compact and Data-Efficient Learning Using Holomorphic Activation Functions

Reference implementation and reproducibility package for the ICML 2026 paper "CauchyNet: Compact and Data-Efficient Learning Using Holomorphic Activation Functions" by Hong-Kun Zhang, Xin Li, Sikun Yang, and Zhihong Xia.

CauchyNet is a single-hidden-layer complex-valued network whose activation is the multivariate Cauchy kernel [ \mathscr{X}(\mathbf{z}) = \prod_{i=1}^{N} z_i^{-1}. ] It is designed for data-scarce regimes and targets with sharp, rational-like spikes, where standard real-valued networks become width-hungry.

Installation

Requires Python 3.9+ and PyTorch 2.0+.

pip install -r requirements.txt

A CUDA-enabled GPU is optional; all experiments here fit comfortably on CPU (the largest run is a few minutes).

GitHub upload notes

Use this code_release/ directory as the repository root. The surrounding camera-ready manuscript workspace contains legacy notebooks, paper build artifacts, and large intermediate files that are not part of the clean code release.

Before pushing, the lightweight integrity check is:

cd experiments
python verify_table1.py
python run_all.py 11

The full python run_all.py command retrains the released experiments and may download sklearn tabular datasets for the UCI benchmark. Use verify_table1.py when you only need to validate the shipped result files against the paper.

Repository layout

code_release/
├── README.md
├── LICENSE
├── requirements.txt
└── experiments/
    ├── shared.py                       # CauchyNet, FNN, SIREN, Transformer, N-BEATS, training utils
    ├── run_all.py                      # Entry point — runs all experiments
    ├── verify_table1.py                # Cross-check JSON results vs. Table 1 claims
    ├── exp1_epsilon_sensitivity.py     # Eps sweep (Supp Exp 1)
    ├── exp2_computational_overhead.py  # Wall-clock comparison (Supp Exp 2)
    ├── exp3_piecewise_affine.py        # Step-ramp failure mode (Supp Exp 3)
    ├── exp4_parameter_matched.py       # Param-match: 20x vs FNN (Table 1, row 1)
    ├── exp5_convergence_rate.py        # Convergence rate (Supp Exp 5)
    ├── exp6_multilayer_cauchy.py       # Multi-layer Skip variant (Table 1)
    ├── exp7_uci_tabular.py             # UCI tabular (Table 1, row 4)
    ├── exp7_hybrid.py                  # FNN→Cauchy hybrid (Table 1, row 4)
    ├── exp7_hybrid2.py                 # Hybrid ablation
    ├── exp7_sweep.py                   # Hybrid hyperparameter sweep
    ├── exp8_rational_nn.py             # RationalNN comparison (Table 1, row 5)
    ├── exp9_delta_sweep.py             # δ-sweep: 102x vs FNN (Table 1, row 2)
    ├── exp10_sample_efficiency.py      # Sample efficiency
    ├── exp11_fixed_circle.py           # Fixed-pole / contour-shape ablation
    ├── exp11_fixed_pole_ridge.py       # Mixed 1D fixed-pole ridge result
    ├── best_config_gap_filling.py      # Final fixed-pole gap-filling config (Table 1, row 7)
    ├── sweep_epochs.py                 # Epoch sweep utility
    ├── sweep_params.py                 # Parameter sweep utility
    ├── sweep_small_n.py                # Small-n sweep utility
    └── results/                        # JSON + LaTeX tables of completed runs

Reproducing the headline numbers

Most numbers in Table 1 of the paper come from these scripts:

Paper Table-1 row Script
Param-match (20× vs FNN, n=20, near-singular) exp4_parameter_matched.py
δ-sweep (up to 102× vs FNN) exp9_delta_sweep.py
Multi-layer (Skip) (1.7× vs 1-layer CN) exp6_multilayer_cauchy.py
UCI tabular (wins 3/3, 4× fewer params) exp7_uci_tabular.py, exp7_hybrid.py
RationalNN (wins 3/3, 1.2–2.0×) exp8_rational_nn.py
Supp. mixed 1D fixed-pole ridge (14.8× at n=20) exp11_fixed_pole_ridge.py
Final fixed-pole gap-filling config best_config_gap_filling.py
Gap-filling (MAE 0.0202, 4.3× vs SIREN) best_config_gap_filling.py, plot_main_figure5.py

Run everything sequentially:

cd experiments
python run_all.py             # all experiments
python run_all.py 1 4 9       # only experiments 1, 4, 9

To cross-check the shipped JSON against the paper's Table 1 (no retraining, runs in <1 second):

python verify_table1.py
python verify_table1.py --tolerance 0.10   # tighten to 10%

The script prints PASS/FAIL per row and exits non-zero if any check fails.

Each script writes JSON and (where applicable) LaTeX tables and PDF/PNG plots into experiments/results/. The shipped results/ directory contains the JSON/LaTeX outputs from our runs; figures are regenerated on each invocation.

Minimal usage

import torch
import shared

# Synthetic regression with a sharp peak
x = torch.linspace(-1, 1, 200).unsqueeze(1)
y = (4.0 / ((x - 0.5) ** 2 + 0.01) + torch.sin(3 * x)).squeeze()

# 80/20 split
n_train = 160
x_train, x_test = x[:n_train], x[n_train:]
y_train, y_test = y[:n_train], y[n_train:]

model = shared.CauchyNet1D(hidden_size=64)
metrics = shared.train_and_eval(
    model, x_train, y_train, x_test, y_test,
    epochs=500, lr=1e-2, imag_penalty=0.1,
)
print(metrics)

See shared.py for the full model definition (≈40 lines) and training loop.

Hyperparameters

All experiments use the defaults in shared.py, which match the supplement of the paper (Table 2):

Most non-gap-filling scripts use the following default training settings:

Parameter Value
Hidden size h 64
Batch size 32
Learning rate 1e-2
Weight decay 1e-4
Epochs 200
Imaginary penalty λ 0.1
Optimizer Adam
Epsilon (denominator regularizer) 1e-8

The gap-filling result in Table 1 row 7 of the paper uses the final configuration identified by a 5-stage sweep: ellipse (r_re, r_im) = (2.5, 0.4), fixed poles, λ_imag = 0, lr = 5e-2, h = 64, n_train = 200, and epochs = 3000. The shipped result file reports CauchyNet mean MAE 0.0202 versus SIREN 0.0872, FNN 0.2032, and N-BEATS 0.1940. This is encoded in best_config_gap_filling.py and plotted by plot_main_figure5.py.

The supplementary fixed-pole mixed-target and tuned-UCI appendix tables are backed by exp11_fixed_pole_ridge.py, results/exp11_fixed_pole_ridge_results.json, and results/exp7_uci_tuned_results.json. The older results/exp11_n_sweep_focused.json file is retained as an Adam-trained contour-sensitivity ablation, not as the reported mixed-target result.

Reproducibility notes

  • All scripts seed torch, numpy, and random at the top.
  • Each cell in Table 1 averages 10 seeds (configurable via --seeds where exposed); we ship the per-seed numbers in results/*.json.
  • Reported results were obtained on an Apple-Silicon laptop (CPU, no GPU). GPU runs reproduce the same numbers within ±1% MSE.

Citation

@inproceedings{zhang2026cauchynet,
  title  = {{CauchyNet}: Compact and Data-Efficient Learning Using Holomorphic Activation Functions},
  author = {Zhang, Hong-Kun and Li, Xin and Yang, Sikun and Xia, Zhihong},
  booktitle = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},
  year   = {2026}
}

License

Code is released under the MIT License (see LICENSE).

Contact

Questions or issues: open a GitHub issue, or email xiazh@gbu.edu.cn (corresponding author).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors