Skip to content

itsakk/PHYGEN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PHYGEN

phygen packages a collection of reference PDE solvers and exports their trajectories to a consistent time-first HDF5 layout (T, *spatial_dims, C). Configurations define which equation to solve, how to sample parameters, and how the resulting data should be organised into train/validation/test splits.

Features

  • Config-driven solvers — pick from the bundled adapters (advection, Gray-Scott, turbulence, wave2d) or add your own via the Hydra config system.
  • Rich hyper-parameters — control spatial resolution (spatial_size, downsample), temporal extent (time_steps, time_horizon, dt_high, dt_low), and solver batch sizes directly from the equation profile.
  • Flexible parameter sampling — declare sweeps with linspace, range, choices, values, or random uniform draws (with optional seeds) to explore a grid of physical regimes.
  • Multiple dataset splits — generate train/val/test (or arbitrary) splits in one run, each with its own number of trajectories, base seed, and metadata. Selected splits can feed the automatic statistics accumulator.
  • Self-contained outputs — every run creates data/<split>/*.hdf5 files plus a matching stats.yaml, ready for loading with phyload or custom pipelines.

Installation

Clone the repository and install in editable mode:

git clone https://github.com/itsakk/PHYGEN.git
cd PHYGEN
pip install -e .

The only runtime dependencies are numpy, torch, torchdiffeq, h5py, PyYAML, tqdm, and hydra-core (see pyproject.toml).

Configuring a run

All user settings live in configs/ and are managed by Hydra.

  1. Root config (configs/main.yaml)

    • Chooses the default equation profile via the defaults list.
    • Sets global options such as the compute device, the target output_root, and the runtime split to generate (runtime.split).
  2. Equation profiles (configs/equation/*.yaml)

    • Describe solver-specific options inside generator.options (spatial resolution, time-stepping parameters, batch size, etc.).
    • Define the output block (output.root, output.dataset_name, output.compression, output.dtype).
    • Provide per-split settings, including num_trajectories, base_seed, and the parameter sweep declaration.

Parameter grids

Each split can specify a parameter_grid powered by phygen.config.expand_parameter_grid. Supported axis specifications:

  • linspace: [start, stop, num]
  • range: [start, stop, step]
  • choices: [v1, v2, ...]
  • values: [v1, v2, ...]
  • uniform: {low: a, high: b, num: n, seed: optional} (or the list form [low, high, num, seed])

Axes are combined in a Cartesian product, yielding one HDF5 file per parameter combination (with num_trajectories samples each). You can mix parameter_grid with an explicit parameters list for bespoke cases.

Running phygen

With the configuration in place, launch generation using either the module entrypoint or the helper script:

# Full control via Hydra overrides (runs in-place)
python -m phygen.main equation=turbulence runtime.split=train output_root=/tmp/phygen

# Convenience wrapper (writes to ./outputs by default)
./scripts/run_phygen.sh equation=turbulence runtime.split=train

Omit runtime.split to generate every split defined in the config. Outputs are written under <output_root>/<dataset_name>/data/{train,valid,test} with the matching stats.yaml at the dataset root.

Extending phygen

  • New solvers live under src/phygen/solvers/; subclass BaseSolverAdapter and register the adapter in src/phygen/registry.py.
  • Reuse ParameterSet and SplitConfig utility classes when wiring custom configurations.
  • Use scripts/run_phygen.sh as the base for environment-specific launchers (cluster job scripts, cloud runners, etc.).

Happy generating!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors