BayesOpt scheme, AGNI grey gas, improved test coverage#675
Conversation
…r than pickle. Sets n_workers for initial sampling from config. Removed notebook. Added docstrings. Updated tests. Accesses paths safely
…nfig. Option for disabling SPIDER log. Move timestep parameters into config
…ephyrus/agni in BO config£
|
@timlichtenberg this branch has developed substantially enough that it should be merged (ideally) next week. I will continue development iteratively after that. However, I need to tidy it up a bit before it's ready for human-review. I will let you know when this is, but hopefully before the PROTEUS meeting tomorrow (4 May). |
|
@timlichtenberg I believe that this is now ready for its first human-review. |
There was a problem hiding this comment.
Thanks for bundling this up, Harrison. The BO refactor and AGNI grey-gas plumbing read cleanly. Two issues I'd like to resolve before approving.
1. prevent_warming semantics change in F_int handling
src/proteus/interior/spider.py removes the old floor:
# REMOVED from ReadSPIDER:
if config.atmos_clim.prevent_warming:
output['F_int'] = max(1.0e-8, output['F_int'])src/proteus/interior/wrapper.py:402 replaces it with a ratchet that also covers T_surf and F_int:
if config.atmos_clim.prevent_warming and (interior_o.ic == 2):
hf_row['Phi_global'] = min(hf_row['Phi_global'], Phi_global_prev)
hf_row['T_magma'] = min(hf_row['T_magma'], T_magma_prev)
hf_row['T_surf'] = min(hf_row['T_surf'], T_surf_prev)
hf_row['F_int'] = min(hf_row['F_int'], hf_all.iloc[-1]['F_int'])These are not equivalent on F_int.
- Old: F_int floored at
+1e-8underprevent_warming(no negatives, no monotonicity requirement). - New: F_int monotonically non-increasing across iterations, including into negatives. T_surf and Phi also monotonically non-increasing.
F_int can now go negative under prevent_warming. A single noisy iteration where the interior is briefly cooler than the atmosphere produces F_int < 0, after which the ratchet pins F_int at that negative value indefinitely (since min(later, negative) = negative). The old floor explicitly prevented this. The PR description says the BL parser was "corrected to respect the prevent_warming flag" — but the new ratchet semantics are stronger than what the old SPIDER floor enforced. Was the floor's removal intentional, or an incidental consequence of moving the logic from ReadSPIDER into run_interior?
If the intent is to keep both behaviours, the simplest fix is to combine them, e.g.:
hf_row['F_int'] = max(1.0e-8, min(hf_row['F_int'], hf_all.iloc[-1]['F_int']))There is also no unit test exercising the new T_surf or F_int clamp on either interior module (tests/interior/test_timestep.py covers the timestep machinery only). At least one regression test pinning expected behaviour at the clamp boundary would help future reviewers.
2. Element-mass-ratio derivation returns 0.0 for both numerator-zero and denominator-zero
src/proteus/outgas/wrapper.py:94:
if min(hf_row[f'{e1}_kg_atm'], hf_row[f'{e2}_kg_atm']) < 1e-30:
hf_row[key] = 0.0
else:
hf_row[key] = hf_row[f'{e2}_kg_atm'] / hf_row[f'{e1}_kg_atm']Numerator-zero → 0.0 is correct (no e2, ratio is zero). Denominator-zero → 0.0 is wrong: the ratio is undefined / +inf, and a silent zero will mislead any downstream plot or BO objective that ingests these columns (the BO objective treats _atm ratios as observables). Suggest np.nan for the denominator-zero branch and 0.0 for the numerator-zero branch; the ratio symmetry is already handled by GetHelpfileKeys(). Also consider promoting the 1e-30 floor to a named constant alongside the other mass thresholds.
|
Thanks for this @timlichtenberg. Regarding point 1... I have corrected the Regarding point 2... The behaviour you've highlighted is what I had intended. It isn't a mistake, because returning X/Y element ratios=0 is probably safer than returning NaN values. Dividing by zero isn't a defined operation, and can reasonably be evaluated to =0 in this case. It works appropriately for the BO scheme. |
Bring Harrison's BayesOpt rewrite of the inference module onto this branch as the first sub-feature of #675. What this does: - Reformats the BO scheme to use logging instead of print - Switches data persistence from pickle to pandas CSV (data.csv, logs.csv, Ts.csv, init.csv replace the .pkl files) - Adds tests/inference/test_*.py with ~30 new unit tests covering BO, async BO, gen_D_init, objective, plot, utils, inference entry - Adds tools/extract_inference_truths.py for pulling truth values out of baseline PROTEUS runs (used by BO retrieval) - Removes src/proteus/inference/toy_example.ipynb - Updates docs/How-to/inference.md to match the new logging field and the .pkl->.csv output names I kept this branch's schema (planet.*, interior_struct.*) in the test files and dummy.infer.toml; Harrison's PR was opened before the schema rename and would have reverted those paths. I added variable_is_logarithmic to utils/coupler.py because inference/objective.py depends on it. The rest of coupler.py changes from #675 land in a later sub-feature. The 4 smoke tests in test_inference.py run real PROTEUS subprocesses (n_workers=2, n_steps=5) so they exceed standard pytest timeouts and are intended for nightly. Unit tier passes.
#675) Three small plot fixes that all show up when AGNI is run with the grey-gas spectral file (nband=1), but the changes are valid for any single-band or near-degenerate input. What this does: - cpl_emission.plot_emission: handle nband=1 by reading the single band directly instead of indexing ds['bandmin'][1]. Also clip flux values to >=1e-20 before log plotting and tighten the x-limit fallback when the wavelength range collapses. - cpl_visual.plot_visual: same nband=1 branch for the SW/LW/stellar arrays, and clip arrays before plotting. - cpl_orbit.plot_orbit: when the planetary eccentricity is constant, the y-range collapsed and matplotlib raised; widen ymax to at least ymin + 0.01. - utils.plot.sample_output: downgrade the "tar archive exists" message from error to warning (it's recoverable). - utils.visual.interp_spec: when the spectrum has only one wavelength, return a constant array instead of calling PchipInterpolator which requires >=2 points. I added tests/plot/test_cpl_emission.py, expanded tests/plot/test_cpl_visual.py, and added tests/utils/test_visual.py to cover the new branches.
Add unordered element-to-element mass ratios in the atmosphere to the helpfile schema, derived from the per-element atmospheric inventories computed by run_outgassing. I picked up only the element-ratio additions from #675's outgas changes here. The rest of the PR's outgas diff (Calliope solver tolerance migration, schema reverts of delivery/planet.elements, removal of the atmodeller backend) lives in later sub-features or conflicts with this branch's schema and is intentionally not taken. What this does: - GetHelpfileKeys: append "{e2}/{e1}_atm" for each unordered pair of elements in element_list (36 new columns). - outgas/wrapper.run_outgassing: fill those columns with hf_row["{e2}_kg_atm"] / hf_row["{e1}_kg_atm"], guarded by a 1e-30 kg floor on either side to avoid divide-by-zero noise. The keys are gated by `if key not in hf_row: continue` so the loop is a no-op on rows where the helpfile schema hasn't been initialised yet.
Add the grey-gas option to AGNI as an alternative to the SOCRATES spectral-file pathway. When `atmos_clim.agni.spectral_file = "greygas"`, AGNI bypasses the spectral-file copy/modification step and uses two constant opacities (one longwave, one shortwave) instead. What this does: - AtmosClim.agni: three new config fields. `spectral_file` (None by default; set to 'greygas' or a file path), `grey_opacity_lw` and `grey_opacity_sw` (m2/kg, forwarded to AGNI as kappa_grey_lw/sw). - valid_agni: handle the new `spectral_file` field. When None, fall back to the existing `atmos_clim.spectral_group` + `spectral_bands` requirement; when set, accept 'greygas' or check the path exists. - New `valid_rayleigh` validator on `atmos_clim.rayleigh` that rejects the combination with AGNI grey gas (no Rayleigh band structure to scatter) and the combination with the dummy module (subsumes the old warn_if_dummy check). - agni.py init_agni_atmos: dispatch on `spectral_file` and pass kappa_grey_lw/sw to AGNI.atmosphere.setup_b. The greygas path short-circuits the runtime spectral-file copy. I kept the rest of #675's atmos_clim refactor out of this commit: - `p_top`, `p_obs`, `spectral_group`, `spectral_bands`, `num_levels` stay on the parent AtmosClim class on this branch (shared between AGNI and JANUS). The PR moves them under each module. - The PR removes AGNI failure-mode parsing, NaN-flux/T_surf state validation, and the surf_state=mixed_layer check. All three stay. Note for grey-gas runtime use: AGNI must be at >=1.9.0 source. The local AGNI submodule HEAD has the kappa_grey_* kwargs even though its Project.toml still says 1.8.9; AGNI_MIN_VERSION stays at 1.8.0 here to avoid breaking the version check. Tests: new tests/config/test_atmos_clim_validators.py with 7 cases covering valid_rayleigh and the new spectral_file branches of valid_agni; new test in tests/atmos_clim/test_agni.py that drives init_agni_atmos through the greygas path with a fully-faked AGNI binding. Updated tests/config/test_config.py to add spectral_file to the AGNI test fixtures.
…675) Add the new dt knobs that #675 introduces, in our existing flat TimeStepParams layout (not in the nested DtAdaptive/DtProportional subclasses Harrison's PR creates, since that would require migrating every input/*.toml in this repo). What this does: - dt.scale_incr (default 1.6): scale factor on a successful adaptive step, must be >1. - dt.scale_decr (default 0.8): scale factor on a rejected adaptive step, in (0, 1). - dt.window (default 3): number of previous steps the adaptive controller looks at for comparison. - dt.maximum_rel (default 1.0): cap on dt as a fraction of current simulation time. The per-step cap becomes min(dt.maximum, dt.maximum_rel * Time); default 1.0 disables the fractional cap. Schema-only addition. The actual adaptive controller in interior_energetics/timestep.py still reads atol/rtol/propconst from the flat namespace and ignores these new fields; the timestep code change that consumes them lands with the #676 overshoot fix in the next commit. I kept the rest of #675's _params.py and _interior.py reshuffle out of this commit: - DtAdaptive / DtProportional nested classes (schema rearrangement that would break every input/*.toml using flat dt.atol/dt.rtol). - Default-value tweaks to starspec/starinst/minimum/initial/p_stop. - Removal of dt_write_rel, mushy_maximum, mushy_upper, hysteresis_iters, hysteresis_sfinc, max_growth_factor, StopSolid.freeze_volatiles. - SPIDER log_output and CALLIOPE nguess/nsolve config fields, which belong with their runner-side changes in 7g.
…rt of #675) When stop.time is the active termination criterion, dtswitch can carry the simulation past stop.time.maximum on the last step. Add a per-step cap on the candidate dt so Time + dt <= stop.time.maximum. What this does: - Cap dtmaximum at max(1, stop.time.maximum - Time) when stop.time.enabled is True. The 1 yr floor keeps a tiny end-of-run remainder from collapsing dt to zero (and then to dt.minimum on the next line). - Add dt.maximum_rel * Time as an additive allowance to dtmaximum, matching #675's behaviour for proportional/maximum methods. Default dt.maximum_rel=1.0 (no change in practice; the existing dt.maximum cap still binds for runs that don't set it). - Wire the four schema fields added in the previous commit (dt.scale_incr, dt.scale_decr, dt.window) into the adaptive controller. Defaults match the previous hardcoded constants (SFINC=1.6, SFDEC=0.8, LBAVG=3), so existing configs are unaffected but the knobs are now tunable. I kept the stiffness-aware extensions (mushy-regime cap, hysteresis counter, max_growth_factor, retry step_sf in static/initial branches) intact; #675 strips those. The PR locates timestep.py under src/proteus/interior/timestep.py. On this branch the file lives at interior_energetics/timestep.py (Phase 6 module rename), so the changes land there. Tests follow the same path. Tests: 4 new cases in tests/interior_energetics/test_timestep.py that verify the overshoot cap, the 1 yr floor, the disabled-stop.time path, and that dt.maximum_rel widens (not narrows) the cap.
… of #675) This is the physics-critical part of #675 on the interior boundary. It does three independent things that #675 ships together. Wrapper step limiters in run_interior: - The prevent_warming clamp (gated by planet.prevent_warming and interior_o.ic == 2) now also clips T_surf to the previous step and floors F_int at min(F_int, prev) with a 1e-8 lower limit. Previously only Phi_global and T_magma were clamped, leaving T_surf and a negative F_int (heat-pump artefact) to leak through. - A symmetric runaway-T warning fires for T_surf, capping any > dT_delta jump at T_surf_prev + dT_delta. - The Boundary backend uses interior_energetics.boundary.Tsurf_event_change as its dT_delta budget (instead of the SPIDER/Aragog tmagma_atol+tmagma_rtol*T formula), since the boundary-layer model does not expose those knobs. - The F_int >= 1e-8 floor moves from interior_energetics/spider.py to wrapper.run_interior so all backends share one limiter path. New schema: - Spider.log_output (default True): route SPIDER subprocess stdout/stderr to <output>/spider_recent.log; set False to discard via sp.DEVNULL. Wired in interior_energetics/spider._try_spider. - Spider.tolerance_struct (default 100 kg) and Aragog.tolerance_struct (default 100 kg): absolute mass tolerance for the secant solver in determine_interior_radius. The xtol used to be a hardcoded 1e3 kg; the new field tightens it to 100 kg per backend. - Calliope.nguess (1000) and Calliope.nsolve (3000): CALLIOPE equilibrium-solver iteration counts that used to be hardcoded inside outgas/calliope.py. Defaults match the legacy hardcoded values; existing configs see no behavioural change. determine_interior_radius solver tuning: - xtol = interior_energetics.<module>.tolerance_struct (was 1e3). - maxiter 10 -> 20. - Initial bracket widens to x1 = R_int * 1.5 (was 1.02), which lets the secant escape stiff plateaus on highly-volatile-rich planets where the previous 2% bracket flat-lined the residual. I kept the rest of #675's interior diff out of this commit: - The retry decay change (step_sf *= 0.1 -> 0.2): our branch already uses 0.3 from independent work. - The aragog.py rtol-field rename: our runner already reads config.interior_energetics.rtol, so the PR's tolerance -> tolerance_rel rename is moot on this branch. - The config.atmos_clim.prevent_warming reference: our branch reads config.planet.prevent_warming (Phase 5). Tests: 12 new tests in tests/interior_energetics/test_wrapper.py covering run_interior limiters (ic == 2 vs not, prevent_warming on vs off, boundary-module dT_delta path, T_surf runaway warning), Spider/Aragog tolerance_struct + log_output schema + wiring, and Calliope nguess/nsolve schema + validator. The 1188 existing unit tests + 7 smoke tests still pass.
…e (part of #675) _get_sufficient runs at startup to make sure every external data file the simulation needs is on disk. Before 7d, AGNI always resolved its spectral file via group + bands and queued a matching download. After 7d, AGNI can also take a custom file path or 'greygas' directly, so the group/bands download is no longer needed in those cases. What this does: - When `atmos_clim.module == 'agni'` AND `atmos_clim.agni.spectral_file` is set (any truthy value), skip the get_spfile_name_and_bands lookup and the second download_spectral_file call. The high-res Honeyside post-processing file is still always downloaded for AGNI/JANUS so end-of-run plotting tools find it. - JANUS is unchanged: it does not have a spectral_file knob and the conditional explicitly checks `module == 'agni'` before it can bypass the lookup. This is the data-side complement to the 7d AGNI grey-gas dispatch. The runner-side change (init_agni_atmos consulting spectral_file) and the config-schema change (Agni.spectral_file / grey_opacity_lw / sw) landed in 7d; the helpfile schema for element ratios landed in 7c; variable_is_logarithmic landed in 7a. The remaining items from the original 7h plan (coupler retry-ladder, time-cap) are either already covered by 7f (time.maximum overshoot fix) or do not exist on this branch (the retry ladder is module-internal). Tests: 3 new tests in tests/utils/test_data.py covering the AGNI spectral_file=set bypass, the AGNI spectral_file=None fallback path, and an anti-happy-path check that JANUS configs never enter the bypass branch.
main has one commit ahead of the merge-base: d5d80c2 "BayesOpt scheme, AGNI grey gas, improved test coverage (#675)". On this branch the same work is present as eight separate commits, ported onto the refactored interior_struct + interior_energetics layout: 86796ea Update inference module to BayesOpt rewrite (part of #675) 3cd599b Fix plot crashes when nband=1 or input has only one wavelength (part of #675) 09dd7e1 Record element mass ratios in atmosphere (part of #675) 688c721 Add AGNI grey-gas RT scheme support (closes #674, part of #675) 29a355e Add four new adaptive time-stepping fields to TimeStepParams (part of #675) b07f152 Prevent time-stepping overshoot of stop.time.maximum (closes #676, part of #675) ba7f96f Extend prevent_warming clamp + add SPIDER/CALLIOPE config knobs (part of #675) cb27022 Skip group/bands spectral-file download when AGNI uses a user-set file (part of #675) Plus the follow-up review-finding commits: e3fe7bd Fix two inference-module review findings 051ec69 Block aerosols on grey-gas RT and dummy atmos_clim at config-load time The structural collision (main patches src/proteus/interior/*; this branch renamed the directory to src/proteus/interior_energetics and ported every edit there) means a content merge would either duplicate our decomposed commits or re-introduce the old layout. -s ours records the integration without taking any content from main; the eight commits above already carry it in the correct place. File-by-file audit confirms every change in d5d80c2 is present: - The four src/proteus/interior/* edits live in src/proteus/interior _energetics/{aragog,spider,timestep,wrapper}.py. - The two tests/interior/test_*.py files live in tests/interior _energetics/test_{timestep,wrapper}.py. - src/proteus/inference/toy_example.ipynb is removed on both branches. - input/ensembles/{example.grid.toml,example.toml,example.infer.toml} and input/planets/l9859{b,c}.toml were consolidated into the config_version 3.0 schema by 4c802e0 ("Consolidate input/ to 5 files"); the supersession is intentional. - All other src/ and tests/ files match path-for-path; the BO, AGNI grey-gas, time-stepping, and outgas-wrapper changes are in place at the same paths. Net effect of this merge commit: git rev-list origin/main..HEAD drops to zero, GitHub stops showing the PR as CONFLICTING, and no file content changes (git diff HEAD^1 HEAD is empty).
BayesOpt changes
tools/for extracting truth values from baseline PROTEUS runs, for BO retrievalTimestepping changes
prevent_warmingconfig flag. This improves model stability and performance substantially, by damping oscillations.AGNI grey gas
nband=1.Moved hard-coded config parameters in PROTEUS
Validation of changes
A nice bonus from this PR is that the total PROTEUS coverage is increased by 9 percentage points, bringing it to 78.50%. We are almost at the 80% target.
Checklist
Relevant people
@timlichtenberg