Conversation
Separates environment identity (env_id) from run configuration (run_id) to
allow inference environments to be reused across configuration changes. This
prevents unnecessary rebuilding of venv and squashfs images when only the
inference config YAML or steps are modified.
Changes:
src/evalml/config.py:
- Add RunConfig.ENV_FIELDS ClassVar documenting fields that determine the
inference environment (checkpoint, extra_requirements, disable_local_eccodes_definitions)
- Add RunConfig.HASH_EXCLUDE ClassVar for fields never included in hashing
(label, inference_resources)
- Export module-level constants RUN_ENV_FIELDS and RUN_HASH_EXCLUDE
workflow/rules/common.smk:
- Add ENV_HASH_FIELDS and RUN_HASH_EXCLUDE constants
- Split hashing logic into two functions:
- env_entry_hash(): hashes only environment-determining fields
- run_specific_hash(): hashes run-specific fields (config YAML, steps)
- Refactor register_run() to compute and store both env_id and run_id in
each run config entry. Format: run_id = {env_id}/{config_hash}
- Add collect_all_envs() function and ENV_CONFIGS global dict
- Update master_hash() to hash both env and run components separately
workflow/rules/inference.smk:
- Rules using {env_id} wildcard (outputs in data/envs/{env_id}/):
- prepare_checkpoint
- extract_checkpoint_requirements
- create_inference_venv
- make_squashfs_image
- Rules using {run_id} wildcard with nested config directories:
- prepare_inference_forecaster
- prepare_inference_interpolator
- execute_inference (references env via lookup)
- create_inference_sandbox
Directory structure change:
- Environment artifacts: data/envs/{env_id}/
- Run-specific outputs: data/runs/{env_id}/{config_hash}/{init_time}/
Benefits:
- Reuses environments across config changes (no squashfs rebuild)
- Reduces disk I/O on shared filesystems
- Documents identity contract via ClassVars
- Nested directory structure clearly separates concerns
Tests:
- Add test_run_identity.py with 5 tests validating identity separation
- All existing tests pass
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
This is looking really good! Thanks! One question I have: do I understand correctly that now, although we distinguish between model envs and actual model runs, we stack both together under |
Used to be exactly this, but I changed it in c689c99. I like that by organizing hierarchically we see immediately which models use which envs. What advantage do you see in separating them? |
mmmh maybe I don't fully understand it, can you paste here an example of how it'd look like? |
|
All the configs, with the exception of |
|
Louis-Frey
left a comment
There was a problem hiding this comment.
Looks good from my side, good to go!
fix from regression after #122 when running showcase workflow: ``` InputFunctionException in rule make_forecast_animation in file "/users/ned/src/evalml/workflow/rules/plot.smk", line 125: Error: KeyError: '88a3' Wildcards: showcase=20260401_forecasters-ich1_75e9/forecaster-233b-098c run_id=88a3 init_time=202406010000 param=T_2M region=globe Traceback: File "/users/ned/src/evalml/workflow/rules/plot.smk", line 130, in <lambda> File "/users/ned/src/evalml/workflow/rules/plot.smk", line 118, in get_leadtimes (rule make_forecast_animation, line 207, /users/ned/src/evalml/workflow/rules/plot.smk) ``` run_id now contains "/" (format: "{env_id}/{r_hash}"), Snakemake wildcards would greedily absorb part of run_id into {showcase} when matching paths of the form results/{showcase}/{run_id}/... constrain showcase to a single path component.
Summary
Separates environment identity from run configuration to allow inference environments to be reused across configuration changes, eliminating unnecessary rebuilds of venv and squashfs images. Closes #111
Changes
ENV_FIELDSandHASH_EXCLUDEClassVars toRunConfigdocumenting the identity contractenv_entry_hash()for environment-level changes,run_specific_hash()for configuration changesregister_run()to compute bothenv_idandrun_idwith nested directory structure:data/runs/{env_id}/{config_hash}/{env_id}wildcard for environment artifacts (indata/runs/{env_id}/) and{run_id}for run outputsENV_CONFIGSglobal dict andcollect_all_envs()functionBenefits
Testing