Bringing Unsupervised Models Back in Line With Main#647
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #647 +/- ##
==========================================
- Coverage 63.90% 63.86% -0.05%
==========================================
Files 55 55
Lines 5550 5552 +2
==========================================
- Hits 3547 3546 -1
- Misses 2003 2006 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR aligns the ImageDCAE and HyraxAutoencoderV2 models with recent HyraxQL changes and config naming conventions. The changes standardize configuration access patterns to match the conventions used in other models like HyraxAutoencoder and SimCLR.
Changes:
- Converted ImageDCAE config access from
.get()with defaults to nested dictionary structure under[model.ImageDCAE] - Added corresponding
[model.ImageDCAE]configuration section in default config file - Updated HyraxAutoencoderV2's
to_tensormethod to extract data from nested structure - Removed unused
Tensorimport and improved type hints in HyraxAutoencoderV2
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/hyrax/models/image_dcae.py | Updated config access to use model-specific nested section [model.ImageDCAE], removing fallback defaults |
| src/hyrax/models/hyrax_autoencoderv2.py | Modified to_tensor to handle nested data dict structure, removed unused import, improved type hints |
| src/hyrax/hyrax_default_config.toml | Added [model.ImageDCAE] configuration section with default values |
| data_dict : dict | ||
| The dictionary returned from our data source | ||
| """ | ||
| data_dict = data_dict["data"] |
There was a problem hiding this comment.
The direct access data_dict = data_dict["data"] will raise a KeyError if the "data" key is missing, which is less user-friendly than the RuntimeError raised by other models. This is inconsistent with the pattern used in HyraxAutoencoder (which uses data_dict.get("data", {}) for safe access) and HyraxCNN (which checks for the key and raises a descriptive RuntimeError).
Consider using one of these safer patterns:
- Use
.get()for safe access:data = data_dict.get("data", {})(like HyraxAutoencoder) - Check for key existence and raise a descriptive error:
if "data" not in data_dict: raise RuntimeError("Unable to find 'data' key in data_dict")(like HyraxCNN)
There was a problem hiding this comment.
@copilot Implement safer pattern 2 you suggested
| self.latent_dim = config["model"]["ImageDCAE"]["latent_dim"] | ||
| self.base_channel_size = config["model"]["ImageDCAE"]["base_channel_size"] |
There was a problem hiding this comment.
Changing from .get() with default values to direct dictionary access removes the fallback behavior. While this aligns with the convention used in other models (HyraxAutoencoder, HyraxAutoencoderV2, SimCLR), it's a breaking change for users who have existing configurations that don't include the [model.ImageDCAE] section.
Consider documenting this breaking change in the PR description or release notes, or adding a migration note for users upgrading from previous versions.
There was a problem hiding this comment.
Will ignore this to have consistency across all the models.
Click here to view all benchmarks. |
gitosaurus
left a comment
There was a problem hiding this comment.
Looks OK to me, but I don't know much!
|
@aritraghsh09 I've opened a new pull request, #650, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: aritraghsh09 <6251857+aritraghsh09@users.noreply.github.com>
Implement safer pattern for data key access in HyraxAutoencoderV2
A collection of different changes to get our unsupervised models in line with