Standardize prepare_inputs docstrings across built-in models#877
Standardize prepare_inputs docstrings across built-in models#877
Conversation
Agent-Logs-Url: https://github.com/lincc-frameworks/hyrax/sessions/ef16e0c4-3a0a-49d5-a72a-54644e0fe857 Co-authored-by: drewoldag <47493171+drewoldag@users.noreply.github.com>
|
@copilot since these models also act as examples for users, please also include a line in the docstring mentioning that the static method is meant to be modified when the user wants to experiment with different inputs. The method is the interface between the data and the model. It exists to change the shape of the collated batch of data into the shape that the model expects. |
Agent-Logs-Url: https://github.com/lincc-frameworks/hyrax/sessions/e9e67629-dc20-4c8d-a5fc-616d91c5aede Co-authored-by: drewoldag <47493171+drewoldag@users.noreply.github.com>
Done in b95bc5f. Each
|
drewoldag
left a comment
There was a problem hiding this comment.
The updates to the docstrings look good to me. No other models implement their own prepare_inputs.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #877 +/- ##
=======================================
Coverage 66.54% 66.54%
=======================================
Files 62 62
Lines 6513 6513
=======================================
Hits 4334 4334
Misses 2179 2179 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Standardizes the documentation for built-in prepare_inputs methods to consistently describe how models extract NumPy arrays from the collated batch dict, and clarifies that Hyrax handles numpy→tensor conversion/device placement.
Changes:
- Replaced inconsistent/inaccurate
prepare_inputsdocstrings across built-in models with a uniform NumPy-style structure. - Added a missing docstring for
default_prepare_inputsin the model registry. - Corrected prior misleading wording implying tensor outputs where the implementations return NumPy arrays.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/hyrax/models/model_registry.py |
Adds a standardized docstring for default_prepare_inputs. |
src/hyrax/models/image_dcae.py |
Updates prepare_inputs docstring to correctly describe extracting a NumPy image array. |
src/hyrax/models/hyrax_cnn.py |
Replaces informal/odd phrasing with standardized prepare_inputs docstring. |
src/hyrax/models/hyrax_autoencoderv2.py |
Updates prepare_inputs docstring (but leaves a return type hint mismatch). |
src/hyrax/models/hyrax_autoencoder.py |
Updates prepare_inputs docstring (but leaves a return type hint mismatch). |
| @@ -204,12 +204,25 @@ def infer_batch(self, batch): | |||
|
|
|||
| @staticmethod | |||
| def prepare_inputs(data_dict) -> tuple: | |||
There was a problem hiding this comment.
prepare_inputs is annotated as returning tuple, but the implementation returns a single image array. This conflicts with the updated docstring (Returns: image : numpy.ndarray) and can mislead type checkers and callers. Update the return annotation to an NDArray type (or remove it) to match the actual return value.
| def prepare_inputs(data_dict) -> tuple: | |
| def prepare_inputs(data_dict): |
| @@ -269,12 +269,25 @@ def infer_batch(self, batch): | |||
|
|
|||
| @staticmethod | |||
| def prepare_inputs(data_dict) -> tuple: | |||
There was a problem hiding this comment.
prepare_inputs is annotated as returning tuple, but it returns a single image array. This is inconsistent with the new docstring (Returns: image : numpy.ndarray) and reduces the usefulness of type hints. Align the return annotation with the actual return type (e.g., an NDArray) or drop the annotation.
| def prepare_inputs(data_dict) -> tuple: | |
| def prepare_inputs(data_dict): |
There was a problem hiding this comment.
In future we may want a way to check that a benchmark works in github CI without comitting to main; however, I think the trade-off here (posting to slack) is reasonable given that we don't add benchmarks that often.
Ugh typed in wrong box. Reviewing this now
Change Description
Built-in
prepare_inputsmethods had inconsistent, inaccurate docstrings — some said "Does NOT convert to PyTorch tensors" (odd phrasing post-rename), others said "Convert structured data to tensor format" (wrong: these return NumPy arrays), and one had no docstring at all.Solution Description
All five
prepare_inputsimplementations now share a uniform NumPy-style docstring structure with:prepare_inputsis the interface between the data pipeline and the model, and that users should override it on their model class to reshape or select fields from the collated batch to match the inputs their model expects.ParametersandReturnssections following the NumPy docstring convention used elsewhere in the codebase.Files updated:
hyrax_cnn.py— replaced informal "Does NOT convert to PyTorch Tensors..." with uniform docstring.image_dcae.py— fixed factually incorrect "Convert structured data to tensor format." (returns ndarray, not tensor).hyrax_autoencoderv2.py/hyrax_autoencoder.py— replaced "converts structured data to the input tensor we need to run" with accurate docstring.model_registry.py(default_prepare_inputs) — added missing docstring.Code Quality