Skip to content

Add HDF5 output function#49

Merged
lispandfound merged 11 commits intomainfrom
srf_to_hdf5
Jul 22, 2025
Merged

Add HDF5 output function#49
lispandfound merged 11 commits intomainfrom
srf_to_hdf5

Conversation

@lispandfound
Copy link
Copy Markdown
Contributor

@lispandfound lispandfound commented Jul 14, 2025

Adds methods to read/write an SRF to HDF5. Is intended to replace my C code srf2hdf5 because the SRF parsing is actually faster in the python version thanks to the highly optimised Rust code.

Because the SRF parser completely parses the SRF you get, in addition to what the C code used to parse, all the headers, and the full slip array. And of course the new code is tested where the old code was not.

@lispandfound lispandfound requested a review from Copilot July 14, 2025 04:37
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @lispandfound, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the SrfFile class by introducing robust HDF5 input/output capabilities, which are intended to replace existing C code for improved performance. It also refactors the traditional SRF file I/O operations into class methods, leading to a more cohesive and maintainable codebase.

Highlights

  • HDF5 Integration: Added write_hdf5 and from_hdf5 methods to the SrfFile class, enabling efficient storage and retrieval of SRF data in HDF5 format. This leverages xarray for structured data handling and applies zlib compression to relevant sparse array components.
  • Code Refactoring: Encapsulated the existing SRF file read/write logic by moving it into SrfFile class methods (from_file and write_srf). This improves the object-oriented design and maintainability by centralizing file I/O within the SrfFile class.
  • Data Conversion Utility: Introduced a to_xarray method within SrfFile to facilitate seamless conversion of SRF data (header, points, and sparse slip array) into an xarray.Dataset. This is a crucial intermediate step for the HDF5 serialization process.
  • Documentation & Type Hinting: Updated docstrings for SrfFile attributes (e.g., correcting elot to elat and simplifying slip component descriptions) and added Self type hinting for improved code clarity and robustness.
  • Test Coverage: A new comprehensive test (test_hdf5_read_write) was added to validate the integrity of SRF data after a roundtrip through HDF5 serialization and deserialization, ensuring that all key attributes (version, header, points, and the sparse slipt1_array) are preserved.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds HDF5 serialization support for SRFFile by introducing xarray-based export/import and refactors existing SRF read/write entrypoints.

  • Introduce to_xarray, write_hdf5, and from_hdf5 methods for HDF5 I/O.
  • Add from_file classmethod and update top‐level read_srf/write_srf to delegate.
  • Add test_hdf5_read_write to validate HDF5 round-trip integrity.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
tests/test_srf.py Added HDF5 round-trip test test_hdf5_read_write
source_modelling/srf.py Added to_xarray, write_hdf5, from_hdf5, and refactored read_srf/write_srf
Comments suppressed due to low confidence (1)

source_modelling/srf.py:317

  • [nitpick] For symmetry with write_hdf5, consider renaming from_hdf5 to read_hdf5 (or alias it) so the API follows a consistent read_…/write_… pattern.
    def from_hdf5(cls, hdf5_ffp: Path) -> Self:

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces HDF5 support for SRF files. I've provided feedback on the new HDF5 implementation to enhance correctness and maintainability, specifically around hardcoded column names and SRF versioning.

Copy link
Copy Markdown
Contributor

@joelridden joelridden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should test the to_xarray SRF output function as well? Also not sure if the write_srf is tested either, a full loop for these would be nice

@lispandfound
Copy link
Copy Markdown
Contributor Author

@joelridden both are tested, to_xarray is indirectly tested with the write_hdf5 method. Similarly, the write_srf method is tested by writing and reading back the same file to check the contents don't change. Can add simple unit tests to check both, however.

@joelridden
Copy link
Copy Markdown
Contributor

@joelridden both are tested, to_xarray is indirectly tested with the write_hdf5 method. Similarly, the write_srf method is tested by writing and reading back the same file to check the contents don't change. Can add simple unit tests to check both, however.

Ah I see, that's all good. Indirectly works fine

@lispandfound lispandfound merged commit b43e1ff into main Jul 22, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants