Skip to content

Conversation

@JoelLucaAdams
Copy link
Collaborator

@JoelLucaAdams JoelLucaAdams commented Dec 28, 2025

  • Add support for xarray.DataTree. Thankfully this is quite a simple addition that takes an existing dataset, converts the names back to the original ones in the raw SDF files and then builds a datatree using the from_dict method.
  • When the user generates a xarray.DataTree an additional attribute called flat_structure_name is added that points to the name used to reference the variable when opening as a regular xarray.Dataset.
  • Add tests and documentation.

Resolves #23

The motivation behind this addition is so that we are able to interface with a package like xr-tui. This makes interfacing with and playing around with datasets on HPCs significantly easier.

Requires samueljackson92/xr-tui#4 to work with open_mfdatatree.

@JoelLucaAdams
Copy link
Collaborator Author

JoelLucaAdams commented Dec 31, 2025

Well, it's unfortunate to see that this doesn't work with Python 3.10. The last supported version of xarray is 2025.6.1 and the version where the issues with failing xarray.DataTrees is resolved is in 2025.9.1.

I think we should drop support for 3.10 for several reasons:

  • Its EOL is approximately October 2026 PEP 619
  • xarray has dropped support for this since June 2025
  • xr-tui doesn't support 3.10
  • We already support 3.11, 3.12, 3.13 and 3.14.

@LiamPattinson
Copy link
Collaborator

When it comes to supporting Python versions, there are two places I always reference:

Python 3.10 itself will still be supported until the end of this year, but SPEC 0 recommends keeping to the cutting-edge. If you want to keep pace with the likes of NumPy and Xarray, the recommendation is actually to drop 3.10 and 3.11 at this stage.

With the rise of tools like uv, it's less important than it used to be to support older Python versions, but it is likely to frustrate a lot of users who are just using the version that ships with their OS or runs on their cluster.

It's your call if you want to drop 3.10. I'm probably overly conservative about this because I recall being stuck on 3.8 for ages in the days before uv, but dropping 3.10 and maintaining 3.11 might strike a nice middle ground here.

@JoelLucaAdams
Copy link
Collaborator Author

Thanks for the advice Liam! I think to stay in-line with SPEC 0 but still allow for people who may have slightly older versions of Python installed I shall follow your advice and drop 3.10 but maintain 3.11

@LiamPattinson
Copy link
Collaborator

LiamPattinson commented Jan 15, 2026

I've rebased on main, apologies that GitHub is now giving me credit for all of your hard work! The failing readthedocs build seems to be because my commit was immediately followed by an auto-commit that cancelled the previous workflow.

I tried to resolve a ruff format issue without realising that the repo contains a conflict between black and ruff. I would personally recommend switching to ruff to simplify the number of dev dependencies and replacing the auto-committing black action with one that simply raises an error for unformatted code, as in my experience this tends to cause fewer surprises for contributors.

@LiamPattinson
Copy link
Collaborator

Rebased again to remove the little dance with the conflicting formatting tools.

@JoelLucaAdams
Copy link
Collaborator Author

Thanks for fixing the conflict!

Regarding the ruff vs black formatter I agree we should just fully switch to ruff. Could you either fix this in a separate branch or raise a PR about it and I'll get round to it when I have more time

@LiamPattinson
Copy link
Collaborator

LiamPattinson commented Jan 15, 2026

Thanks for fixing the conflict!

Regarding the ruff vs black formatter I agree we should just fully switch to ruff. Could you either fix this in a separate branch or raise a PR about it and I'll get round to it when I have more time

No worries! I'll raise an issue/PR for the workflow change later.

I've read through the code and all looks good to me, but I haven't yet had a chance to play around with the DataTree type and confirm it behaves sensibly. I've complete my review sometime this afternoon.

Copy link
Collaborator

@LiamPattinson LiamPattinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good! I've tested it myself and the DataTrees are a lovely addition to the library. I can think of a few previous projects I've worked on where they would have been really helpful.


ds = ds.rename_vars(final_renames)

return xr.DataTree.from_dict(ds)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit, I'm quite impressed that this works. I would have expected much more manual intervention to be necessary!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just as shocked that it worked when I first tried it, such an elegant system!

@JoelLucaAdams JoelLucaAdams merged commit 10272f8 into main Jan 16, 2026
15 checks passed
@JoelLucaAdams JoelLucaAdams deleted the datatree_support branch January 16, 2026 09:42
@JoelLucaAdams JoelLucaAdams mentioned this pull request Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hierarchical data support

3 participants