Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise exception in to_dataset if resulting variable is also the name of a coordinate #8433

Merged
merged 5 commits into from Nov 14, 2023

Conversation

mgunyho
Copy link
Contributor

@mgunyho mgunyho commented Nov 9, 2023

Let me know if you think the error message is unclear or too verbose or too fancy or something.

@max-sixty max-sixty added the plan to merge Final call for comments label Nov 9, 2023
@max-sixty
Copy link
Collaborator

This looks great, thanks @mgunyho .

I'll merge in the next day or so unless anyone has feedback.

Test failures are unrelated I think, will check they fal on main too

xarray/core/dataarray.py Outdated Show resolved Hide resolved
@max-sixty
Copy link
Collaborator

I think we need to adjust the error message tests for the tests to pass

@mgunyho
Copy link
Contributor Author

mgunyho commented Nov 13, 2023

Ah of course, I missed that! Fixed now.

@max-sixty
Copy link
Collaborator

Thanks @mgunyho !

Unfortunately we also have a break in the docs — would you care to take a look? https://readthedocs.org/projects/xray/builds/22540466/

(This also likely means the docs were bad, so it's good news your code found it!)

@mgunyho
Copy link
Contributor Author

mgunyho commented Nov 13, 2023

Huh, at a quick glance this seems like a problem with the error raising logic/my code rather than with the docs. I'll look into it tomorrow.

@mgunyho
Copy link
Contributor Author

mgunyho commented Nov 14, 2023

Yeah, the issue was indeed with my code, it didn't show up in the tests because test_to_dataset_split was using just a 1D array, and in that case the problem didn't show up. I've updated that test now and fixed the logic. (I also rebased onto main)

@mgunyho
Copy link
Contributor Author

mgunyho commented Nov 14, 2023

I can't tell if the tests fail because of my code or something else. I changed the logic from {k: v for k, v in self._coords.items() if k != dim} to {c: self._coords[c] for c in coord_names}, where coord_names is a set, so the order might differ. I thought that my version would be easier to understand. So either the tests are assuming a specific order for the coords, or the failures are due to something unrelated.

@max-sixty
Copy link
Collaborator

I can't tell if the tests fail because of my code or something else. I changed the logic from {k: v for k, v in self._coords.items() if k != dim} to {c: for c in self._coords[c] for c in coord_names}, where coord_names is a set, so the order might differ. I thought that my version would be easier to understand. So either the tests are assuming a specific order for the coords, or the failures are due to something unrelated.

To respond quickly though not fully — does the previous approach work? It may well require a specific order. If the former works well then that's fine...

Thanks for adding more tests, really good we caught the issue prior

@mgunyho
Copy link
Contributor Author

mgunyho commented Nov 14, 2023

I was also confused because I couldn't reproduce the error locally. But I changed it back to the original self._coords.items() version.

Comment on lines +597 to +599
variables = variables_from_split | {
k: v for k, v in self._coords.items() if k != dim
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this too weird, should I try to think of another way to express this?

@mgunyho
Copy link
Contributor Author

mgunyho commented Nov 14, 2023

Hm, at least the failures in https://github.com/pydata/xarray/actions/runs/6869006192/job/18680869980?pr=8433 and https://github.com/pydata/xarray/actions/runs/6860572749/job/18654637893?pr=8433 seem to be the same for both implementations, so maybe it's not my code but something else.

@mgunyho
Copy link
Contributor Author

mgunyho commented Nov 14, 2023

Yep, here's the same failure from another PR: https://github.com/pydata/xarray/actions/runs/6865796013/job/18670620776?pr=8450

@max-sixty
Copy link
Collaborator

Super, thanks @mgunyho !

(Sorry for the test failures, hopefully those will resolve shortly...)

@max-sixty max-sixty merged commit 59378cc into pydata:main Nov 14, 2023
16 of 27 checks passed
dcherian added a commit to rabernat/xarray that referenced this pull request Nov 29, 2023
* main:
  [skip-ci] dev whats-new (pydata#8467)
  2023.11.0 Whats-new (pydata#8461)
  migrate the other CI to python 3.11 (pydata#8416)
  preserve vlen string dtypes, allow vlen string fill_values (pydata#7869)
  Pin mypy < 1.7 (pydata#8458)
  Fix typos found by codespell (pydata#8457)
  [skip-ci] Small updates to IO docs. (pydata#8452)
  Deprecate certain cftime frequency strings following pandas (pydata#8415)
  Added driver parameter for h5netcdf (pydata#8360)
  Raise exception in to_dataset if resulting variable is also the name of a coordinate (pydata#8433)
  Automatic region detection and transpose for `to_zarr()` (pydata#8434)
  remove `cdms2` (pydata#8441)
  Remove PseudoNetCDF (pydata#8446)
  Pin pint to >=0.22 (pydata#8445)
  Remove keep_attrs from resample signature (pydata#8444)
  Rename `to_array` to `to_dataarray` (pydata#8438)
  Add missing DataArray.dt.total_seconds() method (pydata#8435)
  Declare Dataset, DataArray, Variable, GroupBy unhashable (pydata#8392)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataArray.to_dataset(dim) silently drops variable if it is already a dim
3 participants