feat: ✨ Allow writing to Zarr via xarray_tensorstore._TensorStoreAdapter#10
feat: ✨ Allow writing to Zarr via xarray_tensorstore._TensorStoreAdapter#10copybara-service[bot] merged 17 commits intogoogle:mainfrom
Conversation
This commit adds a __setitem__ method to xarray_tensorstore._TensorStoreAdapter, allowing users to write data to a Zarr opened via this adapter. Should close Issue google#5
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Done. Ready or review @shoyer |
|
Hi Jeff, thanks for looking into this! Have you tested that this actually works for writing files in Xarray? I'd love to see an example of how that works. Also, I think you may need to revert edits in the test file -- it looks like this is just an older version of the test suite, without any new additions for this functionality? |
Attempted to match original formatting.
Re-add import
Yup. I'm currently using this version of xarray-tensorstore under the hood for reading/writing large imaging datasets. It's a bit involved for an example, I know, but here is where it is executed in my codebase. In this case, That said, I haven't tried it for other use cases, such as the type of data you have in the minimal usage example in the README. |
Done. 👍🏼 Thank you! |
OK, thanks for sharing! For context, this is a little different from how other storage systems work in Xarray, which do not support modifying an existing Zarr array opened in an Xarray object. Typically, we separate reading and writing into two separate methods, e.g.,
There are a few detailed reasons why this this is the case for Xarray, but as far as I can tell, none of them apply to Xarray-TensoreStore. So I think it would be fine to add this functionality here. That said, because this differs from the default in Xarray, I would suggest making this new behavior (mutability) something that requires an explicit opt-in. The suggestion would be to add a new
|
|
I think I found an issue with vector indexing in Xarray as a whole, compared to Numpy indexing. Do you happen to know if this is expected? Here's a minimal example: import numpy as np
import xarray
key = (np.array([0, 1]), np.array([0, 1]), slice(None))
source_data = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
source = xarray.DataArray(
source_data,
dims=("x", "y", "z"),
name="baz",
)
xarray.testing.assert_equal(source_data[key], source[key]) |
Yes, this is intentional. Xarray's vectorized rules match keys up with array dimensions: |
This reverts commit c651d35.
Redo change without auto-Black formatting
|
Sorry for the messy formatting commits... |
Also fix indentation.
|
Sorry. Forgot to update the test. Should be good now. |
|
thanks! |
|
Thank you! |
|
This is now available as part of the 0.1.5 release. |
This commit adds a setitem method to xarray_tensorstore._TensorStoreAdapter, allowing users to write data to a Zarr opened via this adapter.
Should close Issue #5