-
-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attrs kwarg #1299
Attrs kwarg #1299
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1299 +/- ##
===========================================
- Coverage 100.00% 99.99% -0.01%
===========================================
Files 35 35
Lines 14130 14158 +28
===========================================
+ Hits 14130 14157 +27
- Misses 0 1 +1
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for this! I think this optimization should really help speed up the creation of hierarchies with metadata. (I did some diagnostics of this on s3 here from the perspective of xarray.)
I have on question that is quite relevant to latency on object storage. Basically I am trying to understand whether this implementation truly reduces the number of I/O operations, or whether it simply provides a more convenient way to specify metadata at array creation time.
@@ -633,6 +663,9 @@ def init_group( | |||
_init_group_metadata(store=store, overwrite=overwrite, path=path, | |||
chunk_store=chunk_store) | |||
|
|||
# initialize attrs | |||
_init_group_attrs(store, path, attrs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question
|
This change is motivated by the convenience. I haven't explicitly checked whether this actually saves I/O budget (but I doubt it). |
I think this would be a breaking API change, so probably best to avoid that.
Can you elaborate more on the downside here? I'm not sure I fully understand the problem. |
There are a lot of places in the code that end up invoking Alternatively, we make a breaking change to For this PR I'm fine with |
I favor this options. It allows us solve the problem without changing the public API in a way that could break user code. I don't consider
|
Tests would require a lot of redundant changes to |
@d-v-b - should we close this in favor of |
closing in favor of |
This PR adds
attrs
as a keyword argument toinit_array
andinit_group
. In both cases, the value assigned toattrs
is inserted into the attributes of the group / array upon initialization.Resolves #538