Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZipStore arguments: 'w' vs 'x' #1057

Open
aliaksei-chareshneu opened this issue Jun 26, 2022 · 3 comments
Open

ZipStore arguments: 'w' vs 'x' #1057

aliaksei-chareshneu opened this issue Jun 26, 2022 · 3 comments
Labels
documentation Improvements to the documentation

Comments

@aliaksei-chareshneu
Copy link

aliaksei-chareshneu commented Jun 26, 2022

Dear all,

Could you tell me what is the exact difference between 'w' and 'x' mode for ZipStore creation?
Also, what does it mean "truncate" here?

modestring, optional

One of ‘r’ to read an existing file, ‘w’ to truncate and write a new file, ‘a’ to append to an existing file, or ‘x’ to exclusively create and write a new file.

Two related questions:

  • How dimension_separator would affect reading/writing arrays?
  • Does the store need to be closed after reading data from it if opened in 'r' (reading mode)?

https://zarr.readthedocs.io/en/stable/api/storage.html#zarr.storage.ZipStore

Best regards,
Aliaksei

  • Value of zarr.__version__: 2.10.3
  • Value of numcodecs.__version__: 0.9.1
  • Version of Python interpreter: 3.9.0
  • Operating system (Linux/Windows/Mac): Windows 11
  • How Zarr was installed (e.g., "using pip into virtual environment", or "using conda"): using pip into virtual environment
@clbarnes
Copy link
Contributor

Could you tell me what is the exact difference between 'w' and 'x' mode for ZipStore creation?

These are related to python's open arguments. w will create a file if it doesn't exist or truncate a file which does - i.e. delete all data, reducing it to 0 bytes. x will fail if the file exists already.

How dimension_separator would affect reading/writing arrays?

It won't really, if you only intend to access the zip using zarr. I'd suggest leaving it as the default, unless you intend to unzip it into a file system with opinions on how many files should be in a directory.

Does the store need to be closed after reading data from it if opened in 'r' (reading mode)?

Yes, just like regular files: use it with a context manager (with statement), it'll make your life easier.

@aliaksei-chareshneu
Copy link
Author

aliaksei-chareshneu commented Jun 30, 2022

@clbarnes, thank you very much.

Regarding 'r' mode: what could happen if store is not closed after reading? I am not introducing any modifications to it.

@clbarnes
Copy link
Contributor

Unlike most stores, the ZipStore obeys normal python file-opening semantics. Just like a python file (or zipfile), , the file is automatically closed when the object is garbage collected, but you're not in control of when that happens (and it can be dependent on how your script/ package is structured) so for certain usage patterns it can lead to unpredictable numbers of files being open at once.

Also like python files/ zipfiles, there's a .close() method to explicitly close it (which is what the with statement does implicitly as soon as you leave the block) - that seems to be what the examples use. But in general using it directly is discouraged because if an exception happens before the explicit close, you may never reach it, and you leave the file cleanup to the garbage collector. One alternative pattern would be

try:
    store = zarr.ZipStore("some/path.zip")
    ...  # whatever else you want to do
finally:
    store.close()

but that's less ergonomic than the with statement.

tl;dr in real life it's unlikely that anything disastrous would happen if you left the file closure up the garbage collector, but it is generally good practice to handle these kinds of resources using the context manager. For the sake of explicitness and "one-- and preferably only one --obvious way to do it", I would highly recommend using the context manager wherever possible.

If you have a large amount of code which would need to be indented here, I would recommend either factoring that inner code into a function which takes an open ZipStore (then it would be generic over any other store too!), or possibly a wrapper class which itself is a context manager which closes the inner ZipStore on __exit__.

@joshmoore joshmoore added the documentation Improvements to the documentation label Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements to the documentation
Projects
None yet
Development

No branches or pull requests

3 participants