Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v3] revisit runtime config #1772

Closed
jhamman opened this issue Apr 5, 2024 · 6 comments
Closed

[v3] revisit runtime config #1772

jhamman opened this issue Apr 5, 2024 · 6 comments
Labels
design discussion V3 Related to compatibility with V3 spec
Milestone

Comments

@jhamman
Copy link
Member

jhamman commented Apr 5, 2024

This issue tracks a evaluation of the v3 runtime config

Context

The v3 branch runtime config currently looks like this:

@dataclass(frozen=True)
class RuntimeConfiguration:
order: Literal["C", "F"] = "C"
concurrency: Optional[int] = None
asyncio_loop: Optional[AbstractEventLoop] = None

This is then attached to Array/Group classes

@dataclass(frozen=True)
class AsyncArray:
metadata: ArrayMetadata
store_path: StorePath
runtime_configuration: RuntimeConfiguration

A few things are missing here:

  1. User experience
    • as a user, I may want to set config settings and forget about them (e.g. order, concurrency)
  2. Portability
    • I don't know for sure but I really doubt that putting the AsyncIO loop on the Array class is going to work when it comes to serialization

Improvements

So looking for some ideas for how to manage this better. Two ideas:

  1. Xarray style set-options: https://docs.xarray.dev/en/stable/generated/xarray.set_options.html
    • Pros: allows for validation and is typed
    • Cons: a bit bespoke, doesn't support environment variables or a config file option
  2. Dask style config - https://donfig.readthedocs.io/en/latest/
    • Pros: very flexible framework, support for environment variables and config files, nested namespaces, etc.
    • Cons: extra dependency (though we could vendor it), no typing or validation

what do we expect to go in the runtime config?

  • Order
  • Concurrency
  • logging settings
  • what else?
@jhamman jhamman added the V3 Related to compatibility with V3 spec label Apr 5, 2024
@jhamman jhamman added this to the 3.0.0.alpha milestone Apr 5, 2024
@jhamman jhamman self-assigned this Apr 5, 2024
@jhamman
Copy link
Member Author

jhamman commented Apr 19, 2024

I spoke with @maxrjones today about this. Our thought for now was to try using donfig and see how it goes. We can continue to evaluate the dependency vs vendoring and typing/validation as needed.

cc @djhoese

@jhamman jhamman removed their assignment May 6, 2024
@normanrz
Copy link
Contributor

normanrz commented May 8, 2024

Additional config options:

  • Specify alternate implementations for CodecPipeline (e.g. for a rust-based codec pipeline)
  • Specify alternate implementations for codecs (e.g. for GPU-based batch-aware codecs)
  • Batch size in the HybridCodecPipeline

@maxrjones
Copy link
Member

thanks @normanrz! Joe mentioned you asked about this today. I'm working on getting a minimal PR opened now and should have that submitted within the next couple hours.

@jhamman
Copy link
Member Author

jhamman commented May 10, 2024

@normanrz - #1855 is now in the v3 branch. Should clear the way to add additional config options as needed.

@jhamman jhamman closed this as completed May 10, 2024
@jhamman
Copy link
Member Author

jhamman commented May 10, 2024

Thanks @maxrjones for getting this moving!

@normanrz
Copy link
Contributor

I think this is a great way of dealing with configurations. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design discussion V3 Related to compatibility with V3 spec
Projects
Status: Done
Development

No branches or pull requests

3 participants