-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GDP-3: Duck Storages #28
Conversation
12209d8
to
e0c2c5c
Compare
7fd7a84
to
67290e9
Compare
I addressed some of your comments @mcgibbon . Regarding the |
…s, halo parameter and halo/domain attributes
Co-authored-by: Linus Groner <linus.groner@cscs.ch> Co-authored-by: Enrique G. Paredes <enriqueg@cscs.ch>
Co-authored-by: Linus Groner <linus.groner@cscs.ch> Co-authored-by: Enrique G. Paredes <enriqueg@cscs.ch>
Co-authored-by: Jeremy McGibbon <jeremy.mcgibbon@gmail.com>
b6aebd2
to
230e458
Compare
…evious implementation with `dims` as alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! My comments are mostly stylistic, or asking for clarification on some options. The code ideas are all solid.
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
Finally, without internally keeping information about the semantic meaning of dimensions, e.g. the | ||
best layout and proper broadcasting for the resulting storage can not be determined. Further, | ||
implementations would depend on the availability of a library implementing the operations for a | ||
given device. We have already observed performance problems when using cupy on AMD hardware. For | ||
future hardware, these libraries might be entirely unavailable. Therefore, we will not commit to | ||
supporting such operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this appears to be necessary. This in particular should be communicated to users.
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
The implementation of this GDP breaks the integration with external libraries which require a NumPy | ||
`ndarray` subclass. Further, we propose some API changes like renaming or repurposing of keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add here a short description of how to migrate such codes? e.g. indicate what attribute name is guaranteed to be an ndarray subclass which provides access to the storage memory?
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
|
||
|
||
Detailed Description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detailed Description | |
Detailed Description |
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
meaning is the same as in the NumPy :code:`__array_interface__` and the | ||
:code:`__cuda_array_interface__`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A hyperlink to numpy/cupy documentation would be helpful here.
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
+ :code:`"data": Tuple[int, bool]` | ||
+ :code:`"strides": Tuple[int, ...]` | ||
|
||
In Addition, the following optional keys can be contained: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Addition, the following optional keys can be contained: | |
In addition, the following optional keys can be contained: |
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
the respective dimension, which can be used to denote asymmetric halos. It defaults to no halo, | ||
i.e. :code:`(0, 0, 0)`. (See also Section :ref:`domain_and_halo`) | ||
|
||
:code:`layout: Optional[Sequence[int]]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: layout
is a slightly overloaded term, we're using it already to indicate the number of ranks along edges of a tile face. Could this be named stride_order
or something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using as_storage
, is this derived by default from the array passed as input? If so, can you describe that here? Also, what is the interplay between this and the dims
argument, if only one is given?
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
The rationale behind this is that in this way, storages allocated with :code:`defaults` set to a | ||
backend will always get optimal performance, while :code:`defaults` set to :code:`"F"` or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like it disagrees somewhat with the last paragraph of the introduction.
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
:code:`managed: Optional[str]` | ||
:code:`None`, :code:`"gt4py"` or :code:`"cuda"`. It only has effect if :code:`device="gpu"` and | ||
it specifies whether the synchronization between the host and device buffers is not done | ||
(:code:`None`), GT4Py (:code:`"gt4py"`) or CUDA (:code:`"cuda"`). It defaults to :code:`"gt4py"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference in behavior between None
and cuda
? It could be worth having a short description of what GT4Py and CUDA management are.
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
The values of parameters which are not explicitly defined by the user will be inferred from the | ||
first alternative source where the parameter is defined in the following search order: | ||
|
||
1. The provided :code:`defaults` parameter set. | ||
2. The provided :code:`data` or :code:`device_data` parameters. | ||
3. A fallback default value specified above. The only case where this is not available is | ||
:code:`shape`, in which case an exception is raised. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part about inferring parameters should probably be written at the start of this section. Which parameters can be derived from data
/device_data
, and how are they derived?
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
Internally holds a reference to a `CuPy <https://cupy.chainer.org/>`_ `ndarray`. This storage | ||
does not have a CPU buffer. | ||
|
||
:code:`SoftwareManagedGPUStorage` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The previous name of ExplicitlyManaged
was good for this object, I think. "Software" reads like an external binary running on the host system outside of Python. It contrasts well with CUDAManagedGPUStorage.
Accessing `.storage` on a quantity which was initialized with a numpy or cupy array and a gt4py_backend value currently causes `view` to become out of sync with the underlying data of the quantity, since the data is re-allocated in creating the storage but the view is unchanged and retains a reference to the old data. This PR updates the routine which re-allocates the data to also update the view. This is a work-around to deal with not being able to initialize gt4py storages from existing numpy or cupy arrays. We should be able to remove the work-around after GDP-3 (GridTools/gt4py#28) is merged.
docs/GDPs/gdp-0003-duck-storage.rst
Outdated
+ :code:`"acquire": Optional[Callable[[], Any]]` Is called on all objects that are passed to a | ||
stencil, before running computations. It can be used to trigger a copy to the respective device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized I got mixed up to think that acquire was an argument passed to stencil, not one passed to storage. I would suggest this re-wording to make it clearer:
+ :code:`"acquire": Optional[Callable[[], Any]]` Is called on all objects that are passed to a | |
stencil, before running computations. It can be used to trigger a copy to the respective device. | |
+ :code: The `"acquire": Optional[Callable[[], Any]]` method of each object passed to a | |
stencil is called before running computations. It can be used to trigger a copy to the respective device. |
Knowing that, does the acquire function need any information about the device the stencil is about to be run on?
@gronerl or @egparedes can you summarize what led to decision to close it, then we can merge it as |
After this GDP and its reference implementation have not gained any traction for roughly a year, and after more discussions offline, we propose to decline this GDP. The original motivation for this GDP was that moving away from the design where storages are subclasses of Resolving this in any way leads to a storage that is either severely limited in functionality, which prompts users to escape to different frameworks anyways (as is the case with the old storages), or that performance may be severely impacted depending on the exact use of the storages, which is not transparent to the user. In other cases the behavior may be unintuitive (broadcasting, slicing). Overall it seems impossible to find a solution that is a good trade off for a substantial share of all use cases. Instead, we currently propose to pursue a "No Storage" approach in the future. The idea there would be that gt4py provides utilities that allocate buffers which have desireable properties for stencils, yet the framework will not provide any methods on the buffers, i.e. it is up to the user to fill values in the allocated memory, implement ufuncs if desired etc. However, a generic interface like the Please, raise objections to declining this GDP until Nov 10, 2021. |
@mcgibbon Thoughts? |
It feels like less is more when it comes to storages, and this seems to fit the bill. It sounds exciting to be able to provide gt_data_interface on our |
I'm pasting some remarks by @jdahm on Slack to preserve them:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with the conclusions.
…onstruct_inside Remove code duplication and decrease complexity of PAST Lowering
No description provided.