Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: refactor zero-filling and expose dtype API slot for it #23591

Merged
merged 7 commits into from Apr 20, 2023

Conversation

ngoldbaum
Copy link
Member

This refactors zero-filling to be controlled by dtype-specific logic. It also moves all zero-filling logic inside PyArray_NewFromDescr_int and removes the _zerofill helper, which now has no consumers.

I experimented with making the API slot only provide the initial zero value and let numpy fill the buffer with that value by doing e.g. repeated memcpy calls, but this ended up not being sufficient, since in principle each zero value might contain embedded references that must be unique, so we need control over filling the entire array buffer. It also makes it slightly simpler to handle object and void dtypes, since I can just move the existing logic from _zerofill to the new object_fill_zero_value function I added.

I didn't see the need to add a test for this to the scaled float dtype, since the legacy dtypes should use all the code paths I modified, but am happy to add such a test if that's desired.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this, I think it has the potential to clean up a quite a bit more mess. Unfortunately, I really would like to try to stick to that design goal of not using array objects (especially for something that is very fundamental).

*
* @returns -1 or 0 indicating error and arr being successfully filled.
*/
typedef int (fill_initial_function)(PyArrayObject *arr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made sure we never use array objects anywhere in the new DType API. That is to allow swapping out the array object or using NumPy dtypes for non-array objects in principle, and was one of the early wishes not just from me, but also for example Travis.

There are two possible paths:

  • Do a strided buffer fill (could be a set up like the traverse/clearing function or in principle a ufunc core). Those add the get_travers_loop() indirection since that is needed for structured dtypes at least.
  • We fill a single element buffer then use that to fill the rest. (I used that approach for the ufunc identity.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fair enough! I think it might be possible to write this using a traversal loop. I will poke around next week.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks. I don't have any opinion right now about whats better. The first is neat, but the second seems simpler and close to how identities currently work...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where in numpy is the ufunc identity buffer setup? I’ll take a look at that first since I’m not familiar with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its the get_reduction_initial_function *get_reduction_initial; and defined here:

/**
* Query an ArrayMethod for the initial value for use in reduction.
*
* @param context The arraymethod context, mainly to access the descriptors.
* @param reduction_is_empty Whether the reduction is empty. When it is, the
* value returned may differ. In this case it is a "default" value that
* may differ from the "identity" value normally used. For example:
* - `0.0` is the default for `sum([])`. But `-0.0` is the correct
* identity otherwise as it preserves the sign for `sum([-0.0])`.
* - We use no identity for object, but return the default of `0` and `1`
* for the empty `sum([], dtype=object)` and `prod([], dtype=object)`.
* This allows `np.sum(np.array(["a", "b"], dtype=object))` to work.
* - `-inf` or `INT_MIN` for `max` is an identity, but at least `INT_MIN`
* not a good *default* when there are no items.
* @param initial Pointer to initial data to be filled (if possible)
*
* @returns -1, 0, or 1 indicating error, no initial value, and initial being
* successfully filled. Errors must not be given where 0 is correct, NumPy
* may call this even when not strictly necessary.
*/
typedef int (get_reduction_initial_function)(
PyArrayMethod_Context *context, npy_bool reduction_is_empty,
char *initial);

The actual implementation is more annoying, since its set up to fall back to the one on the ufunc:

get_initial_from_ufunc(

(which is all a bit annoying and I wouldn't mind making it nicer, but it helps support complicated reduce initials.)


But of course what we would need here is much simpler, since it only needs the descr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with not having things like this on the dtype. Having a ufunc would seem to make a lot of sense to me. Could be an actual zeros(out=array, or be more like the strange private _ones_like that already exists (no idea why! but it is covered by astropy units...). For zero filling it would then be zeros_like(array, out=array). But another logical concept is that for dtypes that do not have strange non-identical zeros, one effectively just uses whatever np.copyto does.

(Sorry, I have not really looked at this PR, just really read the comments here... Ignore if this is all irrelevant.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had thought about a ufunc, but overall still lean against it.

My thoughts are roughly:

  • Creating a zero filled array seems very core and doing ufunc dispatching is a bit of work for something simple. It feels a bit much if I think of an exposing API like PyArray_FillBuffer(dtype, ptr, num, stride, NPY_ZEROVAL) (that is partially a lie, because we made sure ufuncs are now set up in a way to make exactly this fairly easy!).
  • Its just seems like more hassle for the DType author to implement for something pretty basic (but of course users will implement many ufuncs anyway).

Another path would be an ArrayMethod (i.e. ufunc loop) that lives on the DType (just like casts are ufuncs loops/ArrayMethods that don't live on a ufunc object).

For clearing (dealloc+NULLint of array values) I had for a simplified traversal loop getter. I don't like splitting from ArrayMethod, but it seemed like a lot of set up for something very basic. I do translate this to zero().

Now ufuncs seem like they should have some advantages (maybe for subclasses?), but I can't really put my finger on anything concrete right now except a certain beauty of aiming that as much as possible should a ufunc.


N.B.: I suspect this will continue to include "zero", "one", and also "empty" (we do need this for object() although I am not sure we will need it generally).
In principle one could continue down the rabbit hole with the typical ufunc.identity values, and NA filling might be a thing at some point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is no more than thinking that there is a certain beauty and simplicity if a DType interacts with arrays only via ufuncs. Of course, regardless the DType has to have a way to tell what "zero" is (though there is an obvious default).

numpy/core/src/multiarray/ctors.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/dtypemeta.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/dtypemeta.c Outdated Show resolved Hide resolved
@ngoldbaum
Copy link
Member Author

I re-implemented the API hook to use a traversal loop. Instead of using PyArray_FillObjectArray I now manually insert pointers to the zero PyObject* for object arrays and use the _fillobject helper used by PyArray_FillObjectArray for void arrays. In principle I could have avoided _fillobject but it seemed better to avoid duplicating the somewhat complicated recursive logic it uses.

I considered following the initial reduction API slot but decided to use a traversal loop instead both to ease implementation for me (the existing approach is basically a traversal loop, so it was straightforward to translate) and to allow more flexibility for the dtype author. I suspect that this is a niche case so in my opinion it's OK to require a little extra boilerplate if that buys dtype authors more flexibility.

@seberg
Copy link
Member

seberg commented Apr 18, 2023

Thanks, I will have a look tomorrow.

In principle I could have avoided _fillobject but it seemed better to avoid duplicating the somewhat complicated recursive logic it uses.

We should definitely change it to use a pattern like the clear traversal loop (and hopefully remove the other place at some point). But, we can follow up on that too.

(EDIT: To explain this, we should not figure out the structure again on every elment, its very slow. And after we have it, this should become the one true way to initialize zero, and hopefully _fillobject will eventually be gone, rather than this way around.)

@ngoldbaum
Copy link
Member Author

we should not figure out the structure again on every elment, its very slow

Good point! I know next to nothing about structured dtypes so got scared of trying to heavily refactor _fillobject. I agree that it’s doing a lot of work that could be amortized over the whole array.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, happy to go this route.

One more thing to note: Your object filling loop does not DECREF anything that may be inside the array before.

We can do this, it just means:

  • We need to make a note of the choice that the loop cannot be called on memory that holds references. (You cannot use this to zerofill a used/fully initialized array, you would have to dealloc/clear it first!)
  • In theory we could choose to avoid the initial memory zeroing due to references always. But if we want to keep that open we would need to ask the zero-fill function to guarantee an initialized state on failure (in other words: calling clear on the memory must be OK). Or find another safe scheme.
    (Not sure it matters in practice, for large arrays, we will touch all memory twice here, but the memory churn could be improved by us via chunking, and most zeroed arrays are likely used more than twice anyway.)

As I said, I can help with the void loop in a follow-up, or we can work together on it here (it may not be super hard by looking at the clear loop. We might even be able to practically re-use it).

numpy/core/src/multiarray/ctors.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/ctors.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/ctors.c Outdated Show resolved Hide resolved
NPY_traverse_info fill_zero_info;
/* float errors do not matter and we do not release GIL */
NPY_ARRAYMETHOD_FLAGS zero_flags = PyArrayMethod_MINIMAL_FLAGS;
NPY_traverse_info_init(&fill_zero_info);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to pair this with the free/clear function, unfortunately (this will be important for structured dtypes).

One path may be to not support/check for fill_zero_info.func == NULL here, but only use get_fill_zero_loop == NULL to make the decision about calloc. That would move this whole logic down to after the calloc choice was made. (structured dtypes won't be able to signal calloc is good enough, but 🤷)

Alternatively, we just need to add the cleanup either in fail: of before all relevant goto fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh oops, I missed the existence of NPY_traverse_info_xfree looking at the traversal code, thank you very much for catching this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up refactoring so the cleanup goes in the fail block. This does lead to allocating and initializing the traverse info struct in the fast path, but it's a tiny struct so I don't think that's a problem. Let me know if you'd like further massaging on this.

numpy/core/src/multiarray/dtypemeta.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/dtypemeta.c Outdated Show resolved Hide resolved
numpy/core/src/multiarray/dtypemeta.c Outdated Show resolved Hide resolved
@ngoldbaum
Copy link
Member Author

We need to make a note of the choice that the loop cannot be called on memory that holds references.

Added a comment in the loop itself and added a warning in the documentation for the API hook in dtypemeta.h.

I can help with the void loop in a follow-up

I'd prefer this, if that's alright. This is already 3 or 4 levels of yak shaving from where I started working last week :)

@ngoldbaum
Copy link
Member Author

Also I should highlight 4403c01 which made some changes to the setup for the traversal info struct. For zeroing we don't need the descriptor at all (the array creation routine already has it), so it's OK to just leave it as NULL.

@seberg
Copy link
Member

seberg commented Apr 20, 2023

OK, thanks. Lets put this in and I will fix the void loop in a follow-up.

(If anyone prefers another API, e.g. to ask for a single value only. Happy to entertain that also.)

@seberg seberg merged commit e20f110 into numpy:main Apr 20, 2023
59 checks passed
seberg added a commit to seberg/numpy that referenced this pull request Apr 20, 2023
This reorganizes the generic traversal functions for structured
dtypes a bit (before it wasn't quite generic).

Then uses that for zerofilling.  We could get the two a bit closer
by also supporting `func==NULL` explicitly for clearing.
(I have not includes this here.)

The old `FillObjectArray` is still in use now, this is not ideal
and the new approach should be duplicated to add an "emptyfill"
(same semantics, normally just memset to 0, but for objects we
place an explicit `None` object).

This is a necessary follow-up to numpygh-23591.
seberg added a commit to seberg/numpy that referenced this pull request Apr 20, 2023
This reorganizes the generic traversal functions for structured
dtypes a bit (before it wasn't quite generic).

Then uses that for zerofilling.  We could get the two a bit closer
by also supporting `func==NULL` explicitly for clearing.
(I have not includes this here.)

The old `FillObjectArray` is still in use now, this is not ideal
and the new approach should be duplicated to add an "emptyfill"
(same semantics, normally just memset to 0, but for objects we
place an explicit `None` object).

This is a necessary follow-up to numpygh-23591.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants