Make it possible to disable broadcasting library-wise #23126

clime · 2023-01-29T22:59:39Z

Proposed new feature or change:

I would like some numpy global flag to exist that would globally disable broadcasting.

I don't love the implicit nature of broadcasting that can bring unexpected results. E.g.:

>>> np.array([[0.3, 0.7], [0.2, 0.8]]) * np.array([0.1, 0.2])
array([[0.03, 0.14],
       [0.02, 0.16]])

In that example, I would like an error to be raised because what I really needed was:

>>> np.array([[0.3, 0.7], [0.2, 0.8]]) * np.array([0.1, 0.2])[:, np.newaxis]
array([[0.03, 0.07],
       [0.04, 0.16]])

I know it is my responsibility to make the program correct but such implicit features like broadcasting make it more difficult for me. And it doesn't feel safe to me.

I would welcome it if there was an explicit operator, instead, allowing one to specify on which axis you want to broadcast.

E.g.

np.array([[0.3, 0.7], [0.2, 0.8]]) * np.array([0.1, 0.2])[np.newaxis|np.broadcast, :] # adds new dimension that is also broadcastable

np.array([[0.3, 0.7], [0.2, 0.8]]) * np.array([[0.1, 0.2]])[np.broadcast, :] # if the dimension already exists

This probably can't work just like that for whatever reason but I just wanted to try to convey the basic idea.

The text was updated successfully, but these errors were encountered:

rkern · 2023-01-30T05:53:48Z

I don't believe that either of these are feasible, or will receive the necessary championing by anyone else on the development team to turn them into things that are feasible.

As for the first idea of the switch, we currently have one short set of broadcasting rules. Having two sets of rules controlled by a switch that may be far from the code that you are reading introduces significant cognitive load to the reader. So even if a switch were technically feasible, we would probably decide against implementing it as a design choice.

As for the latter, no, there's no real way do that at the operator/ufunc level.

If you don't like our broadcasting rules and want to write your code in a way as if they don't exist, you can certainly do that right now. Reshape/repeat everything explicitly, just as if you had flipped your proposed switch. Sprinkle in asserts to help keep you honest. There are efforts in various places to enhance developer tooling for debugging and type-checking array shape logic. I would highly recommend redirecting all energy towards those efforts.

clime · 2023-01-30T08:55:17Z

I don't believe that either of these are feasible, or will receive the necessary championing by anyone else on the development team to turn them into things that are feasible.

As for the first idea of the switch, we currently have one short set of broadcasting rules. Having two sets of rules controlled by a switch that may be far from the code that you are reading introduces significant cognitive load to the reader. So even if a switch were technically feasible, we would probably decide against implementing it as a design choice.

I don't think there is any cognitive load added as broadcasting then needs to be explicit so you will see what is happening exactly in all places.

As for the latter, no, there's no real way do that at the operator/ufunc level.

Can you give counter-examples to my suggestions with np.broadcast?

If you don't like our broadcasting rules and want to write your code in a way as if they don't exist, you can certainly do that right now. Reshape/repeat everything explicitly, just as if you had flipped your proposed switch. Sprinkle in asserts to help keep you honest. There are efforts in various places to enhance developer tooling for debugging and type-checking array shape logic. I would highly recommend redirecting all energy towards those efforts.

I was thinking about it but placing an assert before basically any numpy operation is really quite some work and makes the code cluttered.

mattip · 2023-01-30T10:13:05Z

We are very unlikely to change this in NumPy, it is part of the basic functionality of the library and has been adopted by the entire array processing community as part of the data array API standard. One way you could implement this for yourself would be to use the __arrayfunction__ protocol to override the normative broadcasting behaviour with your own rules. Beyond that there is not much we can help with.

clime · 2023-01-30T12:33:22Z

I could find quite a few issues open (e.g. against tensorflow: tensorflow/tensorflow#1406, dynet: clab/dynet#1016) about having such a flag available and also some people complaining about bugs implicit broadcasting have caused.

Btw. there is an option to enable warnings in Octave warning("on, "Octave:broadcast"); about automatic broadcasting and something similar in Matlab. This is a minimum thing that can be done.

Adding a warning that can be enabled about this would be good enough solution for me.

seberg · 2023-01-30T14:16:18Z

I don't want to be too negative because I do not disagree that this may have a little merit. However, I can only agree that it is a larger project and that getting core dev buy-in at this point is unlikely. (Which is typical even for projects we are excited about, we have very limited core-dev resources). So it would require someone spending a significant effort on it with uncertain outcome.

One note: you cannot do "library specific" behavior. So you would have to change it for everyone which would break a lot of code for, at this point, unclear gains. Such efforts normally require writing a NEP

Adding a warning might be possible, but to me at least seems also pretty hard hard for something that I doubt more than a handful of users would use in practice. You are mentioning octave/matlab: Those come from a world without any broadcasting, so it makes sense for them to have such warnings, but do people actually use e.g. the octave warning unless they are worried about portability or freshly ported code that previously ran in an environment without any broadcasting?

eric-wieser · 2023-01-30T14:35:10Z

I would argue the feature you're requesting is not "disabling broadcasting", but rather "disabling insertion of leading 1 dimensions during broadcasting". Obviously you realize this, based on your example, but your phrasing is likely to bias people against the suggestion before they finish reading it.

eric-wieser · 2023-01-30T14:36:05Z

Am I right in thinking that the underlying problem this is trying to solve is accidental broadcasting of the wrong dimension? If so you might be interested in xarray, which IIRC uses names for the dimensions instead of numbers, making incorrect broadcasting much less likely.

clime · 2023-01-30T15:24:00Z

I would argue the feature you're requesting is not "disabling broadcasting", but rather "disabling insertion of leading 1 dimensions during broadcasting". Obviously you realize this, based on your example, but your phrasing is likely to bias people against the suggestion before they finish reading it.

For sure, inserting a new dimension during broadcasting makes the feature less safe, and e.g. tensorflow XLA forbids it (https://www.tensorflow.org/xla) If I understood correctly.

It makes a difference but it's just partial fix/solution. I could rely on my code more (without long checking) If I knew no broadcasting takes place without my knowledge.

im-Kitsch · 2023-12-04T18:12:37Z

It's really helpful, otherwise people must check the dimension very carefully. And Jax also already supported this feature

https://jax.readthedocs.io/en/latest/rank_promotion_warning.html

rgommers added the 57 - Close? Issues which may be closable unless discussion continued label Jan 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it possible to disable broadcasting library-wise #23126

Make it possible to disable broadcasting library-wise #23126

clime commented Jan 29, 2023 •

edited

rkern commented Jan 30, 2023

clime commented Jan 30, 2023 •

edited

mattip commented Jan 30, 2023

clime commented Jan 30, 2023

seberg commented Jan 30, 2023

eric-wieser commented Jan 30, 2023 •

edited

eric-wieser commented Jan 30, 2023

clime commented Jan 30, 2023

im-Kitsch commented Dec 4, 2023

Make it possible to disable broadcasting library-wise #23126

Make it possible to disable broadcasting library-wise #23126

Comments

clime commented Jan 29, 2023 • edited

Proposed new feature or change:

rkern commented Jan 30, 2023

clime commented Jan 30, 2023 • edited

mattip commented Jan 30, 2023

clime commented Jan 30, 2023

seberg commented Jan 30, 2023

eric-wieser commented Jan 30, 2023 • edited

eric-wieser commented Jan 30, 2023

clime commented Jan 30, 2023

im-Kitsch commented Dec 4, 2023

clime commented Jan 29, 2023 •

edited

clime commented Jan 30, 2023 •

edited

eric-wieser commented Jan 30, 2023 •

edited