Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement co-broadcasting in operator overloading #898

Merged
merged 16 commits into from
Mar 12, 2021

Conversation

SparrowLii
Copy link
Contributor

@SparrowLii SparrowLii commented Jan 20, 2021

This PR did the following to implement co_broadcasting in operator overloading

  1. Implement DimMax trait, which uses the same broadcast mechanism as Numpy, for all Dimension so as to get the returned array type in operator overloading.
  2. Use map_collect() and map_collect_owned() in operator overloading to avoid redundant array traversal.

where
A: Clone + Zero + FromPrimitive + Add<Output = A> + Div<Output = A>,
D: RemoveAxis,
D::Smaller: BroadcastShape<Ix0>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make a specific comment here - but it is more important to talk about the whole picture in this pull request.

Specifically, this new bound

  1. Looks like mean_axis is suddenly exposing unnecessary implementation details - we don't want to make any externally visible changes to this method's signature - what is required to keep the old signature?

  2. This is, to me, a sign that users will have to use this kind of bound in their dimension-generic array code. I think that makes ndarray harder to use, even if it's more "powerful". Objectively that looks like a blocking concern for this feature. I mean, if the answer to (1) is "don't use arithmetic ops", then I have reservations.

Copy link
Member

@jturner314 jturner314 Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point and something I hadn't considered before. Unfortunately, I don't think there's a way to avoid the extra bound without improvements to the language (either specialization or enum-variants-as-types), since the compiler needs to know that BroadcastShape is implemented for all possible combinations of dimension types to remove the bound.

Copy link
Member

@bluss bluss Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see how far we get with the blanket implementation for all Dimension with output Self and then maybe it can be added impls that show that Ix0 can broadcast to any other dimension.

Already now we need to think about "min const generics" which hit stable soon. What can we preserve when transferring over to const generics for const dimensions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a good way :( If we add a statement like this:

    impl<D: Dimension> BroadcastShape<Ix0> for D {
        type BroadcastOutput = D;
    }

Then it will conflict with the statement on broadcast.rs:62( Although they don’t actually conflict ).

@bluss
Copy link
Member

bluss commented Jan 20, 2021

Co-broadcasting is cool and interesting! I can see that you are attacking our fixme's in code one by one and that's gratifying, but it doesn't always tell the whole story of what might be the most important issues to solve. (*)

Edit: On further thought the code example below might just work out of the box - that would resolve most of this :)

I think the type system is still limited in what can be expressed, and we seem to have a typenum-like problem here (not really the fault of typenum, just the limits of the type system). Expressing one generic arithmetic operation is ok.. but what about two? Will it be usable for the user? Imagine this:

type Mat<D> = Array<f64, D>;
fn foo<D>(x: Mat<D>, y: Mat<D>, z: Mat<D>) -> Mat<D> {
   x + &y * &z
}

How would we write the actual return type of the function? Maybe in some cases this change makes it harder for the user of ndarray, or can we find a great solution?

It could be that co-broadcasting should be a feature that ndarray implements, without using it by default in the arithmetic ops.

(*) Specifically see the issue about #699 - there are lots of things we want to spend the - implementation, complexity, etc - budget on in arithmetic ops.

@SparrowLii
Copy link
Contributor Author

SparrowLii commented Jan 21, 2021

How would we write the actual return type of the function? Maybe in some cases this change makes it harder for the user of ndarray, or can we find a great solution?

If the dimension types of the parameters are all D, the result can also be written directly as D. But if the dimensions are not the same, it will be really troublesome. That's right, if the cost of using operators is that additional restrictions must be added, it maybe better to risk panic. I'm sorry I can't think of a better way for the time being. Thanks for pointing out.

@bluss bluss changed the title Fix co_broadcast in operator overloading Implement co-broadcasting in operator overloading Jan 21, 2021
@bluss
Copy link
Member

bluss commented Jan 21, 2021

Changed the title from "fix" since it's a new feature, not a fix. There is more feedback and thoughts to be given, when I have time. 🙂

@bluss
Copy link
Member

bluss commented Jan 23, 2021

Here's a trick that actually works to resolve the mean situation. Just to show you a fun trick to use. Ignore the broadcast_shape method change for now - that one can be fixed later (method bodies don't matter for type checking(!)).

What we do here is that we put the BroadcastShape<Self> and BroadcastShape<Ix0> requirements into requirements of the Dimension itself. That way we rearrange the situation so that instead every type that implements Dimension has to prove - for itself - that it implements these two broadcasts. It seems to sidestep the overlapping problem.

See this commit on a temporary branch for the diff to this PR: 54c3a85

@jturner314
Copy link
Member

jturner314 commented Jan 23, 2021

What we do here is that we put the BroadcastShape<Self> and BroadcastShape<Ix0> requirements into requirements of the Dimension itself.

That's a really nice solution. Is it possible to add BroadcastShape<Self::Smaller, BroadcastOutput=Self> and BroadcastShape<Self::Larger, BroadcastOutput=Self::Larger> bounds too? If so, I would think all four bounds would avoid a need for users to add BroadcastShape bounds to their functions in most cases.

Edit: I wonder if we could also add BroadcastShape<Self::Smaller::Smaller, BroadcastOutput=Self> and BroadcastShape<Self::Larger::Larger, BroadcastOutput=Self::Larger::Larger> (and more, similar bounds of increasing depth)? Plus, we could add BroadcastShape<IxDyn, BroadcastOutput=IxDyn>.

@SparrowLii
Copy link
Contributor Author

SparrowLii commented Jan 23, 2021

What we do here is that we put the BroadcastShape and BroadcastShape requirements into requirements of the Dimension itself. That way we rearrange the situation so that instead every type that implements Dimension has to prove - for itself - that it implements these two broadcasts. It seems to sidestep the overlapping problem.

That is amazing! It's so surprising it works! However, we still can’t add statements like + BroadcastShape<Self::Smaller, BroadcastOutput=Self> because it will cause cycles in analysis passes. But the exciting thing is that I tried this way in the definition of Dimension:

/// Next smaller dimension (if applicable)
type Smaller: Dimension + BroadcastShape<Self, BroadcastOutput=Self>;
/// Next larger dimension
type Larger: Dimension + RemoveAxis + BroadcastShape<Self, BroadcastOutput=Self::Larger>;

And it did run successfully in my nightly version (cargo 1.49.0-nightly (d5556aeb8 2020-11-04) ), I think we can achieve this amazing work as soon as the 1.49 version is stable.

[Edit]I updated my rust version and found that the stable version 1.49 has been released, and it does works. But the CI of ndarray still seems to be version 1.42. On the other hand, such changes may affect users of previous rust versions, just like me lol

@bluss
Copy link
Member

bluss commented Jan 23, 2021

Note that + BroadcastShape<Self::Smaller, BroadcastOutput=Self> does work it just has to be written like + BroadcastShape<<Self as Dimension>::Smaller, BroadcastOutput=Self> to help the compiler.

@jturner314
Copy link
Member

We could also add this bound to RemoveAxis: BroadcastShape<<<Self as Dimension>::Smaller as Dimension>::Larger, BroadcastOutput = Self>.

src/impl_ops.rs Outdated Show resolved Hide resolved
src/impl_ops.rs Outdated Show resolved Hide resolved
src/impl_ops.rs Outdated Show resolved Hide resolved
Copy link
Member

@bluss bluss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry I haven't at this time provided a high-level plan

src/impl_ops.rs Outdated Show resolved Hide resolved
src/impl_ops.rs Show resolved Hide resolved
src/impl_ops.rs Outdated Show resolved Hide resolved
tests/array.rs Outdated Show resolved Hide resolved
tests/array.rs Outdated Show resolved Hide resolved
@SparrowLii
Copy link
Contributor Author

I used uninitialized() to create an uninitialized array, and added the zip_mut_from_pair method to impl_methods.rs to get the result. Now no matter whether we need co_broadcast or not, we only need to traverse self and rhs once each. And ArcArray will not be turned into Array. But there is still a problem: both uninitialized() and maybe_uninit() methods require elements to implement the copy trait (I don’t know the reason behind this). So it is still unavoidable to change the signature of the method in impl_numeric.rs.

@bluss
Copy link
Member

bluss commented Jan 25, 2021

uninitialized is hard to use correctly. I haven't removed it yet because I think I have managed to put it on sound ground - it is technically possible to use it correctly. See issue #804 and issue #876 (we have just completed removing all internal usage of this method).

Unfortunately ArrayBase::uninitialized:

  1. Has some usage by actual users - it's a basically popular and easy to try to use (github code search confirms)
  2. This usage is from casual inspection almost always technically incorrect

Please use maybe_uninit that I mentioned. 🙂 If you do, then I don't have to go through and point out where your usage of uninitialized breaks the soundness rules 🙂


pub trait BroadcastShape<Other: Dimension> {
/// The resulting dimension type after broadcasting.
type BroadcastOutput: Dimension;
Copy link
Member

@bluss bluss Jan 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A trait with just one associated type - an "output type" (See RawDataSubst and FnOnce as inspiration) should almost always prefer the associated type name "Output" as a matter of convention. Think of it as a type-level function's return value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks much for teaching. I changed BroadCastOutput to Output

S1: Data<Elem = B>,
S2: Data<Elem = C>,
F: Fn(&B, &C) -> A,
{
Copy link
Member

@bluss bluss Jan 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use Zip in this method (is it slower?)? Zip::map_assign_into should do it (assign into self).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we can use Zip. This function does not need to be added.

src/impl_ops.rs Outdated Show resolved Hide resolved
@SparrowLii
Copy link
Contributor Author

SparrowLii commented Jan 26, 2021

Please use maybe_uninit that I mentioned. 🙂 If you do, then I don't have to go through and point out where your usage of uninitialized breaks the soundness rules 🙂

I tried to use maybe_uninit(), but I can’t think of a simple way to generate an ArrayBase with the data type MaybeUninit<A>. So I added a trait named MaybeUninitSubst to get the mapping from ArrayBase with element A to MaybeUninit<A>. In addition used Zip to get the results.

@bluss
Copy link
Member

bluss commented Jan 26, 2021

Good fighting. The type related problems with the maybe_uninit constructor are not so fun, we're really feeling that we miss the "GAT" feature here (still experimental).

@bluss
Copy link
Member

bluss commented Jan 29, 2021

I'm working on fixing the maybe uninit functionality so that it can do what we need.

@SparrowLii
Copy link
Contributor Author

I'm working on fixing the maybe uninit functionality so that it can do what we need.

Cool! It will be very exciting if it and co_broadcast both land.

@SparrowLii
Copy link
Contributor Author

@bluss Thank you so much for the amazing work! I have rebased the pr and used map_collect_owned according to your guide to avoid unsafe code.

src/dimension/broadcast.rs Outdated Show resolved Hide resolved
src/dimension/broadcast.rs Outdated Show resolved Hide resolved
Copy link
Member

@bluss bluss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's starting to getting in shape, we've managed to reduce the arithmetic ops trait requirements to a minimum (optimal would be to have all dimensions implement BroadcastShape<E> but it seems impossible and this is acceptable for now).

Arithmetic ops and broadcasting docs (on ArrayBase) look like they need an update

src/dimension/broadcast.rs Outdated Show resolved Hide resolved
src/dimension/broadcast.rs Outdated Show resolved Hide resolved
/// The resulting dimension type after broadcasting.
type Output: Dimension;

/// Determines the shape after broadcasting the dimensions together.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shapes, not dimensions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Determines the shape after broadcasting the dimensions together.
/// Determines the shape after broadcasting the shapes together.

src/dimension/broadcast.rs Show resolved Hide resolved
src/impl_ops.rs Outdated Show resolved Hide resolved
src/impl_ops.rs Outdated Show resolved Hide resolved
tests/array.rs Show resolved Hide resolved
tests/array.rs Outdated Show resolved Hide resolved
src/impl_ops.rs Outdated
Comment on lines 178 to 180
let shape = self.dim.broadcast_shape(&rhs.dim).unwrap();
let lhs = self.broadcast(shape.clone()).unwrap();
let rhs = rhs.broadcast(shape).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to perform well we need to ensure that no-ops pass through efficiently. Maybe it can be a task for another time. The existing .zip_mut_with solution it only broadcasts if necessary, so it can be a lot more efficient. Let's fix that after everything else - not because it's less important but because it makes the commit history easier to understand.

Copy link
Contributor Author

@SparrowLii SparrowLii Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. But the method to convert the dim of ArrayBase seems to have only into_dimensionality and broadcast. However, the former cannot be used for reference types. I think we can add the judgment whether the shape is the same in broadcast.

// Uses the [NumPy broadcasting rules]
// (https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html#general-broadcasting-rules).
//
// Zero dimension element is not in the original rules of broadcasting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which original rules? ArrayBase::broadcast seems concsistent with these rules already. Also, we need to ensure here that we do exactly the same thing as in ArrayBase::broadcast.

Copy link
Contributor Author

@SparrowLii SparrowLii Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not find a description of the zero value in the Numpy broadcast rules cited above. However, if there is no difference between a zero value and a value greater than 1, I think this explanation can be omitted.
ArrayBase::broadcast only detects the shape mismatch, and does not make any changes to dim.

*x = x.clone() $operator y.clone();
});
self
if self.ndim() == rhs.ndim() && self.shape() == rhs.shape() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the PR in detail, but I just wanted to make a small comment here. There's another case that's worth considering: when it's possible to broadcast the RHS to match the existing shape of the LHS. In this case, it's possible to reuse the existing LHS allocation (like in the current release of ndarray and the if branch here), instead of collecting into a new array (like the else branch here).

This observation also applies analogously to the other ops impl with DataOwned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is room for future improvement.

@bluss
Copy link
Member

bluss commented Feb 18, 2021

I'll just make a list of "known issues" that we can fix after the PR review and merge

  • Rename BroadcastShape to DimMax - we also move the broadcast method out of the trait (it can just be a free function)
  • It is a wish that the D: Dimension + BroadcastShape<E> bound in arith ops becomes just D: Dimension, i.e. the DimMax with associated type becomes implied by D: Dimension. At this point it is not certain how this can be done.
  • Preserve the allocation of the input array in arithmetic ops in more cases (when only the left or right side needs broadcast)
  • Improve performance of "no-broadcasting-needed" scenario in &array + &array operation
  • Potentially provide public function or method similar to broadcast_with - that does the broadcasting of two arrays together.

src/impl_methods.rs Outdated Show resolved Hide resolved
@SparrowLii
Copy link
Contributor Author

SparrowLii commented Feb 20, 2021

@bluss It has been modified accordingly. I very much look forward to our ability to solve the above problems in the future. Especially removing the restriction like BroadcastShape<E>, it may represent a new language feature IMO, which will be so cool and exciting.

@bluss
Copy link
Member

bluss commented Mar 12, 2021

I've rebased this to fix the conflict (sorry about that). Could we update the PR description for the current state of the PR before merge?

For consistency with other dimension traits (to come);
with <A as DimMax<B>>, Output is the maximum of A and B.
While calling co_broadcast directly is less convenient, for now they are
two different functions.
@SparrowLii
Copy link
Contributor Author

I've rebased this to fix the conflict (sorry about that). Could we update the PR description for the current state of the PR before merge?

OK, it has been modified accordingly

@bluss
Copy link
Member

bluss commented Mar 12, 2021

Thanks :)

@bluss bluss merged commit b5687f8 into rust-ndarray:master Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants