Wording change: Tighten up output shape calculation algorithms #523

inexorabletash · 2024-01-20T00:08:30Z

For gemm() - make lists that are mutated be explicit clones using Infra terminology, and simplify the wording reversing the lists.

For matmul() - make lists that are mutated be explicit clones using Infra terminology, use append/prepend definitions from Infra, convert a variable change from "let" to "set" and drop use of "array".

For #395

Preview | Diff

inexorabletash · 2024-01-20T00:09:59Z

For matmul(), the current text is:

If shapeA[0] is not equal to shapeB[sizeB - 2], then throw an "OperationError" DOMException.
If shapeA[sizeA - 1] is not equal to shapeB[0], then throw an "OperationError" DOMException.

Should the former be - 1 as well? If not, what happens if sizeB is < 2 ?

zolkis

Looks good. Thanks.

index.bs

huningxin · 2024-01-24T09:32:23Z

index.bs

    1. [=map/For each=] |index| in [=the range=] 0 to |size|, exclusive:
-        1. Set |shape|[|index|] to the maximum of |shapeA|[|index|] and |shapeB|[|index|].
+        1. [=list/Append=] the maximum of |shapeA|[|index|] and |shapeB|[|index|] to |shape|.


Nit: [=list/append=] for consistency?

Would index be out-of-bounds if the size of a particular shape is less than size?

And I think the previous algorithm itself has issue. If the size of a shape is less than maximum size, this shape should be prepended 1 until its size is equal to maximum size before entering this loop.

Re: [=list/append=] - It's written with a capital A so that the start of the sentence is capitalized. Bikeshed is smart enough to case-insensitive match the mixed-case version, but not smart enough to capitalize the start of the sentence.

Re: previous algorithm - agreed, this looks "sus".

With the caveat that I'm still new to tensors, my intuition at least according to the prose ("If either a or b is N-dimensional where N > 2, it is treated as a stack of matrices with dimensions corresponding to the last two indices...") is that we need the end result to be:

sizeA == sizeB (achieved by prepending - how does appending come in???)

sizeA > 1 (via appending/prepending; alternately we reject per Simplify matmul op #470)

sizeB > 1 (via appending/prepending; alternately we reject per Simplify matmul op #470)

shapeA[sizeA - 1] = shapeB[sizeB - 2] (i.e. columns in first matrix = rows in second matrix)

... and the algorithm as written doesn't seem to either try to achieve this or enforce it.

Maybe someone that knows could weigh in with a more precise description or pseudocode? I can translate to spec language!

Chromium implementation does this:

if sizeA < 2 || sizeB < 2, throw -- per Simplify matmul op #470

colsA = shapeA[sizeA - 1]

rowsA = shapeA[sizeA - 2]

colsB = shapeB[sizeB - 1]

rowsB = shapeB[sizeB - 2]

if colsA != rowsB, throw -- the 2D matrices must match

slicedA = shapeA[0 : sizeA - 2]

slicedB = shapeB[0 : sizeB - 2]

if sizeA > 2 && sizeB > 2

b = broadcastShapes(slicedA, slicedB, bidirectional = true) or throw

outputShape = b with [rowsA, colsB] appended

else if size A == 2 && size B == 2

outputShape = [rowsA, colsB]

else (sizeA == 2 or sizeB == 2)

outputShape = (sizeA > sizeB) ? slicedA : slicedB) with [rowsA, colsB] appended

... where broadcastShapes() is a whole 'nother can of worms.

I notice https://webmachinelearning.github.io/webnn/#mlgraphbuilder-broadcast-shapes says it is implementation-defined - is that true!?!?!?

I put #534 up to define broadcasting shapes, so for this PR we could wait for that to land, then rebase this on top of it and I can translate the above into spec text. I'm feeling close to actually understanding what's going on!

I notice https://webmachinelearning.github.io/webnn/#mlgraphbuilder-broadcast-shapes says it is implementation-defined - is that true!?!?!?

broadcast shouldn't be implementation-defined, thanks for putting a PR to fix it.

Latest update includes a commit that rewrites the matmul output shape calc to match Chromium's impl. PTAL?

@huningxin I see you're good with the rest of the CR, and Joshua's last iteration addressed this comment too. So I merged it, but please let me know when you're back from vacation if there's any other little feedback.

index.bs

This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. For webmachinelearning#324, webmachinelearning#462, and potentially webmachinelearning#523.

This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.

This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. Use broadcasting definition in expand(), rather than bespoke steps For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523. Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>

* New content: Add definition for shape broadcasting This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. Use broadcasting definition in expand(), rather than bespoke steps For #324, #378, #462, and potentially #523. Co-authored-by: Dwayne Robinson <dwayner@microsoft.com> * Fix prelu parameter order --------- Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>

For gemm() - make lists that are mutated be explicit clones using Infra terminology, and simplify the wording reversing the lists. For matmul() - make lists that are mutated be explicit clones using Infra terminology, use append/prepend definitions from Infra, convert a variable change from "let" to "set" and drop use of "array". For webmachinelearning#395

inexorabletash · 2024-02-13T18:23:31Z

Rebased and force-pushed - sorry about that. There are two commits - the first is unchanged from before, the second is a rewrite of the matmul shape calculations to match the Chromium impl.

Please take a look?

…ng-calc-output-shapes

index.bs

fdwr

We can reduce some redundant GEMM steps, but otherwise LGTM sir.

index.bs

Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>

fdwr

👍 Thanks J.

SHA: a2f7e0a Reason: push, by fdwr Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

inexorabletash mentioned this pull request Jan 20, 2024

Bug fix: Correct indentation of an iteration substep #522

Merged

zolkis reviewed Jan 22, 2024

View reviewed changes

inexorabletash force-pushed the wording-calc-output-shapes branch 2 times, most recently from 49e76e6 to 9596953 Compare January 23, 2024 18:12

huningxin reviewed Jan 24, 2024

View reviewed changes

inexorabletash force-pushed the wording-calc-output-shapes branch from 9596953 to 93bd980 Compare January 24, 2024 17:30

inexorabletash mentioned this pull request Jan 26, 2024

New content: Add definition for shape broadcasting #534

Merged

inexorabletash force-pushed the wording-calc-output-shapes branch from 93bd980 to c5314ea Compare January 27, 2024 19:01

inexorabletash marked this pull request as draft February 2, 2024 17:51

inexorabletash force-pushed the wording-calc-output-shapes branch from c5314ea to 5af908b Compare February 6, 2024 18:14

inexorabletash force-pushed the wording-calc-output-shapes branch from 5af908b to 4e6cd98 Compare February 7, 2024 17:43

inexorabletash added 2 commits February 13, 2024 09:56

Rewrite to match Chromium impl

30acaf4

inexorabletash force-pushed the wording-calc-output-shapes branch from 4e6cd98 to 30acaf4 Compare February 13, 2024 18:19

inexorabletash marked this pull request as ready for review February 13, 2024 18:21

Merge branch 'main' of github.com:webmachinelearning/webnn into wordi…

db81f9b

…ng-calc-output-shapes

inexorabletash requested a review from fdwr February 13, 2024 23:21

fdwr reviewed Feb 14, 2024

View reviewed changes

index.bs Show resolved Hide resolved

fdwr approved these changes Feb 14, 2024

View reviewed changes

index.bs Show resolved Hide resolved

index.bs Outdated Show resolved Hide resolved

Update index.bs

9deaeb5

Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>

fdwr approved these changes Feb 14, 2024

View reviewed changes

fdwr merged commit a2f7e0a into webmachinelearning:main Feb 14, 2024
2 checks passed

github-actions bot added a commit that referenced this pull request Feb 14, 2024

Wording change: Tighten up output shape calculation algorithms (#523)

7d324d0

SHA: a2f7e0a Reason: push, by fdwr Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

inexorabletash deleted the wording-calc-output-shapes branch February 14, 2024 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wording change: Tighten up output shape calculation algorithms #523

Wording change: Tighten up output shape calculation algorithms #523

inexorabletash commented Jan 20, 2024 •

edited by pr-preview bot

Loading

inexorabletash commented Jan 20, 2024

zolkis left a comment

huningxin Jan 24, 2024

huningxin Jan 24, 2024

inexorabletash Jan 24, 2024

inexorabletash Jan 25, 2024 •

edited

Loading

inexorabletash Jan 25, 2024

inexorabletash Jan 26, 2024

huningxin Feb 2, 2024

inexorabletash Feb 13, 2024

fdwr Feb 14, 2024

inexorabletash commented Feb 13, 2024

fdwr left a comment

fdwr left a comment •

edited

Loading

Wording change: Tighten up output shape calculation algorithms #523

Wording change: Tighten up output shape calculation algorithms #523

Conversation

inexorabletash commented Jan 20, 2024 • edited by pr-preview bot Loading

inexorabletash commented Jan 20, 2024

zolkis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

inexorabletash Jan 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

inexorabletash commented Feb 13, 2024

fdwr left a comment

Choose a reason for hiding this comment

fdwr left a comment • edited Loading

Choose a reason for hiding this comment

inexorabletash commented Jan 20, 2024 •

edited by pr-preview bot

Loading

inexorabletash Jan 25, 2024 •

edited

Loading

fdwr left a comment •

edited

Loading