Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise graph resource validation #622

Conversation

inexorabletash
Copy link
Contributor

@inexorabletash inexorabletash commented Mar 26, 2024

compute() calls validate graph resources with the passed input/output maps and the graph's input/output descriptors. The intent is to ensure that all of a graph's descriptors are matched to a passed resource. However, the steps were only validating that a passed resource had a corresponding descriptor - it wouldn't flag if a resource was missing!

The Chromium prototype implementation made the test reflexive - all resources much have matching descriptors and vice versa. After deliberation we settled on a more helpful behavior: all inputs must be provided, but extra inputs are ignored. A requested output must be present, but dropping outputs is allowed.

Fixes #602


Preview | Diff

compute() calls validate graph resources with the passed input/output
maps and the graph's input/output descriptors. The intent is to ensure
that all of a graph's descriptors are matched to a passed resource.
However, the steps were only validating that a passed resource had a
corresponding descriptor - it wouldn't flag if a resource was missing!

The Chromium prototype implementation makes the test reflexive - all
descriptors must have a corresponding resource and all resources must
have a corresponding descriptor. Incorporate that into the spec in the
same way - if number of keys is the same, and each resource has a
descriptor, then _ipso facto_* each descriptor has a resource.

Fixes webmachinelearning#602

(* = a Latin phrase meaning "the writer is pretentious")
@inexorabletash
Copy link
Contributor Author

@huningxin and @fdwr - can you please review? Thanks!

index.bs Outdated
@@ -840,6 +840,7 @@ When the {{MLContext/[[contextType]]}} is set to [=context type/default=] with t
<summary>
To <dfn>validate graph resources</dfn>, given {{MLNamedArrayBufferViews}} |resources| and [=ordered map=] |descriptors|, run the following steps:
</summary>
1. If |resources|'s [=map/size=] does not equal |descriptors|'s [=map/size=], then return false.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this implied by the loop below? (but a quick test is fine)

Copy link
Contributor Author

@inexorabletash inexorabletash Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If resources's keys are [ "a", "b" ] but descriptors's keys are [ "a", "b", "c" ] the loop below won't fail:

  1. For each name → resource of resources:

So this would iterate over [ "a", "b" ]

  1. If descriptors[name] does not exist, return false.

descriptors["a"] and descriptors["b"] exist, so this does not fail

  1. If validating buffer with descriptor given resource and descriptors[name] returns false, then return false.

Presumably this also doesn't fail

  1. Return true.

So we return true, even though "c" is not provided.

@fdwr
Copy link
Collaborator

fdwr commented Mar 27, 2024

Before reviewing, I'm just thinking aloud the various cases, given a graph that expects 2 inputs and generates 2 outputs...

  • await context.compute(graph, {input1: inbuf1, input2: inbuf2}, {output1: outbuf1, output2: outbuf2});
    ✅ Exactly matching. All required inputs satisfied. All generated outputs bound.
  • await context.compute(graph, {input1: inbuf1}, {output1: outbuf1, output2: outbuf2});
    ❌ Missing a required input.
  • await context.compute(graph, {input1: inbuf1, input2: inbuf2, input3: inbuf3}, {output1: outbuf1, output2: outbuf2});
    🤷‍♂️✔ Extra input specified. I foresee useful cases to pass potentially ignorable input so an app can more generically provide inputs regardless of the model flavor. Say one diffusion model doesn't use a mask input, whereas another inpainting model does use the mask, and just having the compute ignore that ignore that parameter for the non-inpainting model is cleaner for the caller than inspecting which ones are actually present and branching between them, especially when you have many different permutations of potential inputs.
  • await context.compute(graph, {input1: inbuf1, input2: inbuf2}, {output1: outbuf1});
    🤷‍♂️✔ An output was generated by the graph was ignored and discarded. This seems fine to me too. The caller doesn't always care about all outputs, and needing to create a dummy variable just to discard that dummy seems pointless.
  • await context.compute(graph, {input1: inbuf1, input2: inbuf2}, {output1: outbuf1, output2: outbuf2, output3: outbuf3});
    🤷‍♂️❌ If you require an output, and the graph doesn't provide it, that's an error. Though, by my same logic above for optionally unused inputs, there can be a set of closely related model flavors where some provide an output and others do not, and the output in that case would be missing from the return results.outputs. So maybe this is 🤷‍♂️✔ too, where the caller has to check the dictionary. I suppose there isn't an IDL dictionary syntax to express {requiredOutput1: outbuf1, optionalOutput2: outbuf2}. 🤔...

Anyway, we definitely both agree (per #602) on the case where it "Should fail, because input2 is missing - but it doesn't" ❌.

@huningxin
Copy link
Contributor

  • await context.compute(graph, {input1: inbuf1, input2: inbuf2}, {output1: outbuf1});
    🤷‍♂️✔ An output was generated by the graph was ignored and discarded.

Computing a subset of outputs is supported. There was an example using sync compute API https://www.w3.org/TR/2024/CRD-webnn-20240214/#example-4a21156c. Unfortunately, my PR #548 that removes the sync APIs deleted this example together. We may want to bring it back by adapting to async compute API.

@inexorabletash
Copy link
Contributor Author

inexorabletash commented Mar 28, 2024

there isn't an IDL dictionary syntax to express {requiredOutput1: outbuf1, optionalOutput2: outbuf2}.

Correct.

Based on above ✔🤷‍♂️❌ it sounds like we might want:

  • Inputs:
    • every input descriptor must have matching input resource: can't run inference with missing inputs!
    • an input resource without a matching input descriptor is okay: some graphs don't use it
  • Outputs:
    • every output resource must have a matching output descriptor: asking for an output that's not provided by the inference is a bug!
    • an output descriptor without a matching output resource is okay: if it's not needed, avoid extra work

Given this we'd want either one algorithm that takes a flag (is input or output?) or two algorithms. The latter is probably easier. And since they're only used in one place this could be in-line in "execute graph", something like this:

  1. For each name → descriptor of graph.[[inputDescriptors]]:
  1. If inputs[name] does not exist, then return a new promise rejected with a TypeError.
  2. If validating buffer with descriptor given inputs[name] and descriptor returns false, then return a new promise rejected with a TypeError.
  1. For each name → resource of outputs:
  1. If graph.[[outputDescriptors]][name] does not exist, then return a new promise rejected with a TypeError.
  2. If validating buffer with descriptor given resource and graph.[[outputDescriptors]][name] returns false, then return a new promise rejected with a TypeError.

... and then a non-normative note saying something like:

NOTE: Invocations of compute() will fail if any of the graph's inputs are not provided as input resources, or if any requested output resources do not match the graph's outputs.

@huningxin
Copy link
Contributor

The latter is probably easier. And since they're only used in one place this could be in-line in "execute graph", something like this:

SGTM!

* inputs must be provided, but extra inputs are ignored
* requested outputs must exist, but not all outputs must be requested
@inexorabletash
Copy link
Contributor Author

1e7e2b9 captures the algorithm discussed above. We should probably wait for the WG telecon to decide if this is what we want.

@inexorabletash inexorabletash changed the title Bug fix: Make "validate graph resources" test reflexive Revise graph resource validation Apr 4, 2024
Copy link
Contributor

@huningxin huningxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@inexorabletash
Copy link
Contributor Author

I updated this PR's name/description to match the changes. I think we're probably good to merge now?

@huningxin
Copy link
Contributor

@fdwr , do you have any further comments?

@fdwr
Copy link
Collaborator

fdwr commented Apr 16, 2024

... or two algorithms. The latter is probably easier. And since they're only used in one place this could be in-line in "execute graph", something like this ...
I updated this PR's name/description to match the changes. I think we're probably good to merge now?

Yeah, I like it. Thanks for noticing and fixing.

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@fdwr
Copy link
Collaborator

fdwr commented Apr 16, 2024

Minor merge conflict from the transferred inputs/outputs CR. So just resolved.

@fdwr fdwr merged commit fd8a8d0 into webmachinelearning:main Apr 16, 2024
2 checks passed
github-actions bot added a commit that referenced this pull request Apr 16, 2024
SHA: fd8a8d0
Reason: push, by fdwr

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@inexorabletash inexorabletash deleted the bugfix-validate-resources-reflexive branch April 16, 2024 15:56
huningxin added a commit to huningxin/webnn that referenced this pull request Jun 2, 2024
fdwr pushed a commit that referenced this pull request Jun 5, 2024
* Fix unidirectionally broadcast the shapes steps

Fix #622

* Update the call sites of "unidirectionally broadcasting the shapes"

* Reverse the order of `shapeFrom` and `shapeTo`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Is "validate graph resources" backwards?
4 participants