Is there a dependency on LET? #6

lars-t-hansen · 2022-03-08T07:30:42Z

ISTR that there was an assumption that a restriction of feature detection to the code section would still allow the conditional use of types that may not be present in all engines - eg, v128 would be usable for local variables within suitably conditionalized code for engines that support SIMD, though not in function signatures or on globals. Currently all locals are declared at the function head, not within any block. Yet feature_block is decoded as a block according to the overview, not allowing locals to be declared within it. Is there a hidden dependency on LET in here? LET is on the chopping block over in the function-references proposal.

tlively · 2022-03-08T20:09:58Z

Ah, this is a good point. I wrote elsewhere that we could with proposals that introduce new types by making sure the types never appear in the type section, but you're right that we would have to avoid locals of the new type as well, and that may not be feasible in general.

let would solve this problem if it existed, but I rather we resolve those discussions and kill it entirely.

The actual solution I have in mind is that spec-compliant engines should still parse and validate new SIMD types, even if they choose not to implement any of the corresponding instructions. This isn't entirely satisfying, but I'm afraid that any other solution would unduly increase the complexity of this proposal.

rossberg · 2022-03-09T06:41:17Z

It's not clear to me that this assumption would be workable for anything interesting but SIMD, as other potentially relevant extensions involve less simple type system extensions (just consider GC), or depend on other non-instruction constructs (threads). And if it's gonna be limited to basically just one use case, then isn't the complexity and global cost of this feature hard to justify already?

lars-t-hansen · 2022-03-09T11:32:50Z

I'd like to understand specifically what it means to "parse and validate the SIMD types even without any of the corresponding instructions". Consider a couple of cases:

  (func $f (result v128)
    (call $f))

  (func $g (param v128) (result i32)
    (i32.const 86))

  (func $h
    (local v128))

  (func $j (param v128))

  (func $k
    (call $j (call $f)))

A SIMD-unaware compiler does not know how to do register allocation or value manipulation for v128 values but even so it needs to accept these functions, I think, and generate something - these functions could be the results of resolving feature_blocks. So what does the compiler do - accept all parameters and trap in each body because the signature or locals or intermediate values are of a recognized-but-unsupported type?

What is the meaning of $f "validating"? We can't really check that the call in its body is legal, so do we accept all function bodies when an illegal type is used in the signature or among the locals? What about $k? Here neither signature or locals indicate the use of v128 but there's an intermediate v128 value. Does validation of the function succeed when the validator sees that, as if it were an unreachable? What happens if this value appears inside the arm of an if?

tlively · 2022-03-09T23:51:03Z

I believe it would work for an engine that doesn't support SIMD instructions to interpret v128 as a trivial empty type, similar to unit in e.g. Rust or OCaml. So functions can have v128 in their parameter or result or local types and validation works the same as in an engine that fully supports SIMD, but then codegen just ignores all v128 parameters, results, and locals. A local.get or a local.set to a v128 local both become no-ops and a branch or return of a v128 has identical codegen to a branch or return without a value. The calling convention for a function with v128 parameters is identical to the calling convention for the same function with its v128 parameters removed.

tlively · 2022-03-10T00:02:50Z

It's not clear to me that this assumption would be workable for anything interesting but SIMD, as other potentially relevant extensions involve less simple type system extensions (just consider GC), or depend on other non-instruction constructs (threads).

Yes, I agree, and that is an intentional trade off made to keep this proposal so simple.

And if it's gonna be limited to basically just one use case, then isn't the complexity and global cost of this feature hard to justify already?

I wouldn't say so. Besides introducing optional features to the spec, this proposal is not very complicated. Feature detection is an extremely frequent feature request from SIMD users (and mostly only from SIMD users, who already do this kind of feature detection for their non-Wasm builds).

On the other hand, if we arrive at a future where we are done adding SIMD-like proposals and every Wasm engine supports every SIMD proposal, then this feature detection would no longer be useful. The reason feature detection is a pain point for our users so far is that they are targeting engines that have not gotten around to implementing SIMD (or are pre-SIMD versions of engines that have since implemented it), not because they are targeting engines that have chosen not to implement SIMD. The far future utility of this proposal does depend on the existence of engines that choose not to implement some subset of SIMD proposals.

tlively · 2022-03-10T18:37:01Z

Another possibility would be to extend this proposal with conditional types. That could look something like this:

conditional_type feature-vec vec(u8) alternative_type

Where the vec(u8) is the type to decode if the engine supports the features in the feature-vec and otherwise the type is decoded as alternative_type.

Rather than depending on engines to be updated to recognize and validate unsupported types but ignore them during codegen, producers would provide known fallback types for locals.

lars-t-hansen · 2022-03-11T07:21:40Z

Another possibility would be to extend this proposal with conditional types. That could look something like this:

Thanks! I spent some time yesterday trying to figure out the implications of your previous approach with known-but-unsupported types and I was not happy with the outcome; there seemed to be a fair amount of hidden complexity incurred by verifying and compiling concurrently in the presence of that approach. Conditional types would remove much of that complexity.

Conditional types also seem to gel well with our solution for nondefaultable types since, IIRC, locals of these types
will always be initialized with an assignment (even though WebAssembly/function-references#44 remains unresolved) and the assignment can be conditionalized.

Going further down this path, it feels somewhat probable that we're going to want to allow (constant) expressions to be conditionalized too, for global initializers. IIRC these currently do not allow blocks and the extended-const proposal does not change that.

tlively · 2022-03-14T20:34:32Z

I spent some time yesterday trying to figure out the implications of your previous approach with known-but-unsupported types and I was not happy with the outcome; there seemed to be a fair amount of hidden complexity incurred by verifying and compiling concurrently in the presence of that approach. Conditional types would remove much of that complexity.

I'd be curious to hear more about the problems with that approach. I can see that introducing some internal notion of unit to represent unsupported types during codegen could bring a good deal of complexity, but another option would be to treat unsupported types as aliases for e.g. i32. Would that still run into the problems you found?

Conditional types also seem to gel well with our solution for nondefaultable types since, IIRC, locals of these types
will always be initialized with an assignment (even though WebAssembly/function-references#44 remains unresolved) and the assignment can be conditionalized.

The initialization of the conditionally typed local would depend on the specified alternative_type specified (assuming the features were not supported). So if the alternative_type is not defaultable, it will be subject to whatever rules we end up with for non-defaultable types, but if the alternative_type is e.g. i32, then it will be initialized with 0 just like any other i32 local.

Also note that the alternative_type could in principle be another conditional type, so there could be a chain of types to try depending on the supported features. The entire chain would be resolved to a single known type during decoding.

Going further down this path, it feels somewhat probable that we're going to want to allow (constant) expressions to be conditionalized too, for global initializers. IIRC these currently do not allow blocks and the extended-const proposal does not change that.

This sounds plausible, but I'm having trouble coming up with a specific use case. Do you have one in mind? I hope that if this is necessary, then it would be sufficient to allow blocks in initializers with no impact on anything else.

rossberg · 2022-03-15T10:39:58Z

To be honest, this is exactly the kind of slippery slope rabbit hole I fear. It doesn't stop with codes for value types, I bet we will quickly discover that we'll need similar constructs for type definitions in the type section, and for stuff in other sections, or whole section types for that matter. I can't claim that I feel comfortable with the prospect.

tlively · 2022-03-15T18:09:52Z

I think it's reasonable to be wary of the slipperly slope here, but I've tried to mitigate that by scoping the goals of the proposal specifically to the needs of SIMD programmers. I can't imagine SIMD users operating on anything other than linear memory or maybe GC i8 arrays at some point in the future. They certainly won't need any new kinds of type definitions or sections—those would imply broader changes to the compilation scheme that would make feature detection impractical anyway.

lars-t-hansen · 2022-03-16T07:44:21Z

I spent some time yesterday trying to figure out the implications of your previous approach with known-but-unsupported types and I was not happy with the outcome; there seemed to be a fair amount of hidden complexity incurred by verifying and compiling concurrently in the presence of that approach. Conditional types would remove much of that complexity.

I'd be curious to hear more about the problems with that approach. I can see that introducing some internal notion of unit to represent unsupported types during codegen could bring a good deal of complexity, but another option would be to treat unsupported types as aliases for e.g. i32. Would that still run into the problems you found?

If I understand your earlier messages correctly, the v128 type would be "known but unsupported" in the sense that if it appears in a signature on a function, it could not be replaced by unit (or anything else) in that signature for the purposes of validation but would have to participate in typechecking. Since validation happens concurrently with compilation and code generation this means that there are now two views on each type, and every part of the pipeline must be sure not to be confused about which view they have. This adds complexity that I would just as soon not have. I agree that substituting eg i32 for it during codegen makes codegen "just work", provided that we've already gotten the views on the type right.

lars-t-hansen · 2022-03-16T08:10:27Z

During yesterday's meeting, @tlively asserted that most functions that need feature detection for SIMD do not take or return v128 values but take pointers to memory. The current discussion we're finding ourselves in with conditional_type and its slippery slope is on the other hand due to the combination of the feature_block => block transition coupled with no let and no block-local variables. Looking back to the GCC/LLVM examples posted on the overview, the prominent idioms are all function-level. Should we instead refocus feature detection on the function level? Could there be a notion of a multiversion function that is a single function with a single signature (obviously containing no unsupported types) whose bodies and local variables are conditionally defined?

In the binary format such a function could be accommodated by using the number of local entries as a flag; FFFFFFFFh could signal a conditionally defined function with multiple bodies following (or in general, that flag bits follow). These bodies could then be selected using the feature flags. Each body would have its own section of locals. As before, the body would be selected at decode time.

feature.supported and feature_block might still be supported for function-local uses (they would be useful within conditionally selected bodies that could themselves have multiple cases for various sets of supported instructions); we would not need conditional_type. We should expect to see eg

func $f (param i32 i32) (result i32)
  conditional_body SIMD128
    local v128
    local.set 2 (v128.load (local.get 0))
    ...
    if feature.supported RELAXED128
      feature_block RELAXED128
        ...
      end
    else
      ...
    end
  else_body
    local f32 f32 f32 f32
    local.set 2 (f32.load (local.get 0))
    local.set 3 (f32.load offset=4 (local.get 0))
    local.set 4 (f32.load offset=8 (local.get 0))
    local.set 5 (f32.load offset=12 (local.get 0))
    ...
  end
end

(Body selection by a single bit vector may not be quite powerful enough; minimally providing for multiple bit strings to signify logical OR would help. There would be conditional_body, elif_body, else_body, at least at the text level.)

(There is a connection here to call tags and its func_switch idea, as well, though of course there it's about run-time selection of the body.)

[Edited mainly to clarify continued utility of feature.supported and feature_block]

tlively · 2022-03-16T17:37:27Z

Function-level selection would be sufficient for the gcc/clang function multiversioning use case I'm interested in. My reason for proposing a block-level primitive instead is that I anticipated that the group would favor an approach that allowed for simple inlining, but I don't feel strongly about that myself. It would certainly be nice to avoid the need for feature_type.

titzer · 2022-03-16T17:53:38Z

Definitely function-level substitution feels more MVP-like than the more complex block-level substitution.

lars-t-hansen · 2022-03-17T06:55:33Z

My reason for proposing a block-level primitive instead is that I anticipated that the group would favor an approach that allowed for simple inlining

But not inlining that a typical C/C++ compiler would do, possibly? As it is, with function-level attributes for gcc/clang, an "SSE4.2" function can be inlined into another "SSE4.2" function just fine, but inlining it into a non-attributed function would require a run-time test before the inlined code, probably reducing the benefits of the inlining.

rossberg · 2022-03-17T07:59:10Z

Function-level granularity sounds very plausible to me, given that it matches what Clang is doing IIUC. But if function granularity is what we are targeting, then wouldn't conditional sections (or rather, alternative sections with strict sizes as mentioned during the meeting) be the generic and more scalable version of that?

Especially since I remain unconvinced that conditionals will always be confined to the code section. In particular, differences in function signatures already affect the type section, or at least the function section. Not to mention the case where SIMD values vs alternative representations need to be stored in GC types, which I don't think is particularly hypothetical. I can also imagine the desire to define different data segments, e.g., to adapt to representations of constants.

tlively · 2022-03-18T01:01:38Z

My reason for proposing a block-level primitive instead is that I anticipated that the group would favor an approach that allowed for simple inlining

But not inlining that a typical C/C++ compiler would do, possibly? As it is, with function-level attributes for gcc/clang, an "SSE4.2" function can be inlined into another "SSE4.2" function just fine, but inlining it into a non-attributed function would require a run-time test before the inlined code, probably reducing the benefits of the inlining.

That's right, LLVM inlining would behave as you describe. In principle Binaryen would be able to inline functions containing feature_block, but in practice such functions would almost always contain loops and cause Binaryen to decide not to inline them, so in practice I don't think it would make a difference.

Especially since I remain unconvinced that conditionals will always be confined to the code section... I can also imagine the desire to define different data segments, e.g., to adapt to representations of constants.

I would really like to try to ground this discussion in the developer experience we concretely want to support because our users have been specifically asking for it for the past year. If there are other concrete use cases, we should consider them as well. But it is very difficult to make progress when we try to design for an unbounded space of hypothetical future use cases instead.

Function-level granularity sounds very plausible to me, given that it matches what Clang is doing IIUC. But if function granularity is what we are targeting, then wouldn't conditional sections (or rather, alternative sections with strict sizes as mentioned during the meeting) be the generic and more scalable version of that?

Switch-style, fixed-length conditional sections with real feature detection rather than pseudo-features would work well, I think.

lars-t-hansen mentioned this issue Mar 11, 2022

Should const expressions declare their length in the binary format? WebAssembly/design#1443

Open

rossberg mentioned this issue Mar 17, 2022

Alternative sections #10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a dependency on LET? #6

Is there a dependency on LET? #6

lars-t-hansen commented Mar 8, 2022

tlively commented Mar 8, 2022

rossberg commented Mar 9, 2022

lars-t-hansen commented Mar 9, 2022

tlively commented Mar 9, 2022

tlively commented Mar 10, 2022

tlively commented Mar 10, 2022

lars-t-hansen commented Mar 11, 2022 •

edited

tlively commented Mar 14, 2022

rossberg commented Mar 15, 2022

tlively commented Mar 15, 2022

lars-t-hansen commented Mar 16, 2022

lars-t-hansen commented Mar 16, 2022 •

edited

tlively commented Mar 16, 2022

titzer commented Mar 16, 2022

lars-t-hansen commented Mar 17, 2022

rossberg commented Mar 17, 2022

tlively commented Mar 18, 2022

Is there a dependency on LET? #6

Is there a dependency on LET? #6

Comments

lars-t-hansen commented Mar 8, 2022

tlively commented Mar 8, 2022

rossberg commented Mar 9, 2022

lars-t-hansen commented Mar 9, 2022

tlively commented Mar 9, 2022

tlively commented Mar 10, 2022

tlively commented Mar 10, 2022

lars-t-hansen commented Mar 11, 2022 • edited

tlively commented Mar 14, 2022

rossberg commented Mar 15, 2022

tlively commented Mar 15, 2022

lars-t-hansen commented Mar 16, 2022

lars-t-hansen commented Mar 16, 2022 • edited

tlively commented Mar 16, 2022

titzer commented Mar 16, 2022

lars-t-hansen commented Mar 17, 2022

rossberg commented Mar 17, 2022

tlively commented Mar 18, 2022

lars-t-hansen commented Mar 11, 2022 •

edited

lars-t-hansen commented Mar 16, 2022 •

edited