Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text format revisions #884

Closed
tlively opened this Issue Oct 2, 2018 · 23 comments

Comments

Projects
None yet
7 participants
@tlively
Copy link

tlively commented Oct 2, 2018

At the Oct. 2 CG meeting, there was unanimous consent to consider making changes to the text format to improve the consistency of instruction names and remove special characters. Specifically, interest has been expressed in removing slashes from all type conversion (wrap, trunc, extend, convert, demote, promote, and reinterpret) instructions and unifying the formats of memory and global instructions.

To minimize ecosystem disruption, we will plan to make all changes we agree on at once and update as many tools as possible at roughly the same time. We would like to resolve this issue quickly to avoid holding up the standardization process.

What final changes do we want to make?

@tlively

This comment has been minimized.

Copy link
Author

tlively commented Oct 2, 2018

I propose replacing slashes with underscores. For example, i32.wrap/i64 would become i32.wrap_i64.

Related discussion for nontrapping float to int conversions here: WebAssembly/nontrapping-float-to-int-conversions#4.

@rossberg

This comment has been minimized.

Copy link
Member

rossberg commented Oct 2, 2018

I'm fine with underscore for conversions. If we do that, I would propose switching around the _u/_s suffixes, so that they consistently go last in all mnemonics. That is,

  • i32.trunc_f32_u, i32.trunc_f32_s and so on

This makes it easier to consistently recognise e.g. all i32_xxx_u operating specifically on unsigned i32's.

The other changes I would propose, for consistency with memory and upcoming table instructions (see bulk operations proposal):

  • {get,set}_global -> global.{get,set}
  • {get,set,tee}_local -> local.{get,set,tee}

(For the record, one could also consider call -> func.call, but I won't be proposing that.)

@binji

This comment has been minimized.

Copy link
Member

binji commented Oct 2, 2018

OK, so then the full list would be like this, right?

;; table declaration
anyfunc -> funcref

;; core instructions
get_local -> local.get
set_local -> local.set
tee_local -> local.tee
get_global -> global.get
set_global -> global.set
i32.wrap/i64 -> i32.wrap_i64
i32.trunc_s/f32 -> i32.trunc_f32_s
i32.trunc_u/f32 -> i32.trunc_f32_u
i32.trunc_s/f64 -> i32.trunc_f64_s
i32.trunc_u/f64 -> i32.trunc_f64_u
i64.extend_s/i32 -> i64.extend_i32_s
i64.extend_u/i32 -> i64.extend_i32_u
i64.trunc_s/f32 -> i64.trunc_f32_s
i64.trunc_u/f32 -> i64.trunc_f32_u
i64.trunc_s/f64 -> i64.trunc_f64_s
i64.trunc_u/f64 -> i64.trunc_f64_u
f32.convert_s/i32 -> f32.convert_i32_s
f32.convert_u/i32 -> f32.convert_i32_u
f32.convert_s/i64 -> f32.convert_i64_s
f32.convert_u/i64 -> f32.convert_i64_u
f32.demote/f64 -> f32.demote_f64
f64.convert_s/i32 -> f64.convert_i32_s
f64.convert_u/i32 -> f64.convert_i32_u
f64.convert_s/i64 -> f64.convert_i64_s
f64.convert_u/i64 -> f64.convert_i64_u
f64.promote/f32 -> f64.promote_f32
i32.reinterpret/f32 -> i32.reinterpret_f32
i64.reinterpret/f64 -> i64.reinterpret_f64
f32.reinterpret/i32 -> f32.reinterpret_i32
f64.reinterpret/i64 -> f64.reinterpret_i64

;; saturating-float-to-int instructions
i32.trunc_s:sat/f32 -> i32.trunc_sat_f32_s
i32.trunc_u:sat/f32 -> i32.trunc_sat_f32_u
i32.trunc_s:sat/f64 -> i32.trunc_sat_f64_s
i32.trunc_u:sat/f64 -> i32.trunc_sat_f64_u
i64.trunc_s:sat/f32 -> i64.trunc_sat_f32_s
i64.trunc_u:sat/f32 -> i64.trunc_sat_f32_u
i64.trunc_s:sat/f64 -> i64.trunc_sat_f64_s
i64.trunc_u:sat/f64 -> i64.trunc_sat_f64_u

;; simd instructions
f32x4.convert_s/i32x4 -> f32x4.convert_i32x4_s
f32x4.convert_u/i32x4 -> f32x4.convert_i32x4_u
f64x2.convert_s/i64x2 -> f64x2.convert_i64x2_s
f64x2.convert_u/i64x2 -> f64x2.convert_i64x2_u
i32x4.trunc_s/f32x4:sat -> i32x4.trunc_sat_f32x4_s
i32x4.trunc_u/f32x4:sat -> i32x4.trunc_sat_f32x4_u
i64x2.trunc_s/f64x2:sat -> i64x2.trunc_sat_f64x2_s
i64x2.trunc_u/f64x2:sat -> i64x2.trunc_sat_f64x2_u

;; atomic instructions
i32.atomic.rmw8_u.add -> i32.atomic.rmw8.add_u
i32.atomic.rmw16_u.add -> i32.atomic.rmw16.add_u
i64.atomic.rmw8_u.add -> i64.atomic.rmw8.add_u
i64.atomic.rmw16_u.add -> i64.atomic.rmw16.add_u
i64.atomic.rmw32_u.add -> i64.atomic.rmw32.add_u
i32.atomic.rmw8_u.sub -> i32.atomic.rmw8.sub_u
i32.atomic.rmw16_u.sub -> i32.atomic.rmw16.sub_u
i64.atomic.rmw8_u.sub -> i64.atomic.rmw8.sub_u
i64.atomic.rmw16_u.sub -> i64.atomic.rmw16.sub_u
i64.atomic.rmw32_u.sub -> i64.atomic.rmw32.sub_u
i32.atomic.rmw8_u.and -> i32.atomic.rmw8.and_u
i32.atomic.rmw16_u.and -> i32.atomic.rmw16.and_u
i64.atomic.rmw8_u.and -> i64.atomic.rmw8.and_u
i64.atomic.rmw16_u.and -> i64.atomic.rmw16.and_u
i64.atomic.rmw32_u.and -> i64.atomic.rmw32.and_u
i32.atomic.rmw8_u.or -> i32.atomic.rmw8.or_u
i32.atomic.rmw16_u.or -> i32.atomic.rmw16.or_u
i64.atomic.rmw8_u.or -> i64.atomic.rmw8.or_u
i64.atomic.rmw16_u.or -> i64.atomic.rmw16.or_u
i64.atomic.rmw32_u.or -> i64.atomic.rmw32.or_u
i32.atomic.rmw8_u.xor -> i32.atomic.rmw8.xor_u
i32.atomic.rmw16_u.xor -> i32.atomic.rmw16.xor_u
i64.atomic.rmw8_u.xor -> i64.atomic.rmw8.xor_u
i64.atomic.rmw16_u.xor -> i64.atomic.rmw16.xor_u
i64.atomic.rmw32_u.xor -> i64.atomic.rmw32.xor_u
i32.atomic.rmw8_u.xchg -> i32.atomic.rmw8.xchg_u
i32.atomic.rmw16_u.xchg -> i32.atomic.rmw16.xchg_u
i64.atomic.rmw8_u.xchg -> i64.atomic.rmw8.xchg_u
i64.atomic.rmw16_u.xchg -> i64.atomic.rmw16.xchg_u
i64.atomic.rmw32_u.xchg -> i64.atomic.rmw32.xchg_u
i32.atomic.rmw8_u.cmpxchg -> i32.atomic.rmw8.cmpxchg_u
i32.atomic.rmw16_u.cmpxchg -> i32.atomic.rmw16.cmpxchg_u
i64.atomic.rmw8_u.cmpxchg -> i64.atomic.rmw8.cmpxchg_u
i64.atomic.rmw16_u.cmpxchg -> i64.atomic.rmw16.cmpxchg_u
i64.atomic.rmw32_u.cmpxchg -> i64.atomic.rmw32.cmpxchg_u

Note: Edited to reflect @rossberg's comment below.

@rossberg

This comment has been minimized.

Copy link
Member

rossberg commented Oct 2, 2018

It would be more consistent to have the _u/_s go after the _sat, for the same reasons mentioned above.

In fact, I suggest establishing the following consistent naming scheme for numeric instruction mnemonics:

  • t.xxx for all numeric instructions agnostic to signedness
  • t.xxx_u/t.xxx_s for all numeric instructions sensitive to signedness

where xxx describes the specific operation (e.g., add). (That already holds for everything but conversions.)

Conversions are then instantiations of this scheme with xxx = yyy_t, yielding:

  • t.yyy_t for conversions agnostic to signedness
  • t.yyy_t_u/t.yyy_t_s for conversions sensitive to sign

where yyy describes the kind of conversion (e.g., trunc or trunc_sat).

@binji

This comment has been minimized.

Copy link
Member

binji commented Oct 3, 2018

OK, I've updated the comment above to match this.

@lars-t-hansen

This comment has been minimized.

Copy link

lars-t-hansen commented Oct 3, 2018

Since we're just chatting anyway: It irks me that f64.lt (say) does not fit the general pattern; it should really be i32.lt_f64.

@lars-t-hansen

This comment has been minimized.

Copy link

lars-t-hansen commented Oct 3, 2018

Oh, and atomics:

i32.atomic.load8_u fits the proposed pattern.
i32.atomic.rmw8_u.add more or less does not.

@rossberg

This comment has been minimized.

Copy link
Member

rossberg commented Oct 3, 2018

@lars-t-hansen, I suppose you cannot interpret the type before the dot strictly as the result type of an operator. It's the type that the operator "is defined at" in some looser sense.

@lars-t-hansen

This comment has been minimized.

Copy link

lars-t-hansen commented Oct 3, 2018

I'm not sure why I wouldn't want to interpret the type strictly as the result type of the operator, though, since that pattern is used very broadly, just not in some cases where it's suddenly about the arguments and the result type is implied (in all cases I can think of the implied result type is i32). Of course there's a succinctness to the instructions that fall under that exception that would be lost if we applied the general pattern, but I don't know why that is an argument either.

(I'm not just trolling, I actually think the current syntax for comparisons is confusing and counterintuitive. Even lt_f64 -- without any type-designating prefix -- would have been better than the current f64.lt.)

@binji

This comment has been minimized.

Copy link
Member

binji commented Oct 3, 2018

I dunno, i32.lt_f64 looks super weird to me. If we think of the type before the . as a result type, many instructions don't make sense, such as i32.store or even memory.size. Personally I don't think it's that valuable to encode the types in the instruction name anyway, we just need a way to differentiate them.

@tlively

This comment has been minimized.

Copy link
Author

tlively commented Oct 4, 2018

I think of the type before the dot as signifying what type the generic instruction is specialized on. Since all lts return i32, the type before the dot tells me what type the operands are instead, since those still need to be disambiguated.

@alexcrichton

This comment has been minimized.

Copy link

alexcrichton commented Oct 4, 2018

FWIW some possible breakage I think that may want to be considered when doing I believe is:

  • Libraries which want to be idiomatic with the current wasm spec will need breaking changes to update their type names to reflect these name changes. For example a parser library in Rust, parity-wasm, would need a breaking change to rename Instruction::GetLocal to Instruction::LocalGet.
  • We've got tests in Rust which verify that the correct instruction is generated for a particular function (namely for intrinsics like SIMD and such). These instruction names come from disassembly the wasm binary itself with wasm2wat and the instructions to assert for are embedded in the source code. If wasm2wat changes its output, these tests will start to break.
  • A final one is one we've talked about in the Rust project historically is the "blog post" problem. This is where (for example) there's blog posts, tutorials, reference materials, etc on the internet today which may reference wasm instructions by the old names. If the standard changes a number of instructions (especially common ones like get_local) then all this documentation becomes out of date and needs and update.

To be clear I'm not opposed to changing the instructions to be more consistent, but I wanted to be sure to point out some perhaps more subtle locations that "breakage" can arise in the non-traditional sense. They're all pretty minor and we're certainly more than willing to do the updates on the Rust side of things, but points that may want to be considered!

Also @binji your comment with an exhaustive list of renames is super helpful!

@binji

This comment has been minimized.

Copy link
Member

binji commented Oct 4, 2018

Right, we've already experienced some of the same issues when changing grow_memory to memory.grow and current_memory -> memory.size, and these instructions are much more common.

@lars-t-hansen is right about atomics too, these are odd now:

i32.atomic.rmw.add 
i64.atomic.rmw.add 
i32.atomic.rmw8_u.add
i32.atomic.rmw16_u.add
i64.atomic.rmw8_u.add
i64.atomic.rmw16_u.add
i64.atomic.rmw32_u.add
...
@rossberg

This comment has been minimized.

Copy link
Member

rossberg commented Oct 9, 2018

@binji, good point. I propose moving the _u/_s to the end for those as well (which arguably fits existing instructions).

@binji

This comment has been minimized.

Copy link
Member

binji commented Oct 9, 2018

OK, updated my comment above to include these as well.

@rossberg

This comment has been minimized.

Copy link
Member

rossberg commented Oct 15, 2018

Oh, that reminds me of one more thing that I'd like to suggest. In the light of the reference type and GC proposals it would be more consistent if we renamed anyfunc to funcref, so that it lines up with the likes of anyref, eqref, optref, and possibly other subtypes of ref.

@binji

This comment has been minimized.

Copy link
Member

binji commented Oct 16, 2018

OK, added that to the list.

I think as a good first step, we should file bugs in the various trackers where the name change is relevant and link to this issue.

@aardappel

This comment has been minimized.

Copy link

aardappel commented Oct 18, 2018

Apologies for the bike-shedding, but I don't see a whole lot of consistency in the use of . and _.

So far the most frequent use of . has been to separate the type operated upon (input types, not return type) before the . from the rest of the instruction. Which to me means operations like get_local which are not type specific shouldn't have a . in them. And I'm not sure why there are multiple . in atomic ops but not in others.

I'd suggest that either . has a very intuitive meaning like above, or maybe we should do away with . as well, and make _ the only separator.

In fact, while I am ranting, why does i32.add have a type and get_local does not? Both participate in completely static dataflow determined by the stack and locals, and both will have completely static types that are verified when bytecode is loaded. If get_local loads the wrong type, that is an error, but that is not obvious from the instruction. If i32.add receives a wrong operand that is also an error, but here it is more obvious. If instead the instruction was add it wouldn't really change anything, beyond knowing which of the 2 operands is wrong.

@tlively

This comment has been minimized.

Copy link
Author

tlively commented Dec 15, 2018

@binji it looks like we have generally settled on the list above. Is there any action we need to take to officially finalize it?

@binji

This comment has been minimized.

Copy link
Member

binji commented Dec 15, 2018

I don't think so, we agreed to the change in the CG meeting, and the change to the spec has landed. I think at this point we should update the tools to use the new names.

tlively added a commit to tlively/binaryen that referenced this issue Jan 7, 2019

Massive renaming
Automated renaming according to
WebAssembly/spec#884 (comment).

tlively added a commit to tlively/binaryen that referenced this issue Jan 7, 2019

Massive renaming
Automated renaming according to
WebAssembly/spec#884 (comment).

tlively added a commit to WebAssembly/binaryen that referenced this issue Jan 7, 2019

Massive renaming (#1855)
Automated renaming according to
WebAssembly/spec#884 (comment).

llvm-git-migration pushed a commit to llvm-git-prototype/llvm that referenced this issue Jan 8, 2019

[WebAssembly] Massive instruction renaming
Summary:
An automated renaming of all the instructions listed at
WebAssembly/spec#884 (comment)
as well as some similarly-named identifiers.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, eraman, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D56338

llvm-svn: 350609

chapuni pushed a commit to llvm-project/llvm-project-20170507 that referenced this issue Jan 8, 2019

[WebAssembly] Massive instruction renaming
Summary:
An automated renaming of all the instructions listed at
WebAssembly/spec#884 (comment)
as well as some similarly-named identifiers.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, eraman, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D56338

chapuni pushed a commit to llvm-project/llvm-project-submodule that referenced this issue Jan 8, 2019

[WebAssembly] Massive instruction renaming
Summary:
An automated renaming of all the instructions listed at
WebAssembly/spec#884 (comment)
as well as some similarly-named identifiers.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, eraman, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D56338

earl pushed a commit to earl/llvm-mirror that referenced this issue Jan 8, 2019

[WebAssembly] Massive instruction renaming
Summary:
An automated renaming of all the instructions listed at
WebAssembly/spec#884 (comment)
as well as some similarly-named identifiers.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, eraman, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D56338

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350609 91177308-0d34-0410-b5e6-96231b3b80d8

chicoxyzzy added a commit to chicoxyzzy/spec that referenced this issue Jan 10, 2019

bgins added a commit to bgins/tiger-wat that referenced this issue Jan 19, 2019

dylanmckay pushed a commit to dylanmckay/llvm that referenced this issue Jan 20, 2019

tlively
[WebAssembly] Massive instruction renaming
Summary:
An automated renaming of all the instructions listed at
WebAssembly/spec#884 (comment)
as well as some similarly-named identifiers.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, eraman, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D56338

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350609 91177308-0d34-0410-b5e6-96231b3b80d8
@binji

This comment has been minimized.

Copy link
Member

binji commented Jan 24, 2019

These name changes have landed in the spec repo, in wabt, in LLVM, and binaryen, and likely other places too. Let's close this issue.

@binji binji closed this Jan 24, 2019

@hjorthjort

This comment has been minimized.

Copy link

hjorthjort commented Jan 30, 2019

I found what seems to be a trailing error.

In section Structure/Instructions/Numeric Instructions (2.4.1), in the bullet list:

Conversions: consume a value of one type and produce a result of another (the source type of the conversion is the one after the “/”).

I see the same thing in both web versions, and on page 11 in the PDF (generated Jan 27, 2019).

rossberg added a commit that referenced this issue Jan 30, 2019

@rossberg

This comment has been minimized.

Copy link
Member

rossberg commented Jan 30, 2019

@hjorthjort, good catch, see #957.

rossberg added a commit that referenced this issue Jan 30, 2019

binji added a commit to WebAssembly/threads that referenced this issue Feb 13, 2019

binji added a commit to WebAssembly/threads that referenced this issue Feb 13, 2019

binji added a commit to WebAssembly/threads that referenced this issue Feb 13, 2019

binji added a commit to WebAssembly/threads that referenced this issue Feb 13, 2019

aheejin added a commit to aheejin/threads that referenced this issue Feb 18, 2019

Atomic instruction renaming
This renames instructions according to WebAssembly/spec#884.

aheejin added a commit to aheejin/threads that referenced this issue Feb 18, 2019

Atomic instruction renaming
This renames instructions according to
WebAssembly/spec#884 (comment).

binji added a commit to WebAssembly/threads that referenced this issue Feb 18, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.