Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use brackets for array types #854

Closed
litherum opened this issue Jun 10, 2020 · 27 comments
Closed

Use brackets for array types #854

litherum opened this issue Jun 10, 2020 · 27 comments
Labels
wgsl WebGPU Shading Language Issues
Projects
Milestone

Comments

@litherum
Copy link
Contributor

Right now, WGSL has array<vec4<f32>, 5>. This would be much easier read if it was vec4<f32>[5].

(Bonus points for fvec4[5] if we do #736)

@litherum litherum added the wgsl WebGPU Shading Language Issues label Jun 10, 2020
@dj2 dj2 added this to For Next Meeting in WGSL Jun 10, 2020
@Kangz
Copy link
Contributor

Kangz commented Jun 10, 2020

I have some worries that appending modifiers at the end of type will lead to a similar syntax hell that C declarations have. IMHO the array syntax should surround the element type. Similarly to Swift's Array<T> (it doesn't have static sized arrays), or Rust's [u32; 100].

Alternatively Zig reverses the order, which helps with parsing too: [100]u32.

@RobinMorisset
Copy link
Contributor

I don't think we should be worried about C-like syntax hell, because we don't have prefixes in type names (pointers are not * in WGSL).

I don't have a strong preference among the various proposals.

@dj2
Copy link
Member

dj2 commented Jun 16, 2020

Switching to [] is inconsistent with everything else which has the type first, which to me, makes it less readable.

@dneto0
Copy link
Contributor

dneto0 commented Jun 23, 2020

MSL has established practice with the template notation.

Declare an array of textures as either:
array<typename T, size_t N>
...

@litherum
Copy link
Contributor Author

Yes, Metal has both forms (with some semantic differences between them). However, the bracket form is much more common in MSL. Also, (AFAIK) both HLSL and GLSL only have the bracket form.

@grorg
Copy link
Contributor

grorg commented Jul 28, 2020

Discussed at the 2020-07-28 meeting.

@kdashg
Copy link
Contributor

kdashg commented Jul 28, 2020

A middle ground: T[N] desugars to array<T,N>
There seems to be cautious consensus for this desugaring, and forbidding attributes in the T[N] case.

@dneto0
Copy link
Contributor

dneto0 commented Jul 31, 2020

For visibility:

In the last meeting I expressed concern about ambiguity of having a type declaration structure where attributes are to the left, a base type is in the middle and array brackets are on the right:

    optional-attributes    element-type   [ array-size-count ]

In that case it's ambiguous whether the attributes apply to the element type or to the array type we're trying to express.

Putting the array-brackets on the left disambiguates this:

   optional-array-attributes [ array-size-count ] optional-element-attributes element-type

@litherum rejected this as unfamiliar and hence not meeting their familiarity goal.

@kainino0x
Copy link
Contributor

Arguably since we're not trying to have C's confusing syntax anyway (more like Java I guess), you could go toward Rust and use the [T] (unsized) / [T; 4] (sized) style. It's a little unfamiliar but perfectly clear.

@grorg grorg moved this from For Next Meeting to Under Discussion in WGSL Aug 3, 2020
@dneto0
Copy link
Contributor

dneto0 commented Aug 4, 2020

Arguably since we're not trying to have C's confusing syntax anyway (more like Java I guess), you could go toward Rust and use the [T] (unsized) / [T; 4] (sized) style. It's a little unfamiliar but perfectly clear.

So an unsized array of size-two-array of two-floats would be:

  [[f32;2]]

That looks an awful lot like an attribute, and so it would extend the lookahead you'd need for parsing, both for humans and machines.

@kvark
Copy link
Contributor

kvark commented Aug 4, 2020

It sounds like keeping array as the only way is most sensible for now? All the shortcut proposals appear to have issues.

@kdashg
Copy link
Contributor

kdashg commented May 7, 2021

Should we punt this into the post-mvp future?

@jimblandy
Copy link
Contributor

jimblandy commented Aug 17, 2021

The discussion here seems to assume that T[N] is syntax for the type 'an array of N elements of type T'. But that implies that a multidimensional array type like T[N][M] should be parenthesized as (T[N])[M]: an array of M elements, each of which is an array of N elements of type T. In this syntax, sizes of multidimensional arrays appear from inner to outer as one reads from left to right, because the inner type is written before the size of the array containing it.

This is the transpose of the way C and GLSL interpret T[N][M]: they treat that as an array of N elements, each of which is an array of M elements of type T. The proposal here is also backwards relative to the way arrays are indexed in expressions, where outer indices appear first.

For this proposal to work, the syntax would need to be something like:

unsubscripted-type subscript ...

where:

unsubscripted-type subscript... final-subscript

gets parenthesized as:

( unsubscripted-type final-subscript ) subscript...

In other words, the grammar must grab all the subscripts at once, and apply them to the inner type from right to left.

Note that Rust gets this sort of thing 'wrong': let x: [[i32; 10]; 4] = ...; gives you an array of 4 arrays of length 10. When you write x[i][j], i can range from 0 to 4, and j can range from 0 to 10. So the sizes apply to the indices in reverse order. The way the array syntax surrounds the element type arguably helps make this explicit, but it's still not good.

This is why Go has you write [N]T for 'array of N elements each of type T': [M][N]T parenthesizes naturally as [M]([N]T), so the sizes appear from major to minor, as expected.

(I think it was a great choice for WGSL to avoid C-style type syntax, and not hide the identifier being declared in the midst of syntax specifying its type. But in doing so, we've traded int x[5] for x: int[5], and swapping the positions of the type and the identifier has consequences - the confusion here being one of them.)

@kdashg
Copy link
Contributor

kdashg commented Aug 17, 2021

WGSL meeting minutes 2021-08-17
  • JG: Do we want to talk about this before MVP?
  • MM: Important to talk about, because it’s hard to feature-detect, and also people will definitely assume it works everywhere if it works anywhere.
  • AB: Talking about arrays internally, one thing we’re considering is pulling stride into the type. (removing it as an attribute)
  • MM: Is this a functional change?
  • AB: Benefit is that it’s less ambiguous when types are compatible, or when attribs are required.
  • MM: I see three directions:
      1. [[stride=12]] i32[5]
      1. i32[(stride=12), 5] as long as the () parts are optional (or something like that)
      1. i32[5], and if you need stride, use angle-brackets
  • MM: Should we postpone discussion until Google has its discussions?
  • AB: Not ready to agree to a final syntax today, but not opposed to it either.
  • DM: I don’t think C’s syntax is a good one to follow, and I think other languages end up with because of their C heritage. I agree with DN’s points about why this syntax can be complicated/confusing.
  • MM: Yeah, so like i32[4][5] is confusing because it’s not clear what it is? (yeah)
  • AB: I feel like we’re there already with our accessors, so if you have to access them you’re already decoding it mentally somehow.
  • JG: That would be a (i32[4])[5]
  • AB: But that’s backwards from how the accessor works
  • DM: We have conflicting guidelines: It would be nice to have the access and declaration order be the same, but also we should match the C style(?)
  • JB: Comparisons to C are worrisome, because we’ve already decided to swap the order of the ident and the type for declarations. I think if we’re trying to be recognizable, it might be too late to cleave towards what C does.
  • [...]
  • https://godbolt.org/z/MKKW4nMT7
  • void (*myFunctionPointer)(int[3][4])
  • (We should come back to this in a week after more thought)

@litherum
Copy link
Contributor Author

litherum commented Aug 17, 2021

The primary desire here is to increase familiarity - every popular human-writable shading language uses these square brackets. Solutions like [i32; 4] or [4]i32 do not satisfy the familiarity desire.

There are a few natural situations about how the stride attribute could be applied with this new syntax:

  1. Leave it is an attribute: [[stride=8]] i32[5]. If you want to put stride on the inner type, use parentheses: ([[stride=8]] i32[5])[7]
  2. Find somewhere else to stick it: i32[stride=8, 5] or i32(stride=8)[5] or something like that. Because of the familiarity goals, this only makes sense if it's optional, so authors could write i32[5] to get the default stride.
  3. Include both syntaxes (i32[5] and array<i32, 5>) in the language, and only allow stride to be specified on the array<> syntax. The square bracket syntax always gets the default stride.
  4. (My personal favorite) get rid of the stride concept entirely, and tell authors (and other-language-to-WGSL-compilers) to add additional members / wider types to their arrays to insert their own padding. Neither C++ nor MSL (nor HLSL, I think?) have the concept of an explicit stride. Why are we even supporting this, anyway?

@litherum
Copy link
Contributor Author

litherum commented Aug 17, 2021

In this proposal, i32[5][7] would mean "5 repetitions of 7 ints," as it does in C.

var my2DArray: i32[5][7];
var outerLength = arrayLength(my2DArray); // outerLength is 5
var componentLength = arrayLength(my2DArray[0]); // componentLength is 7

This matches C:

void myFunction(int32_t x[5][7]) {
    printf("%zu\n", sizeof(x[0]));
}
...
void (*myFunctionPointer)(int32_t[5][7]) = &myFunction; // WGSL's "i32[5][7]" is meant to match this "int32_t[5][7]" in C
int32_t myVariable[5][7];
myFunctionPointer(myVariable);

prints 28.

I'm assuming the existing shading languages follow C.

@kainino0x
Copy link
Contributor

Drive-by 2 cents: please do not make this look like C. In C, int x[5][7] means "x[i][j] is of type int", not "x is of type int[5][7]". And this is consistently confusing. (Yes, you can spell the type as int[5][7] in certain contexts, but that is not the common spelling, and it still associates the weird way.)

In WGSL it would be the common (only) spelling, and it would clearly be self-inconsistent: i32 -> i32[5] is an array of 5 i32s -> i32[5][7] is an array of 5 i32[7]s??

Please do the Rust thing, the Go thing, or some other solution that is compatible with the fact that WGSL completely separates type from name (like Rust and Go) - the C syntax is deeply entrenched in the fact that C does not.

@kainino0x
Copy link
Contributor

Also please take into account that Rust and Go didn't invent array syntaxes for the fun of it - they did it because the C syntax is deeply incompatible with the type syntaxes of those languages.

@kvark
Copy link
Contributor

kvark commented Aug 18, 2021

Adding to @kainino0x and @jimblandy

I think it's a great property of a language if any type that is implicitly used is spelled out. Like, I can just take the span of the source file to show where this type is defined:

let foo: array<array<i32, 3>, 4>; // the sub-type is "array<i32, 3>"

If we allow C syntax for array declarations, this property is gone:

let foo: i32[4][3]; // the sub-type is "i32[3]", which is not spelled out anywhere

Another point I'd like to make is that WGSL's array semantics are closer to std::array<T, N> than to T[N]. It's passed by value into functions, and is bounds checked.

@kainino0x
Copy link
Contributor

kainino0x commented Aug 18, 2021

Just a note, I was trying to think of this yesterday but couldn't think of it.
Relatively few languages use C's "outfix" type syntax. One example of a language with strictly separated types is Java (which uses prefix types, unlike WGSL/Rust/Go's postfix types). In Java, there are no statically sized array types, but the array constructor does take size arguments:

int[][] x = new int[2][5];

This is an array of 5 int[2]s, i.e. the opposite of C, and much more intuitive - yet the fact that it is the opposite of C(/GLSL/HLSL?) makes it a less appealing choice here.

@dneto0
Copy link
Contributor

dneto0 commented Aug 24, 2021

I think it's a great property of a language if any type that is implicitly used is spelled out. Like, I can just take the span of the source file to show where this type is defined....

Yes. Same thing for 'expressions'. From the spec:

An expression is a segment of source text parsed as one of the WGSL grammar rules whose name ends with "_expression". An expression E can contain subexpressions which are expressions properly contained in the outer expression E.

This simple mapping to source text has big benefits for understanding, and probably for tooling as well.

@dneto0
Copy link
Contributor

dneto0 commented Aug 24, 2021

One more nuance: We now have type-inference for let and var, so authors don't have to write down types nearly as much as in the original context of this issue.

@kdashg
Copy link
Contributor

kdashg commented Aug 31, 2021

WGSL meeting minutes 2021-08-24
  • (We should come back to this [this week] after more thought)
  • RM: I have a proposal for this one. Seems like the main issue is with multidimensional arrays, and whether this should have C or java semantics. Maybe we should just forbid this for not? E.g. require using array<> syntax if you want multidimensional arrays, but retain syntax sugar for single-dimensional arrays.
  • MM: It’s subtle too, since we’re not forbidding multidim arrays, but rather it forbids writing foo[A][B].
  • RM: And it would just be forbidden in the grammar, and we could have a nice error message if people tried to do this.
  • DM: Could this argument be applied to e.g. pointers, where we might be pressured to use c-style pointer syntax?
  • MM: I think the cost-benefit trade-off is different, so it wouldn’t be compelling there.
  • DM: Well, e.g. Mozilla uses a lot of array<> style containers instead, so the c-style familiarity argument is weaker in practice
  • MM: MSL has int[5] and Array<int, 5>. Int[5] has reference semantics, and Array<int, 5> has value semantics
  • GR: In HLSL, conceptually int[5] is copied in and copied out, but optimizations make it more efficient than this would seem to suggest
  • DM: I feel like we don’t even have a majority among our backend languages.
  • MM: Most shader source code that I’m aware of is written in HLSL or GLSL. That’s what I’m leaning toward.
  • JG: I like the sugaring proposal, especially with the restriction on multidim arrays. There are a bunch of languages people use int[5] in, so it’d a decent desire to allow this sugaring.
  • [...]
  • DM: Where does this leave us for stride on the type vs on the var?
  • MM: I think the sugar is just for the common cases, and you can just spell it out if you need e.g. stride.
  • DM: If HLSL is by value, I’m fine with this being that way in WGSL
  • DN: I think HLSL is like GLSL, where it’s copy-in-copy-out, with all its edge cases. I think I’m somewhat concerned about making choices based on how HLSL behaves. I think cribbing from how other languages behavor “usually” is risky, but I’m not disagreeing with DM’s conclusion.
  • DN: I think we’ll need to discuss this more internally.
  • DN: Stepping way way back, I think there’s a split between being like JS or TypeScript, and that this is sort of related to that?
  • MM: This proposal doesn’t change the reference/value semantics of arrays. This proposal is just changing how to spell the type. No semantics change. I believe we have also forbidden the kind of aliasing that causes your hesitation.
  • Come back to this next week.

@kdashg kdashg moved this from Needs Discussion to Needs Action in WGSL Aug 31, 2021
@kdashg
Copy link
Contributor

kdashg commented Aug 31, 2021

WGSL meeting minutes 2021-08-31
  • RM: Last week suggested a solution at grammar level to forbid multiple dimensional brackets to keep things simple here.
  • JG: I like [] as sugar, just another way to spell array<>, and that this limit nicely constrains the complexity here. Most users should be happy with it.
  • BC: Is there any precedent for writing the same thing in two ways. No precedent yet.
  • MM: also talking about things like vec4<f32> having a short alias name.
  • AB: There are type aliases the user can write.
  • BC: This is unsettling. For a young language, multiple ways to declare the same thing.
  • DM: Closest thing we have is the ‘for’ syntax, another way to write a loop.
  • MM: Related If they are [ ] available, can they be used for a runtime-sized array. Think we should make that work too.
  • AB: That’s how GLSL does it.
  • JG: Would you write array<foo,2>[3]
  • MM: So can you mix the notations. Parallel question.
    • array<i32[3],3>
    • Goal is familiarity, and that’s not familiar in combination. So no opinion offered here.
  • JG: As long as no square
  • DN: Amplifying what BC said, where we’re adding a separate way to say the same thing in order to add familiarity for a slice/lineage of languages. Taking only some parts of C’s array but not all, so that weakens the familiarity argument. Other new languages (Zig, Go) put the element-counting before the element type.
  • DN: Non-orthogonality is concerning. Too early in evolution of the language.
  • KN: WGSL isn’t close enough to C / Java to justify taking this syntax.
  • BC: No accommodation for stride. Will have stride forces you to use the generic syntax. So will have code for both. Then newcomers will have questions.
  • MM: Why did we not think we could add a stride here?
  • AB: We didn’t come up with a way to add it, and it’s hard for multi-dim
  • AB: Also Google is discussing internally maybe making this not an attribute anymore, such as inside the template list
  • MM: I was afraid the approach would be to say “use a bigger type to stride things”
  • DN: GLSL has e.g. std140 to implicitly enforce padding
  • MM: This isn’t a dealbreaker for anyone, sounds like. What would move the needle for anyone? The thing that would make me change my mind would be that squarebracket syntax would end up being so uncommon that no one uses them/knows them. That would make it feel like a bad idea to me.
  • BC: I want one way to declare an array.
  • KN: +1, if we replaced the current syntax entirely, that would be fine too. Not wanting to do that right now, but could be happy either way.
  • MM: Is it worth coming up a wholesale replacement? Would that be a waste?
  • DN: I like it now, so “no progress” is forward progress.
  • AB: If you have a full proposal, maybe having that would be useful, and we could progress it outside the meeting.
  • BC: We also have a merge conflict with the new stride compatiblity stuff, so we should sort that out sooner rather than later. Not waiting on a meeting, waiting on DN.
  • JG: This sugaring proposal might be a great thing for one implementation to prototype and then solicit concrete feedback from devs, and see if they like it, or not, or need it.
  • DM: This would make an exception in parsing. This would be the only place where, if that place expects a type, you have to continue lookahead to see if there’s an array-ing suffix to modify the type you just parsed.
  • MM: Don’t think that’s a strong argument.
  • JB: Lots of places where lookahead by one token has not been a problem.
  • DM: Yeah I guess we have one other place where we do this, so it only makes the compiler slightly more complicated.
  • Resolved: Needs action (prototyping and more feedback)

@Kangz Kangz added this to the V1.0 milestone Feb 2, 2022
@kdashg
Copy link
Contributor

kdashg commented Apr 26, 2022

I would like to move this to post-v1 unless we get unblocked on getting feedback from prototyping here.

@kdashg kdashg modified the milestones: V1.0, Polish post-V1 May 11, 2022
@kdashg
Copy link
Contributor

kdashg commented May 11, 2022

WGSL meeting minutes 2022-05-10
  • KG: Instead of array<f32,4> would be array<f32>[4]
  • AB: Push past v1 at least.
  • KG: No desire to keep in v1
  • MM: Probability can carry forward is non-existant.
  • KG: So, post-v1? Something we might want to do?
  • MM: Yes.

@kdashg kdashg modified the milestones: Polish post-V1, post-V1 May 11, 2022
ben-clayton added a commit to ben-clayton/gpuweb that referenced this issue Sep 6, 2022
`vec4()` is now a valid constructor.
See: gpuweb#2305
@litherum
Copy link
Contributor Author

This ship has sailed. Closing.

WGSL automation moved this from Needs Action to Done May 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wgsl WebGPU Shading Language Issues
Projects
WGSL
Done
Development

No branches or pull requests

10 participants