Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simd enablement #2945

Closed
wants to merge 18 commits into from
Closed

Simd enablement #2945

wants to merge 18 commits into from

Conversation

shawnl
Copy link
Contributor

@shawnl shawnl commented Jul 25, 2019

Not in scope:

  • new memory model for comptime
  • stack allocate vectors, so that addr-of vectors (&a[3]) can work, like they are supported in gcc-9

Some stuff is blocked on #447.

While this mostly follows #903, there are some decisions that had to be made, the biggest one is support both bit-wise and bool operations on bools, because a() and b() is differn't from a() & b(), as if a() returns true the first will not run b() and the second will. This is explained in the commit.


TODO

  • needs more tests of zero-length vectors
  • block exporting incompatible ABIs: 256-bit wide on pre-avx, and 512-bit wide on pre-avx512

@shawnl shawnl force-pushed the simd5 branch 3 times, most recently from c32bb99 to 4db2a1c Compare July 26, 2019 03:30
@shawnl
Copy link
Contributor Author

shawnl commented Jul 26, 2019

Some discussion on horizontal intrinsics in LLVM here #903 (comment) which effect #2698

std/math/fma.zig Outdated Show resolved Hide resolved
@data-man
Copy link
Contributor

var v: u32 = 5;
var x = @splat(4, v);

Can be changed to

var v: u32 = 5;
var x = @Vector(4, v);

?

doc/langref.html.in Outdated Show resolved Hide resolved
src/all_types.hpp Outdated Show resolved Hide resolved
src/codegen.cpp Outdated Show resolved Hide resolved
src/ir.cpp Outdated
@@ -12734,6 +12925,30 @@ static IrInstruction *ir_analyze_cast(IrAnalyze *ira, IrInstruction *source_inst
return ir_analyze_widen_or_shorten(ira, source_instr, value, wanted_type);
}

// widening of vectors
// These are separate (while identical) as I am still not sure if this should not implicitely cast,
// but only explicitely cast. (i.e. with @cast, or @as, or @Vector(4, i32)(foo)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need deciding on before merge?

Also, typo: "explicitly"

src/ir_print.cpp Outdated Show resolved Hide resolved
test/stage1/behavior/shuffle.zig Outdated Show resolved Hide resolved
@shawnl
Copy link
Contributor Author

shawnl commented Jul 28, 2019

@data-man I don't like that because the capital letter V means type.

@shawnl shawnl force-pushed the simd5 branch 4 times, most recently from 7b458ce to 520d2a3 Compare August 2, 2019 18:06
@shawnl shawnl force-pushed the simd5 branch 7 times, most recently from 5283ce3 to e5292c0 Compare August 6, 2019 18:50
@shawnl shawnl marked this pull request as ready for review August 6, 2019 22:00
doc/langref.html.in Outdated Show resolved Hide resolved
@andrewrk andrewrk mentioned this pull request Nov 2, 2019
@andrewrk
Copy link
Member

andrewrk commented Nov 2, 2019

The first item would be Array Accesses on Vectors. If you open a PR which only does this, I think we can get it merged swiftly. It seems everything else in this PR depends on that feature.

See #3575

This PR implements array access of vectors to mean element access, but it's planned for that to be how to dereference a vector of pointers.

I definitely am going to need this PR to be split into distinct smaller mergeable pieces.

@shawnl
Copy link
Contributor Author

shawnl commented Nov 2, 2019 via email

@shawnl
Copy link
Contributor Author

shawnl commented Nov 5, 2019

The lowering to @shuffle, @gather, and @scatter of loads and stores got dropped while re-basing this.

If the integer fits in the significand (including the signed bit because of a edge
case resulting from the difference between twos-complement and ones-complement),
then cast to float is lossless.
This will allow alot more integer code to just magically work with vectors of integers,
et cetera.
https://llvm.org/docs/LangRef.html#zext-to-instruction

The ‘zext’ instruction takes a value to cast, and a type to cast it to.
Both types must be of integer types, or vectors of the same number of integers.
**The bit size of the value must be smaller than the bit size of the destination type, ty2.**

The codegen was invalid.
v2: fixup dest_type when value is nullptr
No comptime support yet, that will enough work that it needs to be its own
patch series.

v2: ir_analyze_masked_vector for use by vector indexing
Why ^, &, |, and ~ instead of !=, and, or, !?

Consider:

a() and b()

If a() returns @vector(2, bool)([_]bool{false, false}), is b() run?
How about if a() returns @vector(2, bool)([_]bool{false, true})?

Making this defined would be slow, confusing, involve hidden control flow,
and require putting the any() all() none() branching into the language.

Even if a() and b() return "bool" these are different:

a() and b()
a() & b()

-----

I would like to throw a good error when a vector of bools is passed, but
the current architecture prevents that.
…unc), with safety checks.

Finishing this depends on ziglang#1757. I'd rather not re-work ir_gen_node_raw for explicit casts
(signed to unsigned, and safe narrowing casts) when that is upcoming.

v2: @truncate can now take a scalar type
v2: do not emit libmvec when it does exist on the platform
v3: do what is written above
Can't test for larger vectors (256-bit and 512-bit because of confusion
with passing on stack and passing in registers (that don'y always exist).

See ziglang#1481 (comment)

ARM uses sret for vector arguments, and those are runtime. I don't understand
what that assert was for.
@andrewrk
Copy link
Member

andrewrk commented Nov 27, 2019

This PR won't be merged, but I'm definitely interested in the smaller PRs that bring these commits in 1 at a time. I'll leave it up to @shawnl (or anyone else!) to track this fork and do the work of slowly upstreaming it.

@andrewrk andrewrk closed this Nov 27, 2019
@daurnimator daurnimator added the stage1 The process of building from source via WebAssembly and the C backend. label Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage1 The process of building from source via WebAssembly and the C backend.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants