[C] Handle BinOp for SIMDArray #2836

Thirumalai-Shaktivel · 2023-11-10T12:23:04Z

Towards #2293

certik · 2023-11-10T18:51:04Z

Is this ready for review?

certik

I think it's fine. I don't know if this is ready for review.

Thirumalai-Shaktivel · 2023-11-11T06:24:44Z

Yup, this is ready!
The remaining issue in matmul_01 is x = a + b * c(1:4)

Thirumalai-Shaktivel · 2023-11-11T11:53:27Z

From Zulip:
For a = c(1:2), we actually want to also use the vector extensions in C, not a loop.

I'm marking this as a draft for now. For implementing SIMD assignment using vector extensions.

Thirumalai-Shaktivel · 2023-11-14T04:57:55Z

I made some research and came across the following:
Scalar intialization or Broadcast

[...]
!LF$ attributes simd :: A
real :: A(4)
A = 23.
! or
A = i ! value: `23.`
[...]

We can do:

// C code
[...]
A = (float __attribute__ (( vector_size(sizeof(float) * 4) ))){23., 23., 23., 23.}
// or 
A = (float __attribute__ (( vector_size(sizeof(float) * 4) ))){i, i, i, i}
// or 
A = i - (float __attribute__ (( vector_size(sizeof(float) * 4) ))){ }
// See for more details: https://stackoverflow.com/a/43801280/15913193
[...]

I thought of doing the third one.

Array initialisation:

[...]
!LF$ attributes simd :: A
real :: A(4), C(8)
C = 3.
A = C(:4)
[...]

We can do

// C code
[...]
float a __attribute__ (( vector_size(sizeof(float) * 4) ));
struct r32 c_value;
struct r32* c = &c_value;
float c_data[8];
c->data = c_data;
c->n_dims = 1;
c->dims[0].lower_bound = 1;
c->dims[0].length = 8;
memcpy(&a, c->data, sizeof(float) * 4);
[...]

@certik do yo know any other builtin functions that we can use here?

Thirumalai-Shaktivel · 2023-11-14T05:09:16Z

Also, should we predefine the types as the following in lfortran_intrinsics.h?

typedef float   v8float  __attribute__ ((vector_size (32)));   /* float[8],  AVX  */
typedef double  v4double  __attribute__ ((vector_size (32)));  /* double[4], AVX  */
typedef float   v4float  __attribute__ ((vector_size (16)));   /* float[4],  SSE  */

See: https://www.linuxquestions.org/questions/programming-9/how-do-you-use-sse-2-3-in-c-c-code-884780-print/

certik · 2023-11-14T06:17:16Z

do yo know any other builtin functions that we can use here?

Do you mean in our C++ code? Not sure. '

I would not predefine v4float, since the lengths will differ on each architecture.

czgdp1807 · 2023-11-14T06:55:25Z

tests/reference/c-pragma2-a14de52.stdout

+    for (__1_t=0; __1_t<=7; __1_t++) {
+        a[__1_t] = (float)(1);
+    }


Is this the intention? I am confused, the older version feels correct. This loop is just normal C code, correct?

Same here. I think we should ideally avoid accessing individual vector elements and leave it upto the processor to handle vector operations.

Yea, Yea, I thought the same. I'm working on it and will push the changes soon.

Thirumalai-Shaktivel · 2023-11-14T14:38:53Z

Current design:
Fortran example

!LF$ attributes simd :: A
real :: A(4), C(8)
real :: i = 12.
A = i

C Backend

a = (float __attribute__ (( vector_size(sizeof(float) * 4) ))) {i, i, i, i};

Fortran example

A = 1.2

C Backend

a = (float __attribute__ (( vector_size(sizeof(float) * 4) ))) {  1.19999999999999996e+00,   1.19999999999999996e+00,   1.19999999999999996e+00,   1.19999999999999996e+00};

Fortran example

C = 42
A = C

C Backend

memcpy(&a, c->data, sizeof(float) * 4);

Fortran example

A = C(2:)

C Backend

memcpy(&a, c->data + (2 - c->dims[0].lower_bound), sizeof(float) * 4);

Thirumalai-Shaktivel · 2023-11-14T14:45:30Z

Ready for review!

certik

I think that this is fine. However, I think only the simd version is being tested, not the non-simd version, correct? I think we need to update our cmake tester to test both versions.

Shaikh-Ubaid · 2023-11-14T15:30:18Z

src/libasr/codegen/asr_to_c.cpp

        }
+        array_const_str += "}";
+        src = "(" + cast + ") " + array_const_str;


Why do we do cast here? This seems something like hard coded, where as the previous one where we actually insert the cast while visiting the visit_ArrayPhysicalCast() is more general I think.

I think ArrayBroadcast should only be used for SIMDArray, otherwise throw error.
And for SIMDArray we always need a cast, so I moved it here.

I don't see a case, where this operation has to be performed in visit_ArrayPhysicalCast. If it requires we will use this there as well.

I think ArrayBroadcast should only be used for SIMDArray

I think it can be used for regular arrays as well. At the moment, we only support array broadcast for SIMD arrays at the C backend level. I would keep the implementation generalised as before instead of hardcoding.

And for SIMDArray we always need a cast, so I moved it here.

I think we can't say that a cast would always be needed. Consider if we are broadcasting an SIMD array. Then I think a cast is not needed here.

I think the backend should just be completely "dumb" and just follow what the ASR says. If the ASR has a cast node, then the backend adds the cast operation. If there is no cast node in the ASR, then there should be no cast generated by the backend.

I think it can be used for regular arrays as well.

Nope, the ArrayBroadcast design is that we handle all the ArrayBroadcast in the array_op itself except for the SIMDArray case, i.e., we shouldn't be visiting ArrayBroadcast in the backends except for SIMDArray

Regarding the Cast, I had some problem with adding the cast in visit_ArrayPhysicalCast so I moved them here, I will look into it and report back.

Consider if we are broadcasting an SIMD array. Then I think a cast is not needed here.

I think SIMDArray would have an assignment here:

(= (Var 4 a) (Var 4 b) () )

we shouldn't be visiting ArrayBroadcast in the backends except for SIMDArray

Nope, the ArrayBroadcast design is that we handle all the ArrayBroadcast in the array_op itself except for the SIMDArray case, i.e., we shouldn't be visiting ArrayBroadcast in the backends except for SIMDArray

Yes, I know the array_op pass handles the ArrayBroadcast. But think it from the general perspective. In general, I think the ArrayBroadCast is meant for all types of arrays. If you plan to support only simd for now or in future, I would add an LCompilersAssert(is_simd_array()) (or anything similar) so that it indicates that only simd arrays are supported and helps us catch unexpected bugs in ArrayBroadcast (For example when some other type of array gets passed to ArrayBroadcast.)

I think SIMDArray would have an assignment here:

Consider a with length 128 and b with length 256 and we have b = a. I think here we need to broadcast a so that it matches the length of b.

I think we wouldn't be able to add a check LCompilersAssert(is_simd_array()) in ArrayBroadcast, as ArrayBroadcast type will always be a FixedArraySize. Maybe it might be possible in visit_Assignment or visit_ArrayPhysicalCast.

Consider a with length 128 and b with length 256 and we have b = a. I think here we need to broadcast a so that it matches the length of b.

We shouldn't allow the assignment of different size array, right?
GFortran throws an error for it:

$ gfortran examples/expr2.f90 && ./a.out examples/expr2.f90:6:0: 6 | y = x | Error: Different shape for array assignment at (1) on dimension 1 (256 and 128)

Ok, fine for now. We can also add the assert later as the design evolves.

Shaikh-Ubaid · 2023-11-14T15:31:57Z

src/libasr/codegen/asr_to_c.cpp

+            } else if (ASRUtils::is_simd_array(x.m_v)) {
+                index += src;


I would ideally not implement array item for SIMD array until utmost necessary. I think we should avoid using it as much as possible, ideally never use it.

Yes, I used this for the print statement, just for debugging. If this is not required, I can remove it.

Should we introduce an option, like --print-simd?

Should we introduce an option, like --print-simd?

I do not have any opinion on this. I would just do best to avoid as many if(is_simd_array()), then, else as possible, so that the code is still clean.

Would should "--print-simd" do?

My question was, Should we print SIMDArray?
print_arr pass converts the SIMDArray to a do_loop to print the values, should we allow it?

print *, a

--show-fortran

do __1_k = lbound(a, 1), ubound(a, 1) print *, a(__1_k) end do

cf65b41 and 53de75b changes were introduced for the above do loop.

src/libasr/pass/print_arr.cpp

Shaikh-Ubaid · 2023-11-14T15:38:22Z

src/libasr/codegen/asr_to_c.cpp

-            src = "((" + result_type + ")" + var_name + "->dims[" + idx + "-1].lower_bound)";
+            if (ASRUtils::is_simd_array(x.m_v)) {
+                src = "0";
+            } else {


I think lower_bound for an SIMD array in not meaningful. I am unsure, but I think this case should never be triggered for SIMD arrays.

Same, used for printing the output

Again, no opinion on this. Mostly likely it would not get triggered, so seems like a dead code to me at the moment.

Okay, I thought of removing them for now.

I would keep this for further implementation of SIMDArray, as it would be helpful for debugging.
I will create an issue to remove it.

Please post a link of the issue here.

Shaikh-Ubaid · 2023-11-14T15:39:40Z

Thanks for the contributions, Thirumalai. I shared some comments above. Overall, it looks good to me.

Shaikh-Ubaid · 2023-11-14T15:43:44Z

Looking at the changes in the backend, it seems (and as we expected) there are some/several if (simd_array()), then, else being needed. This seems to point in the direction of a separate type for Vector Array (https://github.com/lcompilers/lpython/wiki/Design-of-Vector-Arrays-in-ASR).

I think as we dive more, the design might get clearer.

certik · 2023-11-15T21:14:13Z

What needs to be done to finish this?

Shaikh-Ubaid · 2023-11-16T05:48:24Z

integration_tests/simd_02.f90

+    res = A + B
+    C(:4) = res
+    res = A * B
+    C(5:) = res


Could you add a print *, C here (before the if assert) so that it is helpful for debugging later?

I think it would be useless for the CI, so let's not do it. The developer can always add a print *, C and test it, but not git stage it.

I think it would be useless for the CI, so let's not do it. The developer can always add a print *, C and test it, but not git stage it.

I think the tests are not just for the CI but also to help developers debug an issue (when any arises). I think there is no disadvantage in printing value on the console, but I think there are advantages:

Helps developers debug the test (without making changes to it)

Most important: Ensures that the test actually runs and produces values. The produced values can be seen by the developers on the console. Previously I have experienced that we had some tests which have asserts (and no prints) and these asserts never got run or get triggered because the function that would test them got removed by the unused_functions pass. So, the test would pass (because there is no checking being done) and the developer gets the false impression that the test works correctly. I think a print in this would have been very helpful in avoiding such situation. Since the developer would have known in the first place when adding such test as he would have noticed no value being printed on the console (because even the print would be removed as the function itself is removed).

I didn't know it would print an output on the console on failure. In that case, I think we can add a print statement.

Shaikh-Ubaid · 2023-11-16T05:54:41Z

src/libasr/codegen/asr_to_c_cpp.h

+                    } else {
+                        value += "->data";
+                    }
+                } else if (ASR::is_a<ASR::Var_t>(*x.m_target)) {


Should this instead be ?

Suggested change

} else if (ASR::is_a<ASR::Var_t>(*x.m_target)) {

} else if (ASR::is_a<ASR::Var_t>(*x.m_value)) {

Also, based on the if-else conditions, I think the above case is unused currently (or is like a dead code)?

Done.

Could you specify what part was updated? I can see no change in the above if statement. Did you push your changes?

Now, it seems updated. Thanks.

Shaikh-Ubaid · 2023-11-16T06:02:49Z

I think reference tests need to be updated.

Shaikh-Ubaid · 2023-11-16T07:51:06Z

@Thirumalai-Shaktivel Could you also check if the above changes work with LPython by submitting a PR?

Thirumalai-Shaktivel · 2023-11-16T07:57:23Z

@Shaikh-Ubaid, you can do a final review now.

Yup, I was planning to do the same, after the review.

Shaikh-Ubaid

It seems good to me. Thanks for this.

If these changes work with LPython, I think it is good to merge.

Thirumalai-Shaktivel · 2023-11-16T08:45:01Z

LPython: lcompilers/lpython#2425

Thirumalai-Shaktivel · 2023-11-16T08:46:03Z

Thanks for the review!

Shaikh-Ubaid · 2023-11-16T09:55:14Z

It seems this PR is merged, but the related PR is still open (I think it might be waiting for approval). Please merge the PR and its related PRs together or at similar times (ideally when there is approval on both/all the PRs). This will help keep the libasr intact and allow to contribute libasr changes to the two projects fluently. Thanks for the contributions. I appreciate it.

Thirumalai-Shaktivel added 5 commits November 10, 2023 17:35

Bug: [C] Fix a missing indent

998a74e

Bug: Fix the accessing of m_values

8e32f3e

Test: Add a test and register in CMakeLists

fe368f8

[C] Handle ArrayBound for SIMDArray

ae0de2d

[C] Handle ArrayItem for the SIMDArray

3c7466c

Thirumalai-Shaktivel force-pushed the simd_02 branch from 392e616 to eef5b18 Compare November 10, 2023 12:59

certik approved these changes Nov 10, 2023

View reviewed changes

Thirumalai-Shaktivel added 2 commits November 13, 2023 19:57

[ASR Pass] Use the utils function to check for SIMDArray

000d608

Refactor: [C] Handle ArrayBroadcast for variable initialization

cb792c7

czgdp1807 reviewed Nov 14, 2023

View reviewed changes

Thirumalai-Shaktivel force-pushed the simd_02 branch from eef5b18 to ea2227f Compare November 14, 2023 14:20

Thirumalai-Shaktivel force-pushed the simd_02 branch from 2dfcd8c to 6b592a7 Compare November 14, 2023 14:44

certik requested review from Shaikh-Ubaid and czgdp1807 November 14, 2023 15:03

certik approved these changes Nov 14, 2023

View reviewed changes

Shaikh-Ubaid reviewed Nov 14, 2023

View reviewed changes

src/libasr/pass/print_arr.cpp Outdated Show resolved Hide resolved

Shaikh-Ubaid reviewed Nov 14, 2023

View reviewed changes

Thirumalai-Shaktivel force-pushed the simd_02 branch from 6b592a7 to 9d28f26 Compare November 14, 2023 17:45

[C] Use memcpy for initializing SIMDArray's

d674960

Thirumalai-Shaktivel force-pushed the simd_02 branch from 9d28f26 to 0ec0d61 Compare November 16, 2023 05:26

Shaikh-Ubaid reviewed Nov 16, 2023

View reviewed changes

Update tests

a4098ef

Thirumalai-Shaktivel force-pushed the simd_02 branch from 0ec0d61 to 68d80db Compare November 16, 2023 06:16

Thirumalai-Shaktivel mentioned this pull request Nov 16, 2023

[C] Remove SIMDArray specific handling in ArrayItem and ArrayBound #2856

Open

Thirumalai-Shaktivel force-pushed the simd_02 branch from 68d80db to 920dbb1 Compare November 16, 2023 07:53

Shaikh-Ubaid approved these changes Nov 16, 2023

View reviewed changes

Thirumalai-Shaktivel force-pushed the simd_02 branch from 920dbb1 to a4098ef Compare November 16, 2023 08:45

Thirumalai-Shaktivel marked this pull request as ready for review November 16, 2023 09:07

Thirumalai-Shaktivel enabled auto-merge November 16, 2023 09:08

Thirumalai-Shaktivel merged commit 92e744a into lfortran:main Nov 16, 2023
20 checks passed

Shaikh-Ubaid mentioned this pull request Nov 16, 2023

[C] Simd changes from LFortran lcompilers/lpython#2425

Merged

Thirumalai-Shaktivel deleted the simd_02 branch November 17, 2023 01:35

This was referenced Nov 17, 2023

[ASR Pass] Handle ArraySection and SIMDArray BinOp #2865

Merged

SIMD backend #2293

Open

	} else if (ASR::is_a<ASR::Var_t>(*x.m_target)) {
	} else if (ASR::is_a<ASR::Var_t>(*x.m_value)) {

[C] Handle BinOp for SIMDArray #2836

[C] Handle BinOp for SIMDArray #2836

Conversation

Thirumalai-Shaktivel commented Nov 10, 2023

certik commented Nov 10, 2023

certik left a comment

Choose a reason for hiding this comment

Thirumalai-Shaktivel commented Nov 11, 2023

Thirumalai-Shaktivel commented Nov 11, 2023

Thirumalai-Shaktivel commented Nov 14, 2023 • edited

Thirumalai-Shaktivel commented Nov 14, 2023 • edited

certik commented Nov 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Thirumalai-Shaktivel Nov 14, 2023 • edited

Choose a reason for hiding this comment

Thirumalai-Shaktivel commented Nov 14, 2023

Thirumalai-Shaktivel commented Nov 14, 2023

certik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Thirumalai-Shaktivel Nov 14, 2023 • edited

Choose a reason for hiding this comment

Thirumalai-Shaktivel Nov 14, 2023 • edited

Choose a reason for hiding this comment

Shaikh-Ubaid Nov 14, 2023 • edited

Choose a reason for hiding this comment

Thirumalai-Shaktivel Nov 16, 2023 • edited

Choose a reason for hiding this comment

Thirumalai-Shaktivel Nov 16, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Thirumalai-Shaktivel Nov 16, 2023 • edited

Choose a reason for hiding this comment

Thirumalai-Shaktivel Nov 16, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Thirumalai-Shaktivel Nov 15, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Shaikh-Ubaid commented Nov 14, 2023

Shaikh-Ubaid commented Nov 14, 2023

certik commented Nov 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Shaikh-Ubaid Nov 16, 2023 • edited

Choose a reason for hiding this comment

Thirumalai-Shaktivel Nov 16, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Shaikh-Ubaid commented Nov 16, 2023

Shaikh-Ubaid commented Nov 16, 2023

Thirumalai-Shaktivel commented Nov 16, 2023

Shaikh-Ubaid left a comment

Choose a reason for hiding this comment

Thirumalai-Shaktivel commented Nov 16, 2023

Thirumalai-Shaktivel commented Nov 16, 2023

Shaikh-Ubaid commented Nov 16, 2023 • edited

Thirumalai-Shaktivel commented Nov 14, 2023 •

edited

Thirumalai-Shaktivel commented Nov 14, 2023 •

edited

Thirumalai-Shaktivel Nov 14, 2023 •

edited

Thirumalai-Shaktivel Nov 14, 2023 •

edited

Thirumalai-Shaktivel Nov 14, 2023 •

edited

Shaikh-Ubaid Nov 14, 2023 •

edited

Thirumalai-Shaktivel Nov 16, 2023 •

edited

Thirumalai-Shaktivel Nov 16, 2023 •

edited

Thirumalai-Shaktivel Nov 16, 2023 •

edited

Thirumalai-Shaktivel Nov 16, 2023 •

edited

Thirumalai-Shaktivel Nov 15, 2023 •

edited

Shaikh-Ubaid Nov 16, 2023 •

edited

Thirumalai-Shaktivel Nov 16, 2023 •

edited

Shaikh-Ubaid commented Nov 16, 2023 •

edited