GitHub

Observations

Something that I've noticed is that Flang doesn't seem to vectorize its operations when it seems like it could benefit from doing so. Both clang and flang process certain operations in blocks of four, but it seems like only clang vectorizes the operations (See fnc_add_scalar.f90 vs fnc_add_scalar.c). I'm not sure if there's a compiler flag that enables vectorization, but if not, this might be a good opportunity for optimization. EDIT: this is a known issue and is being worked on. It's related to this task.
I found that the resize underperforms in flang binaries compared to gfortran binaries. It doesn't seem to be an inlining problem like I guessed earlier, but rather an issue of allocation. In resize_test, resh allocates 3 arrays on the stack and calls malloc, whereas resh_manual and the C function are able to transpose the array in place. The same can be seen for transpose. This could be a good opportunity for optimization in cases where the size of both the original and output arrays are known at compile time. EDIT: this is a part of the ongoing array reduce copy effort.

Make sure to manually specify where the compiled runtime libraries are so the linker can actually work
Flang's frontend doesn't make any premature stride optimizations when it comes to looping through arrays. If you see a lot of multiplying by 1 and adding 0 in the unoptimized LLVM IR, it's because the frontend doesn't make exceptions when iterating with a step of 1.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
add_arrays_test		add_arrays_test
add_scalar_test		add_scalar_test
arr_conc		arr_conc
div_scalar_test		div_scalar_test
fnc_add_arrays		fnc_add_arrays
fnc_add_scalar		fnc_add_scalar
intr_fnc_test		intr_fnc_test
max_fnc_test		max_fnc_test
mrmvrs		mrmvrs
mult_scalar_test		mult_scalar_test
plain_shape_test		plain_shape_test
resize_test		resize_test
rho		rho
transpose_test		transpose_test
.gitattributes		.gitattributes
.gitignore		.gitignore
clean-all.sh		clean-all.sh
readme.md		readme.md
run-all.sh		run-all.sh
run-bench.sh		run-bench.sh
run-c-f-bench.sh		run-c-f-bench.sh
run-test.sh		run-test.sh