benchmark_name | cpp_code | accera_code |
---|---|---|
Accera_Vectorized2 |
src/softmax/accera_vectorized_2.cpp |
src/softmax/vectorized_2.py |
Note
The following shows the implementation of the {{benchmark_name}}
.
The full source code listing of the Accera code generator can be found in {{accera_code}} :fas fa-code: and the benchmark runner is found in {{cpp_code}} :fas fa-code: .
As in the naive implementation, we first need to import the required packages:
We then define the input size:
Target dependent characteristics can be queried from by creating a package target.
We define our package which will be used throughout our program.
As done previously, the input and output arrays are defined along with the auxillary temporaries.
The max
operation schedule can be defined and vectorized.
We define the exp
nest.
The accum
operation schedule is defined, but we do not perform vectorization on it.
[!ATTENTION] The
accum
nest can be vectorized as shown in the vectorized reduction case study.
We finally define the normalization nest.
We invoke the above functions to add the to the package.
Finally, we export the package.
The package can then be used within our C code base. To do so, we first need to import the HAT package created:
We then declare our inputs and outputs:
We then can use the exported function within our C code: