You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am quite new to the polyhedral model, and may still be unfamiliar with related concepts, so please point out if I made any mistakes.
I would like to know is there any methods for parallelizing or loop tiling that automatically resolves data dependencies. To be more specific, consider the following one-dimensional stencil computation:
for (t = 1; t < T; t += 1)
for (i = 1; i < N - 1; i += 1)
A[t][i] = 0.25 * (A[t - 1][i + 1] - 2.0 * A[t - 1][i] + A[t - 1][i - 1]);
Since computing A[t][i] needs to read A[t - 1][i + 1], the statement instance (t, i) have to be excuted after the statement instance (t - 1, i + 1). So the two-level loop cannot be simply tiled, otherwise data dependency will be violated.
However, the Computation::tile function in Tiramisu seems won't make any efforts to solve the data dependency:
where data dependency is violated, and test also fails.
I have heard that the Pluto algorithm can be adopted in such a scenario, which will automatically skew the iteration domain to solve the data dependency:
for (t = 1; t < T; t += 1)
for (i = 1 + t; i < N - 1 + t; i += 1)
A[t][i - t] = 0.25 * (A[t - 1][i - t + 1] - 2.0 * A[t - 1][i + t] + A[t - 1][i - t - 1]);
and the loop can be safely tiled. It is also possible to skew the iteration domain and perform loop tiling in Tiramisu:
It will pass the test. But it requires my observation of the loop patterns to make such a transformation. Moreover, if I want to paralize the loop, it requires not only skewing, but also synchronization and communication between parallel computation units. This seems complicated to me, but it can theoratically be automated through the Pluto algorithm. This is why I would like to know: is there any methods in Tiramisu for parallelizing or loop tiling that automatically resolves data dependencies?
The text was updated successfully, but these errors were encountered:
Yes you would need Tiramisu to apply skewing automatically for this type of stencils to be parallelized. The public Tiramisu does not check the correctness of transformations, so if you apply tiling on a code that should not be tiled you'll get wrong results. We have a private branch that checks for correctness but it never made its way to the master branch.
We have the skew() command which you and use use to apply skewing manually.
Currently Tiramisu does not do skewing automatically. Tiramisu uses the ISL library though which implements the Pluto algorithm. You can very easily call the Pluto algorithm in ISL and enable Tiramisu to automatically skew this code and parallelize it. We are also working on a new algorithm for automatic code parallelization that will solve this problem but it will not be ready soon.
So in short: Tiramisu does not apply skewing automatically currently, you can very easily add support for skewing if you want given that the Pluto algorithm is already implemented in ISL, we are just not calling it.
Please let me know if you have any other question.
Thank you for helping. I understand the situation now. Now the problem is that I'm not really familiar with the APIs in ISL, especially the ones related to isl_shcedule. I read the documentation of ISL and it seems that the Pluto algorithm implementation is closely related to isl_shcedule. Would you be kind enough to provide an example of using ISL to automatically skew and tile the loop in my example? Thanks a lot!
I am quite new to the polyhedral model, and may still be unfamiliar with related concepts, so please point out if I made any mistakes.
I would like to know is there any methods for parallelizing or loop tiling that automatically resolves data dependencies. To be more specific, consider the following one-dimensional stencil computation:
Since computing
A[t][i]
needs to readA[t - 1][i + 1]
, the statement instance(t, i)
have to be excuted after the statement instance(t - 1, i + 1)
. So the two-level loop cannot be simply tiled, otherwise data dependency will be violated.However, the
Computation::tile
function in Tiramisu seems won't make any efforts to solve the data dependency:Uncomment the two lines related to loop tiling, and the output Halide IR changes from:
to:
where data dependency is violated, and test also fails.
I have heard that the Pluto algorithm can be adopted in such a scenario, which will automatically skew the iteration domain to solve the data dependency:
and the loop can be safely tiled. It is also possible to skew the iteration domain and perform loop tiling in Tiramisu:
It will pass the test. But it requires my observation of the loop patterns to make such a transformation. Moreover, if I want to paralize the loop, it requires not only skewing, but also synchronization and communication between parallel computation units. This seems complicated to me, but it can theoratically be automated through the Pluto algorithm. This is why I would like to know: is there any methods in Tiramisu for parallelizing or loop tiling that automatically resolves data dependencies?
The text was updated successfully, but these errors were encountered: