transformations: Implement stencil inlining. #2615

PapyChacal · 2024-05-21T15:51:17Z

Apologies for the monster PR; I might be able to split in two 🤔

codecov · 2024-05-21T15:58:18Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.62%. Comparing base (97727df) to head (13e93ad).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2615      +/-   ##
==========================================
+ Coverage   89.61%   89.62%   +0.01%     
==========================================
  Files         360      361       +1     
  Lines       46198    46334     +136     
  Branches     6985     7026      +41     
==========================================
+ Hits        41399    41528     +129     
- Misses       3724     3727       +3     
- Partials     1075     1079       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tests/filecheck/transforms/stencil-inlining.mlir

tobiasgrosser · 2024-05-22T09:48:28Z

How nice. I am curious about the performance numbers.

xdsl/tools/command_line_tool.py

PapyChacal · 2024-05-22T12:36:50Z

How nice. I am curious about the performance numbers.

Me too!

There is some polishing to do, it does not seem to work exactly as expected on big kernels.
Because of my KISS implementation, this pass as-is also combinatorial explode on big kernels.
So I'll first integrate with #2623, to cut off those explosions and be able to pinpoint where exactly I missed something still!

…from duplicated ones.

Co-authored-by: Sasha Lopoukhine <superlopuh@gmail.com>

georgebisbas

Is this now ready to review?

PapyChacal · 2024-05-28T11:08:03Z

@georgebisbas

Is this now ready to review?

Just finished my checks, it is now!

@tobiasgrosser

How nice. I am curious about the performance numbers.

Here they are:

kernel	OEC	xDSL	xDSL w/ inlining	Relative error w/ inlining
laplace	332	313	326	3.19949e-11
fastwavesuv	109	409	122	6.31987e-08
fvtp2d_flux	76	525	72	4.16758e-16
fvtp2d_qi	107	549	80	9.55451e-13
fvtp2d_qj	137	987	131	0
hadvuv	91	585	90	3.41916e-08
hadvuv5th	109	570	109	4.04618e-08
hdiffsa	74	294	83	4.44881e-12
nh_p_grad	94	323	104	1.57696e-13
p_grad_c	57	158	67	1.34918e-14
uvbke	28	48	28	3.77537e-16

tobiasgrosser · 2024-05-28T11:10:12Z

Nice. The performance looks good. Why do we get a relative error here? Should inlining not perform exactly the same compuation?

georgebisbas · 2024-05-28T11:52:59Z

Are these seconds?

PapyChacal · 2024-05-28T12:08:03Z

Nice. The performance looks good. Why do we get a relative error here? Should inlining not perform exactly the same compuation?

The relative error is xDSL inlined vs OEC inlined.

I'm not sure what exactly causes the slight differences yet, but could look into it; it might be quite the rabbit hole, given that different versions of CUDA, MLIR and clang are at play.

Regarding inlining doing the same computations; I'm not confident about OEC - or me missing something - on this side yet. In my tests, OEC plains out crashes on some examples if I don't use inlining. On some other examples that run without issue, OEC's inlining appear to change the results relatively significantly; I could look into that too, most likely in priority to xDSL vs old MLIR.

That's why I reported the relative error of both frameworks with inlining enabled for now. It is the case without surprise from OEC and at least demonstrating that things are consistent there.

I don't mind waiting until all those grey areas are clarified if anybody prefers

PapyChacal · 2024-05-28T12:09:34Z

Are these seconds?

milliseconds! Just the first thing I got to work on that side, I can now fine-tune if anyone wants to see different measures.

Those are 512 iteratons over 64x64x64 domains with a halo size of 4 in all directions (i.e. 72x72x72 buffers, computation over the central 64x64x64)

NB: 512 iterations without bufferswapping, just repeating the same output buffer update from the same inputs. I'm actually not sure how this influences performance measurements on GPU 🤔 But FWIW, both frameworks are measured the same way here.

georgebisbas

I am happy with the reported relative error, so I approve

PapyChacal added the transformations Changes or adds a transformatio label May 21, 2024

PapyChacal self-assigned this May 21, 2024

PapyChacal marked this pull request as ready for review May 21, 2024 16:24

superlopuh reviewed May 21, 2024

View reviewed changes

tests/filecheck/transforms/stencil-inlining.mlir Outdated Show resolved Hide resolved

AntonLydike changed the title ~~transformations: implement stencil inlining.~~ transformations: Implement stencil inlining. May 22, 2024

georgebisbas approved these changes May 22, 2024

View reviewed changes

xdsl/tools/command_line_tool.py Show resolved Hide resolved

PapyChacal marked this pull request as draft May 22, 2024 12:42

PapyChacal force-pushed the emilien/stencil-inlining branch from fa327fe to 7453f78 Compare May 22, 2024 12:45

PapyChacal and others added 7 commits May 27, 2024 11:23

Canonicalization also gets rid of unused operands even if not coming …

7881cde

…from duplicated ones.

Stencil inlining.

a82c78f

Docstrings.

6ba4795

More docstrings.

edb4b8d

more comments.

610dc24

Update tests/filecheck/transforms/stencil-inlining.mlir

b4907aa

Co-authored-by: Sasha Lopoukhine <superlopuh@gmail.com>

Use canonicalization patterns in inlining.

96dc769

PapyChacal force-pushed the emilien/stencil-inlining branch from 3f78657 to c98e78c Compare May 27, 2024 12:56

Add missing check in inlining.

6570843

PapyChacal force-pushed the emilien/stencil-inlining branch from c98e78c to 6570843 Compare May 27, 2024 12:58

Tweaks.

7593ab0

georgebisbas reviewed May 27, 2024

View reviewed changes

Merge branch 'main' into emilien/stencil-inlining

55d71cc

PapyChacal marked this pull request as ready for review May 28, 2024 11:08

PapyChacal requested review from georgebisbas and tobiasgrosser May 28, 2024 11:08

georgebisbas approved these changes May 28, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into emilien/stencil-inlining

13e93ad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformations: Implement stencil inlining. #2615

transformations: Implement stencil inlining. #2615

PapyChacal commented May 21, 2024

codecov bot commented May 21, 2024 •

edited

Loading

tobiasgrosser commented May 22, 2024

PapyChacal commented May 22, 2024

georgebisbas left a comment

PapyChacal commented May 28, 2024

tobiasgrosser commented May 28, 2024

georgebisbas commented May 28, 2024

PapyChacal commented May 28, 2024

PapyChacal commented May 28, 2024 •

edited

Loading

georgebisbas left a comment

transformations: Implement stencil inlining. #2615

Are you sure you want to change the base?

transformations: Implement stencil inlining. #2615

Conversation

PapyChacal commented May 21, 2024

codecov bot commented May 21, 2024 • edited Loading

Codecov Report

tobiasgrosser commented May 22, 2024

PapyChacal commented May 22, 2024

georgebisbas left a comment

Choose a reason for hiding this comment

PapyChacal commented May 28, 2024

tobiasgrosser commented May 28, 2024

georgebisbas commented May 28, 2024

PapyChacal commented May 28, 2024

PapyChacal commented May 28, 2024 • edited Loading

georgebisbas left a comment

Choose a reason for hiding this comment

codecov bot commented May 21, 2024 •

edited

Loading

PapyChacal commented May 28, 2024 •

edited

Loading