Start adding Python bindings for RTL dialect. #767

mikeurbach · 2021-03-14T01:42:07Z

Sharing work in progress here, but I can't actually get this to work. In the test case, I can build an RTL constant op with the line:

rtl.ConstantOp(i32, a50)

But when I go to print out the operations, I get this error:

dialect has no registered attribute printing hook
UNREACHABLE executed at /home/mikeurbach/alloy/external/llvm-project/mlir/include/mlir/IR/Dialect.h:89!
Aborted (core dumped)

Some notes:

I also tried going through ctx.dialects.rtl.ConstantOp to build the op, but that leads to the same error
If I remove the line to build an rtl.ConstantOp, the test passes and prints out the std.ConstantOp no problem
If I remove the call to circt.register_dialects, and instead say ctx.allow_unregistered_dialect = True, it will correctly print out the rtl.ConstantOp in generic form
I was really confused by the error and couldn't figure out how to get a backtrace, so I hacked up Dialect.h to print the dialect namespace in that assertion... and it printed the empty string, which I think indicates printAttribute was invoked on the Standard dialect?
I've also tried registering the dialects on the context with a call like this: https://github.com/llvm/mlir-npcomp/blob/main/lib/CAPI/Registration.cpp#L16, or like this: https://github.com/llvm/llvm-project/blob/main/mlir/lib/CAPI/Registration/Registration.cpp#L15, directly inside the register_dialects method in CIRCTModule. This leads to the same error, and the registration calls in the current revision seem more in line with layering Python bindings on the CAPI, so that's what I've been testing.

I'm pretty stumped about what's going on. Since the error literally mentions dialects and registration, I'm pattern matching this as a dialect registration issue. But, I'm able to build ops no problem, and it only seems to be an issue when I try to print them. Not sure if that is any clue...

I've been trying to see what the Standard dialect does that our RTL dialect doesn't. I think I've got all the bindings set up the same, so I'm wondering if the handling in the MLIR Python bindings is subtly doing something differently for in-tree dialects versus ones we define. It seems like the MLIR Python bindings always just load the context with all the in-tree dialects on construction: https://github.com/llvm/llvm-project/blob/main/mlir/lib/Bindings/Python/IRModules.cpp#L460

@GeorgeLyon not sure if you've seen anything like this, but does it ring a bell?
@stellaraccident I believe this should be a minimal Python dialect bindings setup. If you have any free cycles, do you mind giving a once over to sanity check this?

stellaraccident · 2021-03-14T02:16:43Z

I suspect this may be the dynamic linking "odr" bug rearing its head again. The symptom is when you get a downstream effect of the TypeID for a dialect not matching depending on which module is doing the query. The visible effect is often falling off the end of an unreachable block. If that is the case, it won't make any sense from a looking at the code perspective.

Are you building with BUILD_SHARED_LIBS or with the DYLIB options (ie. libMLIR)? In the former, I've been looking for a solid repro and suspect relative rpath canonicalization issues causing the dynamic linker to think that two instances of the same object are different. In the latter, it is fairly broken in a few subtle ways.

You may want to try a variant of the hack in npcomp: https://github.com/llvm/mlir-npcomp/blob/4fd9b4afb526692ec4943633019eaa864315240b/python/npcomp/__init__.py#L1

There are some relatively long term fixes I'm working on upstream to help the dynamic linking situation. If enabling global ld loading fixes it, it is a good sign this is the same vague linkage issue as I'm referring to. I can also pull your branch and work on it (more evidence may help me further isolate the issue). I've not built circt before: is it pretty self explanatory?

mikeurbach · 2021-03-14T03:46:37Z

I've been using the DYLIB options, but I can try the BUILD_SHARED_LIBS approach and see what happens.

I missed that hack in npcomp, I will try that out as well.

If you want to build circt, we try to keep this up to date: https://github.com/llvm/circt/blob/main/docs/GettingStarted.md, and I hope it's unsurprising. I'm currently building mlir with circt and npcomp both as external projects, but I've also used the documented approach fairly recently.

stellaraccident · 2021-03-14T03:53:17Z

I am reasonably certain that with the DyLib option, you need that loader hack or something like it.

…

On Sat, Mar 13, 2021, 7:46 PM mikeurbach ***@***.***> wrote: I've been using the DYLIB options, but I can try the BUILD_SHARED_LIBS approach and see what happens. I missed that hack in npcomp, I will try that out as well. If you want to build circt, we try to keep this up to date: https://github.com/llvm/circt/blob/main/docs/GettingStarted.md, and I hope it's unsurprising. I'm currently building mlir with circt and npcomp both as external projects, but I've also used the documented approach fairly recently. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#767 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADYVAGMV5NH7NCBRCWGJQ3TDQPRVANCNFSM4ZENURJA> .

mikeurbach · 2021-03-14T04:06:30Z

I think that makes sense. Thanks for the insight.

I just built this with BUILD_SHARED_LIBS instead, and that works. Hope this is a good data point... I'm still catching up on these issues.

stellaraccident · 2021-03-14T04:22:16Z

I think that makes sense. Thanks for the insight.

I just built this with BUILD_SHARED_LIBS instead, and that works. Hope this is a good data point... I'm still catching up on these issues.

Thanks, yes, that is a good data point. The rework to shared linking upstream basically builds the aggregate dylibs using the same mechanism as BUILD_SHARED_LIBS, which maintains the dependency graph appropriately (libLLVM/libMLIR does not).

I've been meaning to also switch npcomp to this mode. Good to know it works.

This adds a -shared-libs option to llhd-sim that is analogous to the same option for mlir-cpu-runner. If specified, those libraries are passed to the ExecutionEngine to dynamically load and link.

teqdruid · 2021-03-19T03:47:50Z

I had to install numpy to get llvm to cmake configure with your build-llvm.sh change. That's fine, but should be noted somewhere. After that, I can compile and the test runs! (Why is this an integration test? It is a targeted test of basic functionality.) I'm compiling both CIRCT and LLVM with BUILD_SHARED_LIBS.

Really exciting!

mikeurbach · 2021-03-19T05:00:12Z

Thanks for taking a look.

The build-llvm change should go away before merging this, or at least that flag shouldn't always be set in there and numpy should only be required if you want to build python bindings. I'll make sure to document that dependency regardless.

The integration test is really a placeholder. This is the simplest test to check that the RTL dialect is registered. Eventually I'd like to be creating and wiring together rtl.modules and whatnot, and then call export verilog. In general, we could move these to be regular tests, but I thought we were just going to run them nightly for now.

Just creating an rtl.module is pretty clumsy at the moment, and I have a separate patch to add a builder with a small Python extension. I was thinking one thing at a time (at least until the integration test actually runs), and then I'll follow up with a better test as more support comes in.

mikeurbach · 2021-03-22T22:06:20Z

FYI sorry for the apparent lack of progress here. This does work! But it currently doesn't pass the integration tests: https://github.com/llvm/circt/runs/2167346388?check_suite_focus=true. Something about where the MLIR python sources get installed versus the CIRCT python sources. I am hoping to get this wrapped up here today or tomorrow.

mikeurbach · 2021-03-23T15:30:33Z

I can run the Python integration tests locally, and run them with the Docker-based local integration testing setup.

However, in GitHub actions, they continue to fail. It looks like the MLIR Python bindings can't be found.

They fail when trying to import an MLIR Python module: https://github.com/llvm/circt/runs/2172901557?check_suite_focus=true#step:11:124

I made sure MLIR was building with Python bindings enabled, and busted the cache key in this branch so that Python module and others would be installed. It sure looks like the Cache LLVM action installed it: https://github.com/llvm/circt/runs/2167346388?check_suite_focus=true#step:6:8393

But the test continues to fail with that error. I added ls -alR llvm/install to the GitHub workflow to try and see what was going on, and sure enough, there is no python directory in there: https://github.com/llvm/circt/runs/2172901557?check_suite_focus=true#step:8:26

What gives? It seems like our Cache LLVM step is caching the whole llvm directory, so I would expect the Python bits to be installed. This branch uses a different cache key for nightly integration tests versus the regular tests, so I would think there is no conflict there. @teqdruid @youngar I think you guys have the most experience with this workflow, do you have any idea why the python directory is missing from llvm/install after the cache is restored?

EDIT: I found that in these workflow runs, LLVM is being restored from a different cache key than what saved it: https://github.com/llvm/circt/runs/2177073441?check_suite_focus=true#step:6:11

Sorry for the noise... I am updating that.

teqdruid · 2021-03-23T18:34:19Z

It's always the dumb stuff which gets ya. My experience is that when you hit this sort of thing ("this should work, damnit!") its best to take a break then come back to it fresh... Tougher for me to do than it sounds.

I think we cache too much. We should either cache the build or install directory. I've had success building locally against the build/ directory but the install directory would be smaller and should have everything we need. The Windows build cache only caches build/ and install/ (since restore on Windows has trouble overwriting files). I've been too busy/lazy to make this change for the Linux builds.

mikeurbach · 2021-03-23T18:45:49Z

its best to take a break then come back to it fresh

yep a cup of coffee and fresh eyes was all it took.

I think we cache too much. We should either cache the build or install directory.

I think it will work against the install directory. I am going to open a separate PR for some general GitHub workflow improvements. In my wee hours testing, I figured out these CMake hints aren't actually doing anything:

circt/.github/workflows/nightlyIntegrationTests.yml

Lines 116 to 117 in 9ceca83

    
                       -DMLIR_DIR=../llvm/install/lib/cmake/mlir/ \ 
        
                       -DLLVM_DIR=../llvm/install/lib/cmake/llvm/ \

. The ../ is not needed. I believe that is the root cause of why llvm-lit wasn't being found, and why we had to stick on Ubuntu 18.04 for now in this workflow. When I go update that, I can also change it to just cache the install directory.

teqdruid · 2021-03-23T19:04:02Z

Thanks for dealing with the build flows!

mikeurbach · 2021-03-23T22:29:44Z

Moving the build changes unrelated to Python bindings over here: #810.

And use it to set Python bindings flag in integration tests.

mikeurbach · 2021-03-24T00:23:48Z

@teqdruid I think this is ready to go. Nightly tests pass on this branch, and for the legs of the configuration matrix where BUILD_SHARED_LIBS is ON, we run the Python tests (this is another reason they are integration tests--I didn't want to impose build restrictions on our main PR check). Do you mind taking another look when you get a chance?

Regarding the NumPy dependency and the change in build-llvm.sh: I documented this (by referencing the upstream docs), and made the build-llvm.sh change generic. People should be able to keep running it as before to the same effect, and now you can also pass in any extra CMake definitions (like -DCIRCT_BINDINGS_PYTHON_ENABLED=ON). Curious what you think about this, I took inspiration from the mlir-npcomp script here. I'm no bash expert, but this seemed decent. I tried it out locally but didn't think it was worth re-building the nightly workflow to test it.

Since the "integration" test is so bare, here some plans for the next steps, which I'd submit as separate patches:

add a Python extension to customize RTL module building, and add that to the test
add bindings to the SV dialect. this should be nice and clean after some of the boilerplate in this PR, and could serve as an example for how to add bindings to other dialects (as we add more C APIs for them)
add bindings to ExportVerilog, including tests

That's about as far as we can get in the "add Python bindings to the C API" department, then we can talk about pass managers and actually driving passes through Python.

teqdruid

Looks great! Thanks for pushing this through!

teqdruid · 2021-03-24T01:57:06Z

lib/Bindings/Python/RTLOps.td

@@ -0,0 +1,15 @@
+//===-- RTLOps.td - Entry point for RTLOps bindings --------*- tablegen -*-===//


Is this file going to grow for the RTL dialect? If not, we will probably want to stuff all the dialects in here. No need to change now, just food for thought.

It may grow a bit, but you are right, this is probably about it. However, unless something changes upstream, we will probably want to have several small files like this. This is the PYTHON_BINDING_TD_FILE mentioned here: https://mlir.llvm.org/docs/Bindings/Python/#integration-with-ods. I would expect one small one per dialect, similar to the .td files upstream: https://github.com/llvm/llvm-project/tree/main/mlir/lib/Bindings/Python

Start adding Python bindings for RTL dialect.

ff1e49f

Simplify RTL Python bindings test.

2a43f22

Make sure CIRCT Python bindings build MLIR Python bindings.

10fd265

stellaraccident mentioned this pull request Mar 15, 2021

Use BUILD_SHARED_LIBS instead of LLVM_BUILD_LLVM_DYLIB llvm/torch-mlir#189

Closed

mikeurbach force-pushed the python-rtl-bindings branch 5 times, most recently from c4e1b10 to 8feb526 Compare March 17, 2021 04:57

Enable MLIR Python bindings in integration tests.

fb211c5

mikeurbach force-pushed the python-rtl-bindings branch from 8feb526 to fb211c5 Compare March 17, 2021 15:45

[LLHD] Add option to specify shared libs for linking.

6a836c2

This adds a -shared-libs option to llhd-sim that is analogous to the same option for mlir-cpu-runner. If specified, those libraries are passed to the ExecutionEngine to dynamically load and link.

mikeurbach added 2 commits March 22, 2021 09:11

Merge branch 'main' into python-rtl-bindings

6fe5a38

Enable Python bindings in integration tests with shared libs on.

308b377

mikeurbach force-pushed the python-rtl-bindings branch from db37f4a to 308b377 Compare March 22, 2021 15:21

mikeurbach force-pushed the python-rtl-bindings branch 7 times, most recently from b4f30b5 to 223e69b Compare March 23, 2021 04:02

mikeurbach force-pushed the python-rtl-bindings branch 10 times, most recently from ebfab74 to 8e98e7a Compare March 23, 2021 06:50

mikeurbach force-pushed the python-rtl-bindings branch from 8e98e7a to 79a2a2e Compare March 23, 2021 16:21

Debug CMake PYTHONPATH

565de45

mikeurbach force-pushed the python-rtl-bindings branch from 79a2a2e to 565de45 Compare March 23, 2021 16:32

Clean up debugging.

2e94568

mikeurbach added 3 commits March 23, 2021 17:40

Replace Python bindings flag with a generic parameter.

97bedb2

And use it to set Python bindings flag in integration tests.

Document Python bindings requirements in GettingStarted.md.

2a014d1

Remove unrelated change.

01706c2

mikeurbach requested a review from teqdruid March 24, 2021 00:24

teqdruid approved these changes Mar 24, 2021

View reviewed changes

mikeurbach merged commit 9b2331e into main Mar 24, 2021

mikeurbach deleted the python-rtl-bindings branch March 24, 2021 02:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start adding Python bindings for RTL dialect. #767

Start adding Python bindings for RTL dialect. #767

mikeurbach commented Mar 14, 2021

stellaraccident commented Mar 14, 2021 •

edited

mikeurbach commented Mar 14, 2021

stellaraccident commented Mar 14, 2021 via email

mikeurbach commented Mar 14, 2021

stellaraccident commented Mar 14, 2021

teqdruid commented Mar 19, 2021

mikeurbach commented Mar 19, 2021

mikeurbach commented Mar 22, 2021

mikeurbach commented Mar 23, 2021 •

edited

teqdruid commented Mar 23, 2021

mikeurbach commented Mar 23, 2021

teqdruid commented Mar 23, 2021

mikeurbach commented Mar 23, 2021

mikeurbach commented Mar 24, 2021

teqdruid left a comment

teqdruid Mar 24, 2021

mikeurbach Mar 24, 2021

		@@ -0,0 +1,15 @@
		//===-- RTLOps.td - Entry point for RTLOps bindings --------- tablegen --===//

Start adding Python bindings for RTL dialect. #767

Start adding Python bindings for RTL dialect. #767

Conversation

mikeurbach commented Mar 14, 2021

stellaraccident commented Mar 14, 2021 • edited

mikeurbach commented Mar 14, 2021

stellaraccident commented Mar 14, 2021 via email

mikeurbach commented Mar 14, 2021

stellaraccident commented Mar 14, 2021

teqdruid commented Mar 19, 2021

mikeurbach commented Mar 19, 2021

mikeurbach commented Mar 22, 2021

mikeurbach commented Mar 23, 2021 • edited

teqdruid commented Mar 23, 2021

mikeurbach commented Mar 23, 2021

teqdruid commented Mar 23, 2021

mikeurbach commented Mar 23, 2021

mikeurbach commented Mar 24, 2021

teqdruid left a comment

Choose a reason for hiding this comment

teqdruid Mar 24, 2021

Choose a reason for hiding this comment

mikeurbach Mar 24, 2021

Choose a reason for hiding this comment

stellaraccident commented Mar 14, 2021 •

edited

mikeurbach commented Mar 23, 2021 •

edited