Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement parsing EraVM assembler #448

Merged
merged 53 commits into from
Apr 25, 2024
Merged

Implement parsing EraVM assembler #448

merged 53 commits into from
Apr 25, 2024

Conversation

atrosinenko
Copy link
Collaborator

This PR contains implementation of assembler parser for EraVM target as well as several refactorings made to backend's internal representation of EraVM instructions.

Related issues: #411, #412, #414, #416, #417, #418, #419, #420, #421, #422, #423, #435.

Copy link

github-actions bot commented Mar 28, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@akiramenai
Copy link
Collaborator

akiramenai commented Mar 29, 2024

It looks like there are plenty of functional and non-functional changes combined in this PR, so it's hard to review. Can we divide the functionality to meaningful commits and review them one by one? I mean divide to less than 114 commits; form a commit structure as we actually plan to land these changes.
Also please clang format and pass the linter.

@asl
Copy link
Collaborator

asl commented Mar 29, 2024

It looks like there are plenty of functional and non-functional changes combined in this PR, so it's hard to review. Can we divide the functionality to meaningful commits and review them one by one? I mean divide to less than 114 commits; form a commit structure as we actually plan to land these changes.

The commits itself are quite well-divided and represent fine-grained incremental set of changes, I would say there are almost no non-functional changes. However, we had to restructure the instructions definitions entirely as in the original form they were unsuitable for instruction encoding and had some inherent issues. Certainly, some commits could be combined, for example:

  • Support parsing name.s for commutative instructions (such as add.s)
  • Support parsing "name.s reg, reg, reg" for non-commutative instructions
  • Parse context. instructions
  • ...

Let me know if you'd want us to do this.

@akiramenai
Copy link
Collaborator

akiramenai commented Mar 29, 2024

@asl it sounds like a reasonable change. I think we should go for it.
Could we also make sure no diff clang-format, clang-tidy produce no warnings, and check-llvm pass for all commits?

@asl
Copy link
Collaborator

asl commented Mar 29, 2024

Could we also make sure no diff clang-format, clang-tidy produce no warnings, and check-llvm pass for all commits?

We are working on silencing linter warnings.

Copy link

github-actions bot commented Mar 29, 2024

Lit Tests Results

     3 files  ± 0       4 suites  ±0   6m 45s ⏱️ +9s
49 145 tests +24  17 537 ✅ +24  31 608 💤 ±0  0 ❌ ±0 
69 628 runs  +24  17 972 ✅ +24  51 656 💤 ±0  0 ❌ ±0 

Results for commit f0d5f46. ± Comparison against base commit 8677f09.

This pull request removes 1 and adds 25 tests. Note that renamed tests count towards both.
LLVM-Unit.Target/EraVM/_/EraVMTests/GetPseudoMapOpcode ‑ TheTest
LLVM.MC/EraVM/asm-parser ‑ arith-commutative-2outs.s
LLVM.MC/EraVM/asm-parser ‑ arith-commutative.s
LLVM.MC/EraVM/asm-parser ‑ arith-instructions-errors.s
LLVM.MC/EraVM/asm-parser ‑ arith-modifiers.s
LLVM.MC/EraVM/asm-parser ‑ arith-non-commutative-2outs.s
LLVM.MC/EraVM/asm-parser ‑ arith-non-commutative.s
LLVM.MC/EraVM/asm-parser ‑ arith-operands-errors.s
LLVM.MC/EraVM/asm-parser ‑ arith-operands.s
LLVM.MC/EraVM/asm-parser ‑ data-errors.s
LLVM.MC/EraVM/asm-parser ‑ data.s
…

♻️ This comment has been updated with latest results.

@asl
Copy link
Collaborator

asl commented Mar 31, 2024

@hedgar2017 @akiramenai @sayon

Looks like the integration tests fail because they use standalone assembler that cannot handle the assembler syntax described in the spec. How would you want to address this?

@asl asl force-pushed the eravm-mc/asm-parser branch 2 times, most recently from 207abbe to 6e0f045 Compare March 31, 2024 10:16
@akiramenai
Copy link
Collaborator

akiramenai commented Mar 31, 2024

@asl
It looks like we miss a register as the first argument of ret.to_label if to satisfy our assembler. It should be easy to fix.
But in general, if I remember it right, we agreed to stick with the current syntax and made the changes requested by @hedgar2017, @sayon in separate patches. Which looks like a sane approach. So, with that in mind, passing the integration tests shouldn't be that hard, and the sooner we integrate, the better. We become confident in our asm printer this way.

@asl
Copy link
Collaborator

asl commented Mar 31, 2024

@akiramenai Yes, the renaming is a part of https://github.com/matter-labs/compiler-llvm/issues/441

However, it looks like there are 3 different sources of information that do no correspond to each other: backend, assembler and the ISA spec. There were some obsolete instructions, some obsolete syntax, some undocumented syntax, etc. So, in such cases we had to resort to the existing spec as an ultimate source of information. Some changes (e.g. removal of obsolete stuff) is certainly a part of initial code drop as it does not make any sense to make them supported only to remove them immediately afterwards.

In this particular case, it seems that standalone assembler does not support revert instruction alias as specified in ISA spec. We will add remained aliases to please the assembler, certainly :)

@akiramenai
Copy link
Collaborator

@asl
I understand that it's our fault. We haven't tested the spec against the assembler. So thank you for finding the discrepancy. We can raise an asm issue there and fix it. The fix could likely be landed on Monday. If you can do an alias, that could unblock you faster.
I still think it worth to integrate earlier. When we pass system contract compilation, the CI won't stop after the first failure.

@akiramenai
Copy link
Collaborator

@asl I asked @sayon, [cc @shamatar]
But I believe the spec is wrong here. It doesn't mention the argument for ret, revert but to the best of my memory we pass the pointer to the returning slice, which makes sense if it's a far return or revert. But as we don't distinguish near and far in case of returns and reverts, we always pass an argument.
N.B. @sayon is OOO today.

@hedgar2017
Copy link
Collaborator

hedgar2017 commented Apr 1, 2024

@akiramenai @asl @sayon I'd kindly ask you guys to remove all aliases as soon as possible (e.g. the revert mentioned above). If it's just for the initial phase to reach conformance, that's fine.
It would be great to start over without legacy.

@asl
Copy link
Collaborator

asl commented Apr 1, 2024

@akiramenai @asl @sayon I'd kindly ask you guys to remove all aliases as soon as possible (e.g. the revert mentioned above). If it's just for the initial phase to reach conformance, that's fine.

Absolutely. However, here we are having chicken-and-egg problem: existing assembler supports only some aliases, but not others (e.g. ret.revert but not revert) or always require register for ret.ok regardless what is in the spec, even for default one, also it seems that it supports only some subset of specified assembler syntax (as it comes with stack operands, for example). So we'd need to add everything and then remove legacy :)

However, we structured the changes in such way so it will be more or less straightforward.

@atrosinenko
Copy link
Collaborator Author

I pushed a few commits that (temporary?) rollback potentially breaking changes to syntax. Let's see if tests pass.

@sayon
Copy link

sayon commented Apr 1, 2024

I have pushed more updates to the the spec available online; now the returns are properly categorized:

  • far return/revert and near return/revert are merged and became ret reg and rev reg. Register argument is ignored if we return from near calls.
  • returns/reverts/panics to label are marked "near" because far returns ignore labels
  • pnc (panic) does not accept arguments because panic ignores register argument

This type specifies the abstract syntax of the programmer-facing assembler instructions:
https://matter-labs.github.io/eravm-spec/spec.html#asm_instruction

Copy link

github-actions bot commented Apr 1, 2024

Benchmark results:

╔═╡ Size (-%) ╞════════════════╡ All M3B3 ╞═╗
║ Mean                                0.000 ║
║ Best                                0.976 ║
║ Worst                              -0.738 ║
║ Total                               0.004 ║
╠═╡ Cycles (-%) ╞══════════════╡ All M3B3 ╞═╣
║ Mean                                0.000 ║
║ Best                                3.803 ║
║ Worst                              -1.587 ║
║ Total                               0.001 ║
╠═╡ Ergs (-%) ╞════════════════╡ All M3B3 ╞═╣
║ Mean                                0.000 ║
║ Best                                3.034 ║
║ Worst                              -1.210 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞════════════════╡ All MzB3 ╞═╗
║ Mean                                0.000 ║
║ Best                                1.942 ║
║ Worst                              -1.980 ║
║ Total                               0.003 ║
╠═╡ Cycles (-%) ╞══════════════╡ All MzB3 ╞═╣
║ Mean                                0.000 ║
║ Best                               12.687 ║
║ Worst                             -12.249 ║
║ Total                              -0.000 ║
╠═╡ Ergs (-%) ╞════════════════╡ All MzB3 ╞═╣
║ Mean                                0.000 ║
║ Best                                7.480 ║
║ Worst                              -5.696 ║
║ Total                              -0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞════════╡ Precompiles M3B3 ╞═╗
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Cycles (-%) ╞══════╡ Precompiles M3B3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞════════╡ Precompiles M3B3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞════════╡ Precompiles MzB3 ╞═╗
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Cycles (-%) ╞══════╡ Precompiles MzB3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞════════╡ Precompiles MzB3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞══════════╡ Real life M3B3 ╞═╗
║ Mean                                0.039 ║
║ Best                                0.354 ║
║ Worst                              -0.081 ║
║ Total                               0.061 ║
╠═╡ Cycles (-%) ╞════════╡ Real life M3B3 ╞═╣
║ Mean                                0.001 ║
║ Best                                0.320 ║
║ Worst                              -0.489 ║
║ Total                              -0.004 ║
╠═╡ Ergs (-%) ╞══════════╡ Real life M3B3 ╞═╣
║ Mean                               -0.000 ║
║ Best                                0.001 ║
║ Worst                              -0.097 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞══════════╡ Real life MzB3 ╞═╗
║ Mean                                0.040 ║
║ Best                                0.494 ║
║ Worst                              -0.247 ║
║ Total                               0.057 ║
╠═╡ Cycles (-%) ╞════════╡ Real life MzB3 ╞═╣
║ Mean                               -0.094 ║
║ Best                                0.164 ║
║ Worst                              -3.704 ║
║ Total                              -0.024 ║
╠═╡ Ergs (-%) ╞══════════╡ Real life MzB3 ╞═╣
║ Mean                               -0.030 ║
║ Best                                0.001 ║
║ Worst                              -1.620 ║
║ Total                              -0.000 ║
╚═══════════════════════════════════════════╝

@asl
Copy link
Collaborator

asl commented Apr 1, 2024

@akiramenai What does "exec format error" error mean?

@hedgar2017
Copy link
Collaborator

@akiramenai @asl @sayon I'd kindly ask you guys to remove all aliases as soon as possible (e.g. the revert mentioned above). If it's just for the initial phase to reach conformance, that's fine.

Absolutely. However, here we are having chicken-and-egg problem: existing assembler supports only some aliases, but not others (e.g. ret.revert but not revert) or always require register for ret.ok regardless what is in the spec, even for default one, also it seems that it supports only some subset of specified assembler syntax (as it comes with stack operands, for example). So we'd need to add everything and then remove legacy :)

However, we structured the changes in such way so it will be more or less straightforward.

Understood!
@akiramenai do you think we need to compare both assemblers for binary equality, or would a simple migration be sufficient? Assembly is low-level enough for our testset to walk all the paths, so I don't need to implement another comparison mode on compiler-tester, especially as a one-time job.

@akiramenai
Copy link
Collaborator

@asl, I have never seen that before. But from the fact that only Viper programs are affected I infer that it's likely to be a CI issue. Could you rebase the work on ToT?
@hedgar2017 could you also have a look at the integration tests failures. You should know it better.

@asl
Copy link
Collaborator

asl commented Apr 2, 2024

@akiramenai do you think we need to compare both assemblers for binary equality, or would a simple migration be sufficient?

Here is an idea: we do have a set of assembler files (under test/MC/EraVM) to test the functionality. We can just run the existing assembler andllvm-mc and compare the outputs? Is there a quick way to dump the encoding from the existing assembler?

@akiramenai
Copy link
Collaborator

@hedgar2017 we can migrate I think. Binary comparison is faster though, could have less dependence, and could identify issues that aren't touched by input data.

@hedgar2017
Copy link
Collaborator

@asl, I have never seen that before. But from the fact that only Viper programs are affected I infer that it's likely to be a CI issue. Could you rebase the work on ToT? @hedgar2017 could you also have a look at the integration tests failures. You should know it better.

I can only see LIT failures on CI. Could you point me directly?

@hedgar2017
Copy link
Collaborator

@hedgar2017 we can migrate I think. Binary comparison is faster though, could have less dependence, and could identify issues that aren't touched by input data.

Binary comparison isn't faster than simply running the CI after migration. I'll see if I can do some hacking like dump all builds from compiler-tester.

@asl
Copy link
Collaborator

asl commented Apr 25, 2024

@akiramenai Looks like there is unresolved conversation that was attached to a commit that does not exist anymore. As a result, the PR cannot be merged :) For the record, the comment was resolved.

@akiramenai akiramenai merged commit 45867d2 into main Apr 25, 2024
31 checks passed
@akiramenai akiramenai deleted the eravm-mc/asm-parser branch April 25, 2024 05:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants