You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to the requirements of the conventional eBPF use-cases, the ISA has some specialized features and instructions (e.g., input buffers and instructions) and the sandboxed execution model is highly constrained. Much of this is related to the fact that eBPF programs often run in kernel mode (and access kernel-specific data structures) and where a verifier attempts to ensure non-malicious behavior or avoid kernel crashes.
Solana requirements overlap-- but are not identical to-- eBPF requirements. Solana programs run in userspace and are governed by an instruction metering mechanism-- obviating many of the restrictions imposed by the original eBPF execution model. Similarly, the original ISA is minimal by design, but useful instructions can be (and already have been) added to increase execution efficiency in the context of Solana programs executing on modern hardware. To continue to meet the specific needs of the Solana blockchain, it is anticipated that the SBF ISA and execution model will continue to evolve significantly and diverge from the eBPF ISA.
While SBF divergence thus far has been relatively minor, each such change requires coordination with and permission of the upstream LLVM BPF backend maintainers. Moreover, not all such changes have been or will be useful or acceptable to the BPF community going forward. A new backend untethers us and enables more flexibility to make aggressive Solana-specific changes and more efficient use of the underlying modern hardware. Finally, the existing BPF backend is in some ways non-idiomatic and is based on the legacy SelectDAGISel framework.
Proposed Solution
This is a sketch and not a detailed design as such. As the implementation proceeds, items may evolve or be added. There is not necessarily a strict ordering and each item description may be somewhat abstract and imply a fair amount of underlying work. Items marked [WIP] are in progress (some partially working today, in preliminary form).
Implement a new baseline SBF LLVM backend (initially a clone of the BPF backend) in lib/Target/SBF (along with corresponding changes in ADT, Support, MC, Object, BinaryFormat for new SBF architecture and EM_SBF ELF object). We will leverage the existing sbf-solana-solana target triple and related machinery to target the new back-end. Dmitri has previously made that machinery work with the BPF->SBF transition of the complicated web of dependences (clang, Rust, Solana scripts, etc). The new back-end should thus be a drop-in replacement.
Update clang to instantiate new back-end with sbf triple and related unit tests.
Update related utilities to recognize new binaries (e.g., llvm-readobj) and related unit tests.
Update the solana_rbpf VM to recognize new binaries (i.e., e_machine). There is little to do here and relocation numbers will match one-to-one.
Update lld to recognize new binaries.
Update the Rust compiler to instantiate the SBF back-end for sbf target.
Update lldb debugger to recognize new binaries.
Request a new e_machine value (EM_SBF) from the ELF gABI committee. Update: we have received an official e_machine value (263) from the committee. We are now registered in the official gABI specification. The committee will update ELF headers in the GNU binutils project on our behalf (the de-facto/official repository of ELF headers). We have added it locally to LLVM in this task (to appear once upstreamed).
Remove unnecessary BPF intrinsics (some have already been disabled for Solana, such as llvm.bpf.load.*). This list is TBD, but most are likely not needed.
Remove any unnecessary BPF instructions (some have already been disabled for Solana, such as ldind* and ldabs*).
Remove unnecessary LLVM passes and code. I estimate that roughly 30% of the BPF backend code is unnecessary for SBF. For example, all BTF related code and mentions (SBF uses standard DWARF) can be removed. Likewise, kernel-specific code such as the BPFAbstractMemberAccess pass that is used to rewrite references to kernel data structures will be removed. Others TBD.
Update the entire backend to use the modern GlobalISel framework. This will be done incrementally, but the eventual goal is complete removal of any legacy SelectionDAGISel code.
Rework the textual assembly syntax to match the more conventional rbpf-style syntax (https://github.com/solana-labs/rbpf/blob/main/src/disassembler.rs) as opposed to the C-style pseudocode syntax currently used. Update any related AsmParser and Disassembler lit tests.
Add substantially more lit unit test coverage for all involved components-- especially the back-end. The existing BPF test coverage is somewhat sparse. This will be ongoing work where a sizable portion will be done naturally during the GlobalISel and syntax work.
Generally modernize, simplify, and clean-up the implementation and ensure adherence to all LLVM coding standards and idioms. Use the latest C++11/14/17 features where they simplify the code and/or enhance reliability/readability/maintainability.
Support previously deployed eBPF binaries (tentative). The main use-case here is dumping/disassembly of existing binaries (e.g., llvm-objdump, llvm-readelf). The envisioned approach is to write a simple translator module to make minor rewrites from the older binaries to new (e.g., EM_BPF -> EM_SBF) so the new SBF tools just work. Alternatively, the target-lookup mechanism can detect older binaries and instantiate the old BPF backend TM instead of the SBF backend TM. The latter is less desirable and more intrusive. Any newly deployed code would use the new backend and tools. Of course, if we maintain indefinitely the existing Solana-modified BPF backend, those tools will continue to work with existing binaries. Ultimately we want to rely only on the new backend. Mandating redeployment is yet another option (which may already be required for ABIv2).
Initially maintain the backend downstream while we mature the implementation, the ISA definition, the ISA documentation, and ELF psABI documentation.
As soon as practical and sensible, begin the process of upstreaming the backend to the LLVM project.
The obvious alternative approach to the above (which we considered) is to write a new backend from scratch-- and do it upstream from day one. But given that our blockchain is already live and the ISA has not diverged tremendously, it makes sense to start with the existing backend and change incrementally. Moreover, we are still learning where the bottlenecks are and what ISA changes will be needed. We can carry out the exploration incrementally instead of attempting to determine apriori the final (or next) form of the ISA.
The text was updated successfully, but these errors were encountered:
Hello @StEvUgnIn, thanks for your interest. The short answer to your question is that the new SBF back-end is up and running and in production today. It is more-or-less in the state shown by the checked task boxes in the description above. That is, the main functional items are done and the back-end "works".
The tasks remaining (lower priority, but still desirable) are all the rework (move to GlobalISel) or clean-up items. Those tasks are basically on hold because we're also building a Move Language (to LLVM) compiler here at Solana-- and that has taken priority over some of the lower-priority back-end work. Since the Move Language compiler is still not yet functional, management has deemed that work the priority right now.
Given that the back-end has been up and running in production for quite some time now, and the less essential (but still desirable) enhancement items are back-burner/low-priority at the moment (i.e., not strictly necessary to be functional), I'm closing this issue for the time being. If and when any of the unfinished items are attended to, this may be reopened.
Problem
Solana needs a new SBF (Solana Bytecode Format) LLVM backend.
Currently the SBF ISA (https://bpf.wtf/sol-0x00-intro/) is an extension of the existing eBPF ISA (https://www.kernel.org/doc/html/latest/bpf/index.html, https://github.com/iovisor/bpf-docs/blob/master/eBPF.md). The primary use-cases of eBPF traditionally involve kernel-mode network packet analysis and similar tasks. More recently, the Solana blockchain adopted eBPF as its VM ISA.
Due to the requirements of the conventional eBPF use-cases, the ISA has some specialized features and instructions (e.g., input buffers and instructions) and the sandboxed execution model is highly constrained. Much of this is related to the fact that eBPF programs often run in kernel mode (and access kernel-specific data structures) and where a verifier attempts to ensure non-malicious behavior or avoid kernel crashes.
Solana requirements overlap-- but are not identical to-- eBPF requirements. Solana programs run in userspace and are governed by an instruction metering mechanism-- obviating many of the restrictions imposed by the original eBPF execution model. Similarly, the original ISA is minimal by design, but useful instructions can be (and already have been) added to increase execution efficiency in the context of Solana programs executing on modern hardware. To continue to meet the specific needs of the Solana blockchain, it is anticipated that the SBF ISA and execution model will continue to evolve significantly and diverge from the eBPF ISA.
While SBF divergence thus far has been relatively minor, each such change requires coordination with and permission of the upstream LLVM BPF backend maintainers. Moreover, not all such changes have been or will be useful or acceptable to the BPF community going forward. A new backend untethers us and enables more flexibility to make aggressive Solana-specific changes and more efficient use of the underlying modern hardware. Finally, the existing BPF backend is in some ways non-idiomatic and is based on the legacy SelectDAGISel framework.
Proposed Solution
This is a sketch and not a detailed design as such. As the implementation proceeds, items may evolve or be added. There is not necessarily a strict ordering and each item description may be somewhat abstract and imply a fair amount of underlying work. Items marked [WIP] are in progress (some partially working today, in preliminary form).
We will leverage the existing
sbf-solana-solana
target triple and related machinery to target the new back-end. Dmitri has previously made that machinery work with the BPF->SBF transition of the complicated web of dependences (clang, Rust, Solana scripts, etc). The new back-end should thus be a drop-in replacement.e_machine
value (EM_SBF
) from the ELF gABI committee.Update: we have received an official
e_machine
value (263) from the committee. We are now registered in the official gABI specification. The committee will update ELF headers in the GNU binutils project on our behalf (the de-facto/official repository of ELF headers). We have added it locally to LLVM in this task (to appear once upstreamed).I estimate that roughly 30% of the BPF backend code is unnecessary for SBF. For example, all BTF related code and mentions (SBF uses standard DWARF) can be removed.Likewise, kernel-specific code such as theBPFAbstractMemberAccess
pass that is used to rewrite references to kernel data structures will be removed. Others TBD.The obvious alternative approach to the above (which we considered) is to write a new backend from scratch-- and do it upstream from day one. But given that our blockchain is already live and the ISA has not diverged tremendously, it makes sense to start with the existing backend and change incrementally. Moreover, we are still learning where the bottlenecks are and what ISA changes will be needed. We can carry out the exploration incrementally instead of attempting to determine apriori the final (or next) form of the ISA.
The text was updated successfully, but these errors were encountered: