Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Gandiva] Migration JIT engine from MCJIT to LLJIT #37848

Closed
niyue opened this issue Sep 24, 2023 · 2 comments · Fixed by #39098
Closed

[C++][Gandiva] Migration JIT engine from MCJIT to LLJIT #37848

niyue opened this issue Sep 24, 2023 · 2 comments · Fixed by #39098

Comments

@niyue
Copy link
Contributor

niyue commented Sep 24, 2023

Description

Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT [1], which presents several advantages over MCJIT:

  • Active Maintenance: ORC v2 is under active development and maintenance by LLVM developers. In contrast, MCJIT is not receiving active updates and, based on indications from LLVM developers, is slated for eventual deprecation and removal.
  • Modularity and Organization: ORC v2 boasts a more organized and modular structure, granting users the flexibility to seamlessly integrate various JIT components.
  • Thread-Local Variable Support: ORC v2 natively supports thread-local variables, enhancing its functionality.
  • Enhanced Resource Management: When compared to MCJIT, ORC v2 provides a more granular approach to resource management, optimizing memory usage and code compilation.

In my project, I've experimented with this migration and got it to work in a prototype. However, transitioning Gandiva to this new API is a substantial undertaking. I'm keen to gauge the community's interest in migrating to this new JIT engine API and would greatly appreciate any feedback or insights. Thank you.

Proposal

  • There won't be any Gandiva user facing API change, namely, the Projector and Filter APIs remain the same
  • There will be some API changes for LLVMGenerator and Engine classes
    • Both LLVMGenerator and Engine classes constructors are expected to take an optional additional GandivaObjectCache reference because LLJIT requires to set up the object cache mechanism during initialization of LLJIT instance
    • There will be major change for the Engine class implementation since it is currently interfacing the MCJIT directly and we will replace the MCJIT related APIs with the LLJIT related APIs
  • There may be a minor change to the Configuration class, and it is expected to add a new configuration option called needs_ir_dumping because LLJIT doesn't allow to retrieve the IR from module at any time. But previously Gandiva has an API called DumpIR which allows dumping IR at any time, so we need to use this new option to indicate IR dumping is needed and we can store the IR up front for later dumping
  • Performance is expected to be roughly the same (according to the feedback I got from LLVM developers in LLVM discord)
  • LLJIT API is available since LLVM 7.0 [1][2][3] so theoretically after migration we could support LLVM >= 7.0, but I am not sure if all the APIs used for migration supports >= LLVM 7.0 across all platforms, and this migration may have to require higher version of LLVM (testing is needed)

References

[1] https://llvm.org/docs/ORCv2.html
[2] https://github.com/llvm/llvm-project/commits/c4e764ea24eb02b6ec34038061cee8ff94c0f34c/llvm/include/llvm/ExecutionEngine/Orc/LLJIT.h?after=c4e764ea24eb02b6ec34038061cee8ff94c0f34c+34
[3] LLVM release dates, https://releases.llvm.org

Component(s)

C++ - Gandiva

@kou kou changed the title [C++] [Gandiva] Migration JIT engine from MCJIT to LLJIT [C++][Gandiva] Migration JIT engine from MCJIT to LLJIT Sep 25, 2023
@kou
Copy link
Member

kou commented Sep 25, 2023

Thanks for opening this issue.
Could you also send an e-mail to dev@arrow.apache.org to get more attention?
https://arrow.apache.org/community/#mailing-lists

@niyue
Copy link
Contributor Author

niyue commented Dec 5, 2023

Here is the discussion thread:

https://lists.apache.org/thread/fphzvtr1jrc069z7kv78oopgr4zrjfgl

UPDATE:
So far (Dec. 6, 11:30, 24 hours after the discussion thread), I haven't received any feedback, but I'll go ahead and give it a try to see how things turn out.

@kou kou closed this as completed in #39098 Jan 4, 2024
kou pushed a commit that referenced this issue Jan 4, 2024
…/LLJIT (#39098)

### Rationale for this change
Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. 

### What changes are included in this PR?
* This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API.
* This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK`

### Are these changes tested?
Yes, they are covered by existing unit tests

### Are there any user-facing changes?
* `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first.
* Closes: #37848

Authored-by: Yue Ni <niyue.com@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@kou kou added this to the 15.0.0 milestone Jan 4, 2024
clayburn pushed a commit to clayburn/arrow that referenced this issue Jan 23, 2024
…ORC v2/LLJIT (apache#39098)

### Rationale for this change
Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. 

### What changes are included in this PR?
* This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API.
* This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK`

### Are these changes tested?
Yes, they are covered by existing unit tests

### Are there any user-facing changes?
* `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first.
* Closes: apache#37848

Authored-by: Yue Ni <niyue.com@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…ORC v2/LLJIT (apache#39098)

### Rationale for this change
Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. 

### What changes are included in this PR?
* This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API.
* This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK`

### Are these changes tested?
Yes, they are covered by existing unit tests

### Are there any user-facing changes?
* `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first.
* Closes: apache#37848

Authored-by: Yue Ni <niyue.com@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this issue Feb 28, 2024
…ORC v2/LLJIT (apache#39098)

### Rationale for this change
Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. 

### What changes are included in this PR?
* This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API.
* This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK`

### Are these changes tested?
Yes, they are covered by existing unit tests

### Are there any user-facing changes?
* `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first.
* Closes: apache#37848

Authored-by: Yue Ni <niyue.com@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
lriggs added a commit to lriggs/arrow that referenced this issue Mar 20, 2024
ryantse added a commit to dremio/arrow that referenced this issue Mar 30, 2024
lriggs added a commit to lriggs/arrow that referenced this issue Apr 5, 2024
lriggs added a commit to lriggs/arrow that referenced this issue May 10, 2024
stevelorddremio pushed a commit to stevelorddremio/arrow that referenced this issue Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants