Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a tier-1 JIT compiler based on aarch64 architecture #304

Merged
merged 1 commit into from
Dec 25, 2023

Conversation

qwe661234
Copy link
Collaborator

@qwe661234 qwe661234 commented Dec 22, 2023

We follow the template and API of X64 to implement A64 tier-1 JIT compiler.

  • Perfromance
Metric rv32emu-T1C qemu
aes 0.034 0.045
puzzle 0.0115 0.0169
pi 0.035 0.032
dhrystone 1.914 2.005
Nqeueens 3.87 2.898
qsort-O2 7.819 11.614
miniz-O2 7.604 3.803
primes-O2 10.551 5.986
sha512-O2 6.497 2.853
stream 52.25 45.776

As demonstrated in the memory usage analysis below, the tier-1 JIT compiler utilizes less memory than QEMU across all benchmarks.

  • Memory usage
Metric rv32emu-T1C qemu
aes 183,212 1,265,962
puzzle 145,239 891,357
pi 144,739 872,525
dhrystone 146,282 853,256
Nqeueens 146,696 854,174
qsort-O2 146,907 856,721
miniz-O2 157,475 999,897
primes-O2 142,356 851,661
sha512-O2 145,369 901,136
stream 157,975 955,809

Related: #238
Close: #296

Makefile Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
tools/gen-jit-template.py Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Show resolved Hide resolved
jserv

This comment was marked as resolved.

@jserv
Copy link
Contributor

jserv commented Dec 23, 2023

Consider the following changes for Apple Silicon:

diff --git a/Makefile b/Makefile
index f8fe8e7..3bc5cc3 100644
--- a/Makefile
+++ b/Makefile
@@ -123,7 +123,7 @@ $(call set-feature, JIT)
 ifeq ($(call has, JIT), 1)
 OBJS_EXT += jit.o 
 ifneq ($(processor), x86_64)
-ifneq ($(processor), aarch64)
+ifneq ($(processor), arm64)
 $(error JIT mode only supports for x64 and arm64 target currently.)
 endif
 endif
diff --git a/src/jit.c b/src/jit.c
index 403f874..89d0fdb 100644
--- a/src/jit.c
+++ b/src/jit.c
@@ -36,6 +36,10 @@
 #include "state.h"
 #include "utils.h"
 
+#if defined(__APPLE__) && defined(__aarch64__)
+#include <libkern/OSCacheControl.h>
+#include <pthread.h>
+#endif
 
 #define JIT_CLS_MASK 0x07
 #define JIT_ALU_OP_MASK 0xf0
@@ -286,9 +290,10 @@ static inline void offset_map_insert(struct jit_state *state, int32_t target_pc)
     map_entry->offset = state->offset;
 }
 
-
+#if !defined(__APPLE__)
 #define sys_icache_invalidate(addr, size) \
     __builtin___clear_cache((char *) (addr), (char *) (addr) + (size));
+#endif
 
 static inline void emit_bytes(struct jit_state *state, void *data, uint32_t len)
 {
@@ -1505,13 +1510,17 @@ struct jit_state *init_state(size_t size)
     struct jit_state *state = malloc(sizeof(struct jit_state));
     state->offset = 0;
     state->size = size;
+#if defined(__APPLE__) && defined(__aarch64__)
+    assert(pthread_jit_write_protect_supported_np());
+    pthread_jit_write_protect_np(0);
+#endif
+#if defined(__APPLE__) && defined(__aarch64__)
+    state->buf = mmap(NULL, size, PROT_EXEC | PROT_WRITE | PROT_READ,
+                      MAP_PRIVATE | MAP_ANONYMOUS | MAP_JIT, -1, 0);
+#else
     state->buf = mmap(0, size, PROT_READ | PROT_WRITE | PROT_EXEC,
-                      MAP_PRIVATE | MAP_ANONYMOUS
-#if defined(__APPLE__)
-                          | MAP_JIT
+                      MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 #endif
-                      ,
-                      -1, 0);
     assert(state->buf != MAP_FAILED);
     prepare_translate(state);
     state->offset_map = calloc(MAX_INSNS, sizeof(struct offset_map));

build/hello.elf is known to work.

@qwe661234 qwe661234 force-pushed the arm64_t1_jit branch 2 times, most recently from 648283d to 79965eb Compare December 23, 2023 07:52
@qwe661234
Copy link
Collaborator Author

qwe661234 commented Dec 23, 2023

Consider the following changes for Apple Silicon:

It pass all tests now, we gain about 4 times performance improvement on macOS + Apple Silicon.

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update CI pipeline for JIT/Arm64-enabled build.

src/jit.c Outdated Show resolved Hide resolved
src/jit.h Outdated Show resolved Hide resolved
src/jit.h Outdated Show resolved Hide resolved
src/jit.h Outdated Show resolved Hide resolved
src/jit.c Show resolved Hide resolved
src/rv32_template.c Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
jserv

This comment was marked as resolved.

src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
jserv

This comment was marked as resolved.

src/jit.c Outdated Show resolved Hide resolved
src/jit.c Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
@qwe661234 qwe661234 force-pushed the arm64_t1_jit branch 3 times, most recently from 3c9074a to 1c30346 Compare December 25, 2023 05:26
src/jit.c Outdated Show resolved Hide resolved
src/rv32_template.c Outdated Show resolved Hide resolved
src/rv32_template.c Outdated Show resolved Hide resolved
src/utils.h Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
src/jit.c Outdated Show resolved Hide resolved
We follow the template and API of X64 to implement A64 tier-1 JIT compiler.

* Perfromance
| Metric   | rv32emu-T1C | qemu  |
|----------+-------------+-------|
|aes	   |        0.034|  0.045|
|puzzle	   |       0.0115| 0.0169|
|pi        |        0.035|  0.032|
|dhrystone |	    1.914|  2.005|
|Nqeueens  |	     3.87|  2.898|
|qsort-O2  |	    7.819| 11.614|
|miniz-O2  |	    7.604|  3.803|
|primes-O2 |	   10.551|  5.986|
|sha512-O2 |	    6.497|  2.853|
|stream	   |        52.25| 45.776|

As demonstrated in the memory usage analysis below, the tier-1 JIT
compiler utilizes less memory than QEMU across all benchmarks.

* Memory usage
| Metric   | rv32emu-T1C |   qemu  |
|----------+-------------+---------|
|aes	   |      183,212|1,265,962|
|puzzle	   |      145,239|  891,357|
|pi        |      144,739|  872,525|
|dhrystone |	  146,282|  853,256|
|Nqeueens  |	  146,696|  854,174|
|qsort-O2  |	  146,907|  856,721|
|miniz-O2  |	  157,475|  999,897|
|primes-O2 |	  142,356|  851,661|
|sha512-O2 |	  145,369|  901,136|
|stream	   |      157,975|  955,809|

Related: sysprog21#238
Close: sysprog21#296
@jserv jserv merged commit 48ed780 into sysprog21:master Dec 25, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

jit: code generation tool should be aware of comments
2 participants