More, more competitors (lightweightedness is questionable of course) #2

pfalcon · 2019-08-06T14:46:09Z

From README:

I only see two real universal light-weight JIT competitors

I know @dibyendumajumdar maintains (or whatever he does to it, dibyendumajumdar/nanojit#15) https://github.com/dibyendumajumdar/nanojit because he likes that it's lightweight. He's also dissatisfied with the bloatedness of Eclipse OMR up to a level of forking it: https://github.com/dibyendumajumdar/nj .

Oh, he also has a C compiler for those JITs: https://github.com/dibyendumajumdar/dmr_c ;-)

dibyendumajumdar · 2019-08-06T18:54:00Z

Hi, its great that this project is trying to create a compact JIT.
We do need something that is small and can generate good code - it is hard to achieve both. I wish I had the time to spend on this type of project! Good luck!

dibyendumajumdar · 2019-08-06T18:56:21Z

My only suggestion is: please avoid global state. Having global state makes it impossible to use the library easily.

vnmakarov · 2019-08-06T22:58:01Z

I know @dibyendumajumdar maintains (or whatever he does to it, dibyendumajumdar/nanojit#15) https://github.com/dibyendumajumdar/nanojit because he likes that it's lightweight. He's also dissatisfied with the bloatedness of Eclipse OMR up to a level of forking it: https://github.com/dibyendumajumdar/nj .

Oh, he also has a C compiler for those JITs: https://github.com/dibyendumajumdar/dmr_c ;-)

Thank you for pointing this. I know nanojit project for long time. I even did some its benchmarking. It does not fit to my goals (light weight jit compiler for CRuby which recently already got GCC/LLVM based JIT).

As I wrote I need at least 70% performance of code generated by GCC with -O2. I've just repeated sieve benchmark of dmr_c with nanojit backend. The generated code is about 3.5 times slower than code generated with GCC -O2. It is even 15% slower than one generated by GCC with -O0.

That is also why I did not include GNU lighting project too.

Opposite to my project, nanojit is a solid project supporting a few architectures and probably you are right I should add it to other JIT candidates in README.md. I am going to do this when I have spare time.

As for C compiler, dmr_c is based on Sparse, my C compiler is not based on any existing project like tcc, 8cc, 9cc, etc. It is written completely from scratch. Still big work is needed to finish it. But I can say that its code will be at least 5 times less than sparse.

Btw currently I am working on LLVM IR to MIR translator. I guess the initial version will be published in Sep -Oct.

vnmakarov · 2019-08-06T22:58:36Z

Hi, its great that this project is trying to create a compact JIT.
We do need something that is small and can generate good code - it is hard to achieve both. I wish I had the time to spend on this type of project! Good luck!

Thank you for your kind words.

vnmakarov · 2019-08-06T23:08:50Z

My only suggestion is: please avoid global state. Having global state makes it impossible to use the library easily.

Yes, thank you. I keep it in my mind. The current project is not suitable for anything right now. I am focused to try it first as a JIT for MRuby w/o parallel compilation. For CRuby I will need to make it suitable for multi-threaded code because CRuby JIT engine requires to compile the code in parallel with bytecode interpretation and other compilations.

MIR project will be developed in parallel with its usage as a JIT for MRuby/CRuby. And it will be only ready when MRuby or/and CRuby JITs are ready and the JITs are proven to have a specific performance.

dibyendumajumdar · 2019-08-07T18:34:15Z

Thank you for pointing this. I know nanojit project for long time. I even did some its benchmarking. It does not fit to my goals (light weight jit compiler for CRuby which recently already got GCC/LLVM based JIT).

As I wrote I need at least 70% performance of code generated by GCC with -O2. I've just repeated sieve benchmark of dmr_c with nanojit backend. The generated code is about 3.5 times slower than code generated with GCC -O2. It is even 15% slower than one generated by GCC with -O0.

NanoJIT was designed to be a trace compiler, and that is how it is used in Flash. It is fine for a sequence of non-branching code, but if there are branches then the register allocation cannot cope with it.

I'd be interested in the program you used for testing; I can try that out myself.

As for C compiler, dmr_c is based on Sparse, my C compiler is not based on any existing project like tcc, 8cc, 9cc, etc. It is written completely from scratch. Still big work is needed to finish it. But I can say that its code will be at least 5 times less than sparse.

Do the benefits outweigh the cost of maintaining your own? 8cc is very small too.

Regards

dibyendumajumdar · 2019-08-07T18:46:03Z

My only suggestion is: please avoid global state. Having global state makes it impossible to use the library easily.

Yes, thank you. I keep it in my mind. The current project is not suitable for anything right now. I am focused to try it first as a JIT for MRuby w/o parallel compilation. For CRuby I will need to make it suitable for multi-threaded code because CRuby JIT engine requires to compile the code in parallel with bytecode interpretation and other compilations.

It can be hard to remove global state later on as the code becomes larger.

MIR project will be developed in parallel with its usage as a JIT for MRuby/CRuby. And it will be only ready when MRuby or/and CRuby JITs are ready and the JITs are proven to have a specific performance.

I too started the various JIT projects as I wanted a JIT backend for Lua. Here's my experience for what its worth:

It is very hard to generate good code for dynamic languages, because the code is full of type checks. You have to do what LuaJIT does or what JavaScript does - i.e. type specialize. This is a huge piece of work - and as far as I know, no one has successfully done this for Ruby or Python.
To get good performance for these languages you need very good optimizer, and global register allocator, alias analysis etc. I too thought I should start from scratch. I originally started with LLVM and that is huge. Then I tried NanoJIT but code generation was poor. Next I tried OMR JIT - this has good optimizer. It is not as big as LLVM. I think it can be made smaller by removing unnecessary stuff. I started that in my project, but I do not have the time unfortunately to focus on it.

I think you may want to see whether the LLVM or GCC backends for Ruby are any good. As far as I know, they are not. In Lua I can get 20x improvement with my backend, provided type annotations are used. If you can get 2x improvement in Ruby you will be lucky I think!

Regards

dibyendumajumdar · 2019-08-07T20:57:31Z

Btw I see that you have a lot of experience writing compilers (gcc) - so perhaps you can crack this! I hope so certainly because a nice compact JIT written in C that can generate optimized code would be fantastic.

In my opinion though, a new small high performance scripting language competing with LuaJIT would be better than than trying to speed up Ruby!

vnmakarov · 2019-08-08T01:39:19Z

Btw I see that you have a lot of experience writing compilers (gcc) - so perhaps you can crack this! I hope so certainly because a nice compact JIT written in C that can generate optimized code would be fantastic.

There is already LIBGCCJIT written by my colleague David Malcolm
https://gcc.gnu.org/onlinedocs/gcc-9.1.0/jit/

Unfortunately, it is hard to implement inlining because it is done too early. As inlining is the most important optimization of JIT, libgccjit did not fit for my goals.

I used another approach based on C-code generation and pre-compiled headers to implement MJIT in CRuby. This approach has practically the same compilation speed as LIBGCCJIT and permits to implement inlining at least on path Ruby code -> Ruby code:
https://developers.redhat.com/blog/2018/03/22/ruby-3x3-performance-goal/

In my opinion though, a new small high performance scripting language competing with LuaJIT would be better than than trying to speed up Ruby!

Ruby is actively used in openshift/openstack which is a strategic area for RedHat. So it defines my language choice (although I'd like to design JIT compiler which could be used for other languages too).

But MRuby could satisfy your criteria. Mike Pall worked roughly 10 years on LuaJIT, did amazing job, and achieved quite a lot. To compete with LuaJIT, I probably would need 10 years too.

vnmakarov · 2019-08-08T02:01:41Z

Thanks for sharing your experience.

I too started the various JIT projects as I wanted a JIT backend for Lua. Here's my experience for what its worth:

1. It is very hard to generate good code for dynamic languages, because the code is full of type checks. You have to do what LuaJIT does or what JavaScript does - i.e. type specialize. This is a huge piece of work - and as far as I know, no one has successfully done this for Ruby or Python.

Yes, implementing specialization/deoptimization is not trivial work but it is much less than writing a good JIT compiler.

2. To get good performance for these languages you need very good optimizer, and global register allocator, alias analysis etc. I too thought I should start from scratch. I originally started with LLVM and that is huge. Then I tried NanoJIT but code generation was poor. Next I tried OMR JIT - this has good optimizer. It is not as big as LLVM. I think it can be made smaller by removing unnecessary stuff. I started that in my project, but I do not have the time unfortunately to focus on it.

OMR JIT is still too complex for my goals besides IBM already implemented CRuby with OMR JIT. The results are not so good. I suspect they did not implemented specialization and compilation in parallel with Ruby execution.

There is a comparison of different CRuby JIT implementations in https://developers.redhat.com/blog/2018/03/22/ruby-3x3-performance-goal/

I think you may want to see whether the LLVM or GCC backends for Ruby are any good. As far as I know, they are not. In Lua I can get 20x improvement with my backend, provided type annotations are used. If you can get 2x improvement in Ruby you will be lucky I think!

Most Ruby programs are io bound and JIT can not help. But JIT could extend Ruby usage into CPU bound program area. For some programs, the current CRuby JIT (MJIT) can improve code close to 3 times.

The problem is that MJIT makes slower the most widely used Ruby application Ruby on Rails until a lot of methods is compiled. That is a reason to implement tired compilation and one reason for MIR project.

If you are interesting, more details can be found on https://www.slideshare.net/VladimirMakarov13/the-lightweightjitcompilerprojectforc-ruby-141836482

vnmakarov · 2019-08-08T02:10:21Z

NanoJIT was designed to be a trace compiler, and that is how it is used in Flash. It is fine for a sequence of non-branching code, but if there are branches then the register allocation cannot cope with it.

I'd be interested in the program you used for testing; I can try that out myself.

#define SieveSize 8190
int sieve (void) {
int i, k, prime, count, iter;
char flags[SieveSize];

for (iter = 0; iter < 100000; iter++) {
count = 0;
for (i = 0; i < SieveSize; i++)
flags[i] = 1;
for (i = 0; i < SieveSize; i++)
if (flags[i]) {
prime = i + i + 3;
for (k = i + prime; k < SieveSize; k += prime)
flags[k] = 0;
count++;
}
}
return count;
}

I used drm_c with nanojit for this function. Most time is spent in function code execution. On my computer nanojit uses 7.18 CPU sec, code generated by GCC -O2 uses 2.30s, and code generated by GCC -O0 takes 6.26s

dibyendumajumdar · 2019-08-08T20:10:24Z

There is already LIBGCCJIT written by my colleague David Malcolm
https://gcc.gnu.org/onlinedocs/gcc-9.1.0/jit/

I used LIBGCCJIT in my Lua project - I found its compilation is very slow.
In the Lua world, the benchmark is LuaJIT which has hardly any slowdown due to JIT compilation, i.e. it can JIT compile so fast that the JITing step is almost cost free. Neither LLVM nor GCC can get anywhere near that.

I used another approach based on C-code generation and pre-compiled headers to implement MJIT in CRuby. This approach has practically the same compilation speed as LIBGCCJIT and permits to implement inlining at least on path Ruby code -> Ruby code:
https://developers.redhat.com/blog/2018/03/22/ruby-3x3-performance-goal/

I had a look at the article. Firstly I am guessing that you haven't implemented all the functionality of Ruby? For example, can it 100% interoperate with existing Ruby libraries and code?

In my experience dynamic languages tend to have features that are very JIT unfriendly. For example, in Lua, a C library can manipulate the Lua stack. A JIT has to deal with such situations. I am not familiar with Ruby but I have read that it is very hard to optimize Ruby.

In my opinion though, a new small high performance scripting language competing with LuaJIT would be better than than trying to speed up Ruby!

Ruby is actively used in openshift/openstack which is a strategic area for RedHat. So it defines my language choice (although I'd like to design JIT compiler which could be used for other languages too).

But MRuby could satisfy your criteria. Mike Pall worked roughly 10 years on LuaJIT, did amazing job, and achieved quite a lot. To compete with LuaJIT, I probably would need 10 years too.

Well, not if you created a new language specifically designed to be JITed efficiently. Lua has many constructs that are bad for JITing. But a language could be designed that avoids such features and therefore allows efficient JITing.

I am skeptical about Ruby efforts. I guess until you have a 100% compatible Ruby implementation that achieves 2 or 3x improvement, it is impossible to say anything. And that could take years to implement too.

Using a slow compiler like GCC or LLVM is simply not an option in the Lua world because of LuaJIT's speed. That is why I find your project very interesting as its compact size and hopefully speed of compilation would be ideal for Lua. But then we don't know how well the generated code will behave. In Lua, the stack is a heap allocated structure and the optimzer needs to be able to figure out when values in the stack are temporary and do not need to be stored/accessed from the heap. LLVM and GCC can do this but at a huge cost.

vnmakarov · 2019-08-09T02:34:09Z

I used LIBGCCJIT in my Lua project - I found its compilation is very slow.
In the Lua world, the benchmark is LuaJIT which has hardly any slowdown due to JIT compilation, i.e. it can JIT compile so fast that the JITing step is almost cost free. Neither LLVM nor GCC can get anywhere near that.

LLVM and GCC can be used mostly as tier 2 JIT compilers.

I had a look at the article. Firstly I am guessing that you haven't implemented all the functionality of Ruby? For example, can it 100% interoperate with existing Ruby libraries and code?

No I did not impelement all functionality but I was pretty close. The approach and the code was adopted by Ruby community. Takashi Kokubun adopted the code for original CRuby VM insns and implemented full functionality. Now LLVM/GCC based JIT is a part of CRuby.

In my experience dynamic languages tend to have features that are very JIT unfriendly. For example, in Lua, a C library can manipulate the Lua stack. A JIT has to deal with such situations. I am not familiar with Ruby but I have read that it is very hard to optimize Ruby.

It is the same for Ruby. Ruby is very dynamic, practically everything can change during execution.

Well, not if you created a new language specifically designed to be JITed efficiently. Lua has many constructs that are bad for JITing. But a language could be designed that avoids such features and therefore allows efficient JITing.

I am skeptical about Ruby efforts. I guess until you have a 100% compatible Ruby implementation that achieves 2 or 3x improvement, it is impossible to say anything. And that could take years to implement too.

As wrote JIT is now a part of the 2 last CRuby releases. The current JIT does not use register insns as I proposed and do not implement speculated code and inlining. Still it achieves 2 times faster code on most widely used Ruby benchmark optcarrot.

Using a slow compiler like GCC or LLVM is simply not an option in the Lua world because of LuaJIT's speed. That is why I find your project very interesting as its compact size and hopefully speed of compilation would be ideal for Lua. But then we don't know how well the generated code will behave. In Lua, the stack is a heap allocated structure and the optimzer needs to be able to figure out when values in the stack are temporary and do not need to be stored/accessed from the heap. LLVM and GCC can do this but at a huge cost.

The same problem exists in CRuby. I did this oprimization on code generated from VM insns. Most ruby local variable was translated into C function local vars which were translated into hardware registers by GCC/LLVM. If the speculation were wrong, code saving C local variables in (heap/stack) memory was executed, and the execution of Ruby code continued in the interpreter. For Ruby global/object/class variables more sophisticated (escape) analysis is needed.

dibyendumajumdar · 2019-08-09T22:30:45Z

I did some tests with your benchmark.
This is what I got:

Timings

option	time	note
gcc -O0	6.269s
gcc -O2	1.756s
dmr_c w nanojit	6.917s	includes JIT time
dmr_c w omrjit	2.518s	includes JIT time
ravi interpreted	18.54s	type annotations used
ravi w omrjit	6.34s	type annotations used
ravi w omrjit	4.72s	type annotations and optimized for loop for variable step (see note below)
ravi w llvm 8	5.89s	type annotations used
lua 5.4 alpha	21.72s

Test programs

I will add the results using LLVM JIT when I get some time.

Note on Ravi perf: I think that the performance is degraded by the inner for loop which has a variable increment (prime) - currently my backend optimizes when the increment is known positive integer, but in this case it falls back to a generic for loop. I suspect that if I optimized this case resulting JIT code will perform close to the dmr_c with omrjit backend.

So, I am interested in Ruby results, with and without JIT.

vnmakarov · 2019-08-11T01:32:37Z

I also did some measurements on my machine. Fortunately, the sieve can be compiled by c2mir. Here are the results:

program	time
dmrc+nanojit	7.55s
c2mir+MIR generator	3.75s
gcc -O2	2.43s
gcc -O0	6.47s

So c2mir + MIR-generator achieves 65% of GCC -O2. According to your measurements, it is close to OMRJIT which achieves 70%. But I should say I am doing stupid generation right now, because I am focused to make c2mir just to work.

As for Ruby JIT, I have no time now to build and benchmark my old code. But you can find sieve data on

https://github.com/vnmakarov/ruby/tree/rtl_mjit_branch#microbenchmark-results

Basically, seive is sped up 2 times by Ruby JIT I worked on.

dibyendumajumdar · 2019-08-11T10:48:14Z

Hi, I had a look at those benchmarks. They are all relative figures aren't they? So it does not give me a feel of how Ruby performs compared to above. But not important ... I will check this out myself.

vnmakarov · 2019-08-12T15:06:13Z

Yes, they are relative figures. I am sure Ruby even with JIT will be much slower than RAVI because it is more dynamic language where everything can be changed during execution.

Just a simple example, all integer arithmetic requires to check overflow. If there is an overflow, value becomes a multi-precision value. You can define any operation, for example change integer + onto integer - :). + can be defined for any values. There are also different representations of arrays and objects, etc. So there are a lot checks even if you generate a speculative code.

Ruby has no type annotations. It might be changed in the future as one goal of Ruby 3 is to have some type system.

dibyendumajumdar · 2019-08-12T18:55:30Z

The integer stuff sounds horrible. I watched your talk about it. I guess they were trying to avoid creating an object like Python does.

c9s · 2019-08-13T03:18:35Z

@vnmakarov What do you think about the V8 TurboFan's CodeAssembler ?

vnmakarov · 2019-08-13T13:37:08Z

@vnmakarov What do you think about the V8 TurboFan's CodeAssembler ?

I never worked on or benchmarked Turbofan and I have a limited knowledge of it (mostly from articles and presentations). But here is my opinion of this project.

Turbofan has different goals and it is in a different category than MIR. It is not a light-weight JIT compiler project. It requires more resources. Turbofan tries to squeeze performance as much as possible. It has longer optimization pipeline. It is a very mature project developed by very smart people, a lot of experience there.

MIR goal is to make it as simple as possible and still generate a decent code. I guess Turbofan can generate 30-40% faster code than MIR. MIR will be simpler to port than Turbofan (although Turbofan has already major target ports).

Turbofan with CodeAssembler is closer to Oracle Graal project than to MIR. They use the same IR (sea of nodes) for optimizations and have an interface to add new languages. MIR is more flexible and streamable IR. I hope to use it as an interface between different language processors in the future. To simplify the project MIR is designed to be simultaneously an interface IR, IR for optimization and IR for the interpretation. There are other less important design solutions to simplify MIR project.

Actually LLVM IR could be used for the same purposes but it is too complicated and a bit unstable (Chris Lattner's team in Google works on its extension which could make it even more complicated). LLVM IR is very bad for interpretation as it has SSA phi nodes and this make it interpretation 100 times more slower than generated code (MIR interpretation is only 6-10 times slower than MIR generated code).

Turbofan and LLVM are written on C++. I don't like this. C++ usage can be very easily abused. For example, SLOCs for GCC and Clang/LLVM are pretty close but LLVM binary code for one target is about 3 times more. GCC was originally written on C, although it was moved to C++ a few years ago. Still its code is mostly C.

dibyendumajumdar · 2019-08-13T19:13:17Z

@vnmakarov QBE seems closest to your goals; did you test it against MIR?

vnmakarov · 2019-08-14T00:54:12Z

@vnmakarov QBE seems closest to your goals; did you test it against MIR?

Yes, I played with it. It is a very interesting code written by a talented guy Quentin Carbonneaux as I understand when he did a PhD in Yale.

QBE has a good set of optimizations, some of which are absent in MIR-generator (like alias analysis and simple loop analysis for better RA spill heuristics). It can be considered as mini-LLVM. It has a simplified version of LLVM IR. Why I decided not to take it and work on it:

No API, only textual IR parser
Generation of assembler only. For JIT, I could generate binary by assembler and load it but
- It does not generate PIC code so you can not get .so file from it
- assembler and loader would slow down code generation considerably
No inlining. It is the most important transformation for method JITs increasing scope for optimizations
No binary IR input (like LLVM has one). Binary is more compact and faster to load JIT compiler environment (usually a lot of functions for potential inlining)
No C to QBE IR translator which is necessary for me to generate JIT environment from CRuby/MRuby sources
- there is a very small and primitive subset of C from which QBE IR is generated
There is no QBE IR interpreter. I'd like to have IR which is set up to be interpreted and lazy compiled when a function is started to be executed
Not full ABI support. Absence of long doubles. I'd like to support full standard C ABI compatibility
I think Quentin is not a compiler guy, his job is different. I suspect QBE will be not supported
And, personally, I don't like LLVM like IR. Presence of phi nodes in an interface IR is nonsense imho. Although QBE IR (unlike LLVM) you can avoid phi-nodes by using variables created by alloca.

When I played with QBE on sieve I got the impression that it has practically the same generated code (may be even better) quality as MIR-generator but its compilation speed was about 5 times slower (I used valgrind --tool=lackey). Although for QBE it included parsing IR representation and output of assembler code.

With assembler to binary transformation, QBE compilation was about 30 times slow (yes as is the bottleneck).

I believe it would be easier for me to write what I need than to adapt QBE to my purposes.

dibyendumajumdar · 2019-08-14T19:37:49Z

@vnmakarov QBE seems closest to your goals; did you test it against MIR?

And, personally, I don't like LLVM like IR. Presence of phi nodes in an interface IR is nonsense imho. Although QBE IR (unlike LLVM) you can avoid phi-nodes by using variables created by alloca.

That's not true, phi nodes are optional in LLVM. My C front-end generates alloca only. LLVM IR is very well designed IMO - it is strongly typed with extensive type checks, makes it hard to write wrong code. It looks like you haven't used LLVM ;-)

dibyendumajumdar · 2019-08-14T20:37:09Z

@vnmakarov QBE seems closest to your goals; did you test it against MIR?
* No C to QBE IR translator which is necessary for me to generate JIT environment from CRuby/MRuby sources

Have you seen https://github.com/michaelforney/cproc. It is C11 front-end to QBE.

dibyendumajumdar · 2019-08-14T20:38:56Z

@vnmakarov QBE seems closest to your goals; did you test it against MIR?
* I think Quentin is not a compiler guy, his job is different. I suspect QBE will be not supported

He seems to be maintaining and enhancing it over a few years ... so not sure that one can assume this.

dibyendumajumdar · 2019-08-14T20:40:47Z

@vnmakarov QBE seems closest to your goals; did you test it against MIR?

I believe it would be easier for me to write what I need than to adapt QBE to my purposes.

I sure like that you made that choice because MIR is something I have been looking for for 4 years now. I also hope this is not going to fizzle out as so many projects do. I can't wait to try it out.

vnmakarov · 2019-08-15T03:30:51Z

That's not true, phi nodes are optional in LLVM. My C front-end generates alloca only. LLVM IR is very well designed IMO - it is strongly typed with extensive type checks, makes it hard to write wrong code. It looks like you haven't used LLVM ;-)

I've been using this for some time. In a different way than you. I am writing LLVM IR to MIR translator. I can use reg2mem pass to remove phi-nodes. In this case I have to implement kind of LLVM mem2reg myself to generate a code with registers. Or I can remove phi- nodes during the translation to MIR. Both approaches are inconvenient. So LLVM IR is not so good for this kind of work.

You use alloca generation to avoid dealing with phi-nodes and this is convenient for you because LLVM takes care about generating efficient code after that. So for your task LLVM IR is good.

At some point in compiler pass, you need to get off SSA. But LLVM IR can not represent non-SSA code. LLVM IR with alloca without phi-nodes is also SSA. This form of LLVM IR is verbose (a lot of loads and stores which complicates the code, make it big and it less readable). So LLVM had to use machine IR (another MIR) when SSA code can not represent adequately code at further points of compiler pipeline.

That is why I wrote that SSA IR as interface language is not a good idea.

But besides SSA, I don't like LLVM IR, specifically syntax of its textual representation. People have different tastes in languages. That why I wrote "personally" I don't like it.

dibyendumajumdar · 2019-08-15T19:21:52Z

I've been using this for some time. In a different way than you. I am writing LLVM IR to MIR translator. I can use reg2mem pass to remove phi-nodes. In this case I have to implement kind of LLVM mem2reg myself to generate a code with registers. Or I can remove phi- nodes during the translation to MIR. Both approaches are inconvenient. So LLVM IR is not so good for this kind of work.

What is the reason for the LLVM IR to MIR interface? It seems a large piece of work, and is it not better to complete the C front-end and MIR first?

To be honest, I can't see why anyone would want to generate LLVM IR and then use MIR...

Regards

vnmakarov · 2019-08-16T03:10:32Z

What is the reason for the LLVM IR to MIR interface? It seems a large piece of work, and is it not better to complete the C front-end and MIR first?

To be honest, I can't see why anyone would want to generate LLVM IR and then use MIR...

There are several reasons for this work:

It can provide interface to other languages implemented by LLVM. For example, if you have a dynamic language implementation written on C++. You need C++ to MIR translator to implement a JIT with MIR which can inline standard method written on C++ into methods on MIR generated by a JIT. Writing own C++ compiler is not an option. This is a long term reason
A short term reason for this work is that I feel implementation of own C compiler strictly supporting ABI will take more time than LLVM IR to MIR translator. The translator implementation is more easy work. I'd like to start work on actual Ruby JIT with MIR as soon as possible to see how it will work. May be all my expectations will be not realized
LLVM IR (after processing by all LLVM optimizations) to MIR translator can provide a better optimized MIR for standard Ruby methods implemented on C than by MIR-generation optimization pipeline. That is important for a better Ruby MIR JIT performance
And actually MIR could be used in LLVM JIT as an alternative much faster optimization/binary code generation/loader pipeline than e.g. ORC one. But I am not interesting in this kind of work. So I even not consider it as a long long term project for myself

dibyendumajumdar · 2019-08-16T19:08:36Z

What is the reason for the LLVM IR to MIR interface? It seems a large piece of work, and is it not better to complete the C front-end and MIR first?
To be honest, I can't see why anyone would want to generate LLVM IR and then use MIR...

There are several reasons for this work:

It can provide interface to other languages implemented by LLVM. For example, if you have a dynamic language implementation written on C++. You need C++ to MIR translator to implement a JIT with MIR which can inline standard method written on C++ into methods on MIR generated by a JIT. Writing own C++ compiler is not an option. This is a long term reason

I can't see why anyone would generate C++ code, and then use LLVM as well as MIR.

A short term reason for this work is that I feel implementation of own C compiler strictly supporting ABI will take more time than LLVM IR to MIR translator. The translator implementation is more easy work. I'd like to start work on actual Ruby JIT with MIR as soon as possible to see how it will work. May be all my expectations will be not realized

However then the experiment is not going to be valid, as you will be relying on LLVM doing all the optimization.

LLVM IR (after processing by all LLVM optimizations) to MIR translator can provide a better optimized MIR for standard Ruby methods implemented on C than by MIR-generation optimization pipeline. That is important for a better Ruby MIR JIT performance

As above, makes the whole experiment pointless IMO.

And actually MIR could be used in LLVM JIT as an alternative much faster optimization/binary code generation/loader pipeline than e.g. ORC one. But I am not interesting in this kind of work. So I even not consider it as a long long term project for myself

I don't know about Ruby but in my case I want to get rid of LLVM. It is a 20MB beast attached to my 200k language. So if I can't replace LLVM with MIR it would be pointless. Maybe Ruby is different.

Personally I still don't see a good reason for this work ... anyway I do wish you success with it!

vnmakarov · 2019-08-17T01:11:09Z

I can't see why anyone would generate C++ code, and then use LLVM as well as MIR.

There is a probably misunderstanding here. I am not going to generate C or C++ code and then generate MIR from it. C or C++ code I mentioned is already written by human. For example, standard Ruby method times is written on C and it is a part of CRuby sources. During building CRuby I will generate MIR from this C code. Then when I am jiting code n.times { <do something> } I generate directly MIR code for <do something>, get MIR code for times, do inlining, optimize, and generate code for binary code with the aid of MIR-generator. So if CRuby were implemented on C++, I would need C++ to MIR translator for this.

Here is a slide illustrating how I am planning to implement CRuby JIT with MIR: https://www.slideshare.net/VladimirMakarov13/the-lightweightjitcompilerprojectforc-ruby-141836482/31

Generating C or C++ code during JIT work and translating it into MIR would be huge wasting compilation time and memory. Whole advantage of MIR-generator would disappear.

However then the experiment is not going to be valid, as you will be relying on LLVM doing all the optimization.

In my example it is only for method times written on C.

I don't know about Ruby but in my case I want to get rid of LLVM. It is a 20MB beast attached to my 200k language. So if I can't replace LLVM with MIR it would be pointless. Maybe Ruby is different.

That is what I am going to do too. LLVM (or my C compiler) to MIR would be used only during building CRuby but not during CRuby work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More, more competitors (lightweightedness is questionable of course) #2

More, more competitors (lightweightedness is questionable of course) #2

pfalcon commented Aug 6, 2019

dibyendumajumdar commented Aug 6, 2019

dibyendumajumdar commented Aug 6, 2019

vnmakarov commented Aug 6, 2019 •

edited

Loading

vnmakarov commented Aug 6, 2019

vnmakarov commented Aug 6, 2019

dibyendumajumdar commented Aug 7, 2019

dibyendumajumdar commented Aug 7, 2019

dibyendumajumdar commented Aug 7, 2019

vnmakarov commented Aug 8, 2019

vnmakarov commented Aug 8, 2019

vnmakarov commented Aug 8, 2019

dibyendumajumdar commented Aug 8, 2019

vnmakarov commented Aug 9, 2019

dibyendumajumdar commented Aug 9, 2019 •

edited

Loading

vnmakarov commented Aug 11, 2019

dibyendumajumdar commented Aug 11, 2019

vnmakarov commented Aug 12, 2019

dibyendumajumdar commented Aug 12, 2019

c9s commented Aug 13, 2019 •

edited

Loading

vnmakarov commented Aug 13, 2019 •

edited

Loading

dibyendumajumdar commented Aug 13, 2019

vnmakarov commented Aug 14, 2019

dibyendumajumdar commented Aug 14, 2019

dibyendumajumdar commented Aug 14, 2019

dibyendumajumdar commented Aug 14, 2019

dibyendumajumdar commented Aug 14, 2019

vnmakarov commented Aug 15, 2019

dibyendumajumdar commented Aug 15, 2019 •

edited

Loading

vnmakarov commented Aug 16, 2019

dibyendumajumdar commented Aug 16, 2019

vnmakarov commented Aug 17, 2019

More, more competitors (lightweightedness is questionable of course) #2

More, more competitors (lightweightedness is questionable of course) #2

Comments

pfalcon commented Aug 6, 2019

dibyendumajumdar commented Aug 6, 2019

dibyendumajumdar commented Aug 6, 2019

vnmakarov commented Aug 6, 2019 • edited Loading

vnmakarov commented Aug 6, 2019

vnmakarov commented Aug 6, 2019

dibyendumajumdar commented Aug 7, 2019

dibyendumajumdar commented Aug 7, 2019

dibyendumajumdar commented Aug 7, 2019

vnmakarov commented Aug 8, 2019

vnmakarov commented Aug 8, 2019

vnmakarov commented Aug 8, 2019

dibyendumajumdar commented Aug 8, 2019

vnmakarov commented Aug 9, 2019

dibyendumajumdar commented Aug 9, 2019 • edited Loading

Timings

Test programs

vnmakarov commented Aug 11, 2019

dibyendumajumdar commented Aug 11, 2019

vnmakarov commented Aug 12, 2019

dibyendumajumdar commented Aug 12, 2019

c9s commented Aug 13, 2019 • edited Loading

vnmakarov commented Aug 13, 2019 • edited Loading

dibyendumajumdar commented Aug 13, 2019

vnmakarov commented Aug 14, 2019

dibyendumajumdar commented Aug 14, 2019

dibyendumajumdar commented Aug 14, 2019

dibyendumajumdar commented Aug 14, 2019

dibyendumajumdar commented Aug 14, 2019

vnmakarov commented Aug 15, 2019

dibyendumajumdar commented Aug 15, 2019 • edited Loading

vnmakarov commented Aug 16, 2019

dibyendumajumdar commented Aug 16, 2019

vnmakarov commented Aug 17, 2019

vnmakarov commented Aug 6, 2019 •

edited

Loading

dibyendumajumdar commented Aug 9, 2019 •

edited

Loading

c9s commented Aug 13, 2019 •

edited

Loading

vnmakarov commented Aug 13, 2019 •

edited

Loading

dibyendumajumdar commented Aug 15, 2019 •

edited

Loading