C Runtime rationale #69

csuriano23 · 2022-02-07T20:33:23Z

Hi there,
this is a really interesting project. I was wondering why the target runtime is obtained transpiling to a C template, instead of directly using rust itself (eg. using Rayon); another option could be targeting something like LLVM.

Thank you

nataneb32 · 2022-02-07T20:49:14Z

I think it is done this way, because c is a easy language to manage memory. There is a near-future problem, that needs some attention. The project is growing fast and maybe we need a better abstraction to create new target runtimes.

nothingnesses · 2022-02-07T22:21:20Z

I don't personally understand the rationale for the C runtime myself, but a few possible reasons I can think of are:

C compilers are less strict, so it's easier to generate programs that compile, allowing the compiler prototype to have been quickly built.
Targeting C as opposed to Rust allows HVM programs to run on more architectures, as C has wider support (although this also applies to LLVM).
Better familiarity with C than Rust/LLVM, especially wrt to concurrency/parallelism (since that seems to be the major difference in between the two runtimes).

That said, I don't think these adequately justify the C runtime's existence; I think the benefits are outweighed by the disadvantages of extra maintenance burden of having two runtimes and having to deal with the poorer C type system which is conducive to more error-prone code.

As such, I do hope that parallelism/concurrency can be integrated into the Rust runtime so that it can hopefully fully replace the C one. I'd like to work on it myself, but I'm still currently trying to better understand how the runtimes work, so I don't have a good idea yet how hard the problem actually is.

From what I currently understand though, I don't think trying to use Rayon would be straightforward, since there is very little use of iterators in the Rust runtime; it instead uses a lot of loops and mutates a lot of variables within said loops. So these would first need to be converted to iterations and the mutations reduced/eliminated before we can utilise Rayon, or perhaps some other library/primitive would need to be used instead (Crossbeam/Mutexes?).

ghost · 2022-02-08T02:18:13Z

A C runtime can eventually incorporate features from C++, such as std::future.

VictorTaelin · 2022-02-08T11:03:09Z

LLVM would be a better option probably, I just don't know it well enough, so for the sake of this prototype C was used.

We could drop both and just keep Rust alone, but note that interpreted performance is considerably inferior, so in order for it to make sense we should use something like WASM, Cranelift or LLVM to JIT-compile user-defined functions. In the long term that is probably the best choice.

Rayon makes no sense at all, it isn't related with the kind of parallelism that HVM uses.

dumblob · 2022-02-08T14:04:44Z

Btw. LLVM IR, JIT etc. are changing extremely often - see c3d/xl#39 etc. So think twice before going down that route. C itself is wastly more portable and 100% stable (and forever will be). This could be a good argument to not generate LLVM stuff but C stuff.

OTOH why not to generate Rust code and compile the Rust code? Sure, Rust compiler is not ubiquitous and it'll be much more difficult to generate such code, but it's the second cleanest solution (after the current C solution) I can think of as of now.

nothingnesses · 2022-02-08T14:27:42Z

I agree about targeting LLVM. It's probably too fast moving to reliably maintain support for it for a small team. Likewise with Cranelift, afaik, its API is still unstable. I know Rust and C code can be compiled to WASM with wasm-bindgen and emscripten, respectively, but I'm not sure if that's what Victor meant, or if he meant that we should target WASM/Binaryen IR directly instead.

ghost · 2022-02-08T20:22:51Z

upcoming techniques for automatic code generation such as AlphaCode will allow for brute forcing more efficient implementations at the lowest level available, which you must then reverse engineer to understand how it does what it does. for this reason, it may make sense to target assembly.

kungfooman · 2022-02-09T07:42:03Z

How about targetting scripting languages? Either JS on V8/ChakraCore or Lua on luajit... the JIT engines generate assembly for multiple platforms on the fly and usually you just have to write idiomatic/high-level code.

Common structures like hashmaps etc. are also efficiently implemented.

Threads could be implemented just as they are in WebAssembly/emscripten:

Emscripten has support for multithreading using SharedArrayBuffer in browsers. That API allows sharing memory between the main thread and web workers as well as atomic operations for synchronization, which enables Emscripten to implement support for the Pthreads (POSIX threads) API. This support is considered stable in Emscripten.

https://emscripten.org/docs/porting/pthreads.html

ghost · 2022-02-09T08:44:41Z

The benefit of relying less (or not at all) on existing toolchains and JITs is that you can focus on your specific use case. It's a no-free-lunch kind of thing. For example, O3 and PGO don't work in the general case, but rather in more specific cases. That said, it wasn't clear to me how many LOC could ultimately be involved. If need for speed is the goal, I don't think the PyPy value proposition - that they could beat CPython in speed because writing Python in Python would allow for more optimization, due to increased legibility - ultimately worked out.

VictorTaelin · 2022-02-09T17:42:07Z

OTOH why not to generate Rust code and compile the Rust code? Sure, Rust compiler is not ubiquitous and it'll be much more difficult to generate such code,

That is the main reason.

How about targetting scripting languages? Either JS on V8/ChakraCore or Lua on luajit... the JIT engines generate assembly for multiple platforms on the fly and usually you just have to write idiomatic/high-level code.

I think JS should be targeted via WASM though. Targeting JS directly isn't hard though, but the HVM->C compiler relies heavily on the assumption that clang will heavily inline function calls. JS does basically no inlining, so either we'd need to inline ourselves, or it would be terribly slow. As in, we could reach 30m rewrites per second in JS if we inlined, vs perhaps 3m if we didn't? Another complication is that JS lacks 64-bit ints, so we'd need to either lose some capacity and use 52-bit pointers, or we'd need to compile to BigInt and take another ~5x performance hit.

ghost · 2022-02-10T02:44:19Z

using emscripten was extremely easy! but yeah, i couldn't get it to work correctly due to the memory model

zicklag · 2022-06-06T14:43:57Z

We could maybe target an intermediate representation such as this yair project. The Yair project probably isn't any more mature than this one is, but it uses an LLVM backend or cranelift to generate the assembly.

Again, maybe not the best option from a maturity standpoint, but it's something to think about.

VictorTaelin · 2022-11-18T17:03:10Z

HVM compiles to Rust now!

steinerkelvin added the discussion label Feb 8, 2022

VictorTaelin closed this as completed Nov 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C Runtime rationale #69

C Runtime rationale #69

csuriano23 commented Feb 7, 2022 •

edited

Loading

nataneb32 commented Feb 7, 2022 •

edited

Loading

nothingnesses commented Feb 7, 2022 •

edited

Loading

ghost commented Feb 8, 2022

VictorTaelin commented Feb 8, 2022 •

edited

Loading

dumblob commented Feb 8, 2022

nothingnesses commented Feb 8, 2022 •

edited

Loading

ghost commented Feb 8, 2022 •

edited by ghost

Loading

kungfooman commented Feb 9, 2022

ghost commented Feb 9, 2022

VictorTaelin commented Feb 9, 2022 •

edited

Loading

ghost commented Feb 10, 2022

zicklag commented Jun 6, 2022

VictorTaelin commented Nov 18, 2022

C Runtime rationale #69

C Runtime rationale #69

Comments

csuriano23 commented Feb 7, 2022 • edited Loading

nataneb32 commented Feb 7, 2022 • edited Loading

nothingnesses commented Feb 7, 2022 • edited Loading

ghost commented Feb 8, 2022

VictorTaelin commented Feb 8, 2022 • edited Loading

dumblob commented Feb 8, 2022

nothingnesses commented Feb 8, 2022 • edited Loading

ghost commented Feb 8, 2022 • edited by ghost Loading

kungfooman commented Feb 9, 2022

ghost commented Feb 9, 2022

VictorTaelin commented Feb 9, 2022 • edited Loading

ghost commented Feb 10, 2022

zicklag commented Jun 6, 2022

VictorTaelin commented Nov 18, 2022

csuriano23 commented Feb 7, 2022 •

edited

Loading

nataneb32 commented Feb 7, 2022 •

edited

Loading

nothingnesses commented Feb 7, 2022 •

edited

Loading

VictorTaelin commented Feb 8, 2022 •

edited

Loading

nothingnesses commented Feb 8, 2022 •

edited

Loading

ghost commented Feb 8, 2022 •

edited by ghost

Loading

VictorTaelin commented Feb 9, 2022 •

edited

Loading