Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C Runtime rationale #69

Closed
csuriano23 opened this issue Feb 7, 2022 · 13 comments
Closed

C Runtime rationale #69

csuriano23 opened this issue Feb 7, 2022 · 13 comments

Comments

@csuriano23
Copy link

csuriano23 commented Feb 7, 2022

Hi there,
this is a really interesting project. I was wondering why the target runtime is obtained transpiling to a C template, instead of directly using rust itself (eg. using Rayon); another option could be targeting something like LLVM.

Thank you

@nataneb32
Copy link

nataneb32 commented Feb 7, 2022

I think it is done this way, because c is a easy language to manage memory. There is a near-future problem, that needs some attention. The project is growing fast and maybe we need a better abstraction to create new target runtimes.

@nothingnesses
Copy link

nothingnesses commented Feb 7, 2022

I don't personally understand the rationale for the C runtime myself, but a few possible reasons I can think of are:

  • C compilers are less strict, so it's easier to generate programs that compile, allowing the compiler prototype to have been quickly built.
  • Targeting C as opposed to Rust allows HVM programs to run on more architectures, as C has wider support (although this also applies to LLVM).
  • Better familiarity with C than Rust/LLVM, especially wrt to concurrency/parallelism (since that seems to be the major difference in between the two runtimes).

That said, I don't think these adequately justify the C runtime's existence; I think the benefits are outweighed by the disadvantages of extra maintenance burden of having two runtimes and having to deal with the poorer C type system which is conducive to more error-prone code.

As such, I do hope that parallelism/concurrency can be integrated into the Rust runtime so that it can hopefully fully replace the C one. I'd like to work on it myself, but I'm still currently trying to better understand how the runtimes work, so I don't have a good idea yet how hard the problem actually is.

From what I currently understand though, I don't think trying to use Rayon would be straightforward, since there is very little use of iterators in the Rust runtime; it instead uses a lot of loops and mutates a lot of variables within said loops. So these would first need to be converted to iterations and the mutations reduced/eliminated before we can utilise Rayon, or perhaps some other library/primitive would need to be used instead (Crossbeam/Mutexes?).

@ghost
Copy link

ghost commented Feb 8, 2022

A C runtime can eventually incorporate features from C++, such as std::future.

@VictorTaelin
Copy link
Member

VictorTaelin commented Feb 8, 2022

LLVM would be a better option probably, I just don't know it well enough, so for the sake of this prototype C was used.

We could drop both and just keep Rust alone, but note that interpreted performance is considerably inferior, so in order for it to make sense we should use something like WASM, Cranelift or LLVM to JIT-compile user-defined functions. In the long term that is probably the best choice.

Rayon makes no sense at all, it isn't related with the kind of parallelism that HVM uses.

@dumblob
Copy link

dumblob commented Feb 8, 2022

Btw. LLVM IR, JIT etc. are changing extremely often - see c3d/xl#39 etc. So think twice before going down that route. C itself is wastly more portable and 100% stable (and forever will be). This could be a good argument to not generate LLVM stuff but C stuff.

OTOH why not to generate Rust code and compile the Rust code? Sure, Rust compiler is not ubiquitous and it'll be much more difficult to generate such code, but it's the second cleanest solution (after the current C solution) I can think of as of now.

@nothingnesses
Copy link

nothingnesses commented Feb 8, 2022

I agree about targeting LLVM. It's probably too fast moving to reliably maintain support for it for a small team. Likewise with Cranelift, afaik, its API is still unstable. I know Rust and C code can be compiled to WASM with wasm-bindgen and emscripten, respectively, but I'm not sure if that's what Victor meant, or if he meant that we should target WASM/Binaryen IR directly instead.

@ghost
Copy link

ghost commented Feb 8, 2022

upcoming techniques for automatic code generation such as AlphaCode will allow for brute forcing more efficient implementations at the lowest level available, which you must then reverse engineer to understand how it does what it does. for this reason, it may make sense to target assembly.

@kungfooman
Copy link

How about targetting scripting languages? Either JS on V8/ChakraCore or Lua on luajit... the JIT engines generate assembly for multiple platforms on the fly and usually you just have to write idiomatic/high-level code.

Common structures like hashmaps etc. are also efficiently implemented.

Threads could be implemented just as they are in WebAssembly/emscripten:

Emscripten has support for multithreading using SharedArrayBuffer in browsers. That API allows sharing memory between the main thread and web workers as well as atomic operations for synchronization, which enables Emscripten to implement support for the Pthreads (POSIX threads) API. This support is considered stable in Emscripten.

https://emscripten.org/docs/porting/pthreads.html

@ghost
Copy link

ghost commented Feb 9, 2022

The benefit of relying less (or not at all) on existing toolchains and JITs is that you can focus on your specific use case. It's a no-free-lunch kind of thing. For example, O3 and PGO don't work in the general case, but rather in more specific cases. That said, it wasn't clear to me how many LOC could ultimately be involved. If need for speed is the goal, I don't think the PyPy value proposition - that they could beat CPython in speed because writing Python in Python would allow for more optimization, due to increased legibility - ultimately worked out.

@VictorTaelin
Copy link
Member

VictorTaelin commented Feb 9, 2022

OTOH why not to generate Rust code and compile the Rust code? Sure, Rust compiler is not ubiquitous and it'll be much more difficult to generate such code,

That is the main reason.

How about targetting scripting languages? Either JS on V8/ChakraCore or Lua on luajit... the JIT engines generate assembly for multiple platforms on the fly and usually you just have to write idiomatic/high-level code.

I think JS should be targeted via WASM though. Targeting JS directly isn't hard though, but the HVM->C compiler relies heavily on the assumption that clang will heavily inline function calls. JS does basically no inlining, so either we'd need to inline ourselves, or it would be terribly slow. As in, we could reach 30m rewrites per second in JS if we inlined, vs perhaps 3m if we didn't? Another complication is that JS lacks 64-bit ints, so we'd need to either lose some capacity and use 52-bit pointers, or we'd need to compile to BigInt and take another ~5x performance hit.

@ghost
Copy link

ghost commented Feb 10, 2022

using emscripten was extremely easy! but yeah, i couldn't get it to work correctly due to the memory model

image

@zicklag
Copy link

zicklag commented Jun 6, 2022

We could maybe target an intermediate representation such as this yair project. The Yair project probably isn't any more mature than this one is, but it uses an LLVM backend or cranelift to generate the assembly.

Again, maybe not the best option from a maturity standpoint, but it's something to think about.

@VictorTaelin
Copy link
Member

HVM compiles to Rust now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants