Implement compiled mode for Perlang #406

perlun · 2023-08-12T18:06:29Z

In #396, I described the recent events leading up to me trying out what LLVM can do for us, in terms of making it possible to run Perlang programs completely independent of the .NET platform.

After that comment was written, and some discussions I had with an old friend of mine (@diwic - thanks a lot to you! 🙏), I started hacking on this and doing a little experiment: How hard would it be to write a compiler for Perlang, which emits C++ code, compiles this code, and then runs the end result? This is obviously not the "final solution" in any way and it is admittedly a bit clumsy. Still, if it was good enough for Bjarne Strousrup, it ought to be good enough for me as well. (Naturally, Strousrup's preprocessor and later Cfront compiler didn't emit C++, but you get the picture.)

I'm setting the milestone for this to 0.4.0, but naturally, given the sheer size of this task, the compiler will in no way be complete in 0.4.0. But it'll probably work to the point where I feel comfortable about pushing it out to the public.

Rough steps

Implement a compiler which translates the syntax tree for all/most valid Perlang programs into valid C++ code, and compiles and runs the result: (compiler) Add first steps towards experimental compiler #409
Implement a C/C++-based stdlib to support the above: (stdlib) Add C++-based stdlib project #407.
- Make it possible to write unit and/or integration tests for this. We probably have to write these in C or C++ for now. cmocka is a useful unit test library for C that I have used elsewhere.
- Add support for BigInt: Add support for BigInt in compiled mode #415
- Distribute the (compiled) stdlib along with snapshot builds. This involves a bit of complexity, since native C++ code has historically only been able to compile on the same platform as the CI job is running on. We'll need to investigate if clang makes this easier for us.
  - In line with the next point, I think it's fine if we are Linux and amd64-only at this point (in compiled mode). In other words, we'll provide a Linux amd64 binary of the stdlib for now and emit an error message on other platforms stating that experimental compilation is not yet supported.
- We will keep things simple in the 0.4.0 milestone and only support compiled mode on Linux. This makes the above easier. Going forward, we'll need to start building releases separately on each platform (i.e. build macOS on a macOS CI runner, build Linux binaries on Linux and so forth). I'll create a separate issue for this at some point and add a link to it here.
- Implemented as of (ci) Include stdlib in Linux-based .tar.gz snapshots #445, with the above limitation (Linux-only).
Make sure PerlangCompiler uses the stdlib artifacts (.so/.a files and .h/.hpp header files), when being executed from a snapshot build.
- The only thing that will prevent this from happening is if $PERLANG_ROOT is set. $PERLANG_ROOT is still used when running Perlang from source, so let's leave this as-is for now.
Once this is stable enough, consider dropping interpreted mode (to avoid having to always make "two implementations" for all new functionality going into the library). Challenge: this will make it hard/impossible to support the REPL though, so ideally we would keep this until we can reimplement the REPL on top of LLVM instead.
- I am currently (2023-11-03) leaning towards dropping (parts of) the REPL soon, perhaps in the 0.5.0 or 0.6.0 release. This will make things simpler and free us from having to keeping it working all the time, since it won't be working in compiled anyway (for quite a long time, realistically speaking). Once the Perlang compiler is mature enough to be able to interface with LLVM to generate machine code for an arbitrary Perlang expression tree, we can reimplement the REPL on top of this.
  
  Suggested approach: make some "glue tooling" for interfacing between Perlang and C++ (and perhaps between Perlang and C# in the intermediate stage), so that we can expose the Perlang AST types to a little C++ helper library. The helper library will then consume the LLVM headers and emit machine code for the Perlang AST.
  - REPL and -e option dropped in (consoleapp) Remove REPL code #446 and (consoleapp) Remove -e <script> option #447.
Figure out how to answer hard questions, like how to cast an ASCIIString to String (https://github.com/perlang-org/perlang/pull/451/files#r1548516040)
- Fixed (or worked around) by (stdlib) Wrap ASCIIString in std::shared_ptr<T> #453, which should be "good enough" for now. As the compiler matures (and we can eventually move away from relying too much on C++), we can rework this to use more stack-based ASCIIString instances where possible, to reduce the number of heap allocations.
Implement some of the obvious missing string-related operations
- Concatenation between AsciiString and int: (many) Support string+integer concatenation in compiled mode #472, (many) Support more types in string+int and int+string concatenation #473
- Concatenation between AsciiString and AsciiString: (many) Support string concatenation in compiled mode #470
Implement some mechanism for multi-file projects (like a "build system" of some form, like MSBuild or cargo)
- TODO: Definitely deserves an issue of its own. A quick-and-dirty approach could be to support a perlang . or perlang <some-directory> approach, i.e. compile all files in a given directory; this seems to be similar to how https://vlang.io/ does it. The easy way here would be to just emit a single C++ file; if we do it like this, I think we can postpone the "build system" question for (perhaps much) later.
Implement a way to call Perlang code from C#, by compiling the Perlang code to one or more .so (subsequently .dll on Windows) files.
- Has been started, TODO: add reference to PR when there is one.
Implement a way to do "reverse P/Invoke", i.e. expose Perlang code as native functions for calling them from managed C# code.
- This approach should work, i.e. relying on callbacks which can be converted into function pointers on the Perlang/C++ side: https://stackoverflow.com/questions/7970128/passing-a-c-sharp-callback-function-through-interop-pinvoke
Once the compiler is in place and we have the required mechanics for creating native libraries with Perlang, start planning on gradually rewriting the Perlang compiler in Perlang. The "easiest" way is probably to start rewriting some isolated part of it, and call into the Perlang (native) code from C#.
- The bootstrapping can be done using a "stable" version of the "compile-via-C++" compiler.
- Once we have that bootstrapped, we can then subsequently move to depend on the first "stable" version which can compile to native code without any dependency on C++; our only dependency will be on the LLVM libraries at this point. (Challenge: consuming LLVM from non-C++ languages can be impractical. We might need to write some C++-based glue code in the Perlang compiler to make this happen, as described in one of the previous points.)
- Should also have an issue of its own: Rewrite the Perlang compiler in Perlang #454.

The text was updated successfully, but these errors were encountered:

perlun · 2023-10-01T11:12:45Z

This involves a bit of complexity, since native C++ code has historically only been able to compile on the same platform as the CI job is running on. We'll need to investigate if clang makes this easier for us.

It does, since Clang is natively a cross-compiler. But that unfortunately doesn't magically solve all related problems:

But, as is true to any cross-compiler, and given the complexity of different architectures, OS’s and options, it’s not always easy finding the headers, libraries or binutils to generate target specific code. So you’ll need special options to help Clang understand what target you’re compiling to, where your tools are, etc.

perlun · 2024-02-24T20:29:00Z

It does, since Clang is natively a cross-compiler. But that unfortunately doesn't magically solve all related problems:

But, as is true to any cross-compiler, and given the complexity of different architectures, OS’s and options, it’s not always easy finding the headers, libraries or binutils to generate target specific code. So you’ll need special options to help Clang understand what target you’re compiling to, where your tools are, etc.

An interesting approach to this is the way Golang is handling this. A single CI job can generate artifacts for a number of platforms and architectures, with very little extra work for the project itself. I saw this myself in action recently: https://gitlab.com/fleeting-plugin-hetzner/fleeting-plugin-hetzner/-/blob/main/.gitlab/ci/build.gitlab-ci.yml?ref_type=heads. The end result can be seen in this pipeline: https://gitlab.com/fleeting-plugin-hetzner/fleeting-plugin-hetzner/-/pipelines/1188295744

Now, this doesn't help us immediately since we don't intend to use Go for this. 😂 But it's still interesting to see, and we should aim for something similar in Perlang: cross compilation should be easy. It's fine if it requires an automatic in-the-background network download of standard libraries though; I presume (without having looked at the details) that this is how the Go toolchain does it.

This provides some of the groundwork for this, mentioned in #406: > Distribute the (compiled) stdlib along with snapshot builds The changes to the `Makefile` means that running `make install` will now install the `stdlib` into the expected location. The next step is to get the `stdlib` bundled with releases and release snapshots as well.

perlun · 2024-03-30T15:15:02Z

The required groundwork for including experimental compilation in release/snapshot binaries has now been done. 🎉 Moving this to the 0.5.0 milestone now, and intending to publish a 0.4.0 release very soon.

As discussed in #406, the REPL will go away for some time, until we can (at some point) reimplement it on top of LLVM. At that point, the REPL will be _dynamically emitting native code_, i.e. still not require a JIT interpreter of any form. Based on some experiments I've done with LLVM, this should be doable. The `-e "<code-to-be-executed>"` will also be removed for now, but will take that as a separate commit.

This has been an oversight while working on the experimental Perlang compiler (#406). The bug was discovered when implementing the changes in #463; when we started running one of those tests, no error was emitted even though the code was redefining a top-level function. It turned out that the compiler would silently overwrite a function if you defined it twice.

This is the biggest change for a while. Because #406 is moving along nicely, we are now ready to: * Flip the switch, i.e. _make compiled mode the default_ for Perlang. * Remove the PerlangInterpreter class in its entirety. This may be reimplemented in one form or another, once we have the LLVM-emitting backend in place, but not as a tree-walking interpreter. This probably means we'll drop Windows (and perhaps macOS) support for a while. Please don't despair; this is not intended to be permanent. While we depend on a specific Clang version for compiling Perlang code, it simply gets easier to not have to support too many platforms. Once we have started emitting C++ code from Perlang, in an idempotent way (being able to disable all timestamping etc in the file header), we could see how hard it would be to get this Perlang-to-C++-transpiled code compiling on macOS and Windows too.

#465) This is the biggest change for a while. Because #406 is moving along nicely, we are now ready to: * Flip the switch, i.e. _make compiled mode the default_ for Perlang. * Remove the PerlangInterpreter class in its entirety. This may be reimplemented in one form or another, once we have the LLVM-emitting backend in place, but not as a tree-walking interpreter. This probably means we'll drop Windows (and perhaps macOS) support for a while. Please don't despair; this is not intended to be permanent. While we depend on a specific Clang version for compiling Perlang code, it simply gets easier to not have to support too many platforms. Once we have started emitting C++ code from Perlang, in an idempotent way (being able to disable all timestamping etc in the file header), we could see how hard it would be to get this Perlang-to-C++-transpiled code compiling on macOS and Windows too.

perlun · 2024-05-08T20:56:18Z

This is the main feature being worked on in the current 0.5.0 milestone, but it won't be finished when we carve out the 0.5.0 release. Moving to 0.6.0.

perlun added enhancement New feature or request compiled mode Issues which are relevant in compiled mode labels Aug 12, 2023

perlun added this to the 0.4.0 milestone Aug 12, 2023

perlun mentioned this issue Aug 12, 2023

Implement a reflection model for compiled mode #405

Open

perlun pinned this issue Aug 13, 2023

perlun mentioned this issue Sep 1, 2023

(stdlib) Add C++-based stdlib project #407

Merged

perlun mentioned this issue Oct 16, 2023

Support FreeBSD #272

Open

This was referenced Nov 3, 2023

(compiler) Add first steps towards experimental compiler #409

Merged

Support all basic integer and floating-point primitive types #70

Open

Add a char type #364

Open

perlun mentioned this issue Nov 17, 2023

Consider renaming print to println #417

Open

perlun mentioned this issue Feb 27, 2024

Bump StyleCop.Analyzers from 1.2.0-beta.321 to 1.2.0-beta.556 #431

Closed

perlun mentioned this issue Mar 30, 2024

(ci) Include stdlib in Linux-based .tar.gz snapshots #445

Merged

perlun modified the milestones: 0.4.0, 0.5.0 Mar 30, 2024

perlun mentioned this issue Mar 30, 2024

(consoleapp) Remember command history in REPL #182

Open

perlun mentioned this issue Mar 31, 2024

(consoleapp) Remove REPL code #446

Merged

This was referenced Apr 5, 2024

Prevent uninitialized variables #452

Open

Rewrite the Perlang compiler in Perlang #454

Open

perlun mentioned this issue Apr 23, 2024

(compiler) Detect method being redefined #464

Merged

perlun mentioned this issue Apr 24, 2024

(compiler) Make compiled mode be the default and drop interpreted mode #465

Merged

perlun mentioned this issue May 8, 2024

Support array/dictionary indexing #270

Open

perlun modified the milestones: 0.5.0, 0.6.0 May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement compiled mode for Perlang #406

Implement compiled mode for Perlang #406

perlun commented Aug 12, 2023 •

edited

perlun commented Oct 1, 2023 •

edited

perlun commented Feb 24, 2024

perlun commented Mar 30, 2024

perlun commented May 8, 2024

Implement compiled mode for Perlang #406

Implement compiled mode for Perlang #406

Comments

perlun commented Aug 12, 2023 • edited

Rough steps

perlun commented Oct 1, 2023 • edited

perlun commented Feb 24, 2024

perlun commented Mar 30, 2024

perlun commented May 8, 2024

perlun commented Aug 12, 2023 •

edited

perlun commented Oct 1, 2023 •

edited