Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future of preprocessing? #5732

Closed
kripken opened this issue Oct 31, 2017 · 10 comments
Closed

Future of preprocessing? #5732

kripken opened this issue Oct 31, 2017 · 10 comments
Labels

Comments

@kripken
Copy link
Member

kripken commented Oct 31, 2017

We use preprocessing on our JS files. The current implementation is a simple one (with many limitations) in JS. It would be good to be able to run it from python too, so we could use it in more places - right now, for example, we duplicate a few files since the SINGLE_FILE pr landed, as it would need preprocessing from python.

See some debate in #5494 , and also relevant:

One proposal was to use the clang preprocessor for everything. The benefit is it's a solid, standard preprocessor we already have. On the other hand it would means we require clang - consider if a language like Rust wanted to just ship Rust + emscripten, it would just need LLVM but not clang.

Alternatively, we could write and maintain a small preprocessor for our purposes, sort of like we do now, but more full-featured and easily usable from both JS and python.

@dschuff
Copy link
Member

dschuff commented Nov 3, 2017

Using the C preprocessor doesn't have to mean Clang, it could be any C compiler. I don't think that "any C preprocessor" is that much of a burden. For example if you wanted to ship a really-minimal rust-only SDK you could include tcc or pcpp instead of clang. Otherwise you're in this uncanny valley where you have what looks like CPP but isn't. And you probably have extra bugs because you decided to roll your own. Having said that of course, CPP is still probably more featureful than we actually need; we mostly just use #ifdef and we don't use function-like macros, right? So maybe we just add proper conditional expressions and call it a day? (that's the bit that I really miss). But if you let the scope creep, then you're back into the same tradeoff.

@juj
Copy link
Collaborator

juj commented Nov 6, 2017

I'd really like going the route of full C preprocessing, since that would let me use #defines, #includes and C macro functions in .js files, which would help with more efficient dead code elimination. The uncanny valley aspect has bit a lot of developers, so reusing LLVM/Clang for the full thing proper would help a lot, and we would not need to even test the implementation since it'll be guaranteed to be good.

@kripken
Copy link
Member Author

kripken commented Nov 6, 2017

@juj What would be a use case for supporting C macros? (First I hear of that, I thought the context here was just ifdefing.)

@juj
Copy link
Collaborator

juj commented Nov 10, 2017

One example is that I find to be doing something like

#if FETCH_DEBUG
  console.error('tracingLogPrintWithCallstack');
#endif

#if GL_DEBUG
  console.error('tracingLogPrintWithCallstack');
#endif

#if ASMFS_DEBUG
  console.error('tracingLogPrintWithCallstack');
#endif

etc.

and it would be nice to replace those with a #define FETCH_DEBUG(str) in a JavaScript file.

Another example is when I wanted to do pthreads proxying, and wanted to have a prologue in all proxied functions: 1dbac7a#diff-94dc100be52b26c684387d19d991a887R52.

More recently such a scenario occurs when I want to do GL context proxying where a GL context might either be owned by the current thread, or by some other thread, and certain functions receive a prologue where they check if (callingThreadHostsTheGLContextOrSomeOtherThread) to know where to route the call.

@kripken
Copy link
Member Author

kripken commented Nov 10, 2017

Maybe I missed something, wouldn't the first example be best as

#if FETCH_DEBUG || GL_DEBUG || ASMFS_DEBUG
  console.error('tracingLogPrintWithCallstack');
#endif

Or are you saying that pattern itself would be very common and you want to replace it all with

FETCH_DEBUG(tracingLogPrintWithCallstack)

?

That does make sense. But on the other hand, we do have the {{{ code }}} capability already, so we could do

// this would be written once somewhere
{{{
function fetchDebug(str) {
  if (FETCH_DEBUG || GL_DEBUG || ASMFS_DEBUG) {
    return "console.error('" + str + "');";
  } else {
    return '';
  }
}}}
[..]
// then this can be anywhere after it
{{{ fetchDebug('tracingLogPrintWithCallstack') }}}

@juj
Copy link
Collaborator

juj commented Nov 10, 2017

Or are you saying that pattern itself would be very common and you want to replace it all with

FETCH_DEBUG(tracingLogPrintWithCallstack)

Yeah, this case - would be nice to be able to do one-liners like this.

But on the other hand, we do have the {{{ code }}} capability already, so we could do ...

My understanding is that this would be a compiler side thing, so would have to ad hoc create rules in the compiler .js files, whereas with macros one could use them in place in the library code itself. And also developers doing their own libraries could do their own macros for efficient DCE mechanisms.

@kripken
Copy link
Member Author

kripken commented Nov 13, 2017

A library could still do that. But it is a few more lines than a C-style macro, that's true.

C-style macros are more invasive, though, the preprocessor needs to look for them in all the code, not just on lines starting with # etc.

But, they are more familiar of course.

@saschanaz
Copy link
Collaborator

Personally I hope a future preprocessor can manage JS-compatible syntax so that I won't see these red lines anymore.

image

@curiousdannii
Copy link
Contributor

curiousdannii commented Nov 26, 2017

So to put another idea out there, emscripten could take a page from the literate programming book.

Goals:

The advantage of a literate programming style system for us is that subsequent definitions of each kind of block are concatenated/appended together.

I quite like the syntax of cdosborn/lit. For example, imagine a setup where we have separate files for wasm and asm.js. Each file can then define and extend blocks.

Define some functions to use later.
    << definitions >>=
    var wasm_filename = {{{ WASM_FILENAME }}}
    << wasm fetch >>
    << wasm setup >>

Add a couple of entries to the startup promise chain:
    << startup promise chain >>=
    .then( wasm_fetch )
    .then( wasm_startup )

Add some standard lib functions:
    << standard lib >>=
    function malloc() { ... }

    function emscripten_wget() { ... }

    << base standard lib >>

So if you're not familiar with literate programming, one of the core ideas is that source code structure doesn't have to equal program structure. You can define blocks of code and then include them later, or before. In the lit syntax << ... >>= defines a block, concatenating it if there are already definitions, and << ... >> includes a block. Using literate programming also allows you to include documentation along with the code (often all the docs are generated from the code. We wouldn't have to do that.) If we used a syntax like this, the main file could be markdown. I'm not sure which editors are smart enough to do syntax formatting for code blocks, but some would be.

The code can be broken out into files for each aspect of the code, and conditionally included only if they're needed. So separate files for wasm and asm.js, for emterpreter, for GL and AL, threads, modularise, etc. The main file doesn't need to know what code eventually gets included, it just says << definitions >> at the appropriate place and everything else gets included. A code gen hook system essentially.

Just to be clear, the main advantage I'm thinking this has is the block concatenating/appending. I looked at CPP and a whole bunch of JS templating systems, and couldn't see any that had it. But maybe there are other options. Maybe it's even possible with CPP.

@stale
Copy link

stale bot commented Sep 19, 2019

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.

@stale stale bot added the wontfix label Sep 19, 2019
@stale stale bot closed this as completed Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants