DOM access #5

Open
glyph opened this Issue Jan 5, 2012 · 22 comments

5 participants

@glyph

See discussion on kripken/emscripten#158

@max99x
replit member

Shouldn't be that hard, but not a priority for us, because the performance of any such web app wouldn'y be practical. For executing Python code from JS, see entry_point.js. For executing JS code from Python, write a C module that wraps emscripten_run_script and does marshalling. Patches welcome.

@glyph

I wanted to give this a shot, but I blew almost my whole weekend just trying to get empythoned to build :). After hacking on the build script for several hours, moving the build to a 32-bit machine (builds on a 64-bit OS would just fail), I eventually got a functional 30M python.js, which is 10x bigger than the one I originally found (via http://syntensity.com/static/python.html), not to mention much slower, and 6x bigger than the one in python.opt.js. Did that one come from empythoned? Is there a HOWTO somewhere explaining how to get this to actually build? Is there a way to add that C module without having to re-build everything?

@max99x
replit member

Let me address your points in order.

I blew almost my whole weekend just trying to get empythoned to build :).

Ouch! You should've bugged us earlier.

Moving the build to a 32-bit machine (builds on a 64-bit OS would just fail)

Empythoned is hardcoded to be 32-bit, since JS has poor support for 64-bit numbers, and the version of Emscripten that Empythoned uses doesn't support 64-bit compilation anyway. I always compiled Empython on my 64-bit Arch Linux machine via a multilib GCC, so that should work.

I eventually got a functional 30M python.js, which is 10x bigger than the one I originally found (via http://syntensity.com/static/python.html), not to mention much slower, and 6x bigger than the one in python.opt.js. Did that one come from empythoned?

The syntensity.com version does not come from Empythoned. That's kripken's first build that inspired this project. It is faster to startup but crashes on unicode and does not support non-built-in modules. The python.opt.js included in the dist directory is generated by Empythoned and is more-or-less what is used on repl.it. The size and speed difference is something I'd expect if you compiled with OPTIMIZED set to 0 in the build script. The speed is mainly affected by whether eliminator.coffee is run on the result, and the size by whether minify.py is run. The latter is just a wrapper around the Closure Compiler with some tricks to inline code across modules. Make sure you build your own Closure from trunk and update the JAR path.

Is there a way to add that C module without having to re-build everything?

On a non-optimized build, you can build a C module alone and use it just by adding an entry to the virtual file system (a mapping from virtual to physical paths at the bottom of python.js) and importing it like any other module. On an optimized build, you'll need to build the whole thing from scratch, since we do cross-module inlining and name mangling. A good start would be to use the example C module in cpython/Modules/xxmodule.c as a template. Make sure you add it to the modules list in the build file.

@sunetos

Hey, sorry to revive an old issue but I started working on this same topic. I have empythoned running in a page off of script type="text/python" blocks, and it's really nifty. Unfortunately without DOM access it's not very useful yet.

I think since this issue was first filed, empythoned switched to always using web workers when possible, making DOM access impossible. I tried switching back to not using web workers, but it fails for various reasons in every browser (it seems non-worker modes haven't been maintained).

Do you have any recommendations for best approaches? Should I make some kind of dom proxy object that works in a worker, or should I try to restore support for running synchronously (not in a worker)?

@max99x
replit member

@bkase had a working DOM layer for Empythoned at one point, by running Empythoned in the main browser thread. I'm not sure if he ever released it publicly, but I expect he may be able to help you. In general, the issue you're going to run into if you use a Worker is that you can't do synchronous data exchange between the DOM thread and the worker.

@sunetos

Thanks! It looks like @bkase had a "bryj" project, which is now giving a 404. Do you know a good way to try to get in touch with him?

@sunetos

Actually, maybe I should bypass this issue entirely. What I actually want is to translate the python to js (I don't care if it's completely unreadable byte-code-interpreting js), then run that js block.

I would then have a build script that pre-processes all the python to minified/gzipped js for release builds, and you just use the python sources in development builds. This way you can have the niceness of python while developing, and reasonably small/performant js for production. Is this approach possible with empythoned? Is there any way to capture the interpreted python as a js block?

If not, it appears the thing that broke empythoned outside of workers is the annoying browser change that blocks synchronous XHR when not in a worker. I have ideas for working around that by pre-processing all the python modules server-side to be base64 encoded, then using data URIs synchronously client-side to process them. I would rather not go down that road though if the py->js conversion is possible.

@max99x
replit member

If you're looking for performance, Empythoned is not a great choice. It's perfectly fine for a REPL or short code samples that you want to work 100% identical to CPython, which is how it was intended to be used, but you'll get awful performance out in the real world. Skulpt and Pyjamas may be a better path. I usually just code in Coffeescript when I'm craving Python's elegance in the browser. Anyway, there's currently no way to compile Python code to JS using Empythoned without including the whole slow interpreter.

The breakage of Empythoned in the DOM thread thread only affects getting non-ASCII data. If you encode the modules to base64 on the server, you can still get and decode them on the client from the DOM thread synchronously.

@sunetos

Thanks for the responses! I'm ok with slow performance in 'debug' (repl) mode, but I was hoping I could get decent performance out of a 'compiled' version. It sounds like that would be quite hard.

I don't think I can use skulpt or pyjamas; I was looking for 100% compatibility with cpython like empythoned. I am trying to solve a similar goal as node.js but by doing the opposite: code sharing between frontend and backend, but by bringing the backend language to the browser instead of the other way around (like node.js).

I will keep investigating. I might work on a NACL plugin that embeds cpython as an alternate route.

@max99x
replit member

NACL should be doable, but that only gives you Chrome. :(

@bkase

Hey, sorry I've taken this long to respond, I've been pretty busy this week. I'm finishing up my internship this week so next week I'll clean up what I have for Empydom and open-source it so that you can check it out @sunetos . Performance was not what I had hoped for, but I didn't spend much time optimizing either.

Here's what I have:
A window Python object is initialized before any Python code is executed on a page. The window object can be used as you would the JavaScript window object. Specifically:

  • You add properties to window, you can read values from properties within window, and you can call functions from properties within window.
  • You can pass Python strings, numbers, lists, dictionaries, and functions as parameters to any function in window and they will be interpreted as JavaScript strings, numbers, arrays, objects, and functions respectively.
  • JavaScript errors are percolated up to the Python stack trace and the output function is called for each character of the JavaScript state

I won't go into much implementation details here in this thread, but if you have any questions once you see the code and my documentation, just let me know. If you are in a rush, I can let you view the code now, but it's really not in a state that I can release it publicly until a fix it up a bit.
Worth noting, is that I could not get Empydom to run in a worker-thread due to the lack of an API to spin until a message is received in a worker, so it's all on the UI thread.

Speaking of the native CPython solution, I've actually explored that as well. I didn't manage to get CPython to build as an NaCl plugin, but I did get it to build with Firebreath -- which is a wrapper over NPAPI -- I have tested this on both Chrome and Firefox and CPython works at native speeds. I also have basic two-way communication between the CPython FireBreath plugin and JavaScript from the browser working as well.
The problem with just using native CPython is the lack of a sandbox that you get for free when the interpreter is compiled to JavaScript. With the basic plugin, a website could delete your harddrive or install malicious software very easily (some basic Python code). If you do go this way, you will need to spend time to ensure that arbitrary unsafe Python code cannot be run from the web -- also that requires viewers of the websites to download a native plugin, which is normally frowned upon as far as I can tell. I can share that code with you too if you want, but that is extremely messy right now, and I am not planning on releasing that at the moment.

NaCl may solve some of those problems that you would have with NPAPI, but as @max99x said, you only get Chrome if you go that route.

@sunetos

Hey, thanks for the details. I'm definitely interested in your code, and I'm not in a rush. This is just a side project.

How hard was the firebreath route? This is my new plan:

  • Build a browser extension that embeds cpython (probably with firebreath).
  • Make the extension intercept script type="text/python" and run pyjaco to produce readable js instead of python.
  • Cache the produced scripts in localStorage using the sha1 as the cache key.
  • Release the server-side build script for pre-processing your html for production builds, so end-users get the final js output and don't need any plugins.
  • (later) Try to wrap the cpython runtime in a nacl sandbox (not using the chrome pepper api version of nacl) for security. If that fails, add a FlashBlock style UI for whitelisting domains.

pyjaco, despite having a broken/outdated website, has a lot of potential and seems to be the most promising python->js translater.

@sunetos

Actually I may have more luck with pypy's sandboxing than running cpython under nacl:
http://doc.pypy.org/en/latest/sandbox.html

@bkase

For me, the main difficulty with Firebreath was just my lack of experience with the CMake build-system. In order to get the thing to compile, I ended up writing a shell script that would modify the contents of the Makefiles and other config files that CMake outputs to hardcode in the CPython related information -- so really not a great solution.
Firebreath is really easy to use though, the project comes with scripts to generate projects and the hello world example worked for me as soon as I installed all the correct dependencies.

Your plan sounds good, but the Javascript generated from pyjaco will definitely not have 100% compatibility with CPython. I thought the compatibility was one of your constraints.

I breifly tried to get pypy working as well, but the build always failed for me when I tried to compile with the sandbox enabled even before I could attempt to get it working with Firebreath. The pypy sandbox might be a good solution, if you can get this to work.

@sunetos

You are right about the compatibility. I spent some time in #pypy and they quickly convinced me that pyjaco is nowhere near close enough. I would have to re-implement nearly every data structure in the standard library first. I'm pretty close to giving up on this project at this point since I don't have time to write and test a huge pile of data structures right now.

I ran into issues with pypy sandbox as well, and I reported in #pypy and they fixed it almost immediately (the fix was in HEAD in like 30 minutes). It now runs well on my local machine. Getting it working in a browser plugin would require rewriting the pypy sandbox wrapper (parent process) scripts in c++, which seems like just a few days of work.

@bkase

Wow, nice, I guess I should have checked #pypy. Keep me updated with your progress.

I'm starting to clean up my Empydom project today, so I'll open source that soon.

@bkase

@sunetos @max99x I open-sourced bkase/empydom. Feel free to check it out!

@sunetos

Thanks! I'll check it out soon.

@max99x
replit member

Neat! @amasad may be interested too.

@amasad
replit member

That's pretty cool. Any example websites built with that?

@bkase

You can check srv/script-tag-example.html for a canvas example, but I actually haven't made a good DOM-manipulation example yet.

@amasad
replit member

Looks great! I'll be sure to play with it real soon. There is a potential for truly awesome applications of this. Education and cool demos comes to mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment