Conversation
|
This looks like a super interesting PR. I have recently been thinking about how to create a vfs for sql.js which would use Asyncify, Blobs, and IndexedDB to be able to persist data reliably and store data larger then what can fit into memory without having to jump through the hoops of the existing synchronous file system implementations. Looks like this could be a good solution to the problem with a much larger impact. I see there has not been much activity on this PR in the last few weeks so if your looking for help and willing to lay out what you think needs to done to get this merge I'd love to help. |
|
Thanks @jgautier! How does the API look to you? It would be good to know if this is in the right direction before moving forward with it. Also it would be nice if there was some more serious example using it, but not sure we need to block on that. |
|
I think the API is looking fine. One change could be to use promises instead of a wakeUp callback but I think just depends if emscripten is moving towards promises rather then callbacks (I noticed async ccall is now using Promises). It seems pretty low level so I don't know if people would be able to create new file systems without a fair amount of knowledge of emscripten, but I am guessing most people would be using one of these async file systems rather than implementing one anyways. I started work on a nodejs async file system (very much a work in progress) https://github.com/jgautier/emscripten/blob/asyncfs/src/library_nodefs_async.js . It seemed like a pretty good place to start. Is this something that you would imagine in emscripten core? Once I got this one working I was planning on creating one for the web using indexeddb. Also I noticed these following syscalls were not implemented, was this for a reason or you had just not gotten around to it yet? warning: undefined symbol: __syscall15 |
|
Promises are an interesting idea. Would we lose any backwards compatibility if we change the API that way? I guess we can always use a Promise polyfill for older VMs. If that sounds good I can rework this PR towards that. A node async filesystem sounds great! Both useful and could be a good reference example. Definitely makes sense to have in upstream. Those syscalls are just things I didn't get around to - but not sure if something in them may be tricky or not. In general I was hoping we need very few syscalls for this - open/close/read/write/seek basically. Do you expect many more to be used? Are those syscalls you expect to use? |
|
I don't think using promises would be a problem compatibility wise. caniuse has promises at 94% and the oldest version of supported node v8.0 supports them. And we could always include a polyfill. For the current node async file system I am building I don't think promises help in any way because all of the file system APIs are still using callbacks, so I think its fine to stick with callbacks for now. It can always be changed later or we could support both callbacks and promises. I ran into a bit of a blocker in implementing the llseek function that I was hoping you could help me sort out. llseek repositions the read/write offset of a file and currently there is no node.js file system api that does this. Maybe we could call the The other file systems when doing llseek get passed a stream object that has a position that can be set as seen here: https://github.com/emscripten-core/emscripten/blob/incoming/src/library_fs.js#L1114 and https://github.com/emscripten-core/emscripten/blob/incoming/src/library_nodefs.js#L265 but for the AsyncFSImpl it does not have a stream parameter https://github.com/emscripten-core/emscripten/pull/9151/files#diff-6fbef50c0bc777ac7ab2342da03ba23fR98. I don't think I am familiar enough with emscripten internals to clearly see how the AsyncFSImpl can or should integrate with the rest of the emscripten file systems. In regards to the missing syscalls, the only reason why I mentioned them was because when I was compiling things on this branch emcc warned that those were not defined. I haven't looked into it yet but seems like something that should be addressed before this is merged. |
|
For seek in node.js, it looks like I'm not sure if we can write a general such layer that would help everything. Leaving it to each implementation may be the most general thing. |
I think this PR could be nicely updated on top of #11429, which added |
|
I'm interested in this in order to implement a filesystem that uses browser File System Access API. I haven't tried it yet, I'll give it a look! I think a file system access API FS could be mounted at specific paths for text/image editing, databases or game saves. |
|
I'd like to use something like this in the future, however I'd be concerned that it may be too low level. I'd hope that a layer could also be provider that would bring the API up to the adaptability level of the current FS design, if not even higher. |
|
Thanks for the feedback @thomasballinger @curiousdannii ! Yes, whether this is too low-level is a good question. May be worth experimenting with it more before we settle on an API. |
|
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant. |
|
Is this something which is still being worked on or something which will fall into the work done on WasmFS? Being able to use Asyncify to implement an async FS would be super cool. My goal is to render a loading status / progress bar with CSS/JS on top of a app which was build for synchronous reading from harddisk but now is freezing the main loop/all rendering when fetching data from the network via a synchronous XHR call without any way to tell the user what’s happening. |
|
@chkuendig This should fall under WasmFS, when that is stable. Here is the most relevant work so far: Closing as this PR is no longer needed. |
We've allowed customizing the JS for the filesystem (and we have MEMFS, IDBFS, etc.), and all of those are written synchronously, the way code normally runs. Asyncify can make compiled code pause and resume, but the system JS code for our default filesystem isn't written to be callback-based. So using Asyncify to wait on the result of a syscall (that internally does some async operations) has not been possible.
This PR provides an option for that, ASYNCFS, which replaces the normal syscall interface with one that integrates with Asyncify. That lets the JS side do any async operation you want before returning to the C code which appears to wait synchronously for the result.
To use this, you need to implement in JS hooks for the various syscalls, from scratch (this can't build on the existing filesystem code because that is synchronous). Each hook gets a callback for resuming execution, so this is fully async and compatible with Asyncify. The test provides an example.
Also: