Skip to content

Conversation

@satoshinm
Copy link
Contributor

@satoshinm
Copy link
Contributor Author

This is functional, but I have a question about the implementation strategy. Currently the dropped files are written to emscripten's in-memory filesystem at the root level. This works but would be it be preferable to FS.mkdir a new directory to save the dropped files, if so at what path? (Arbitrary, configurable? /tmp/drops?)

Also, considered lazily-loading the files (so the application gets the glfw drop callback immediately, before the files are read), but createLazyFile says "Firefox and Chrome have recently disabled synchronous binary XHRs, which means this cannot work for JavaScript in regular HTML pages", and there is a WORKERFS in emscripten which looks promising (pass it a File), but it too cannot run outside of web workers due to the synchronous operations. This could be further optimized but for now I read the entire contents of the file and write to the emscripten filesystem on file drop, before calling the callback.

satoshinm added a commit to satoshinm/NetCraft that referenced this pull request May 7, 2017
emscripten-core/emscripten#5206 Implement glfw file drop API
satoshinm/emscripten@26262a8

git diff 1.37.9 src/library_glfw.js > ~/games/wasm/NetCraft/src/emscripten-1.37.9+netcraftfixes.patch

event.preventDefault();

var filenames = allocate(new Array(event.dataTransfer.files.length*4), 'i8*', ALLOC_NORMAL);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why files.length*4?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, this is writing a string array, *4 is for a pointer size, nm.

var filename = allocate(intArrayFromString(file.name), 'i8', ALLOC_NORMAL);
filenamesArray.push(filename);
setValue(filenames + i*4, filename, 'i8*');
})(event.dataTransfer.files[i]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of an anonymous function looks odds here? Is there a specific reason for that? Would it be possible to just do var file = event.dataTransfer.files[i]; in the beginning of the loop body, that would not generate extra functions on the fly and would read a bit simpler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IIFE is to pass the correct file in each iteration of the loop to the reader.onload closure, without it var file would lexically capture only the last file.

The usual way to solve this is use .forEach(function() { ... }) but unfortunately event.dataTransfer.files is not an Array but a FileList so it doesn't have .forEach, leading to this awkward construction.

I'll see if I can refactor it to avoid the IIFE...

reader.onload = function(e) {
var data = e.target.result;
var path = file.name; // TODO: to a new directory?
FS.writeFile(path, new Uint8Array(data), { encoding: 'binary' });
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior with dropping files on the web will differ quite a bit from that of dropping files in a native GLFW application, in that in a native app, the file already exists on the filesystem and no new files are created, but in here, we create a new file to the filesystem. This file is not cleaned up anywhere, so it is up to the receiver of the file to unlink() all the files it receives in the handler (especially if it is not interested in any of them). Is this desired behavior? If so, it would be good to document this behavioral difference, since that can be a big source of memory leaks since GLFW applications will need to know about this new unlink() responsibility for all undesired files.

Alternatively, if it would be better that this handler would FS.unlink() all the files it adds, then that's probably also good to document that GLFW applications know not to depend on the files being present outside the drop handler. Which behavior do you think makes more sense?

Copy link
Contributor Author

@satoshinm satoshinm May 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good question, I'm not sure. Both approaches have their merits.

If emscripten automatically deletes the files after the callback, then possible memory leaks would be avoided. However, this would require the application finish using the app after the glfwSetDropCallback callback returns, plausible for simpler apps (I think mine will cope with it just fine), but it is not a limitation documented by the glfw API that the file cannot be used after the callback returns. Native apps reasonably would expect the file still to be usable, perhaps performing some multi-threaded processing afterwards, though I don't know how common this is.

Requiring the app to unlink the file itself would introduce the need for #ifdef __EMSCRIPTEN__ unlink(file) to fix possible memory leaks, but on the other hand would allow apps which depend on the file existing after the call to function. This seems safer and more compatible, leaning towards this second option.

update: added the unlink in example

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. If applications are expected to delete the files themselves, then they can keep them around if needed later, but if applications do want to need them later and GLFW automatically deletes it, then apps would need to make a copy or use some other API to ask not to delete, so I think I like placing the responsibility of deleting to the app as well.

var reader = new FileReader();
reader.onload = function(e) {
var data = e.target.result;
var path = file.name; // TODO: to a new directory?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of filenames do you see here? I presume file.name is the name of the file without any paths? So e.g. dragging a file c:\path\to\file.txt would show file.txt as file.name here? In that case, using that as a path directly would mean this creates a new file in the root, e.g. /file.txt. Is that desired, especially if the cleanup responsibility of the files is on the receiver? What if such a file (directory?) already exists on the filesystem, it looks like this would silently overwrite it? Would it make more sense to have a directory specific to GLFW file drops, e.g. /.glfw_dropped_files/ + file.name.replace(/\//g,'_') or something like that?

Copy link
Contributor Author

@satoshinm satoshinm May 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing (latest Chrome, Firefox, and Safari all on macOS), file.name is the base name (e.g. file.txt). Changed to save in /.glfw_dropped_files/ and replace slashes as suggested.

Overwriting existing files of the same name is another open question. Should the library take care to avoid overwriting a previously-dropped file with an identical name? Maybe, maybe not. Could check if the file exists, if so, change the filename/path... or choose a random filename/directory, but we want to preserve the filename as much as possible since it may contain vital information. Overwriting files with the same name seems not too unreasonable to me, a common use case could be repeatedly dragging and dropping the same file, after editing it on the user's system, so overwriting would be useful/expected (assuming the ported app did not add the #ifdef __EMSCRIPTEN__ ... unlink() call yet, after handling the dropped file). I'd lean towards simplicity here... allow to overwrite?


Thinking about this more, if a randomly-named temporary directory is created for each drop to avoid filename collisions, then the application would have to know to delete not only the file itself, but also the containing directory. I'd argue the files should just be written into /.glfw_dropped_files/ with the given name (sanitized), if an app wants to support multiple files of the same name, they could easily rename the dropped files or process them immediately in the callback, and delete afterwards.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good reasoning, and that probably makes most sense.

reader.onload = function(e) {
var data = e.target.result;
var path = file.name; // TODO: to a new directory?
FS.writeFile(path, new Uint8Array(data), { encoding: 'binary' });
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user drops multiple files here, of which one is large enough to cause an out of memory situation here so that new Uint8Array() will throw out of memory, then all the files that have been dropped so far will currently leak. It would be good to have some kind of error handling that would perhaps ignore all files that failed to be read, and call the drop handler with the ones that did succeed, and make sure that even with errors, no files or filename strings will be leaked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that new Uint8Array creates a view on the existing ArrayBuffer already passed in so I wouldn't expect it to fail here. Testing with emcc test_glfw_dropfile.c -o test_glfw_dropfile.html -s USE_GLFW=3 -s TOTAL_MEMORY=16MB, I was able to drop a 100 MB file in the test program and read it successfully. The failure may occur in C if the program tries to read it all into memory, etc. I'll have to see what I can do to trigger an error; if onload isn't called for all files it may never call the callback and leak the filenames. The loadend event looks promising (called after error/abort/load).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use loadend, and check readyState, the callback should now be called even if a file fails to be written. It will be called with all the filenames dropped, the app will have to check if the file doesn't exist and handle this error condition accordingly. However I wasn't able to trigger a read failure. I tried dropping several large files (100+ MB) and they all were read successfully, albeit slowly. I could even drop 1 GB file no error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that new Uint8Array creates a view on the existing ArrayBuffer

Err, that is of course correct. Thanks for testing this, agreed.


event.preventDefault();

var filenames = allocate(new Array(event.dataTransfer.files.length*4), 'i8*', ALLOC_NORMAL);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading https://developer.mozilla.org/en-US/docs/Web/API/HTML_Drag_and_Drop_API, I think it's possible to drag and drop all kinds of elements over a DOM element, e.g. text, images (other DOM elements) and files. In that case, a dataTransfer field might not be present? Will this throw an exception in that case on accessing event.dataTransfer.files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataTransfer is will still be present, but files.length may be zero (tested dragging and dropping the emscripten logo image from the shell html onto the canvas). Couldn't hurt to add a defensive check, added one to return early instead of trying to allocate 0 bytes.

for (var i = 0; i < count; ++i) {
(function(file) {
var reader = new FileReader();
reader.onload = function(e) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to handle reader.onerror as well, and e.g. ignore that file? That could occur possibly on e.g. an OOM situation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use loadend, which encompasses load/error/abort, and checked the readyState.

onDragover: function(event) {
if (!GLFW.active.dropFunc) return;

event.preventDefault();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason to having event.preventDefault(); here in onDragover but not in onDragend? That's fine if so, but just curious to know what the reason for having it is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out listening for the dragend event is unnecessary altogether; I removed it. preventDefault() in dragover prevents the browser from redirecting to the file on drop.

},

onDragend: function(event) {
if (!GLFW.active.dropFunc) return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible that GLFW.active might be null if there is no active window, so these checks would be good to read as if (!GLFW.active || !GLFW.active.dropFunc) return;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added if (!GLFW.active || !GLFW.active.dropFunc) return; checks

while ((c = fgetc(fp) != -1)) {
++size;
}
printf("read %ld bytes from %s\n", size, paths[i]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test! This does not actually do anything with the contents of the file though, so perhaps it would be good to print out e.g. 10 first and 10 last characters of the file, if one drops a text file, so it'll allow verifying that the actual bytes were transmitted correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added printing first 100 bytes if filename contains .txt, think that's good enough for testing. A cooler demo may be to have a 3d scene and load dropped files as textures or object models..but not for this pr :)

}
printf("read %ld bytes from %s\n", size, paths[i]);

fclose(fp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People often use these tests as examples of how the added APIs are supposed to work, so here might be a good location to have a comment about the file being left around, and unlink() if not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added unlink and explanatory comment.

@satoshinm
Copy link
Contributor Author

Updated for all review feedback, I think that's everything, couple notes on current support:

  • Now writes to /.glfw_dropped_files/, replacing / with _, to avoid writing files in root
  • Files dropped with the exact same filename will be overwritten if the app didn't delete them, if the app wants to handle this scenario (very edge case) it can rename or move the files or process and delete immediately
  • It is up to the app developer/porter to unlink() files after finishing processing e.g. in the drop callback (shown in example) to avoid memory leaks

@satoshinm
Copy link
Contributor Author

@juj Any other changes needed or can this be merged?

@juj
Copy link
Collaborator

juj commented Jun 30, 2017

Thanks, checked out the test and looks great now! Sorry for the delay in returning to the review.

@juj juj added the GLFW label Jun 30, 2017
@juj juj merged commit 7807ee4 into emscripten-core:incoming Jun 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants