-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement FileReaderSync #1830
Comments
👍 on adding this api. TBH we shouldn't have landed the file api without it in the first place. Are you interested in implementing the api or would you like to leave it for someone else? |
Oh hey Ethan, want too kinda, have the bandwidth... probably not. |
This comment was marked as resolved.
This comment was marked as resolved.
Ok! So here is the solution i came up with:
/*! Read blob sync in NodeJS. MIT License. Jimmy Wärting <https://jimmy.warting.se/opensource> */
// Worker Thread (blob-worker.js)
const { workerData } = require('worker_threads')
const { signal, port, blob } = workerData
blob.arrayBuffer().then(ab => {
// Post the result back to the main thread before unlocking 'signal'
port.postMessage(ab, [ab])
port.close()
// Change the value of signal[0] to 1
Atomics.store(signal, 0, 1)
// This will unlock the main thread when we notify it
Atomics.notify(signal, 0)
}) /*! Read blob sync in NodeJS. MIT License. Jimmy Wärting <https://jimmy.warting.se/opensource> */
// Module (FileReaderSync.js)
const {
Worker,
receiveMessageOnPort,
MessageChannel,
} = require("worker_threads")
/**
* @param {Blob} blob
* @returns {ArrayBuffer}
*/
function read(blob) {
const subChannel = new MessageChannel()
const signal = new Int32Array(new SharedArrayBuffer(4))
signal[0] = 0
const worker = new Worker('./blob-worker.js', {
transferList: [subChannel.port1],
workerData: {
signal,
port: subChannel.port1,
blob,
},
})
// Sleep until the other thread sets signal[0] to 1
Atomics.wait(signal, 0, 0)
// Close the worker thread
worker.terminate()
return receiveMessageOnPort(subChannel.port2)?.message
}
class FileReaderSync {
readAsArrayBuffer(blob) {
this.result = read(blob)
}
readAsDataURL(blob) {
const ab = read(blob)
this.result = `data:${blob.type};base64,${Buffer.from(ab).toString('base64')}`
}
readAsText(blob) {
const ab = read(blob)
this.result = new TextDecoder().decode(ab)
}
// Should not be used, use readAsArrayBuffer instead
// readAsBinaryString(blob) { ... }
}
exports.FileReaderSync = FileReaderSync // Main Thread (your code)
const {FileReaderSync} = require('./FileReaderSync.js')
const blob = new Blob(['abc'], {type: 'text/plain'})
const frs = new FileReaderSync()
frs.readAsText(blob)
console.log(frs.result) |
IMO i think that a sync file reader should not be implemented at all into NodeJS |
I think FileReaderSync is a huge footgun - similar to other synchronous apis (like XMLHttpRequest with async=false). Unlike XMLHttpRequest, synchronously reading a blob is only available in workers:
(from its webidl definition) As the person who implemented FileReader initially, I don't think it was a good fit for undici. I think WebSocket, fetch, and even a FormData parser (💨) make sense to have - they revolve around different pieces of HTTP/1.1. FileReader... doesn't, although the logic at the time was fair: undici already implements File (which is part of the same FileAPI spec), why not implement FileReader also? Now, considering that node has its own File implementation - also implemented by me - it would be a much better fit in core. Plus, FileReaderSync likely needs some C++ to be limited to workers (and even read the blob synchronously). |
The base FileReader is a perfectly fine API IMO. Would you be willing to expand on what you mean by "bad practices"? |
For browser (in the main thread) then you could do something how i posted on this old gist const blob = new Blob(['123'])
const xhr = new XMLHttpRequest()
const url = URL.createObjectURL(blob)
// required if you need to read binary data:
xhr.overrideMimeType('text/plain; charset=x-user-defined')
xhr.open('GET', url, false)
xhr.send()
const uint8 = Uint8Array.from(xhr.response, c => c.charCodeAt(0))
const text = new TextDecoder().decode(uint8)
const arrayBuffer = uint8.buffer
const json = JSON.parse(text) |
Probably the same reason as you wrote
Also as i mention earlier, Again if we get support for blob's backed up by the fs later, if you try to read this files synchronous, then it will block the main thread for a long time. |
FileReaderSync is fundamentally different from FileReader. You can tell there was consideration while adding it (unlike sync XMLHttpRequest) to not be a footgun - only being exposed in worker threads. FileReader is a perfectly fine, asynchronous API. Plus, every browser and Deno implements it. |
i just think that the I think that WPT should change there blob/file test cases to use the new blob reading methods and flesh out the FileReader to be something separate |
I don't see any disdain from the Deno maintainers about adding it, actually they all seemed to support adding it. The reasons you listed for them adding it don't seem negative - actually the opposite: it increases cross-environment compatibility. I don't see the downsides of this, regardless of if it's legacy.
I don't think that this makes adding it any less valuable. I could see it being added in core while being flagged as legacy, similar to |
Service worker dose not have it... |
denoland/deno#5249 (comment) aws-sdk was their compelling reason. also see w3c/ServiceWorker#735 |
Cross compatibility seems like a strong use case as well for implementing it. edit:
It's not a service worker because it allows synchronous, blocking work: "any type of synchronous requests must not be initiated inside of a service worker". However it's not really close to anything the browser implements, so you could call it whatever and you'd probably be partially correct. Anyways, that seems like a more compelling reason not to implement FileReaderSync. Even if we did implement it, there's no chance that it would ever get past the tsc, lol. |
ok... so if we did by any means implement a sync File Reader (which i think we should not anyways), then maybe it could just be available inside of worker_threads? |
I think one of us (or maybe both of us) is getting confused. I don't think FileReaderSync should be implemented. I think FileReader (the async one) should be, but in nodejs core, not undici. |
yea maybe. i have mostly been talking about sync file reader. but also both. Things we agree on are:
but IMO i don't think there is any reason to keep using the async FileReader either when we already have something better. |
I don't think so - it wouldn't even solve the original feature request. After all, you have to open a worker thread and then wait for the 'data' event, meaning you're... still reading it asynchronously. This might not be an issue if you're in a worker thread already, but it's added complexity for a niche use case. The real solution is to use your version as posted above, which will block the main thread.
Agreed on this completely. Undici should have never implemented any of the FileAPI (File & FileReader), and rather contributed upstream instead. Although, with webidl, implementing everything in undici is far easier and more correct than in node core. |
ups i did a typo, meant should not |
Ill try this and report back, the rest of this conversation was... interesting. |
In my case this polyfill seems to be hanging on the Edit: NVM |
Here's an alternative way that does the bare minimum: index.jsconst { Worker, receiveMessageOnPort } = require('worker_threads')
const { join } = require('path')
const blob = new Blob([Buffer.alloc(2 ** 31 - 1).fill('abcdefg')])
const shared = new SharedArrayBuffer(4)
const { port1: localPort, port2: workerPort } = new MessageChannel()
const path = join(__dirname, 'worker.mjs')
const w = new Worker(path, {
workerData: { shared, blob, port: workerPort },
transferList: [workerPort]
})
const int32 = new Int32Array(shared)
Atomics.wait(int32, 0, 0)
const { message } = receiveMessageOnPort(localPort)
console.log(message) worker.mjsimport { workerData } from 'worker_threads'
const { shared, blob, port } = workerData
port.postMessage({
body: await blob.arrayBuffer()
})
const int32 = new Int32Array(shared)
Atomics.notify(int32, 0) |
looking at your code i just notices that one is a cjs and the other is esm. then i can't help but think that there should be a option for new Worker(url, { type: 'module' }) |
Got it working for my case, thanks for the polyfill @jimmywarting. I will leave this open depending on what the maintainers decide, feel free to close it or change it over to a discussion, my issue is resolved. |
Your welcome! |
Just want to share something with you that i have just created: Let me introduce: to-sync Now if you want to read the blob in a sync manner then it is just as simple as doing: import { createWorker } from 'to-sync'
const syncify = createWorker()
const read = syncify(async blob => {
const ab = await blob.arrayBuffer()
return new Uint8Array(ab)
})
console.log(read(new Blob(['abc']))) // Uint8Array or if you like it tiny: import { createWorker } from 'to-sync'
const readSync = createWorker()(blob => blob.arrayBuffer().then(r => new Uint8Array(ab))
console.log(readSync(new Blob(['abc']))) // Uint8Array Makes all this abstraction way easier :) |
This would solve...
Currently I am doing something that involves keeping everything out of Promise land, however requires reading a Blob
The implementation should look like...
Should look much like the FileReader however have sync behavior.
https://developer.mozilla.org/en-US/docs/Web/API/FileReaderSync
I have also considered...
Currently attempting to make a workaround with the given FileReader available with Unidici as of now.
Additional context
ex.
The text was updated successfully, but these errors were encountered: