Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(slice): missing throw when invalid positions #1

Closed
wants to merge 1 commit into from

Conversation

aboqasem
Copy link

No description provided.

@Aschen
Copy link
Owner

Aschen commented May 12, 2023

Hey thanks for your contribution!
I'm in holiday but I will have a look beginning of next week

@aboqasem
Copy link
Author

Hey! No worries have a great holiday! Just a small thing noticed while browsing GitHub 😆.

@jimmywarting
Copy link

jimmywarting commented May 22, 2023

Actually no... it should not throw an error when using out of range positions like negative values

This works in browsers:

await new Blob(['abc']).slice(-1).text() // c

I honestly would just stick to using fs.openAsBlob(path, { type }) or an already fully feature battle tested Blob library that have already done this for a really long time way back, it's called fetch-blob and is more compatible and even more spec compatible then NodeJS themself. as stream() should for instance return a byte stream that supports BYOB readers and NodeJS have issues with their blob.type being casted to all lower cased. I don't blame NodeJS for this, it's actually an ongoing issue in the spec that meta key/value should stay intact as they are. and i don't see that you have any support for this either.

I Have corrected NodeJS in many areas around their implementation of Blob.

fetch-blob is bundle into in node-fetch and many ppl are already using it today if you just search for some easy keywords on github.

i don't see that LazyBlob have any support for type either other than that you have to set it yourself afterwards:

const lazyBlob = await LazyBlob.create("package.json")
lazyBlob.type = 'text/should-not-work' // this should be read only...
lazyBlob.slice(0, 2, 'text/plain') // Not supported

what's more? it dose not have any safeguards that checks wether or not the file have changed before starting a read stream.
And there is also one more thing. LazyBlob dose not work with the builtin FormData
my fetch-blob dose.

import { fileFromSync, File, Blob } from 'fetch-blob/from.js'
import { LazyBlob } from "lazyblob"

const file = fileFromSync('./package.json', { type: 'application/json' })
file.slice(-10, -5, 'text/plain') // this is acceptable
const fd = new FormData()
fd.set('fileUpload', file)
await fd.get('fileUpload').text() === file.text() // this would not have been true if you had use lazyblob

you can also do more things with fetch-blob like concat multiple blob look-a-like items into one blob

import { fileFromSync, blobFrom, File, Blob } from 'fetch-blob/from.js'
import { openAsBlob } from 'node:fs'
import { LazyBlob } from 'node:fs'

new Blob([
  new globalThis.Blob(['prefix NodeJS blob']),
  await openAsBlob('./movie.mp4', { type: 'video/mp4' }),
  fileFromSync('./package.json', {type: 'text/plain'}),
  await blobFrom('./package.json').slice(-10),
  new File(['fetch-blob'], 'dummy.txt'),
  await LazyBlob.create('./package.json'),
  'str'
]).stream().getReader({mode: 'byob'}).read(...)

you even correct lazyBlob's mistake by wrapping it in our fetch-Blob implementation by hiding it from plain view. As it will normalize the start/end position. before handing it to lazyBlob slice method by correcting the start/end arguments before calling the lazyBlob.slice() method

var { LazyBlob } = await import('lazyblob')
var { File } = await import('fetch-blob/from.js')
var lazyBlob = await LazyBlob.create('./package.json')
var file = new File([lazyBlob], '') // hide and wrap lazyBlob into a fetch-blob
file.slice(-10).text() // slice now works flawlessly and behave correctly.
// and it now also works with built in FormData (which it did not do in the example above)
var fd = new FormData()
fd.set('x', file)
fd.get('x').text()

@Aschen
Copy link
Owner

Aschen commented May 24, 2023

@jimmywarting I see you have done a lot of work on Node.Js internal Blob and thanks for that!

I'm wondering if the Blob implementation in Node or in Browsers address the issue of loading entire files in RAM ? This was the primary concern of this (very small) library.

@jimmywarting
Copy link

jimmywarting commented May 24, 2023

Browser handle this perfectly.

  • if you select something from <input type=file> then it will just be a pointer to the disk. same thing with FileSystemHandle.prototype.getFile()
    • it will not be read into memory
    • changing the content of the file, or just the last modified date will make it so that you can't read the content anymore
  • when you request a blob with await response.blob() and it turns out to be larger than 10mb then it will store it in a temporary location on the disk and offload it from memory.

Chrome's Blob Storage System Design

NodeJS newly shipped await fs.openAsBlob(path, { type }) also just create a empty blob handle that points to where the data is located on the disk as well, nothing is read into memory. it's so sheep to create as it only stats the last modified date and getting the file size so that it could be made into a sync. but the problem with them right now only is that they aren't transferable over postMessage.

openAsBlob is essentially a "lazy" blob.
and if you where to do:

const lazyBlob = await fs.openAsBlob(path)
new Blob([lazyBlob, lazyBlob])
  .slice(0, lazyBlob.size + (lazyBlob.size / 2))

then it will contain two references point to where this lazyblob is located and re-read it twice (actually 1.5 - as it changes the size/end position)
NodeJS do the same thing as i do more or less with the slice method.

 /**
   * The Blob interface's slice() method creates and returns a
   * new Blob object which contains data from a subset of the
   * blob on which it's called.
   *
   * @param {number} [start]
   * @param {number} [end]
   * @param {string} [type]
   */
  slice (start = 0, end = this.size, type = '') {
    const { size } = this

    let relativeStart = start < 0 ? Math.max(size + start, 0) : Math.min(start, size)
    let relativeEnd = end < 0 ? Math.max(size + end, 0) : Math.min(end, size)

    const span = Math.max(relativeEnd - relativeStart, 0)
    const parts = this.#parts
    const blobParts = []
    let added = 0

    for (const part of parts) {
      // don't add the overflow to new blobParts
      if (added >= span) {
        break
      }

      const size = ArrayBuffer.isView(part) ? part.byteLength : part.size
      if (relativeStart && size <= relativeStart) {
        // Skip the beginning and change the relative
        // start & end position as we skip the unwanted parts
        relativeStart -= size
        relativeEnd -= size
      } else {
        let chunk
        if (ArrayBuffer.isView(part)) {
          chunk = part.subarray(relativeStart, Math.min(size, relativeEnd))
          added += chunk.byteLength
        } else {
          chunk = part.slice(relativeStart, Math.min(size, relativeEnd))
          added += chunk.size
        }
        relativeEnd -= size
        blobParts.push(chunk)
        relativeStart = 0 // All next sequential parts should start at 0
      }
    }

    const blob = new Blob([], { type: `${type}` })
    blob.#size = span
    blob.#parts = blobParts

    return blob
  }

and my BlobDataItem is essentially what you have already created and calls a "lazyBlob" that reads from the fs. mine is just a bit different. i did kind of follow chromes design system. it dose have multiple types of Blob(parts) that are lazy read like the one you could eg get from response.blob() given that it includes a content-length and isn't compressed.

They are all just hidden from plain sight. internally they do support more blob parts.
NodeJS and Deno actually do it the same way. they create a other blob look-a-like part and accepts it as being a possible part of the constructor new Blob([ blobLike ])

/**
 * This is a blob look-a-like backed up by a file on the disk
 * with minium requirement. Its wrapped around a Blob as a blobPart
 * so you have no direct access to this.
 * This is done so that it can include arrayBuffer() and text()
 * as well as creating a readable byte stream from a simple thing such as
 * async iterator
 *
 * @private
 */
class BlobDataItem {
  #path
  #start

  constructor (options) {
    this.#path = options.path
    this.#start = options.start
    this.size = options.size
    this.lastModified = options.lastModified
    this.originalSize = options.originalSize === undefined
      ? options.size
      : options.originalSize
  }

  /**
   * Slicing arguments is first validated and formatted
   * to not be out of range by Blob.prototype.slice
   */
  slice (start, end) {
    return new BlobDataItem({
      path: this.#path,
      lastModified: this.lastModified,
      originalSize: this.originalSize,
      size: end - start,
      start: this.#start + start
    })
  }

  async * stream () {
    const { mtimeMs, size } = await stat(this.#path)

    if (mtimeMs > this.lastModified || this.originalSize !== size) {
      throw new DOMException('The requested file could not be read, typically due to permission problems that have occurred after a reference to a file was acquired.', 'NotReadableError')
    }

    yield * createReadStream(this.#path, {
      start: this.#start,
      end: this.#start + this.size - 1
    })
  }

  get [Symbol.toStringTag] () {
    return 'Blob'
  }
}

@Aschen
Copy link
Owner

Aschen commented May 24, 2023

Ok got it !

I didn't knew fs.openAsBlob but it's because it's only available in Node.js 20 I guess.

Actually we also developed a WebBlob which represents a file available through HTTP, wdyt?

(Also it seems that our custom Blob cannot be used as fetch body, maybe for the reason you cited earlier)

@jimmywarting
Copy link

I didn't knew fs.openAsBlob but it's because it's only available in Node.js 20 I guess.

Yea, it's relative new, came just a few weeks ago.

Actually we also developed a WebBlob which represents a file available through HTTP, wdyt?

Man... i tried to do something like this ages ago. see transcend-io/conflux#31 (comment)
not as fully complete as your WebBlob, but anyhow...

(Also it seems that our custom Blob cannot be used as fetch body, maybe for the reason you cited earlier)

i guess the reason for it could be that it's not recognize a Blob or a File.
what node-fetch dose to confirm that something is a blob is that it also has to have a Symbol.toStringTag among other things.

export const isBlob = object => {
	return (
		object &&
		typeof object === 'object' &&
		typeof object.arrayBuffer === 'function' &&
		typeof object.type === 'string' &&
		typeof object.stream === 'function' &&
		typeof object.constructor === 'function' &&
		/^(Blob|File)$/.test(object[NAME])
	);
};

Others (fetch-blob & undici) do just do this small check to accept them as blob or blob parts:

    return (
      object?.constructor === 'function' &&
      ( typeof object.stream === 'function' ) &&
      /^(Blob|File)$/.test(object[Symbol.toStringTag])
    )

https://github.com/nodejs/undici/blob/c5f3bdda4214b172e578b6adb933b1b4a7858195/lib/core/util.js#LL21C1-L30C2

@aboqasem
Copy link
Author

Thanks for the thorough explanation and for your work on Node.js, @jimmywarting!

From what I considered only a typo fix, I've learned a lot of things 😄!

Thanks @Aschen and @jimmywarting!

@aboqasem aboqasem closed this May 28, 2023
@Aschen
Copy link
Owner

Aschen commented May 31, 2023

@jimmywarting Unfortunately the WebBlob is not very useful if you cannot use it with the browser native fetch API.

I tried the isBlobLike function from undici and it returns true, thus I'm not surprised because the WebBlob works perfectly with Node.js's fetch implementation.

I need to look into Chromium code to understand why it is not working despite the fact that it extends the original Blob class. (reproducible example here)

Anyway, thanks for all the insights, I really appreciate 😄

And thanks @aboqasem for opening this PR and starting this conversation, beauty of open source I guess :-)

@jimmywarting
Copy link

Cool project/repo
I'm going to post something there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants