Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to check if FileHandle/DirectoryHandle exists before executing read/write operations #80

Open
poirierlouis opened this issue Dec 21, 2022 · 7 comments

Comments

@poirierlouis
Copy link

After a small talk on Matrix chat (@annevk), I'd like to present a problem and use-case regarding handles when they no longer exists on user's device.

Currently, read/write access operations on FileSystemFileHandle and FileSystemDirectoryHandle are the only way to test if a file or directory entry exists on user's device. I've been explained there is no exists() like-method on these interfaces due to the TOCTTOU principle. Considering it, here is a problem where I believe current state of API prevent any chance at green code practice:

A. Use-case on file:

  1. Get handle of a file (it can be from IndexedDB storage, a dialog picker...)
  2. Prepare data (run long computation / algorithm, e.g. compress data, serialize data...)
  3. Write data in file

Access to underlying file will happen at step (3), and when file does not exist, it rejects with a NotFoundError.
In such case, there is no way to save time and device's resources. It will run, depending on the use-case, a long computation / algorithm to prepare data. Only once data is ready, it will try to write in file, which may or may not exist.

B. Use-case on directory:

  1. Get handle of a directory
  2. Prepare structure of a project
    • list directories
    • list files
    • prepare data of files
  3. Create prepared directories in (1)
  4. Create prepared files in (1)

In the same situation, step (2) could be prevented if directory entry is already known to not exist.

Problem:

There is no way nor "chance" given at the end-user to prevent execution of step (2). This seems to be a waste regarding the device power/resources, and regarding the time of the user of the application.

Let me know if anything is not clear.
What do you think?

Aside: is TOCTTOU like a general no-go rule when writing a specification? Or can it be argued on depending on use-case, the technicality of such, I think, edge-cases? While not recommended, for example Node.js expose an exists() like-method.

@dslee414
Copy link
Collaborator

I can see the argument that exists() like method may be useful, even though it won't guarantee the existence of file/directory at the time of update, as you mentioned. For example, exists() check may be done periodically while preparing the data, especially if it takes a while / is computation-heavy.

This might be relevant to getMetadata() support requested in #12. Could null metadata mean that the file/directory does not exist? or would it make more sense to have exists() method separately?

FWIW, currently FileSystemFileHandle.getFile() could tell us about the existence of the underlying file. The spec mentions "the returned File object will likely be no longer readable." This is a bit strange to me, as it does not throw NotFoundError nor returns null, so you'd need to examine the content (or perhaps metadata) of the returned file object. For directory handle, I'm not sure if there is currently any way to check its existence.

@a-sully
Copy link
Collaborator

a-sully commented Jan 10, 2023

For directory handle, I'm not sure if there is currently any way to check its existence.

This is unspecified (see #11 - this is one of the primary use cases that needs to be specified) but methods which create a child (createWritable(), getFileHandle(), and getDirectoryHandle()) will reject with a NotFoundError if the parent does not exist.

Consider dirHandle at a/b/. If b/ is deleted somehow (this is related to #39), dirHandle.getFileHandle("c.txt", { create: true }) will fail because the browser should not recursively create the path to /a/b/c.txt

Of course attempting to create a new file isn't a good way to check if a directory exists. Tying this to a getMetadata() method seems reasonable to me, since presumably both are just a stat call

@poirierlouis
Copy link
Author

This might be relevant to getMetadata() support requested in #12. Could null metadata mean that the file/directory does not exist? or would it make more sense to have exists() method separately?

I'm guessing getMetadata() shall be expected to return a Dictionary with file's information (last modified date, file's size, ...).
To my knowledge, all methods of any interfaces of this API never resolves with null. It always rejects with an error when a failure arise. It seems coherent to me (if not already needed) to follow this "rule" throughout this API.

createWritable() rejects with a NotFoundError as already mentionned, it does not return null when file is not found. One could argue that all methods alike to createWritable() must resolves with a null value when file is not found. Which would introduce extra complexity to the possible outcomes of a method.

With this in mind, I believe an exists() method would me more suited, do only one "job", resolves with true / false. In such case, it would be clear to the end-user what to expect from this method. Moreover, when exists() rejects, end user knows it is not related to "is the file existing or not?" but means something else "e.g. OutOfMemoryError".

@a-sully
Copy link
Collaborator

a-sully commented Jan 10, 2023

To my knowledge, all methods of any interfaces of this API never resolves with null

Correct, it probably makes sense for getMetadata() to reject with a NotFoundError rather than returning null. I'm mostly indifferent as to whether that's good enough or if we should expose an explicit exists() method.

@annevk any thoughts from a web ergonomics perspective?

Related to this issue, if we were to assert that a handle is no longer usable after it's been removed (see #39 (comment)) then an exists() method doesn't serve any purpose other than to tell you that your handle is now useless. I personally don't think this is a good idea, but if we decide to go down that route then arguably this should become something like isValid()

@annevk
Copy link
Member

annevk commented Jan 11, 2023

For use case A: why is there the assumption that the file would go invalid between 1 and 2 and not 2 and 3? Why not hold a lock?

Same questions for use case B.

Getting a handle is pretty quick and it seems unlikely-if-not-impossible it would invalidate itself by the time you start preparing data if that happens in the same task as suggested. Now I suppose you could be concerned that it invalidates while preparing the data, but that really suggests you need to use a lock of some kind.

@poirierlouis
Copy link
Author

Related to this issue, if we were to assert that a handle is no longer usable after it's been removed (see #39 (comment)) then an exists() method doesn't serve any purpose other than to tell you that your handle is now useless. I personally don't think this is a good idea, but if we decide to go down that route then arguably this should become something like isValid()

@a-sully: From what I understand of issue #39 and its implication, isValid() seems to be more relevant in such case.

For use case A: why is there the assumption that the file would go invalid between 1 and 2 and not 2 and 3?

@annevk: No reason actually, assuming wrong I admit.
In deed, a lock would prevent TOCTTOU and solve such use-cases.

If tacking a lock could be a solution, then could createWritable() be a candidate? With something like this (use case A):

const handle; // from picker / IndexedDB

try {
  const stream = handle.createWritable({exclusiveLock: true}); // tack an exclusive lock or throw NotFoundError.
  const rawData = compute(data); // computation-heavy.

  await stream.write(rawData);
  await stream.close(); // release lock.
} catch (error) {
  console.log("File not found");
}

This is only an example. Currently and according to non-normative text: it tacks a shared lock, writes content through a temporary file, when close() is called, file is replaced with temporary file.

If it were to tack an exclusive lock, writing content could directly happen in file without the use of a temporary file?
Could it be a solution to issue #67 and the use of inPlace option?
Warning block however, point out concerns regarding malware checks.

@a-sully
Copy link
Collaborator

a-sully commented Jun 6, 2023

Related to this issue, if we were to assert that a handle is no longer usable after it's been removed (see #39 (comment)) then an exists() method doesn't serve any purpose other than to tell you that your handle is now useless. I personally don't think this is a good idea, but if we decide to go down that route then arguably this should become something like isValid()

@a-sully: From what I understand of issue #39 and its implication, isValid() seems to be more relevant in such case.

#39 was addressed by #96, which clarified that a FileSystemHandle maps to a file path - and therefore if the underlying file is removed, the FileSystemHandle itself is not changed. So we'd want an exists() method rather than isValid().

Also related is #126. As noted on that issue, it would be nice if a create() method could play nicely with an exists() method, such that the "create if not exists" use case is less vulnerable to TOCTOU errors.

If tacking a lock could be a solution, then could createWritable() be a candidate? With something like this (use case A):

If it were to tack an exclusive lock, writing content could directly happen in file without the use of a temporary file? Could it be a solution to issue #67 and the use of inPlace option? Warning block however, point out concerns regarding malware checks.

https://github.com/whatwg/fs/blob/main/proposals/MultipleReadersWriters.md proposes allowing createWritable() to take an exclusive lock, such that only one FileSystemWritableFileStream may exist to the file at a time. According to that proposal, writes would still be issued to a swap file, though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants