Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dart:io readAsBytes will crash if the file size is greater than INT_MAX on flutter windows embedder #50129

Open
jonahwilliams opened this issue Oct 4, 2022 · 16 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, FFI, and the AOT and JIT backends. library-io P2 A bug or feature request we're likely to work on triaged Issue has been triaged by sub team

Comments

@jonahwilliams
Copy link
Contributor

Repro:

void main() async {
  await File('path/to/huge_file').readAsBytes();
}

The file I am using is approximately 3.3 GB. INT_MAX bytes in GB is approximately 2.147483647 GB.

I'm not able to get a backtrace, but the popup that shows the assert shows the path minkernel\crts\ucrt\src\appcrt\lowio\read.cpp

@jonahwilliams
Copy link
Contributor Author

The error message only appeared in an unopt flutter windows build. In an opt build it crashes or is killed immediately.

@jonahwilliams
Copy link
Contributor Author

Also: yes, we should probably not ever try to open a file this big all at once.

@dnfield
Copy link
Contributor

dnfield commented Oct 4, 2022

When it's this big it should probably get mmapped... whatever the equivalent of that API is on Windows, I can't remember anymore.

@jonahwilliams
Copy link
Contributor Author

jonahwilliams commented Oct 4, 2022

I think we can't mmap them today because dart:io APIs return mutable buffers. If we could change these APIs to return immutable buffers (#50069) then we would not only use less memory, but lots of silly patterns like the FileImage would be faster too

Edit: can't mmap

@zanderso
Copy link
Member

zanderso commented Oct 4, 2022

Reads are happening with https://github.com/dart-lang/sdk/blob/main/runtime/platform/utils_win.cc#L83.

Docs here https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/read?view=msvc-170

say:

If buffer is NULL, or if buffer_size > INT_MAX, the invalid parameter handler is invoked. If execution is allowed to continue, the function returns -1 and errno is set to EINVAL.

The Flutter Windows embedder may need to do something like:

https://github.com/dart-lang/sdk/blob/main/runtime/bin/platform_win.cc#L37

if it isn't already, so that _read will return an error instead of crashing.

@zanderso
Copy link
Member

zanderso commented Oct 4, 2022

cc @a-siva @brianquinlan

@jonahwilliams
Copy link
Contributor Author

On the Dart CLI I get:

FileSystemException: read failed, path = 'C:\Users\Jonah\Downloads\supervhs-win.zip' (OS Error: The operation completed successfully.
, errno = 0)

Which ... is still wrong, but definitely better than exiting.

@jonahwilliams jonahwilliams changed the title dart:io readAsBytes will crash if the file size is greater than INT_MAX on windows dart:io readAsBytes will crash if the file size is greater than INT_MAX on flutter windows embedder Oct 4, 2022
@zanderso
Copy link
Member

zanderso commented Oct 4, 2022

The additional work between the error and creating the OSError object is probably clobbering errno, e.g. here: https://github.com/dart-lang/sdk/blob/main/runtime/bin/file.cc#L1302

@a-siva
Copy link
Contributor

a-siva commented Oct 4, 2022

Thanks for pointing out the exact line, will fix it.

@a-siva
Copy link
Contributor

a-siva commented Oct 4, 2022

I think we can't mmap them today because dart:io APIs return mutable buffers. If we could change these APIs to return immutable buffers (#50069) then we would not only use less memory, but lots of silly patterns like the FileImage would be faster too

Edit: can't mmap

See a breaking change proposal to add a mmap API to random access files in dart:io #49810 would this be useful for Flutter or is your case pretty much limited to the engine mmaping a file and creating these unmodifiable typed data buffers ?

@jonahwilliams
Copy link
Contributor Author

That would be useful for the Flutter tool, which needs to load file bytes to compute hashes. this is generally quite slow for binary artifacts/dependencies.

This would also be useful for Image.file or similar approaches where we have no intention to modify the file bytes. That said, I think there are a lot of File implementers, which might make it hard breaking change to land.

@jonahwilliams
Copy link
Contributor Author

I see that Platform::Initialize is what ultimately sets up the right handlers so that we can catch this error. However, I only see this method invoked from runtime/bin/main.cc and not via any of the embedder setup methods

@zanderso
Copy link
Member

zanderso commented Oct 5, 2022

Ahh, sorry. I thought that was called from BootstrapDartIo, which the Engine does call, but it looks like it isn't:

https://github.com/dart-lang/sdk/blob/main/runtime/bin/dart_io_api_impl.cc#L22

I think the Engine may want Engine embedders to do this explicitly themselves, though. The setup that the Dart CLI uses might not be exactly the right thing for Flutter's Windows embedder, for example.

@jonahwilliams
Copy link
Contributor Author

I'm increasingly weary that disabling this assertion isn't the right fix, especially not for an applicaiton that may include any number of other native libraries. maybe dart:io should check the file size on windows and either fall back to a different loading strategy and/or throw an error without relying on intercepting the error handler.

@dnfield
Copy link
Contributor

dnfield commented Oct 5, 2022

It is super weird for a core IO library to have an undocumented hard limit on how large of a file it can open. I think the right thing to do here would be to just mmap the file even if you end up needing to copy the bytebuffer if the user tries to write to it.

@mkustermann mkustermann added area-vm Use area-vm for VM related issues, including code coverage, FFI, and the AOT and JIT backends. library-io labels Oct 6, 2022
@cbracken
Copy link
Member

cbracken commented Oct 6, 2022

The Flutter Windows embedder may need to do something like:
https://github.com/dart-lang/sdk/blob/main/runtime/bin/platform_win.cc#L37

I'll get a patch out for this in the Windows embedder today to at least avoid aborting on error. Thanks for spotting @zanderso.

@a-siva a-siva added area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. P2 A bug or feature request we're likely to work on library-io-triaged and removed area-vm Use area-vm for VM related issues, including code coverage, FFI, and the AOT and JIT backends. labels Nov 4, 2022
@a-siva a-siva self-assigned this Nov 4, 2022
@a-siva a-siva added triaged Issue has been triaged by sub team and removed library-io-triaged labels Dec 20, 2022
@lrhn lrhn added area-vm Use area-vm for VM related issues, including code coverage, FFI, and the AOT and JIT backends. and removed area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. labels Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, FFI, and the AOT and JIT backends. library-io P2 A bug or feature request we're likely to work on triaged Issue has been triaged by sub team
Projects
None yet
Development

No branches or pull requests

7 participants