Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemoryError downloading file from URL with trailing slash #2839

Open
memo33 opened this issue Sep 11, 2023 · 0 comments
Open

OutOfMemoryError downloading file from URL with trailing slash #2839

memo33 opened this issue Sep 11, 2023 · 0 comments

Comments

@memo33
Copy link

memo33 commented Sep 11, 2023

Coursier runs out of memory when downloading a large file (750 MB) from a URL that ends with a slash (such as example.org/foo/).

The problem comes from Downloader#doTouchCheckFile where the implementation assumes the file content is a directory listing -- and tries to read the entire file content as a String into memory.

if (updateLinks && file.getName == ".directory") {
val linkFile = FileCache.auxiliaryFile(file, "links")
val succeeded =
try {
val content =
WebPage.listElements(url, new String(Files.readAllBytes(file.toPath), UTF_8))
.mkString("\n")

This could potentially be fixed by reading a file input stream instead and passing it to WebPage#listElements. On the JVM, HTML parsing is done using Jsoup, which does have support for input streams, and I assume, on binary data, it would throw a parse error early on. What would be needed for the JS backend, though?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant