Use buffer instead of cloned stream #279
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Current implementation of
CreateMediaArticle
will halt on file larger than a certain size.It is halt because
await file.clone().buffer()
never resolves.The root cause is that when we call
await file.clone().buffer()
, it reads the cloned stream and put its data in an internal buffer so that the original stream can still consume data afterwards. The read operation halts when the internal buffer is full.Related issue:
Analysis
In the current implementation we tried to use stream so that we don't need to load the entire file in memory nor in file system.
However,
image-hash
package requires a buffer as its input. Even if we provide a file path to it, it will read the entire file into memory (then send to JPEG decoder) anyway. Therefore, it is inevitable that we put the entire file in memory.Proposed fix
This PR downloads the entire file from
mediaUrl
into a buffer and then pass to image-hash.We set a max limit for the downloaded file (5MB for now) so that the API server will not run out of memory when someone send us a super large file. If a file bigger than the limit is provided,
CreateMediaArticle
will fail.