Use buffer instead of cloned stream #279

MrOrz · 2022-04-28T12:59:05Z

Current implementation of CreateMediaArticle will halt on file larger than a certain size.

It is halt because await file.clone().buffer() never resolves.

The root cause is that when we call await file.clone().buffer(), it reads the cloned stream and put its data in an internal buffer so that the original stream can still consume data afterwards. The read operation halts when the internal buffer is full.

Related issue:

Analysis

In the current implementation we tried to use stream so that we don't need to load the entire file in memory nor in file system.

However, image-hash package requires a buffer as its input. Even if we provide a file path to it, it will read the entire file into memory (then send to JPEG decoder) anyway. Therefore, it is inevitable that we put the entire file in memory.

Proposed fix

This PR downloads the entire file from mediaUrl into a buffer and then pass to image-hash.

We set a max limit for the downloaded file (5MB for now) so that the API server will not run out of memory when someone send us a super large file. If a file bigger than the limit is provided, CreateMediaArticle will fail.

Because cloning response relies on buffer anyway.

coveralls · 2022-04-28T13:01:48Z

Coverage decreased (-0.06%) to 87.473% when pulling f6cac74 on use-buffer into e0510ec on fix-build-by-upgrade.

nonumpa

LGTM! Thanks for the fix 🙏

MrOrz added 2 commits April 28, 2022 20:06

Use fileBuffer instead of cloning response

7c3b643

Because cloning response relies on buffer anyway.

Use buffer in ListArticles and fix test

674d93f

share MAX_FILE_SIZE across ListARticles and CreateMediaArticle

f6cac74

MrOrz marked this pull request as ready for review April 28, 2022 13:37

MrOrz self-assigned this Apr 28, 2022

MrOrz requested a review from nonumpa April 28, 2022 13:37

nonumpa approved these changes Apr 30, 2022

View reviewed changes

Base automatically changed from fix-build-by-upgrade to master May 1, 2022 15:26

MrOrz merged commit 271bc9b into master May 1, 2022

MrOrz deleted the use-buffer branch May 1, 2022 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use buffer instead of cloned stream #279

Use buffer instead of cloned stream #279

MrOrz commented Apr 28, 2022

coveralls commented Apr 28, 2022 •

edited

nonumpa left a comment •

edited

Use buffer instead of cloned stream #279

Use buffer instead of cloned stream #279

Conversation

MrOrz commented Apr 28, 2022

Analysis

Proposed fix

coveralls commented Apr 28, 2022 • edited

nonumpa left a comment • edited

Choose a reason for hiding this comment

coveralls commented Apr 28, 2022 •

edited

nonumpa left a comment •

edited