Replace ByteArrayInputStream and ByteArrayOutputStream usage with something more efficient #2049

punktilious · 2021-03-10T18:59:51Z

Issues with ByteArrayOutputStream:

Access to the underlying byte[] copies the array which can increase memory pressure in some high throughput scenarios where large buffers are involved;
Many methods are synchronized.

Issues with ByteArrayInputStream:

Many methods are synchronized.

Implement a new class which can provide InputStream and OutputStream accessors, using a single byte[] which can be resized if needed, or avoid array copies if an ideal size is given to begin with.

Replace ByteArrayOutputStream and ByteArrayInputStream where they are used to compress/decompress resource payloads.

lmsurpre · 2021-03-23T15:34:32Z

I ran an export after getting this change deployed and the performance looks good to me:

INFO bulkexportfastjob[164] processed 138753 resources in 21.34 seconds (rate=6501.5 resources/second)
INFO bulkexportfastjob[164]                          Patient      32618
INFO bulkexportfastjob[164]                        Condition     360376
INFO bulkexportfastjob[164]                      Observation    6359819
INFO bulkexportfastjob[164] -------------------------------- ----------
INFO bulkexportfastjob[164]                            TOTAL    6752813

Prior to this change, I'd seen export times over 10,000 resources/second, but back then we were also leaking memory (due to #2040 ).

Lets call this one 'good' and we can revisit in the future as part of a broader performance analysis.

prb112 added the bulk-data label Mar 11, 2021

punktilious self-assigned this Mar 16, 2021

punktilious added this to the Sprint 2021-04 milestone Mar 16, 2021

lmsurpre closed this as completed Mar 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace ByteArrayInputStream and ByteArrayOutputStream usage with something more efficient #2049

Replace ByteArrayInputStream and ByteArrayOutputStream usage with something more efficient #2049

punktilious commented Mar 10, 2021 •

edited

Loading

lmsurpre commented Mar 23, 2021 •

edited

Loading

Replace ByteArrayInputStream and ByteArrayOutputStream usage with something more efficient #2049

Replace ByteArrayInputStream and ByteArrayOutputStream usage with something more efficient #2049

Comments

punktilious commented Mar 10, 2021 • edited Loading

lmsurpre commented Mar 23, 2021 • edited Loading

punktilious commented Mar 10, 2021 •

edited

Loading

lmsurpre commented Mar 23, 2021 •

edited

Loading