Uploading texture data is showing up frequently in profiles on Adreno devices,
especially when zooming on a text-heavy page. Specifically, the time is spent in
glMapBufferRange and glBufferSubData, most likely when internally allocating the
buffer before transferring data in to it.
Currently, we are creating a new PBO, by calling glBufferData(), for each
individual upload region. This change makes it so that we calculate the required
size for all upload regions to a texture, then create single a PBO of the
required size. The entire buffer is then mapped only once, and each individual
upload chunk is written to it. This can require the driver to allocate a large
buffer, sometimes multiple megabytes in size. However, it handles this case much
better than allocating tens or even hundreds of smaller buffers.
An upload chunk may require more space in a PBO than the original CPU-side
buffer, so that the data is aligned correctly for performance or correctness
reasons. Therefore it is the caller of Device.upload_texture()'s responsibility
to call a new function, Device.required_upload_size(), to calculate the required
size beforehand.
On AMD Macs, there is a bug where PBO uploads from a non-zero offset can
fail. See bug 1603783. Therefore this patch preserves the current behaviour on
AMD Mac, reallocating a new PBO for each upload, therefore ensuring the offset
is always zero.
Differential Revision: https://phabricator.services.mozilla.com/D56382
[wrupdater] From https://hg.mozilla.org/mozilla-central/rev/44161695885886b689baae2ad8b357f81f3eef23