-
-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asynchronous image scaling #3090
Comments
Will this need an special setup to configure such a long-running Python thread? If so I would say we need to document it somewhere. |
@erral It depends on WSGI server. With waitress it works OOTB. It may be that even running Plone as such requires this. I don't know for sure. |
Part of this PLIP considering generating configured scales in advanced has been split into a pull request, which I hope, could be merged as an opt-in feature for plone.formwidget.namedfile without PLIP. (My assumption is that only turning it on by default would require PLIP if that would ever happen.) plone/plone.formwidget.namedfile#43 |
|
PLIP (Plone Improvement Proposal)
Responsible Persons
Proposer: Asko Soukka
Seconder: Timo Stollenwerk, Victor Fernandez de Alba
Abstract
We propose option for replacing the current behavior of creating image scales synchronously on demand with a new behavior of building them asynchronously, both for faster performance with multiprocessing support faster responses due to non-blocking scaling requests.
Motivation
HTML5 features such Picture and srcset allow to optimize responsive design at least in three different dimensions: image size (mobile, tablet, desktop, ...), image pixel density (1x, 2x, 3x, ...) and image format (PNG, WEBP, JPEG2000, ...). This increases demand for different scales and other versions of a single image. Also no longer do all users need the same version of the image. When before all necessary scales where created on-demand already immediately by the editor viewing the saved document, more and more scales are created only long after the original edit, resulting slow performance when viewing the content.
Another motivation for asynchronous scaling is acute performance issue with Plone REST API based editing of image content, so called "headless" use case. To return cacheable image scales for all available versions, Plone REST API need to call Plone image scaling API to reserve URLs for those, effectively creating all configured scales immediately on first read of the content. While subsequent calls would be fast, this first slow read makes using Plone REST API for images inconvenient and discourages use of Plone scales and adding support for modern image formats.
Assumptions
We want to add support for modern image format alternatives (WEBP, JPEG2000, ...) for Plone with srcset.
We want to provide responsive images scale alternatives in Plone with Picture tag.
We believe, it is, at least initially, easier to support modern image formats and more scales by reusing the existing scaling framework than by integrate an external image scaling service.
This proposal only covers images stored with plone.namedfile blob image fields.
Proposal & Implementation
We propose enhancing the current on-demand image scaling with, possibly optional, asynchronous scaling implementation:
When a scale is requested, it is not immediately created, but instead, a scale storage item with the usual configuration, but empty data, is put into the existing scaling storage. This creates URL by which the scale could be requested later by browser and which could be returned by REST API without immediately creating the scale.
At the end of the transaction with new scale storage items, tasks for creating those scales are put into implemented image scaling queue. In task descriptions, the related scale storage and original image are referred with OIDs to allow fast retrieval from ZODB by the processor.
Image scaling queue processor is a new thread started once for each Plone instance (or WSGI server process). The thread will reserve its own dedicated ZODB connection from the configured connection pool, but with minimal object cache (only 100 objects) for minimal memory footprint.
Image scaling queue processor scales images using Python built-in concurrent.futures.ProcessPoolExecutor, which allows using all available CPU cores for image scaling in parallel. Scales are written into ZODB sequentially in their completion order, each with its own commit, by the scaling queue processor to prevent conflicts.
If scale has not been generated yet when requested, plone.namedfile scale traverser will do redirect to the original image size display-file URL.
If scale has not been generated yet when requested, but the scale storage placeholder is more than 10 minutes old, a new scaling task will be immediately queued.
On Volto, because Volto proxies all images from Plone, if scale has not been generated yet when requested, the request is retried for a few times to wait for the scale to appear before fallback to the original version. This effectively makes scaling both asynchronous (non-blocking) and still immediate on Volto use cases.
Deliverables
This PLIP will eventually provide three pull requests evolved from the following POC branches into their respective packages:
plone.namedfile https://github.com/plone/plone.namedfile/tree/datakurre-image-scaling-queue
volto https://github.com/plone/volto/tree/datakurre-image-scaling-queue
Risks
There may be bugs.
Not all use cases may not have been covered yet, resulting in empty scales (always redirecting to the original).
ZODB undo log gets bloated with image scales.
May not work with all WSGI deployments, because requires long-running Python thread next to Zope worker threads.
Disabling asynchronous image scaling may leave scales without scale value in scaling storage. Plone will fallback to deliver the original image version when these scales are requested. Eventually the scaling storage will clean those scales and replace them with synchronously generated once.
Participants
The text was updated successfully, but these errors were encountered: