Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebP Support for PW Core Engines #141

Open
wants to merge 32 commits into
base: dev
from

Conversation

Projects
None yet
6 participants
@horst-n
Copy link
Contributor

commented Apr 25, 2019

@ryancramerdesign
Hi Ryan,
here is the webP support for PW core image engines.
Please refer to this forums post for some explanation:
https://processwire.com/talk/topic/14236-webp-support/page/2/?tab=comments#comment-184583

If you like, we can discuss it here or you can contact me via PM or email (info@nogajski.de).

horst-n added some commits Apr 24, 2019

add support for webp
added option "webpAdd" and "webpQuality" that allows to create a webp file as sidecar file when creating a JPEG, GIF or PNG file.
added properties for webp support
with the property $image->hasWebp we can detect, for example in a template file, if an image variation has a dependant webp file, so we can render conditional markup with srcset or picture elements.
And urlWebp and srcWebp for returning the URL.
make more robust filename creation
and remove debug code
@TomS-

This comment has been minimized.

Copy link

commented Apr 25, 2019

Would urlWebp work as well as srcWebp as src is an alias of url?

@horst-n

This comment has been minimized.

Copy link
Contributor Author

commented Apr 25, 2019

Yes, see:

* @property-read string $urlWebp The url property of an optional WebP-dependency file (since 3.0.132).
* @property-read string $srcWebp Convenient alias for the 'urlWebp' property (since 3.0.132).

@horst-n horst-n referenced this pull request Apr 25, 2019

Open

Webp support #1

*
*/
public function setWebpQuality($n) {
$n = (int) $n;

This comment has been minimized.

Copy link
@matjazpotocnik

matjazpotocnik Apr 26, 2019

Could this be simplified? $this->webpQuality = min(max((int) $n, 1), 100);

This comment has been minimized.

Copy link
@teppokoivula

teppokoivula Apr 27, 2019

Contributor

That would make the code less readable, IMHO. One-liners are fun and save lines of code (if that's a priority), but they can also take a while to sink in – the benefit of simple if-else-structure is that it's instantly understandable :)

... though that being said, I would recommend one change to this method as well:

-		if($n > 100) $n = 100;
+		else if($n > 100) $n = 100;

Micro-optimization, but if first rule is true then the second doesn't need to be processed, and this would also make the intention more obvious (again in my opinion).

This comment has been minimized.

Copy link
@teppokoivula

teppokoivula Apr 27, 2019

Contributor

Actually here's another minor suggestion:

-		$this->webpQuality = (int) $n;
+		$this->webpQuality = $n;

(int) is pointless there, since we've already converted the value to an integer a few lines above. Again very small thing, but this extraneous type conversion can make one wonder if there's actually something strange going on here.

horst-n added some commits Apr 26, 2019

added 2 aliases
beginning with webp, as we also use two new options beginning with webp. Maybe this is better to avoid confusion.
webpAdd, webpQuality, webpUrl, webpSrc
@horst-n

This comment has been minimized.

Copy link
Contributor Author

commented Apr 26, 2019

@ryancramerdesign
Hi Ryan, I have updated one thing in the IMagick module, but primarily want to point you to a .htaccess-only solution I have played with today, that does not need any changes on existing markup to switch a complete site to webp support. Here you go:
https://processwire.com/talk/topic/14236-webp-support/page/2/?tab=comments#comment-184669

@horst-n

This comment has been minimized.

Copy link
Contributor Author

commented Apr 27, 2019

Here is example and sample output of the new manually invoked getDebugInfo method:
https://biriba.de/pw_pop3/pw_pageimage_getdebuginfo/

*
* @param int $n
*
* @return $this

This comment has been minimized.

Copy link
@teppokoivula

teppokoivula Apr 27, 2019

Contributor

Since we're in "code review mode" here, minor note about this: return statement should always state a valid return type, and $this is not one of those. @return self should work, although @return ImageSizerEngine would be recommended (works best with IDEs).

Since specific class name can be slightly confusing if and when this class is extended, I might still go with self :)

*
* @param bool $value
*
* @return $this

This comment has been minimized.

Copy link
@teppokoivula

teppokoivula Apr 27, 2019

Contributor

Earlier return type comment applies here as well 👌

* Get verbose DebugInfo, optionally with individual options array, @horst
* (without invoking the magic debug)
*
* @return mixed: array | string

This comment has been minimized.

Copy link
@teppokoivula

teppokoivula Apr 27, 2019

Contributor

One more note about return format: technically this means that the return value type is mixed:, which is not a valid type. : should be removed or separated from mixed with a space, though if there are only two possible return types I'd recommend @return array|string instead.

Since PHPDoc is intended to be parsed by IDEs and other software (programmatically, that is) it's a good idea to follow the spec to the point :)

This comment has been minimized.

Copy link
@teppokoivula

teppokoivula Apr 27, 2019

Contributor

In addition note that parameters ($options and $returnAsString) are currently not mentioned here.

$content = str_replace(array('<pre>', '</pre>'), '', $content);
if(isset($_SERVER['HTTP_HOST'])) {
// build output for HTML
$return = "<pre style=\"margin:10px 10px 10px; padding:10px 10px 10px 10px; background-color:#F2F2F2; color:#000; border:1px solid #333; font-family:'Hack', 'Source Code Pro', 'Lucida Console', 'Courier', monospace; font-size:12px; line-height:15px; overflow:auto;\">{$content}</pre>";

This comment has been minimized.

Copy link
@teppokoivula

teppokoivula Apr 27, 2019

Contributor

Minor note on this one: 10px 10px 10px and 10px 10px 10px 10px are both identical to simple 10px :)

@@ -24,6 +24,7 @@ class ImageSizerEngineIMagick extends ImageSizerEngine {
*
*/
protected $im = null;
protected $imWebp = null;

This comment has been minimized.

Copy link
@teppokoivula

teppokoivula Apr 27, 2019

Contributor

This should probably be commented using a separate PHPDoc block?

Additionally the comment block above isn't particularly helpful as-is, since it doesn't explain what $im or $imWebp are actually for.

$osInfo = $oSizer->getImageInfo(true);
$finalOptions = $oSizer->getOptions();
$info['suffix'] = $finalOptions['suffix'];

This comment has been minimized.

Copy link
@teppokoivula

teppokoivula Apr 27, 2019

Contributor

This may be a bit opinionated, but I believe that one array definition utilising array_merge() (since we already have some initial values) could make the code more readable:

$info = array_merge($info, array(
    'suffix' => $finalOptions['suffix'],
    'extension' => $osInfo['extension'],
    'imageType' => $osInfo['info']['imageType'],
    // etc.
);
@teppokoivula

This comment has been minimized.

Copy link
Contributor

commented Apr 27, 2019

Hey @horst-n. I added some notes to the code, mainly PHPDoc related and some stylistic points. Hope you don't mind :)

@horst-n

This comment has been minimized.

Copy link
Contributor Author

commented Apr 27, 2019

Hi @teppokoivula , very well appreciated! :)
I will work through this, and some parts in the older code, like setQuality what I simply copied to setWebpQuality without any reworking. :)
Only thing I don't understand right now is your "opinionated" one array_merge. What makes you "opinionated" to it? Or what do you think are the advantages?

@horst-n

This comment has been minimized.

Copy link
Contributor Author

commented Apr 27, 2019

@teppokoivula maybe you have time to review / comment my .htaccess only solution too?
https://processwire.com/talk/topic/14236-webp-support/page/3/?tab=comments#comment-184722

horst-n added some commits Apr 27, 2019

reworked the getDebugInfo method,
trying to make the code more readable.
But the main goal for me is, that the output is best readable! :)
@horst-n

This comment has been minimized.

Copy link
Contributor Author

commented Apr 29, 2019

Now the output of getDebugInfo in a CLI environment looks like this:

array(6)
{
  ["FILES"]                           array(10)
  {
    ["original"]                      string(36) "pw-zoom-crop-testimage-landscape.jpg"
    ["basename"]                      string(47) "pw-zoom-crop-testimage-landscape.450x450-gd.jpg"
    ["url"]                           string(71) "/site/assets/files/1022/pw-zoom-crop-testimage-landscape.450x450-gd.jpg"
    ["filename"]                      string(98) "G:/VHOSTS/pw_core/_htdocs_1/site/assets/files/1022/pw-zoom-crop-testimage-landscape.450x450-gd.jpg"
    ["created"]                       string(19) "2018-08-14 11:45:21"
    ["modified"]                      string(19) "2018-08-14 11:45:21"
    ["filemtime"]                     string(19) "2019-04-29 17:12:24"
    ["suffix"]                        string(2) "gd"
    ["extension"]                     string(3) "jpg"
    ["filesize"]                      int(20541)
  }
  ["VARIATIONS"]                      array(3)
  {
    [0]                               string(52) "pw-zoom-crop-testimage-landscape.150x150-noweppy.jpg"
    [1]                               string(49) "pw-zoom-crop-testimage-landscape.150x150-webp.jpg"
    [2]                               string(52) "pw-zoom-crop-testimage-landscape.450x450-noweppy.jpg"
  }
  ["IMAGEINFO"]                       array(14)
  {
    ["imageType"]                     int(2)
    ["mime"]                          string(10) "image/jpeg"
    ["width"]                         int(450)
    ["height"]                        int(450)
    ["focus"]                         string(26) "top=50%,left=59.5%,zoom=0%"
    ["description"]                   string(0) ""
    ["tags"]                          string(0) ""
    ["orientation"]                   int(0)
    ["rotate"]                        int(0)
    ["flip"]                          int(0)
    ["channels"]                      int(3)
    ["bits"]                          int(8)
    ["appmarker"]                     array(2)
    {
      ["APP0"]                        string(4) "JFIF"
      ["APP13"]                       string(13) "Photoshop 3.0"
    }
    ["iptcRaw"]                       NULL
  }
  ["WEBP COPY"]                       array(5)
  {
    ["hasWebp"]                       bool(true)
    ["webpUrl"]                       string(72) "/site/assets/files/1022/pw-zoom-crop-testimage-landscape.450x450-gd.webp"
    ["webpQuality"]                   int(90)
    ["filesize"]                      int(4784)
    ["savings in percent"]            int(77)
  }
  ["ENGINE(S)"]                       array(3)
  {
    ["neededEngineSupport"]           string(3) "JPG"
    ["installedEngines"]              array(3)
    {
      ["ImageSizerEngineGD"]          string(10) "priority 0"
      ["ImageSizerEngineIMagick"]     string(10) "priority 1"
      ["ImageSizerEngineAnimatedGif"] string(10) "priority 9"
    }
    ["selectedEngine"]                string(18) "ImageSizerEngineGD"
  }
  ["OPTIONS HIERARCHY"]               array(3)
  {
    ["imageSizerOptions"]             array(11)
    {
      ["upscaling"]                   bool(false)
      ["cropping"]                    bool(true)
      ["autoRotation"]                bool(true)
      ["interlace"]                   bool(false)
      ["sharpening"]                  string(4) "soft"
      ["quality"]                     int(90)
      ["hidpiQuality"]                int(60)
      ["defaultGamma"]                int(-1)
      ["webpAdd"]                     bool(false)
      ["webpQuality"]                 int(90)
      ["siteConfig3party"]            string(4) "fake"
    }
    ["individualOptions"]             array(7)
    {
      ["forceEngine"]                 string(18) "ImageSizerEngineGD"
      ["forceNew"]                    bool(false)
      ["webpAdd"]                     bool(true)
      ["webpQuality"]                 int(90)
      ["quality"]                     int(90)
      ["suffix"]                      string(2) "gd"
      ["thirdPartyOption"]            string(4) "fake"
    }
    ["finalOptions"]                  array(17)
    {
      ["hidpiQuality"]                int(60)
      ["siteConfig3party"]            string(4) "fake"
      ["forceNew"]                    bool(false)
      ["suffix"]                      string(2) "gd"
      ["thirdPartyOption"]            string(4) "fake"
      ["quality"]                     int(90)
      ["webpQuality"]                 int(90)
      ["webpAdd"]                     bool(true)
      ["cropping"]                    bool(true)
      ["upscaling"]                   bool(false)
      ["interlace"]                   bool(false)
      ["autoRotation"]                bool(true)
      ["sharpening"]                  string(4) "soft"
      ["defaultGamma"]                int(-1)
      ["cropExtra"]                   NULL
      ["scale"]                       float(1)
      ["useUSM"]                      bool(false)
    }
  }
}

horst-n added some commits Apr 29, 2019

force new creation for condition
where the regular variation exists and should not be recreated, but where a webp copy, that do not exist, is requested too.
fixed a bug with internal checking
if a webp copy exists or not. Using $this->hasWebp() within the resize method doesn't work. Changed to test for file_exists of the temporary webp copy.
@horst-n

This comment has been minimized.

Copy link
Contributor Author

commented May 2, 2019

Hi @ryancramerdesign ,
with my commit from today, it seems I have covered all cases correct now. If you have any questions you know the channels to contact me. :)

horst-n added some commits May 3, 2019

added object as output format to the getDebugInfo
I figured out that this is useful to have for (automated) testing and debugging too. It is much easier to get a value for conditional coding via an object:
```
$dbgInfo = $image->getDebugInfo($options, 'object');
if($dbgInfo->engines->selectedEngine == ....
```
changed the return of webpUrl()
to return the calculated URL, even if there is no webp copy available!
@ryancramerdesign

This comment has been minimized.

Copy link
Contributor

commented May 22, 2019

@horst-n Thanks for all of your great work with this! I'm now working through the PR and so far all looks good. I do have some minor questions, mostly related to learning about this format and update.

First, I just wanted to make sure I understood correctly how this works. I'm not so familiar with webp, so these are probably dumb questions but I'll ask anyway. As far as I can tell, webp is treated here as an add-on to the existing file formats (jpeg, png, gif) rather than a new file format itself. And so one can't have a Pageimage that that is represented by a webp file, for instance? Instead, there is an extra webp URL (or path) for any GIF/JPG/PNG, available as an option for when/if the browser supports it. So this is a similar strategy like we use for HiDPI, in some respects. Does it sound like I understand it correctly so far?

2-3 years down the road when all browsers support webp, will this strategy still be the right way to go, or would you recommend something different at that point?

Also, down the road, what if new formats similar-to, but distinct-from webp come along, and we want to support those? I'm just trying to think of how the strategy best scales towards supporting new formats.

Using the .htaccess route, is it even necessary to have URL/filename methods in the PageImage class? Like the Pageimage::webpUrl() — I'm struggling a little with this because it's so format specific, which is fine for today, but I'm not sure about longer term scalability. I really like that your .htaccess solution abstracts this all away so nicely. As I look at the updates in Pageimage, I also wonder about moving webp-specific, hidpi-specific and svg-specific code into separate classes that hook into Pageimage, so that Pageimage doesn't need to know about all these things. I like what you've got so don't change anything right now — this is just future thinking.

I see reference to a webpOnly option in a few parts, but it's not clear to me where I would set that or how/when I would use that? I don't see that the code additions are setting it anywhere, but they are checking for it. I'm confused on that point just because as I understand so far, the webp support is an extra on top of the existing formats, rather than additional format on its own. So what is the webpOnly option for? :)

In your .htaccess updates for webp, it makes sense to me except for one line:

RewriteCond %{DOCUMENT_ROOT}/$1$2$3/$4.webp -f

I'm not clear about what the 4 $variables are referring to, because I'm used to seeing these referring to some prior capturing parenthesis. But in this case, I'm not clear where the $1, $2, $3 and $4 were captured? It may be that I'm just not familiar with this .htaccess feature yet.

@horst-n

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

Hi Ryan,
today I'm a bit in a hurry, but want to answer some of the questions.


First, I just wanted to make sure I understood correctly how this works. I'm not so familiar with webp, so these are probably dumb questions but I'll ask anyway. As far as I can tell, webp is treated here as an add-on to the existing file formats (jpeg, png, gif) rather than a new file format itself. And so one can't have a Pageimage that that is represented by a webp file, for instance? Instead, there is an extra webp URL (or path) for any GIF/JPG/PNG, available as an option for when/if the browser supports it. So this is a similar strategy like we use for HiDPI, in some respects. Does it sound like I understand it correctly so far?

Yes, this is right. More information about the different role was discussed in forum posts:
https://processwire.com/talk/topic/14236-webp-support/?do=findComment&comment=185136
and
https://processwire.com/talk/topic/14236-webp-support/?do=findComment&comment=185140


2-3 years down the road when all browsers support webp, will this strategy still be the right way to go, or would you recommend something different at that point?

As I do not see any advantages from webp over jpeg and png AS MASTERIMAGES, but the same or more difficulties of wrong handling (like with jpeg), I would not recommend to change that with our current pageimage & sizers. But ... (see next question)


Also, down the road, what if new formats similar-to, but distinct-from webp come along, and we want to support those? I'm just trying to think of how the strategy best scales towards supporting new formats.

If this comes true, we need to rewrite a bigger part of the current logic / implementation of pageimage and the sizers. Then we have to support creation of one, two, or more fileformat outputs in one call to the size function, what currently isn't possible. Also to define a single outputformat, different from the source isn't possible atm. There are more points I came across as I started trying to support a "webOnly" flag, but due to my hurry, I will list those later.


Using the .htaccess route, is it even necessary to have URL/filename methods in the PageImage class? Like the Pageimage::webpUrl() — I'm struggling a little with this because it's so format specific, which is fine for today, but I'm not sure about longer term scalability. I really like that your .htaccess solution abstracts this all away so nicely. As I look at the updates in Pageimage, I also wonder about moving webp-specific, hidpi-specific and svg-specific code into separate classes that hook into Pageimage, so that Pageimage doesn't need to know about all these things. I like what you've got so don't change anything right now — this is just future thinking.

I personally also would take the .htaccess route, but there are many users that already ponted out that they want to have a fine control of where and how webp is supported or not. They want to use picture and srcset elements to create markup with webp fallbacks. Besides that, giving the users the choice over different solutions, it maybe that there are websites out, that cannot be changed to webp support by the .htaccess solution, for what ever reason.


I see reference to a webpOnly option in a few parts, but it's not clear to me where I would set that or how/when I would use that? I don't see that the code additions are setting it anywhere, but they are checking for it. I'm confused on that point just because as I understand so far, the webp support is an extra on top of the existing formats, rather than additional format on its own. So what is the webpOnly option for? :)

(Also see above.) - It was a try to support output of webp only, that isn't functional atm. There must be done greater rewrites in pageimage and the sizers, that I'm currently have no time for. And I think that it would be better if we can do it in collaboration, and it should be done with other (future) input and output formats in mind. So the rewrite will be even bigger. :)


In your .htaccess updates for webp, it makes sense to me except for one line:

RewriteCond %{DOCUMENT_ROOT}/$1$2$3/$4.webp -f

I'm not clear about what the 4 $variables are referring to, because I'm used to seeing these referring to some prior capturing parenthesis. But in this case, I'm not clear where the $1, $2, $3 and $4 were captured? It may be that I'm just not familiar with this .htaccess feature yet.

A) There is a newer strategy already, that uses the webpUrl image only, and detects if it is available. If not, it checks for a jpeg and then for a png and redirects to it.
If you primarily want to send webp, all webp supporting browsers will display it direct. Only for not supporting browsers a redirect gets issued.

    # Redirect regular PW WEBP images to JPEG or PNG where the Browser doesn't support WEBP
    # or where the WEBP file is missing:

    ## JPEG
    RewriteCond %{HTTP_ACCEPT} !image/webp   [OR]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{DOCUMENT_ROOT}/$1$2$3/$4.jpg -f
    RewriteRule ^(.*?)(site/assets/files/)([0-9]+)/(.*)\.webp(.*)$  /$1$2$3/$4.jpg [R=307,L]

    ## PNG
    RewriteCond %{HTTP_ACCEPT} !image/webp   [OR]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{DOCUMENT_ROOT}/$1$2$3/$4.png -f
    RewriteRule ^(.*?)(site/assets/files/)([0-9]+)/(.*)\.webp(.*)$  /$1$2$3/$4.png [R=307,L]

B) The 4 $variables are holding the parts from the RewriteRule. These are:
(.?) = are we in a subdirectory ? $1
(site/assets/files/) = is it an image under PW control ? $2
([0-9]+) = page id $3
(.
).webp = image filename without dot and extension $4
This seems to be common apache .htaccess feature, that it parses the RewriteRule first, and fills the $variables in the above RewriteCond with the found values.

https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewritecond

RewriteRule backreferences: These are backreferences of the form $N (0 <= N <= 9). $1 to $9 provide access to the grouped parts (in parentheses) of the pattern, from the RewriteRule which is subject to the current set of RewriteCond conditions. $0 provides access to the whole string matched by that pattern.
@ryancramerdesign

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

Thanks @horst-n all makes sense now.

I'd forgotten that capturing parenthesis from the RewriteRule can be used in a prior RewriteCond, but now with coffee in hand, it's very clear how it works.

With this updated .htaccess, it looks like it would only use webp for cases where you'd called the $pageimage->webpUrl()? Whereas your previous iteration would have replaced jpg/png requests that could be fulfilled by a webp file, with the actual webp file (making it a more global drop-in webp upgrade). I like this approach. But it seems like the right strategy would depend on the context?

For people looking to upgrade to webp images without updating their site code, wouldn't the previous .htaccess strategy have been a better way to go? I did read in the thread about how people mentioned saving an image off of a website, and the drawback of having a jpg extension file with webp content in it, but I really don't think that matters for most? For me at least, I don't design websites for that purpose (people scraping the images off of it). Many might even consider this behavior a benefit as it reduces the chances of someone repurposing images they scraped from your site.

Basically, I don't think the issue described above (scraping) is one that should affect the approach. I also think it's better to avoid redirects when possible, because they double-up those requests and might potentially have SEO implications. But all in all, it seems like the best .htaccess strategy just depends on the case. Would you agree, or do you think the new version is definitely preferable?

@LostKobrakai

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

For me at least, I don't design websites for that purpose (people scraping the images off of it). Many might even consider this behavior a benefit as it reduces the chances of someone repurposing images they scraped from your site.

I would not be that quick. From the user saving your images' point of view it's just bad UX, true, but there are also things like google images, which dynamically generates thumbnails based on the images found on sites. HTTP based thumbnailing solutions (+ CDN) generally become more and more popular (https://github.com/thephpleague/glide and lot of others / other languages) and there might be tools out there using those e.g. for thumbnails of received webmentions (newer improved alternative to pingbacks) or similar. To a degree those probably don't trust the extension, but also consult the image content to determine the type, but the enduser is certainly not the only receiver of a file with an incorrect extension.

What I would support is using images without extension and serve the file with correct content-type by browser support.

@TomS-

This comment has been minimized.

Copy link

commented May 23, 2019

@LostKobrakai I must admit I'm a very big fan of glide and image as a service. However, I should imagine to implement this into ProcessWire would be a very big task. But it would be a great way of modernising ProcessWire in the regards of image processing. It also solved problems previously talked about on when to choose ImageMagick and how to handle that.

@LostKobrakai

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

I'm not suggesting we should use glide in ProcessWire, but rather that tools like glide might also be clients of the data we serve, not just a user sitting in front of a webbrowser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.