Skip to content

Commit

Permalink
Amp faster image (#480)
Browse files Browse the repository at this point in the history
Add support for Faster Image dimension fetching library (for PHP5.3+), which relies on curl multi to allow for parallel fetching of image dimensions. Fast Image is still in place for back-compat with older PHP versions and contexts where curl_multi may not be useable.

props @gititon 

* AMP: "Unpackage" FasterImage and its dependencies for use in amp plugin.

Installed FasterImage (and its dependencies) via Composer. Flattened directory structure. Renamed classes to reflect the GitHub repo and hash they came from. Renamed files in accordance with WordPress standards. Replaced namespace declarations/use statements with includes. Updated files to invoke/refer to new class and interface names.
See https://github.com/willwashburn/FasterImage and https://github.com/willwashburn/stream for original code

* AMP: Integrated FasterImage, removed extraction from meta data

Refactored Image Sanitizer and Dimension Extractor to use FasterImage to get dimensions for all images that need them concurrently.
As a result, it can now fetch dimensions for 100 images in 2 seconds whereas it previously took about 40 seconds. It takes less than a
second to get the dimensions for 10 remote images, and .002 seconds to get their dimensions when they are stored in transients (without memcached).
Removed the extract_from_attachment_metadata method for getting dimensions because it had to run multiple database queries
(triggered by attachment_url_to_post_id() and wp_get_attachment_metadata) for every image every time an amp page was viewed. With the new low "cost"
of extracting dimensions from the images themselves, it did not seem to warrant retaining the additional complexity.

* Changing [] to array for PHP 5.2 compliance

* Changing [] to array for PHP 5.2 compliance

* AMP: Removing closure from Faster Image library to make it php 5.2 compatible

* AMP: Ensure image extraction tests are using Photon image urls

* AMP: Ensure image extraction tests are using Photon image urls

* AMP: Restoring version of Faster Image that uses closure even though it does not work in PHP 5.2 since my first attempt to remove the closure did not work

* AMP: Using FastImage instead of FasterImage if PHP version < 5.3 (due to closure)

* AMP: Adding FastImage library back

* AMP: Accounting for the fact that PHP 5.2 has no array_column

* AMP: Changing image extraction tests to use http instead of https for Travis tests

* AMP: Normalized behavior for FasterImage and FastImage so tests can pass

Certain tests that tested behavior involving the FasterImage and FastImage libraries behaved differently depending on the PHP version that was running them, resulting in the same test never being able to pass under two different PHP versions (5.2 vs 5.3+). I normalized the libraries' behavior by suppressing the warning thrown by FastImage when it couldn't open an image and having it return false instead. This keeps everything working and allows the tests to pass.

* AMP-WP: Change test URLs back to HTTPS

I had changed test URLs from HTTPS to HTTP while investigating why tests were passing locally but failing during Travis CI and forgot to change them back.

* AMP-WP: Bring modified files back into conformance with WordPress coding/formatting standards

I'd thought PHPStorm was taking care of compliance with WordPress coding/formatting standards after it prompted me to do so and I said yes. Fool me once...

* AMP-WP: Reduced FasterImage timeout from 10 to 4 and using real User Agent

* AMP-WP: Removing potentially confusing comment about adjust_and_replace() being called twice

* AMP-WP: Provide a fallback user agent when extracting remote image dimensions in case one isn't present/available

* AMP-WP: Removed unnecessary nesting from AMP_Image_Dimension_Extractor::determine_which_images_to_fetch()

* AMP-WP: Removed blank line from AMP_Image_Sanitizer

* AMP-WP: Adding period to inline comment to conform to  documentation standards

* AMP-WP: Renamed AMP_Image_Dimension_Extractor::modify_return_value_for_tests_that_disable_extraction_filter() to the more general normalize_value_returned_by_extract() to account for any scenario that may bypass the filter that modifies the array

* AMP-WP: Use FastImage for image dimension extraction if curl/multi curl not present

* AMP-WP: Experimenting with changing test image urls from Photon back to placehold.it to rule that out as the reason tests are failing

* AMP-WP: Dummy commit (whitespace) to trigger rerun of failed Travis CI suite

* AMP-WP: Changing image extractor test urls back to use Photon

* AMP-WP: Adding debugging info to help determine why tests are failing on 5.2

* AMP-WP: Changing debugging info to help determine why tests are failing on 5.2

* AMP-WP: Changing debugging info to help determine why tests are failing on 5.2

* AMP-WP: Changing debugging info to help determine why tests are failing on 5.2

* AMP-WP: Changing image dimension extraction test urls to http from https because fopen is not configured to open https on Travic CI server. Removing debugging output.

* AMP-WP: Changed name of dimension extraction filter

* AMP-WP: Changed image dimension extraction filter callbacks to take multiple arguments so value can cascade through filters. Defaulted extraction to failed to account for filter disablement.

* AMP-WP: Checking for empty image src attribute as well as missing image src attribute

* AMP-WP: Changed image dimension extraction filter callback back to single parameter for simplicity

* AMP-WP: Updated wpcom-helper image extraction callbacks to use batch filter and updated callbacks to process urls in batch

* AMP-WP: Updating faster image library to use amp-wp v<version number> as user agent

* AMP-WP: Updating faster image library to use amp-wp v<version number> as user agent

* AMP-WP: Added a section to readme.md about amp image dimension extraction and how to create and update custom filter callbacks

* AMP-WP: Make image dimension extraction user agent contain site url and overridable via filter

* WP-AMP: Extract image dimensions when with or height attributes are present but have no value

* AMP-WP: removing array_fill_keys from image dimension extraction tests and updating  instead

* WP-AMP: Excluding images with bad urls from image extraction

* AMP-WP: Making final adjustments to image nodes as soon as it is determined that they don't need dimensions extracted to eliminate unnecessary loop

* AMP-WP: Minor naming, optimization, and organization tweaks

* Renamed adjust_and_replace_nodes() to adjust_and_replace_nodes_in_array_map() to suggest argument is more than an array of DOMNodes
* Removed hasAttribute() call from DOMNode since getAttribute() returns false when attribute isn't present
* Moved call to get_default_user_agent into FasterImage method, set output of get_default_user_agent() as filter hook input.
  • Loading branch information
gititon authored and mjangda committed Nov 28, 2016
1 parent 2587b8a commit 3e6f1ac
Show file tree
Hide file tree
Showing 14 changed files with 1,484 additions and 419 deletions.
131 changes: 131 additions & 0 deletions includes/lib/class-faster-image-b52f1a8-exif-parser.php
@@ -0,0 +1,131 @@
<?php

/**
* Class ExifParser
*
* @package FasterImage
*/
class Faster_Image_B52f1a8_Exif_Parser
{
/**
* @var int
*/
protected $width;
/**
* @var int
*/
protected $height;

/**
* @var
*/
protected $short;

/**
* @var
*/
protected $long;

/**
* @var StreamableInterface
*/
protected $stream;

/**
* @var int
*/
protected $orientation;

/**
* ExifParser constructor.
*
* @param StreamableInterface $stream
*/
public function __construct(Stream_17b32f3_Streamable_Interface $stream)
{
$this->stream = $stream;
$this->parseExifIfd();
}

/**
* @return int
*/
public function getHeight()
{
return $this->height;
}

/**
* @return int
*/
public function getWidth()
{
return $this->width;
}

/**
* @return bool
*/
public function isRotated()
{
return (! empty($this->orientation) && $this->orientation >= 5);
}

/**
* @return bool
* @throws \FasterImage\Exception\InvalidImageException
*/
protected function parseExifIfd()
{
$byte_order = $this->stream->read(2);

switch ( $byte_order ) {
case 'II':
$this->short = 'v';
$this->long = 'V';
break;
case 'MM':
$this->short = 'n';
$this->long = 'N';
break;
default:
throw new Faster_Image_B52f1a8_Invalid_Image_Exception;
break;
}

$this->stream->read(2);

$offset = current(unpack($this->long, $this->stream->read(4)));

$this->stream->read($offset - 8);

$tag_count = current(unpack($this->short, $this->stream->read(2)));

for ( $i = $tag_count; $i > 0; $i-- ) {

$type = current(unpack($this->short, $this->stream->read(2)));
$this->stream->read(6);
$data = current(unpack($this->short, $this->stream->read(2)));

switch ( $type ) {
case 0x0100:
$this->width = $data;
break;
case 0x0101:
$this->height = $data;
break;
case 0x0112:
$this->orientation = $data;
break;
}

if ( isset($this->width) && isset($this->height) && isset($this->orientation) ) {
return true;
}

$this->stream->read(2);
}

return false;
}
}
192 changes: 192 additions & 0 deletions includes/lib/class-faster-image-b52f1a8-faster-image.php
@@ -0,0 +1,192 @@
<?php

require_once( AMP__DIR__ . '/includes/lib/class-faster-image-b52f1a8-invalid-image-exception.php' );
require_once( AMP__DIR__ . '/includes/lib/class-faster-image-b52f1a8-image-parser.php' );
require_once( AMP__DIR__ . '/includes/lib/class-stream-17b32f3-stream.php' );
require_once( AMP__DIR__ . '/includes/lib/class-stream-17b32f3-stream-buffer-too-small-exception.php' );

/**
* FasterImage - Because sometimes you just want the size, and you want them in
* parallel!
*
* Based on the PHP stream implementation by Tom Moor (http://tommoor.com)
* which was based on the original Ruby Implementation by Steven Sykes
* (https://github.com/sdsykes/fastimage)
*
* MIT Licensed
*
* @version 0.01
*/
class Faster_Image_B52f1a8_Faster_Image
{
/**
* The default timeout
*
* @var int
*/
protected $timeout = 4;

public function __construct( $user_agent )
{
$this->user_agent = $user_agent;
}

/**
* Get the size of each of the urls in a list
*
* @param array $urls
*
* @return array
* @throws \Exception
*/
public function batch(array $urls)
{

$multi = curl_multi_init();
$results = array();

// Create the curl handles and add them to the multi_request
foreach ( array_values($urls) as $count => $uri ) {

$results[$uri] = array();

$$count = $this->handle($uri, $results[$uri]);

$code = curl_multi_add_handle($multi, $$count);

if ( $code != CURLM_OK ) {
throw new \Exception("Curl handle for $uri could not be added");
}
}

// Perform the requests
do {
while ( ($mrc = curl_multi_exec($multi, $active)) == CURLM_CALL_MULTI_PERFORM ) ;
if ( $mrc != CURLM_OK && $mrc != CURLM_CALL_MULTI_PERFORM ) {
throw new \Exception("Curl error code: $mrc");
}

if ( $active && curl_multi_select($multi) === -1 ) {
// Perform a usleep if a select returns -1.
// See: https://bugs.php.net/bug.php?id=61141
usleep(250);
}
} while ( $active );

// Figure out why individual requests may have failed
foreach ( array_values($urls) as $count => $uri ) {
$error = curl_error($$count);

if ( $error ) {
$results[$uri]['failure_reason'] = $error;
}
}

return $results;
}

/**
* @param $seconds
*/
public function setTimeout($seconds)
{
$this->timeout = $seconds;
}

/**
* Create the handle for the curl request
*
* @param $url
* @param $result
*
* @return resource
*/
protected function handle($url, & $result)
{
$stream = new Stream_17b32f3_Stream();
$parser = new Faster_Image_B52f1a8_Image_Parser($stream);
$result['rounds'] = 0;
$result['bytes'] = 0;
$result['size'] = 'failed';

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_BUFFERSIZE, 256);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $this->timeout);
curl_setopt($ch, CURLOPT_TIMEOUT, $this->timeout);
curl_setopt($ch, CURLOPT_USERAGENT, $this->user_agent);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
"Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
"Cache-Control: max-age=0",
"Connection: keep-alive",
"Keep-Alive: 300",
"Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7",
"Accept-Language: en-us,en;q=0.5",
"Pragma: ", // browsers keep this blank.
)
);
curl_setopt($ch, CURLOPT_ENCODING, "");

curl_setopt($ch, CURLOPT_WRITEFUNCTION, function ($ch, $str) use (& $result, & $parser, & $stream, $url) {

$result['rounds']++;
$result['bytes'] += strlen($str);

$stream->write($str);

try {
// store the type in the result array by looking at the bits
$result['type'] = $parser->parseType();

/*
* We try here to parse the buffer of characters we already have
* for the size.
*/
$result['size'] = $parser->parseSize() ?: 'failed';
}
catch (Stream_17b32f3_Stream_Buffer_Too_Small_Exception $e) {
/*
* If this exception is thrown, we don't have enough of the stream buffered
* so in order to tell curl to keep streaming we need to return the number
* of bytes we have already handled
*
* We set the 'size' to 'failed' in the case that we've done
* the entire image and we couldn't figure it out. Otherwise
* it'll get overwritten with the next round.
*/
$result['size'] = 'failed';

return strlen($str);
}
catch (Faster_Image_B52f1a8_Invalid_Image_Exception $e) {

/*
* This means we've determined that we're lost and don't know
* how to parse this image.
*
* We set the size to invalid and move on
*/
$result['size'] = 'invalid';

return -1;
}


/*
* We return -1 to abort the transfer when we have enough buffered
* to find the size
*/
//
// hey curl! this is an error. But really we just are stopping cause
// we already have what we wwant
return -1;
});

return $ch;
}
}

0 comments on commit 3e6f1ac

Please sign in to comment.