Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image resizing loses EXIF data #1055

Open
Toutouwai opened this issue Dec 19, 2019 · 10 comments
Open

Image resizing loses EXIF data #1055

Toutouwai opened this issue Dec 19, 2019 · 10 comments

Comments

@Toutouwai
Copy link

Toutouwai commented Dec 19, 2019

Short description of the issue

This issue is basically a re-opening of this closed issue: #348

The client-side resizing option in InputfieldImage shouldn't result in the loss of EXIF data that would otherwise be retained if the client-side resizing option were not used.

The PW client-side resizer is based on HTML5-ImageUploader. I propose that this be replaced with an alternative solution, perhaps MinifyJpegAsync.

Reasons being:

Besides MinifyJpegAsync, I know that the client-side resizing employed by DropzoneJS correctly retains EXIF data.

Setup/Environment

  • ProcessWire version: 3.0.147
@Toutouwai
Copy link
Author

Toutouwai commented Dec 19, 2019

Actually, it looks like the server-side resizing has the same problem: EXIF data is not retained if the image is resized. I tested with both the ImageMagick engine and the default GD engine, with the resizing occurring as a result of max width and max height settings in InputfieldImage.

@horst, any thoughts on this? I know you've said that you don't believe in client-side resizing but are the server-side resizing engines supposed to be retaining EXIF data?

@Toutouwai Toutouwai changed the title Client-side image resizing loses EXIF data Image resizing loses EXIF data Jan 21, 2020
@ryancramerdesign
Copy link
Member

@Toutouwai I'd like to setup a test case and was just wondering how you are testing to see that the EXIF data is lost? Do you have a particular jpeg file used for testing, and using exif_read_data() on uploaded versions of it? I know both ImageSizer and ImageInspector work with EXIF data, but don't know much about it beyond that, so also curious on @horst thoughts since he knows this subject much better than me.

I do know that EXIF data is where malicious stuff can sometimes be stored in JPEGs, so from that standpoint I kind of like the idea of it getting stripped out. Though that's a narrow perspective, based on that I don't really know what EXIF data is actually useful for when it comes to image variations (rotation data?). Keeping EXIF data would mean adding extra bytes to resulting image variations, which I'm not sure has value. It does seem like it's potentially useful to keep in the original copy though if one wants to examine camera info, etc.

@matjazpotocnik
Copy link
Collaborator

Though that's a narrow perspective

I think there are cases where EXIF data is unwanted or even forbidden and cases where EXIF data must remain. It would be nice if there would be an option in the image settings (details or/and input tab) to retain EXIF data or not.

@Toutouwai
Copy link
Author

Toutouwai commented Jan 21, 2020

@ryancramerdesign, any JPG direct from a camera/phone should contain EXIF data, but here is an image that can be used for testing purposes: https://1drv.ms/u/s!Ap5w3MQkqrRrghcpY9Q9Ef9Qz7qN

As for what EXIF data is useful for, there is a wealth of information there:

2020-01-22_103550

My current project is for a science-related website where the DateTimeOriginal (which can be different than the file created/modified times if the image has been edited) and GPS location data is particularly useful.

To clarify, my concern isn't about EXIF data not being retained on variations (which are generally for front-end display) but about EXIF data not being retained on uploaded originals when max width and max height settings are in use. I understand that the image sizer that produces variations is probably used for resizing uploads to max width/height also but I think different settings should be used here. Now that I think about it, it would be good it were possible to set a different quality setting for server-resized uploads too.

I hear what you're saying about potentially malicious stuff but it seems to me that there's no general policy in PW to strip EXIF data from uploads. The EXIF data is retained so long as max width/height settings don't result in a resize on upload.

You might be wondering why I don't disable the max width/height settings on my image fields. The reason is that digital SLRs are now capable of producing 50 megapixel images, where 40MB images are not unusual. This is vastly larger than is needed for most website uses. Besides the storage implications (my current project involves 70,000 images) this makes for a tedious upload experience for the editor, places big demands on memory and CPU when these massive images get resized, and creates unacceptable delays for the first page view on the front-end. I'd rather avoid all this and have images resized to some more reasonable dimensions on the client-side before upload.

If the client-side resizer could be changed to retain EXIF data then personally speaking this would resolve it because I always have client-side resizing enabled. But other users may prefer server-side resizing so it would be good to have the option to retain EXIF data in both cases. I haven't tested with GD but I know that ImageMagick is capable of retaining EXIF data after a resize so it must be getting discarded somewhere in the PW ImageMagick resizing engine.

A demo:

2020-01-22_105640

@Toutouwai
Copy link
Author

For client-side resizing I have a solution using Piexifjs that allows for minimal changes to the existing resizer.

In InputfieldImage.module:

//...
$config->scripts->add($thisURL . "piexif.$jsExt");
//...

In PWImageResizer.js:

if(this.currentFile.type === 'image/jpeg') {
    var exifObj = piexif.load(img.src);
    var exifStr = piexif.dump(exifObj);
    imageData = piexif.insert(exifStr, imageData);
}

@horst-n
Copy link

horst-n commented Feb 26, 2020

Serverside GD-lib does not support any meta markers. Thats why we historically don't want(ed) to touch the originally uploaded image in any case.
Imagick can do this, but currently we don't care with variations. Only the original images contain this.

We support IPTC data with original AND variation images, as this is natively supported in PHP since 4.0 or earlier.

How can PWImageResizer.js retain EXIF data when using the API or Hooks?

@Toutouwai
Copy link
Author

Imagick can do this, but currently we don't care with variations. Only the original images contain this.

The thing is, "images that have gone through the image sizer" and "variations" are not quite the same thing. If an image field has max width/height settings and server-side resizing specified then then the uploaded image goes through the image sizer but the result is not a variation - it's the original from which other variations will be generated. That's why I think we need extra configuration options for quality and metadata retention specifically for resizing on upload. Maybe it is only practical to offer the metadata retention for Imagick because for GD another library such as pel would be needed. I think it's reasonable to offer a kind of progressive enhancement with Imagick because if you care about images you are likely to be using Imagick over GD.

How can PWImageResizer.js retain EXIF data when using the API or Hooks?

This file only comes into play for client-side resizing of images uploaded via InputfieldImage. It doesn't relate to the API or hooks because there is no client in those cases.

@Toutouwai
Copy link
Author

Toutouwai commented Apr 19, 2020

The code suggested here should have a try/catch for each piexif function in case of malformed EXIF data or other bad data.

if(this.currentFile.type === 'image/jpeg') {
    try {
        var exifObj = piexif.load(img.src);
        try {
            var exifStr = piexif.dump(exifObj);
            try {
                imageData = piexif.insert(exifStr, imageData);
            } catch(error) {
                console.error(error);
            }
        } catch(error) {
            console.error(error);
        }
    } catch(error) {
        console.error(error);
    }
}

@Toutouwai
Copy link
Author

Toutouwai commented May 20, 2020

I've had a look at what's needed to retain EXIF when the "original" uploaded image is resized due to max width/height settings defined for the field. This will only work for ImageMagick because I don't think it's possible for GD to retain metadata. So it would be something where if a user cares about retaining EXIF they would need to use the ImageMagick resizing engine.

In ImageSizerEngine.php:

protected $keepMetadata = false;

public function setKeepMetadata($value = false) {
    $this->keepMetadata = (bool) $value;
    return $this;
}

In ImageSizerEngineIMagick::processResize()...

Don't strip the metadata if $this->keepMetadata is true:

if(!$this->keepMetadata) {
    // remove not wanted / needed Metadata = this is the same behave as processed via GD-lib
    foreach(array_keys($this->imageMetadata) as $k) {
        #if('icc'==$k) continue;     // we keep embedded icc profiles
        #if('iptc' == $k) continue;  // we keep embedded iptc data
        #if('exif'==$k && $this->data['keepEXIF']) continue; // we have to keep exif data too
        #if('xmp'==$k && $this->data['keepXMP']) continue; // we have to keep xmp data too
        $this->im->profileImage("$k", null); // remove this one
    }
}

Don't do any rotation according to EXIF orientation if EXIF data is to be retained in the resized image:
Otherwise the orientation will be wrong for variations derived from this original. Browsers that support image orientation will correctly rotate the original image. For more browser support the PW admin CSS should include img { image-orientation:from-image; }.

if($this->keepMetadata) {
    $needRotation = false;
} else if($this->autoRotation !== true) {
    $needRotation = false;
} else if($this->checkOrientation($orientations) && (!empty($orientations[0]) || !empty($orientations[1]))) {
    $needRotation = true;
} else {
    $needRotation = false;
}

In Pageimage::size()...

Add keepMetadata option:

$defaultOptions = array(
    'upscaling' => true,
    'cropping' => true,
    'interlace' => false, 
    'sharpening' => 'soft',
    'quality' => 90,
    'hidpiQuality' => 40, 
    'webpQuality' => 90,
    'webpAdd' => false,
    'webpName' => '', // use this for the webp file basename rather than mirroring from the jpg/png
    'webpOnly' => false, // only keep the webp version (requires webpAdd option)
    'suffix' => array(), // can be array of suffixes or string of 1 suffix
    'forceNew' => false,  // force it to create new image even if already exists
    'hidpi' => false, 
    'cleanFilename' => false, // clean filename of historial resize information
    'rotate' => 0,
    'flip' => '', 
    'nameWidth' => null, // override width to use for filename, int when populated
    'nameHeight' => null,  // override height to use for filename, int when populated
    'focus' => true, // allow single dimension resizes to use focus area?
    'zoom' => null, // zoom override, used only if focus is applicable, int when populated
    'allowOriginal' => false, // Return original image if already at requested dimensions? (must be only specified option)
    'keepMetadata' => false,
);

Set engine option according to keepMetadata size() option:

if($options['keepMetadata']) {
    $engine->setKeepMetadata($options['keepMetadata']);
}

In InputfieldImage::fileAdded() use keepMetadata option if image needs to be resized according to max width/height settings:

$pagefile2 = $pagefile->size($maxWidth, $maxHeight, array('cropping' => false, 'keepMetadata' => true));

@Toutouwai
Copy link
Author

Toutouwai commented May 21, 2020

It would be good if there was an additional config option in Image fields: "Server-side resize quality percent for JPEGs", similar to the equivalent clientQuality setting. Then this setting could be used when an image is resized on upload to max width/height.

$pagefile2 = $pagefile->size($maxWidth, $maxHeight, array('cropping' => false, 'keepMetadata' => true, 'quality' => $serverQuality));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants