Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close input early for read errors #1370

Closed
kleisauke opened this issue Jul 18, 2019 · 32 comments
Closed

Close input early for read errors #1370

kleisauke opened this issue Jul 18, 2019 · 32 comments

Comments

@kleisauke
Copy link
Member

It might be necessary to close the input file prematurely for corrupt images / read errors (similar to kleisauke/net-vips#12 / #1066).

Consider this pyvips example:

import logging
import os
import gc

import pyvips

logging.basicConfig(level=logging.DEBUG)
pyvips.cache_set_max(0)

# https://t0.nl/corrupted.tif
file_name = 'corrupted.tif'

try:
    # im = pyvips.Image.new_from_file(file_name, access=pyvips.Access.SEQUENTIAL)
    im = pyvips.Image.new_from_file(file_name)
    im.write_to_file('x.tif')
except Exception as e:
    print(str(e))
finally:
    # no effect
    gc.collect()

    try:
        os.remove(file_name)
    except:
        # being used by another process
        print('{0} could not be deleted'.format(file_name))

This patch seems to work for me (for tiff images):

diff --git a/libvips/foreign/tiff2vips.c b/libvips/foreign/tiff2vips.c
index 1111111..2222222 100644
--- a/libvips/foreign/tiff2vips.c
+++ b/libvips/foreign/tiff2vips.c
@@ -2372,12 +2372,16 @@ vips__tiff_read( const char *filename, VipsImage *out,
 		return( -1 );
 
 	if( rtiff->header.tiled ) {
-		if( rtiff_read_tilewise( rtiff, out ) )
+		if( rtiff_read_tilewise( rtiff, out ) ) {
+			rtiff_free( rtiff );
 			return( -1 );
+		}
 	}
 	else {
-		if( rtiff_read_stripwise( rtiff, out ) )
+		if( rtiff_read_stripwise( rtiff, out ) ) {
+			rtiff_free( rtiff );
 			return( -1 );
+		}
 	}
 
 	return( 0 );

Originally posted in kleisauke/net-vips#12 (comment)

Unfortunately, the above patch only works for random access. It doesn't work for sequential access.

@kleisauke
Copy link
Member Author

This patch seems to work for both random and sequential access:

diff --git a/libvips/foreign/tiff2vips.c b/libvips/foreign/tiff2vips.c
index 1111111..2222222 100644
--- a/libvips/foreign/tiff2vips.c
+++ b/libvips/foreign/tiff2vips.c
@@ -557,6 +557,7 @@ rtiff_strip_read( Rtiff *rtiff, int strip, tdata_t buf )
 
 	if( length == -1 ) {
 		vips_foreign_load_invalidate( rtiff->out );
+		rtiff_free( rtiff );
 		vips_error( "tiff2vips", "%s", _( "read error" ) );
 		return( -1 );
 	}

@jcupitt
Copy link
Member

jcupitt commented Jul 19, 2019

Oh, nice!

This all feels very hacky though :( Perhaps there's a cleaner solution to all these early close problems?

I was thinking about the evalstart / eval / evalend signals that are emitted for progress feedback, and the minimise signal that's sent after an eval pipeline has run. Perhaps we could use one of those somehow?

minimise seems like the closest to what we need. Here's tilecache listening for minimise and dropping unused tiles:

https://github.com/libvips/libvips/blob/master/libvips/conversion/tilecache.c#L350

It's emitted at the end of every threadpool iteration (ie. the end of every large computation):

https://github.com/libvips/libvips/blob/master/libvips/iofuncs/threadpool.c#L987

Loaders would need to be prepared to reopen their fds if necessary, but they could maybe close on minimise of their output.

@jcupitt
Copy link
Member

jcupitt commented Jul 20, 2019

I made a branch to experiment with using minimise for early close. This signal is emitted by threadpool at the end of a loop over an image and is remitted on all upstream images, so I think it should catch all cases, for example:

vips crop huge.png x.jpg 0 0 100 100 

It'll early-close huge.png, even though we never read the final line.

It seems to work, but obv. needs more testing. I did tiff, png and jpg. Perhaps other loaders could benefit as well.

I think it should be a cleaner and more reliable way of catching end of computation. What do you think?

jcupitt added a commit that referenced this issue Jul 20, 2019
We close loaders early in order to save file handles, and on Windows to
make sure that files can be deleted as soon as possible.

Currently loaders do this by watching the Y coordinate of requests and
freeing the fd when the final line of the file is fetched. This is messy
and does not always work, since there are cases when the final line is
not fetched.

Instead, this patch gets loaders to listen for "minimise" on their
output and close on that. This signal is emitted on all upstream images
whenever a threadpool finishes a scan of an image and is usually used to
trim caches after computation.

See #1370
@jcupitt
Copy link
Member

jcupitt commented Jul 22, 2019

I added minimise handlers for gif, heif, rad, webp. The branch is here:

https://github.com/libvips/libvips/tree/loader-minimise-experiment

@kleisauke
Copy link
Member Author

Awesome! That looks much better and more reliable.

I just tested the loader-minimise-experiment branch on Windows, see:
https://gist.github.com/kleisauke/4b8382c97bdb5a96f51ce6180379a549

Output:

Successfully deleted truncated.jpg
Successfully deleted corrupted.tif
Successfully deleted trailer_unterminated.pdf
truncated.png couldn't be deleted
Loaded from niftiload instead than analyzeload, skipping test
Successfully deleted truncated.hdr
truncated.gif couldn't be deleted
Successfully deleted truncated.webp
Loaded from magickload instead than fitsload, skipping test
Successfully deleted truncated.fits
Successfully deleted truncated.svs
truncated.exr couldn't be deleted
Successfully deleted truncated.svg
Successfully deleted truncated.bmp
Successfully deleted truncated_nifti.nii.gz
truncated.heic couldn't be deleted

The PNG, GIF, HEIC, and EXR (I guess this can be ignored) loaders doesn't early close on truncated files. For PNG, it looks like it's the cause of this return statement:

I'm not sure why the other loaders don't work (perhaps the g_signal_connect is set on the wrong GObject instance?).

The above log also reveals that truncated.hdr and truncated.fits are loaded by the wrong loader, so that has to be checked as well.

@jcupitt
Copy link
Member

jcupitt commented Jul 22, 2019

Oh huh interesting, I'll have a look.

Thanks for testing!

jcupitt added a commit that referenced this issue Jul 23, 2019
Kleis pointed out a suprious return in png load minimise.

see #1370 (comment)
@jcupitt
Copy link
Member

jcupitt commented Jul 23, 2019

You're right, the logic was all tangled up in png minimise. Nice!

I'm puzzled by gif and heif failing too. Do you have your test files somewhere handy?

@kleisauke
Copy link
Member Author

Thanks, PNG seems to work fine now!

The GIF and HEIC files were created in this way:

wget https://github.com/libvips/libvips/raw/master/test/test-suite/images/cogs.gif
head -c 2000 cogs.gif > truncated.gif
wget https://github.com/libvips/libvips/raw/master/test/test-suite/images/Example1.heic
head -c 10000 Example1.heic > truncated.heic

The above gist also shows (within the comments) how the files were created.

jcupitt added a commit that referenced this issue Jul 24, 2019
we were not closing early on a read error during gif scan

see #1370 (comment)
@jcupitt
Copy link
Member

jcupitt commented Jul 24, 2019

Ooop, sorry, I missed the comment.

I think I've fixed GIF and HEIC. They were closing early if there was an error reading pixels, but they were not closing early if the error happened during a scan of the image header.

@kleisauke
Copy link
Member Author

Thanks! The GIF and HEIC loaders seems to work fine now:

Successfully deleted truncated.jpg
Successfully deleted corrupted.tif
Successfully deleted trailer_unterminated.pdf
Successfully deleted sample.mat
Successfully deleted truncated.png
Loaded from niftiload instead than analyzeload, skipping test
Successfully deleted truncated.hdr
Successfully deleted truncated.gif
Successfully deleted truncated.webp
Loaded from magickload instead than fitsload, skipping test
Successfully deleted truncated.fits
Successfully deleted truncated.svs
truncated.exr couldn't be deleted
Successfully deleted truncated.svg
Successfully deleted truncated.bmp
Successfully deleted truncated_nifti.nii.gz
Successfully deleted truncated.heic

PDF and EXR still uses the old Y coordinate logic:

/* In seq mode, we can shut down the input when we render the last
* row.
*/
if( top >= or->im->Ysize &&
load->access == VIPS_ACCESS_SEQUENTIAL )
vips_foreign_load_pdf_close( pdf );

/* In seq mode, we can shut down the input when we render the last
* row.
*/
if( top >= or->im->Ysize &&
load->access == VIPS_ACCESS_SEQUENTIAL )
vips_foreign_load_pdf_close( pdf );

/* We can't shut down the input file early for tile read, even if we
* know load is in sequential mode, since we are not inside a
* vips_sequential() and requests are not guaranteed to be in order.
*/

read_close( read );

I guess these loader can use the same minimise logic?

Also, do you plan to include this in the 8.8 branch?

@kleisauke
Copy link
Member Author

fwiw, these log messages:

Loaded from niftiload instead than analyzeload, skipping test
Loaded from magickload instead than fitsload, skipping test

Can be safely ignored, it tries to load the damaged image from the fallback loader. The original images (t00740_tr1_segm.hdr and WFPC2u5780205r_c0fx.fits) seems to load fine from the correct loader.

@jcupitt
Copy link
Member

jcupitt commented Jul 25, 2019

Great!

I added pdf/pdfium as well.

OpenEXR can't really do this, unfortunately. It would need revising to be a class so that it could get at the access hint and discover if it's in sequential mode. The whole thing needs revising really -- it uses the OEXR C API, which is really poor. It ought to be redone to use the C++ API. I'm not sure anyone uses it, so there's not much point.

Re. release: it feels like quite a big change in policy to me, and likely to have some consequences. We could probably use minimise more elsewhere too. Let's test it in master for a while and release in 8.9.

@kleisauke
Copy link
Member Author

pdf/pdfium also works with the minimise logic, thanks!

EXR images are not widely used, so I don't think it's worth revising the OpenEXR loader (especially with this weather, it's 40 degrees Celsius here 😅).

I'm fine with including this in 8.9 instead of 8.8. Perhaps we could revert commit 5e2d66d within the 8.8 branch? The minimise handlers solve this much more neatly, which also works with damaged images.

I could try to integrate these commits into the Windows build of NetVips. Performance critical environments are more likely to run on Linux, and this allows us to get feedback at an earlier stage. What do you think of this?

@jcupitt
Copy link
Member

jcupitt commented Jul 27, 2019

Heh yes it was 38C in London yesterday. I had to cycle 20 miles :(

I've removed that code from shrinkv already, I think.

Sure, try it out in NetVips. Early close is important there.

I was thinking about other ways we could use this. How about the case where you are assembling 1000s of images with something like arrayjoin?

Before, we would close input images when we read out the final line, and that would happen when the output read line swept past them as it built the image. Now though, they won't close until the end of evaluation. This will push memory use up quite a bit.

Perhaps operations like insert, composite and arrayjoin need new logic for sequential mode to send minimise signals to their inputs when they have finished with them?

@jcupitt
Copy link
Member

jcupitt commented Jul 28, 2019

I did some experiments, but I think the only good solution is close-on-last-line, as we had before.

I've put close-on-last back for jpg/tif/png. Together with the new minimise handler, we get nice behaviour for things like:

vips arrayjoin "$(echo *.jpg)" x.tif --across 10

Even if some of the inputs are damaged.

@jcupitt
Copy link
Member

jcupitt commented Jul 28, 2019

One thing we're not doing is freeing the memory associated with the jpg etc. images early. I'll see if there's something that can be added there.

@jcupitt
Copy link
Member

jcupitt commented Jul 28, 2019

Nope :( Not a simple thing.

Let's merge and close.

@jcupitt jcupitt closed this as completed Jul 28, 2019
@kleisauke
Copy link
Member Author

Many thanks for all your work! I'll integrate these commits into the Windows build of NetVips.

I've removed that code from shrinkv already, I think.

It has been removed within the master branch but not within the 8.8 branch, see for e.g. the change log:
https://github.com/libvips/libvips/blob/8.8/ChangeLog

@jcupitt
Copy link
Member

jcupitt commented Jul 28, 2019

It needs to stay in 8.8 (I think!).

We were able to remove it in master when we added support for "minimise".

@kleisauke
Copy link
Member Author

Indeed, let's keep that commit in the 8.8 branch. We can always refer users to the master branch where early close is handled more neatly.

(I thought it might cause confusion because the early shutdown behaviour in shrinkv doesn't work for damaged images)

@kleisauke
Copy link
Member Author

It looks like the recent commits that makes the GIF parser less strict breaks the truncated.gif test case.

import logging
import os
import gc

import pyvips

logging.basicConfig(level=logging.DEBUG)
pyvips.cache_set_max(0)

# wget https://github.com/libvips/libvips/raw/master/test/test-suite/images/cogs.gif
# head -c 2000 cogs.gif > truncated.gif
file_name = 'truncated.gif'

try:
    im = pyvips.Image.new_from_file(file_name, access=pyvips.Access.SEQUENTIAL)
    # im = pyvips.Image.new_from_file(file_name)
    im.write_to_file('x.jpg')
except Exception as e:
    print(str(e))
finally:
    # no effect
    gc.collect()

    try:
        os.remove(file_name)
    except:
        # being used by another process
        print('{0} could not be deleted'.format(file_name))
truncated.gif could not be deleted

(tested on Windows with libvips 8.8.2 and the minimise patches applied)

@jcupitt
Copy link
Member

jcupitt commented Aug 31, 2019

It works in master (I think). What patches are you applying? All the minimise stuff?

$ head -c 2000 cogs.gif >truncated.gif
$ python
Python 2.7.16 (default, Apr  6 2019, 01:42:57) 
[GCC 8.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyvips
>>> x = pyvips.Image.new_from_file("truncated.gif", access="sequential")
>>> x.write_to_file("x.jpg")
vips_foreign_load_gif_minimise:
>>> 

@kleisauke
Copy link
Member Author

kleisauke commented Aug 31, 2019

I tried your nice C program on master, and I see:

$ head -c 2000 cogs.gif > truncated.gif
$ gcc -g -Wall try271.c `pkg-config vips --cflags --libs`
$ ./a.out truncated.gif
processing truncated.gif ...
  open files after sniff and header read:
0  1  2  3
vips_foreign_load_gif_minimise:
  open files after processing:
0  1  2  3
  open files after cleanup:
0  1  2

So it calls vips_foreign_load_gif_minimise but the file won't be closed early. This is the output I expected (for example):

$ wget https://raw.githubusercontent.com/lovell/sharp/master/test/fixtures/truncated.jpg
$ ./a.out truncated.jpg
processing truncated.jpg ...
  open files after sniff and header read:
0  1  2
  open files after processing:
0  1  2

(a.out:28078): VIPS-WARNING **: 14:58:46.086: read gave 2 warnings

(a.out:28078): VIPS-WARNING **: 14:58:46.087: VipsJpeg: Premature end of JPEG file

  open files after cleanup:
0  1  2

@jcupitt
Copy link
Member

jcupitt commented Sep 1, 2019

Oh, I totally forgot about that. Yes, you're right, the scanner is obviously leaving something open. I'll have a look.

@jcupitt
Copy link
Member

jcupitt commented Sep 1, 2019

I think I fixed it -- it was trying to keep the FILE open between header and load, and just rewinding it.

It now has close as a proper vfunc and always closes after header read. With that fd test program I see:

$ ./a.out ~/pics/truncated.gif 
processing /home/john/pics/truncated.gif ...
  open files after sniff and header read:
0  1  2
  open files after processing:
0  1  2
  open files after cleanup:
0  1  2

Is this patch easy for you to integrate? I could put it to that minimize branch if that would be simpler.

jcupitt added a commit that referenced this issue Sep 1, 2019
We were trying to keep the FILE open for gifload between header and
load, but this meant some corrupt GIFs could keep the file open longer
than they should.

Instead, make close into a vfunc and always close between header and
load.

see #1370 (comment)
@kleisauke
Copy link
Member Author

Thanks, it's working properly now!

Patch integrated with: libvips/build-win64-mxe@6f20676.

@jcupitt jcupitt reopened this Sep 19, 2019
@jcupitt
Copy link
Member

jcupitt commented Sep 19, 2019

I thought of a horrible problem :(

Consider this code:

#!/usr/bin/env php
<?php

require __DIR__ . '/vendor/autoload.php';

use Jcupitt\Vips;

for ($i = 1; $i < count($argv); $i++) {
  $image = Vips\Image::newFromFile($argv[$i], [
    "n" => -1,
    "access" => "sequential"
  ]);

  $page_height = $image->get("page-height");
  $n_pages = $image->get("n-pages");
  echo($argv[$i] . " has " . $n_pages . " pages\n");
  for ($p = 0; $p < $n_pages; $p++) {
    echo("  rendering page " . $p . " ...\n");
    $page = $image->crop(0, $p * $page_height, $image->width, $page_height);
    $page->writeToFile($argv[$i] . "_page_" . $p . ".png");
  }
}

ie. open a PDF and write the pages out as a set of PNGs.

After the first writeToFile, the PNG save eval loop will signal minimise and the PDF loader will shut down. Now when you try to save page 2, there's no PDF load any more and you get an error. Any program which uses one sequential input image to generate several output images will have this problem.

Possible solutions:

  1. _generate() could check to see if the PDF document has been closed, and reopen it. I don't know how much this would hurt performance.
  2. We could just ban this kind of program and say people need to reopen each time they want to do a save.

Yuk!

jcupitt added a commit that referenced this issue Sep 19, 2019
We were using "minimise" to close pdf input early, but this will break
programs which make several output images from one sequential input
image. For example, loading all pages of a PDF as a toilet-roll image,
then saving pages as a set of PNGs.

This patch adds vfuncs for open and close, and makes _generate reopen
the input if necessary.

We will need similar patches for pdfiumload, gifload, gifnsload,
tiffload etc.

see #1370 (comment)
@kleisauke
Copy link
Member Author

I guess you'll need to change that example to this:

#!/usr/bin/env php
<?php

require __DIR__ . '/vendor/autoload.php';

use Jcupitt\Vips;

for ($i = 1, $count = count($argv); $i < $count; $i++) {
    $image = Vips\Image::newFromFile($argv[$i], [
        'page' => 0,
        'access' => 'sequential'
    ]);

    $n_pages = 1;
    if ($image->typeof('n-pages') !== 0) {
        $n_pages = $image->get('n-pages');
    }

    echo $argv[$i] . ' has ' . $n_pages . ' pages' . PHP_EOL;

    echo '  rendering page 0 ...' . PHP_EOL;
    $image->writeToFile($argv[$i] . '_page_0.png', [
        'strip' => true,
    ]);

    for ($p = 1; $p < $n_pages; $p++) {
        echo '  rendering page ' . $p . ' ...' . PHP_EOL;
        $page = Vips\Image::newFromFile($argv[$i], [
            'page' => $p,
            'access' => 'sequential'
        ]);
        $page->writeToFile($argv[$i] . '_page_' . $p . '.png', [
            'strip' => true,
        ]);
    }
}

Otherwise it will not handle multi-size PDF files. See for example:
https://tcpdf.org/files/examples/example_028.pdf

@jcupitt
Copy link
Member

jcupitt commented Sep 27, 2019

Yes, that's true.

For things like GIF though, we can be sure that all pages are the same size, and being about to write out a strip as a set of small files is useful.

jcupitt added a commit that referenced this issue Oct 3, 2019
heifload will restart read if necessary after minimise

see #1370
jcupitt added a commit that referenced this issue Oct 3, 2019
after minimise, we need to reopen the underlying file

passes pytest but a proper test is still to come

#1370
jcupitt added a commit that referenced this issue Oct 6, 2019
do multiple renders from one seq iage, check fds are opened and closed
as expected

see #1370
@jcupitt
Copy link
Member

jcupitt commented Oct 7, 2019

OK, I think this is all done -- I've revised all the loaders to reopen after minimise, and there's a test that checks this behaviour.

@jcupitt jcupitt closed this as completed Oct 7, 2019
@kleisauke
Copy link
Member Author

Nice! I can confirm that the #1370 (comment) test-case no longer produces errors (also tested on Windows). Performing make check with the new test_descriptors also seems to work well.

@jcupitt
Copy link
Member

jcupitt commented Oct 8, 2019

Great! Thanks for testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants