Edit: See @susemeee's comment below (the image COCO_train2014_000000167126.jpg is corrupted, and you can download a replacement at https://msvocds.blob.core.windows.net/images/262993_z.jpg)
I was trying to run prepro.py but eventually ran into an issue in scipy's pilutil package (see below).
I've installed all dependencies, run the coco_preprocess.ipynb, and downloaded train2014.zip + val2014.zip and extracted them into coco/images.
Am I missing something?
$ python prepro.py --input_json coco/coco_raw.json --num_val 5000 --num_test 5000 --images_root coco/images --word_count_threshold 5 --output_json coco/cocotalk.json --output_h5 coco/cocotalk.h5
parsed input parameters:
example processed tokens:
['a', 'woman', 'riding', 'a', 'bike', 'down', 'a', 'bike', 'trail']
... lots of info deleted for brevity ...
inserting the special UNK token
assigned 5000 to val, 5000 to test.
encoded captions to array of size (616767, 16)
processing 0/123287 (0.00% done)
... lots of percentages deleted for brevity ...
processing 60000/123287 (48.67% done)
Traceback (most recent call last):
File "prepro.py", line 236, in <module>
File "prepro.py", line 186, in main
Ir = imresize(I, (256,256))
File "/usr/local/lib/python2.7/site-packages/scipy/misc/pilutil.py", line 424, in imresize
im = toimage(arr, mode=mode)
File "/usr/local/lib/python2.7/site-packages/scipy/misc/pilutil.py", line 234, in toimage
raise ValueError("'arr' does not have a suitable array shape for "
ValueError: 'arr' does not have a suitable array shape for any mode.
thank you posting an issue.
poof! I'm not quite sure what's up here. One strategy to follow here is to print the filenames as they are being processed, and then manually look at the filename that failed. Presumably there is something wrong with its encoding. Could you report what the filename is? And can you try opening it in some image editing program, saving it back to a jpg, replacing the original, and rerunning?
Okay, I'm going to rerun the script now with the following change in place:
diff --git a/prepro.py b/prepro.py
index ea581da..7440963 100644
@@ -183,7 +183,11 @@ def main(params):
for i,img in enumerate(imgs):
# load the image
I = imread(os.path.join(params['images_root'], img['file_path']))
- Ir = imresize(I, (256,256))
+ Ir = imresize(I, (256,256))
+ print 'failed resizing image %s' % (img['file_path'],)
# handle grayscale input images
if len(Ir.shape) == 2:
Ir = Ir[:,:,np.newaxis]
Okay here's the error I got:
failed resizing image train2014/COCO_train2014_000000167126.jpg
The file is definitely corrupted (the file size is too small and most of it is just gray). The question is if it got corrupted during decompression on my side or if it's actually like that in the original train2014.zip file. I'll try to investigate.
The image at mscoco.org: http://mscoco.org/explore/?id=167126
Note : The image COCO_train2014_000000167126.jpg was also corrupted on my side, so I put fresh new copy from http://mscoco.org/explore/?id=167126 and the preprocessing worked.
https://msvocds.blob.core.windows.net/images/262993_z.jpg is an actual URL of the image.
Thank you! I am going to close this issue and adjust the documentation to point to it.
Hello, I wrote some scripts which may be useful:
script check_jpeg.py finds out broken jpegs and non-jpegs.