Here we do some experiments with image resizing.  
We will use `pillow` module (a fork of `PIL`), included with Anaconda 2.2.  
For more info: http://pillow.readthedocs.org/ 

In [1]:
import os
import os.path
from PIL import Image

Let us set the parameters in one place:

In [2]:
src_dir = "/kaggle/retina/sample" # source directory of images to resize 
trg_dir = "/kaggle/retina/resized" # target directory of the resized images 
prefix = "resized_" # string to prepend to the resized file name
hsize = 256 # horizontal size of the resized image
vsize = 256 # vertical size of the resized image  

**Load** an image:

In [3]:
all_files = filter(lambda x: x.endswith(".jpeg"), os.listdir(src_dir))
filename = all_files[0]
filepath = os.path.join(src_dir, filename)
%timeit Image.open(filepath)
im = Image.open(filepath)

The slowest run took 158.14 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 136 µs per loop


**Resize** the image with default downsampling:  

In [4]:
%timeit im.resize((hsize, vsize))
resized_im = im.resize((hsize, vsize))

The slowest run took 485.36 times longer than the fastest. This could mean that an intermediate result is being cached 
1 loops, best of 3: 455 µs per loop


LANCZOS anti-aliasing method is recommended for downsampling by PIL tutorial, but is much slower:

In [5]:
%timeit im.resize((hsize, vsize), Image.LANCZOS) 

1 loops, best of 3: 192 ms per loop


**Save** the resized image.  
Parameter value `quality` > 95 is not recommended due to excessive file size with minimal benefits, but we do not care.  
More info on file formats can be found here: http://pillow.readthedocs.org/handbook/image-file-formats.html

In [6]:
if not os.path.exists(trg_dir):
    os.makedirs(trg_dir)

In [7]:
resized_filepath = os.path.join(trg_dir, prefix + filename)
%timeit resized_im.save(resized_filepath, "JPEG", quality = 100) 

100 loops, best of 3: 5.32 ms per loop


For quick and dirty experiments we can use the default downsampling. 
Here we create downsized copies of all files in the sample directory:

In [8]:
def resize_all(method):
    for filename in all_files:
        filepath = os.path.join(src_dir, filename)
        im = Image.open(filepath)
        resized_im = im.resize((hsize, vsize), method)
        resized_filepath = os.path.join(trg_dir, prefix + filename)
        resized_im.save(resized_filepath, "JPEG", quality = 100)
        
%timeit -n1 -r1 resize_all(0)

1 loops, best of 1: 1.66 s per loop


Now try with LANCZOS:

In [9]:
%timeit -n1 -r1 resize_all(Image.LANCZOS)

1 loops, best of 1: 3.1 s per loop


Since the processing is dominated by the CPU-bound resizing we can benefit from parallelization (forthcoming...)