# Downloading Images using the Flickr API

For many image processing problems, one needs a set of test images.  The [ImageNet site](http://www.image-net.org/) has an enormous set of images of 1000 different classes with bounding boxes.  But, this set is large, and may not have the exact image classes you are looking for.  An alternative way to get images is via the [Flickr API](https://stuvel.eu/flickrapi-doc/), which gives you access to millions of images in the Flickr database.

In this demo, you will learn to:
* Load images from the Flickr API
* Resize images to a desired shape
* Save images to a file

## Installing and Loading the Flickr API package

The [Flickr API](https://stuvel.eu/flickrapi-doc/) provides a simple python interface.  You will need to install it via

    pip install flickrapi
   
You can validate the installation by import the package.

In [19]:
import flickrapi
import urllib.request
import matplotlib.pyplot as plt
import numpy as np
import skimage.io
import skimage.transform
%matplotlib inline

To use the Flickr API, you need to apply for keys.  You can do this on the [Flickr website](https://www.flickr.com/services/api/misc.api_keys.html).  Once you have the keys, they can be set as follows.

In [20]:
api_key = u'a21e5d54a4d2270bb9beb69551fc45eb'  # Enter key here
api_secret = u'13cebf5d2d880029'               # Enter secret here
flickr = flickrapi.FlickrAPI(api_key, api_secret)

## Loading the Images
The `flickr.walk()` function provides a generator of URLs for photos with a particular `keyword`.  To illustrate the method, we will get a number of elephant images.

In [21]:
keyword = 'car'
dir_name = 'car'
photos = flickr.walk(text=keyword, tag_mode='all', tags=keyword,extras='url_c',\
                     sort='relevance',per_page=100)

Create a directory with the name of the keyword for the images.

In [22]:
import os
dir_exists = os.path.isdir(dir_name)
if not dir_exists:
    os.mkdir(dir_name)
    print("Making directory %s" % dir_name)
else:
    print("Will store images in directory %s" % dir_name)

Will store images in directory car


Next, we create a routine for displaying images.

In [23]:
# Display the image
def disp_image(im):
    if (len(im.shape) == 2):
        # Gray scale image
        plt.imshow(im, cmap='gray')    
    else:
        # Color image.  
        im1 = (im-np.min(im))/(np.max(im)-np.min(im))*255
        im1 = im1.astype(np.uint8)
        plt.imshow(im1)    
        
    # Remove axis ticks
    plt.xticks([])
    plt.yticks([])

Often we need images to be all of the same size.  The following routine resizes images, padding the image if necessary.

In [24]:
def resize_im(im0,nrow,ncol):
    """
    Resizes and pads an image to match a target output shape.
    
    im0:  The input image
    nrow, ncol:  The desired output shape.
    
    It is assumed that nrow==ncol.
    """
    # Get current image shape
    nrow0 = im0.shape[0]
    ncol0 = im0.shape[1]
    nchan = im0.shape[2]
    # Crop rows or columns if not square
    if (ncol0 > nrow0):
        pad = (ncol0-nrow0)//2
        im = np.zeros((ncol0,ncol0,nchan),dtype=np.uint8)
        im[pad:pad+nrow0,:,:] = im0
    elif (nrow0 >= ncol0):
        pad = (nrow0-ncol0)//2
        im = np.zeros((nrow0,nrow0,nchan),dtype=np.uint8)        
        im[:,pad:pad+ncol0,:] = im0
        
    # Resize the image
    im = skimage.transform.resize(im,(nrow,ncol),mode='constant')
    
    return im

Now, we walk through the images and save the files.  We will save the files in the paths:

    elephant/elephant_0000.jpg
    elephant/elephant_0001.jpg
    ...
    elephant/elephant_0009.jpg
    

In [25]:
import warnings
    
nimage = 10
i = 0
full_size_fn = 'full_size'
nrow = 224
ncol = 224
for photo in photos:
    url=photo.get('url_c')
    if not (url is None):
        # Save image to temporary full size
        urllib.request.urlretrieve(url, full_size_fn)
        
        # Read image from file
        im = skimage.io.imread(full_size_fn)
        
        # Resize the image
        im1 = resize_im(im,nrow,ncol)
        
        # Convert to uint8, suppress the warning about the precision loss
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            im2 = skimage.img_as_ubyte(im1)
    
        # Save the image
        local_name = '{0:s}/{1:s}_{2:04d}.jpg'.format(dir_name,keyword, i)  
        skimage.io.imsave(local_name, im2)      
        print(local_name)
        i = i + 1        
    if (i >= nimage):        
        break        

ConnectionError: HTTPSConnectionPool(host='api.flickr.com', port=443): Max retries exceeded with url: /services/rest/ (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x000001CDDD1B02E8>: Failed to establish a new connection: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应，连接尝试失败。',))

We display some of the images.  Note the black border since we have padded the images to keep the square aspect ratio.

In [11]:
plt.figure(figsize=(20,20))
nplot = 4
for i in range(nplot):
    fn = '{0:s}/{1:s}_{2:04d}.jpg'.format(keyword,keyword, i)  
    im = skimage.io.imread(fn)
    plt.subplot(1,nplot,i+1)
    disp_image(im)

FileNotFoundError: [Errno 2] No such file or directory: 'elephant/elephant_0000.jpg'

<matplotlib.figure.Figure at 0x14b7fcc4390>