# image_downloader script
This script is used to download and resize images to be used for classification in the 'run_model' script. It is expected that the user has a CSV with image URLs to be downloaded and resized. See project learning brief for more details.

## Import Statements
Required packages include: requests, PIL, pandas, os and certifi. Also, warnings are disabled to provide a cleaner output. Then, the root path is set.

In [None]:
import requests
from PIL import Image
import pandas as pd
import os
import certifi
from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

In [None]:
ROOT = os.path.dirname(__file__)

## fetch_images Function
This is a helper function used in the main image_downloader script. It is passed a list of urls (from the user's CSV, see the project's learning brief for more information), and a destination folder (which will also be used in the run_model script). 

In [None]:
def fetch_images(url_list, dest_folder):
    for url in url_list:
        img_name=url[-17:]
        img_path = os.path.join(ROOT,'images',dest_folder,img_name)
        if not os.path.exists(img_path):
            print('Downloading ' + img_name)
            try:
                image = Image.open(requests.get(url, stream=True, verify=False).raw)
                image = resize(image)
                image.save(img_path)
            except Exception as e:
                print('Error on ' + img_name)
                print(e)
                pass
        else:
            print('Skipping ' + img_name)

## resize Function
Another helper function for the image_downloader script, used to resize an image to 512x512, in order to save on storage space. 

In [None]:
def resize(image):
    width, height = image.size
    if width != height:
        square_size = min(width, height)
        left = (width - square_size) / 2
        top = (height - square_size) / 2
        right = (width + square_size) / 2
        bottom = (height + square_size) / 2
        image = image.crop((left, top, right, bottom))
    image_sm = image.resize((512,512))
    return image_sm

## Main image_downloader script
First, the user is prompted for a CSV file, then the column name in that file that contains the image URLS, and finally a destination folder to store the images in. See project learning brief for a detailed explanation of how it expects the CSV file to be set up. 

In [None]:
if __name__ == '__main__':
    csv_file = input("What is the name of the CSV file, in the CSV folder? ")
    col = input('What is the column name of image URLs? ')
    dest_folder = input("Where would you like to put the images (inside the images folder)? ")

    df = pd.read_csv(os.path.join(ROOT,'csv',csv_file))
    df.dropna(axis=0, subset=[col], inplace=True)
    urls = list(df[df[col].str.contains('http')][col])
    fetch_images(urls, dest_folder)