# Image processing in Python

## Cleaner

Given an image we want to count the number of pixels of red, green, blue, black and white color and store that value in a file.

In [1]:
from PIL import Image
import os, shutil

There will be more than one image to process so we will loop through a directory and process every jpg file. Results will be stored in a file, output format will be: image_identifier results.

In [2]:
directory = 'test'
fout = 'stats'

Pixel group membership is defined by the minimum distance between the values of a pixel and a group.
Returns a group identifier, known groups are: 'red', 'green', 'blue', 'black' and 'white'.

Distance is defined as a lambda expression that represents the [Manhattan distance](https://en.wikipedia.org/wiki/Taxicab_geometry) between two points.

In [3]:
def colorgroup(pixel):
    groups = {  'red' : (255,   0,   0),
              'green' : (  0, 255,   0),
               'blue' : (  0,   0, 255),
              'black' : (  0,   0,   0),
              'white' : (255, 255, 255)}

    dist = lambda p1,p2: abs(p1[0] - p2[0]) + abs(p1[1] - p2[1]) + abs(p1[2] - p2[2])

    res = {}
    for c,v in groups.items():
        res[c] = dist(pixel,v)
    return min(res, key = res.get)

Once an image is selected the processing loops through every pixel and counts how many of each color are there.

In [4]:
def process(file):
    count = {  'red' : 0,
             'green' : 0,
              'blue' : 0,
             'black' : 0,
             'white' : 0,
             'red/green' : -1,    'red/blue' : -1,   'red/black' : -1,   'red/white' : -1,
             'green/red' : -1,  'green/blue' : -1, 'green/black' : -1, 'green/white' : -1,
              'blue/red' : -1,  'blue/green' : -1,  'blue/black' : -1,  'blue/white' : -1,
             'black/red' : -1, 'black/green' : -1,  'black/blue' : -1, 'black/white' : -1,
             'white/red' : -1, 'white/green' : -1,  'white/blue' : -1, 'white/black' : -1}
    
    im = Image.open('{0}/{1}'.format(directory,file)).convert('RGB')
    
    width,height = im.size
    for w in range(width):
        for h in range(height):
            count[colorgroup(im.getpixel((w,h)))] += 1
    
    for c in {'red', 'green', 'blue', 'black', 'white'}:
        for c2 in {'red', 'green', 'blue', 'black', 'white'}:
            if c == c2 or count[c] == 0 or count[c2] == 0:
                continue
            count['{0}/{1}'.format(c,c2)] = count[c] / count[c2]
            
    return count

Now that images can be processed is time to determine which files in the directory are .jpg and process them.

If the image size is 0 or the stats indicate that isn't valid will be deleted.

An image is not valid if:
 - There are more white than black pixels.
 - There aren't any red or blue pixels.

In [5]:
encdir = os.fsencode(directory)
out = open('{0}.txt'.format(fout), 'w')
for file in os.listdir(encdir):
    fname = os.fsdecode(file)
    if not fname.endswith(".jpg"):
        continue
        
    if os.stat('{0}/{1}'.format(directory,fname)).st_size == 0:
        #os.remove('{0}/{1}'.format(directory,fname))
        shutil.move('{0}/{1}'.format(directory,fname), '{0}/'.format("Bin"))
        continue
        
    result = process(fname)
    if result['white'] > result['black'] or (result['red'] == 0 and result['blue'] == 0):
        #os.remove('{0}/{1}'.format(directory,fname))
        shutil.move('{0}/{1}'.format(directory,fname), '{0}/{1}'.format("Bin/",fname))
        continue
    out.write('{0} {1}\n'.format(fname, result))
out.close()