# Image recognition with TTR


## Bridging between perceptual and conceptual domains

Let's apply the object detection representation proposed in Dobnik & Cooper's *Interfacing language, spatial perception and cognition in TTR* to image recognition.

![Fig 8](fig/lspc-fig8.png)

Here, we use `Image` instead of `PointMap` for the whole, but instead of `reg:PointMap` we use yet another type (and rename it), `seg:Segment`. In Cooper's case the same type can be used to represent both the region and the whole, because a `PointMap` is a set of absolute positions. With `Image`, positions are relative to an origin, which needs to be specified when cropping.

I guess in the general case, the domain of an `ObjectDetector` function need not be the same as the `reg` fields in the output elements.

In [2]:
import sys
sys.path.append('pyttr')
from pyttr.ttrtypes import *
from pyttr.utils import *
import PIL.Image

ttrace()

# Basic types.

Ind = BType('Ind')

Int = BType('Int')
Int.learn_witness_condition(lambda x: isinstance(x, int))
print(Int.query(365))

Image = BType('Image')
Image.learn_witness_condition(lambda x: isinstance(x, PIL.Image.Image))
img = PIL.Image.open('res/dogcar.jpg')
print(Image.query(img))

# Segment type: a rectangular area of a given image.

Segment = RecType({'i': Image, 'cx': Int, 'cy': Int, 'w': Int, 'h': Int})
print(Segment.query(Rec({'i': img, 'cx': 100, 'cy': 150, 'w': 40, 'h': 20})))

# Redefine Image.show() to work with Rec.show().
def image_show(self):
    return str(self)
PIL.Image.Image.show = image_show
show(img)

True
True
True


'<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FC7F408FDD8>'

In [3]:
Ppty = FunType(Ind, Ty)
ImageDetection = RecType({'seg': Segment, 'pfun': Ppty})
ImageDetections = ListType(ImageDetection)
ObjectDetector = FunType(Image, ImageDetections)

## Object detection model YOLO

Requires OpenCV and [Darkflow](https://github.com/thtrieu/darkflow). `yolo.weights` is from [Yolo](https://pjreddie.com/darknet/yolo/).

In [4]:
from darkflow.net.build import TFNet

tfnet = TFNet({"model": "yolo/yolo.cfg", "load": "yolo/yolo.weights",
    'config': 'yolo', "threshold": 0.1})

Parsing yolo/yolo.cfg
Loading yolo/yolo.weights ...
Successfully identified 203934260 bytes
Finished in 0.03171849250793457s
Model has a coco model name, loading coco labels.

Building net ...
Source | Train? | Layer description                | Output size
-------+--------+----------------------------------+---------------
       |        | input                            | (?, 608, 608, 3)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 608, 608, 32)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 304, 304, 32)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 304, 304, 64)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 152, 152, 64)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 152, 152, 128)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 152, 152, 64)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 152, 152, 128)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 76, 76, 128)
 Load  |  Yep!  | conv 3x3p1_1  +bn

In [41]:
# Make preds and ptypes identifiable by their predicate names.
# From now on, use mktype().
ptypes = dict()
def mkptype(sym, types=[Ind], vars=['v']):
    id = '/'.join([sym, ','.join(show(type) for type in types), ','.join(vars)])
    if id not in ptypes:
        ptypes[id] = PType(Pred(sym, types), vars)
    return ptypes[id]

print(show(mkptype('rabbit') is mkptype('rabbit')))

True


In [46]:
import numpy as np

def xy1xy2_to_cwh(x1, y1, x2, y2):
    '''Transform to center, width and height.'''
    return {'cx': int(x1/2 + x2/2), 'cy': int(y1/2 + y2/2), 'w': x2 - x1, 'h': y2 - y1}

def yolo_detector(i):
    return [Rec({
        'seg': Rec({
            'i': i,
            **xy1xy2_to_cwh(o['topleft']['x'], o['topleft']['y'], o['bottomright']['x'], o['bottomright']['y']),
        }),
        'pfun': Fun('v', Ind, mkptype(o['label'], 'v')),
    }) for o in tfnet.return_predict(np.array(i))] # @todo RBG/BGR?

image_detections = yolo_detector(img)

print(ImageDetections.query(image_detections))
print(ImageDetection.query(image_detections[0]))
print(Ppty.query(image_detections[0].pfun))
print(Segment.query(image_detections[0].seg))

for image_detection in image_detections:
    print(show(image_detection))

True
True
True
True
{pfun = lambda v:Ind . person(v), seg = {i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FC7F408FDD8>, cy = 654, w = 276, cx = 138, h = 809}}
{pfun = lambda v:Ind . person(v), seg = {i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FC7F408FDD8>, cy = 657, w = 706, cx = 714, h = 796}}
{pfun = lambda v:Ind . person(v), seg = {i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FC7F408FDD8>, cy = 888, w = 380, cx = 194, h = 381}}
{pfun = lambda v:Ind . car(v), seg = {i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FC7F408FDD8>, cy = 589, w = 774, cx = 490, h = 979}}
{pfun = lambda v:Ind . dog(v), seg = {i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FC7F408FDD8>, cy = 714, w = 687, cx = 704, h = 718}}
{pfun = lambda v:Ind . chair(v), seg = {i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FC7F408FDD8>, cy = 541, w 

Here's a version where individuals are created too.

In [59]:
DetectedInd = RecType({'seg': Segment, 'pfun': Ppty, 'ind': Ind})
DetectedInds = ListType(DetectedInd)

def yolo_detector_ind(i):
    return [Rec({
        'seg': Rec({
            'i': i,
            **xy1xy2_to_cwh(o['topleft']['x'], o['topleft']['y'], o['bottomright']['x'], o['bottomright']['y']),
        }),
        'pfun': Fun('v', Ind, mkptype(o['label'], 'v')),
        'ind': Ind.create(),
    }) for o in tfnet.return_predict(np.array(i))]

ind_detections = yolo_detector_ind(img)
print(DetectedInds.query(ind_detections))
print(show(ind_detections[0]))
print(list(r.ind for r in ind_detections))

True
{pfun = lambda v:Ind . person(v), ind = _a84, seg = {i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FC7F408FDD8>, cy = 654, w = 276, cx = 138, h = 809}}
['_a84', '_a85', '_a86', '_a87', '_a88', '_a89', '_a90', '_a91', '_a92', '_a93']


## Spatial relations

In [63]:
ind_det_index = dict((r.ind, r) for r in ind_detections)

def indpos(a):
    if a in ind_det_index:
        return ind_det_index[a].seg.cx, ind_det_index[a].seg.cy

Left = mkptype('left', [Ind, Ind], ['a', 'b'])
Left.learn_witness_condition(lambda ab: indpos(ab[0])[0] < indpos(ab[1])[0])
print(show(Left))

print(Left.query((ind_detections[0].ind, ind_detections[1].ind)))
print(Left.query((ind_detections[1].ind, ind_detections[2].ind)))

left(a, b)
True
False
