# Image recognition with TTR


## Bridging between perceptual and conceptual domains

Let's apply the object detection representation proposed in Dobnik & Cooper's *Interfacing language, spatial perception and cognition in TTR* to image recognition.

![Fig 8](fig/lspc-fig8.png)

Here, we use `Img` instead of `PointMap` for the whole, but for `reg` we use yet another type, `ImgPart`. In Cooper's case the same type can be used to represent both the region and the whole, because a `PointMap` is a set of absolute positions. With `Img`, positions are relative to an origin, which needs to be specified when cropping.

I guess in the general case, the domain of an `ObjectDetector` function need not be the same as the `reg` fields in the output elements.

In [9]:
import sys
sys.path.append('pyttr')
from pyttr.ttrtypes import *
import PIL.Image

# Basic types.

Ind = BType('Ind')

Int = BType('Int')
Int.learn_witness_condition(lambda x: isinstance(x, int))
print(Int.query(365))

Image = BType('Image')
Image.learn_witness_condition(lambda x: isinstance(x, PIL.Image.Image))
img = PIL.Image.open('res/dogcar.jpg')
print(Image.query(img))

# ImgPart type.

ImgPart = RecType({'i': Image, 'cx': Int, 'cy': Int, 'w': Int, 'h': Int})
print(ImgPart.query(Rec({'i': img, 'cx': 100, 'cy': 150, 'w': 40, 'h': 20})))

# Redefine Image.show() to work with Rec.show().
def image_show(self):
    return str(self)
PIL.Image.Image.show = image_show
show(img)

True
True
True


'<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FB257391DD8>'

In [10]:
Ppty = FunType(Ind, Ty)
ImageDetection = RecType({'reg': ImgPart, 'pfun': Ppty})
ImageDetections = ListType(ImageDetection)
ObjectDetector = FunType(Image, ImageDetections)

## Object detection model YOLO

Requires OpenCV and [Darkflow](https://github.com/thtrieu/darkflow). `yolo.weights` is from [Yolo](https://pjreddie.com/darknet/yolo/).

In [11]:
from darkflow.net.build import TFNet

tfnet = TFNet({"model": "yolo/yolo.cfg", "load": "yolo/yolo.weights",
    'config': 'yolo', "threshold": 0.1})

Parsing yolo/yolo.cfg
Loading yolo/yolo.weights ...
Successfully identified 203934260 bytes
Finished in 0.020205020904541016s
Model has a coco model name, loading coco labels.

Building net ...
Source | Train? | Layer description                | Output size
-------+--------+----------------------------------+---------------
       |        | input                            | (?, 608, 608, 3)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 608, 608, 32)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 304, 304, 32)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 304, 304, 64)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 152, 152, 64)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 152, 152, 128)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 152, 152, 64)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 152, 152, 128)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 76, 76, 128)
 Load  |  Yep!  | conv 3x3p1_1  +b

In [12]:
import numpy as np

def xy1xy2_to_cwh(x1, y1, x2, y2):
    '''Transform (x1, y1, x2, y2) to (x_center, y_center, width, height).'''
    return {'cx': int(x1/2 + x2/2), 'cy': int(y1/2 + y2/2), 'w': x2 - x1, 'h': y2 - y1}
print(xy1xy2_to_cwh(10, 20, 30, 40))

def yolo_detector(i):
    return [Rec({
        'reg': Rec({
            'i': i,
            **xy1xy2_to_cwh(o['topleft']['x'], o['topleft']['y'], o['bottomright']['x'], o['bottomright']['y']),
        }),
        'pfun': Fun('v', Ind, PType(Pred(o['label'], [Ind]), ['v'])),
    }) for o in tfnet.return_predict(np.array(i))] # @todo RBG/BGR?

{'cx': 20, 'h': 20, 'w': 20, 'cy': 30}


In [13]:
image_detections = yolo_detector(img)

print(ImageDetections.query(image_detections))
print(ImageDetection.query(image_detections[0]))
print(Ppty.query(image_detections[0].pathvalue('pfun')))
print(ImgPart.query(image_detections[0].pathvalue('reg')))

print(show(image_detections))

True
True
True
True
[{reg = {cx = 138, h = 809, w = 276, i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FB257391DD8>, cy = 654}, pfun = lambda v:Ind . person(v)}, {reg = {cx = 714, h = 796, w = 706, i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FB257391DD8>, cy = 657}, pfun = lambda v:Ind . person(v)}, {reg = {cx = 194, h = 381, w = 380, i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FB257391DD8>, cy = 888}, pfun = lambda v:Ind . person(v)}, {reg = {cx = 490, h = 979, w = 774, i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FB257391DD8>, cy = 589}, pfun = lambda v:Ind . car(v)}, {reg = {cx = 704, h = 718, w = 687, i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FB257391DD8>, cy = 714}, pfun = lambda v:Ind . dog(v)}, {reg = {cx = 757, h = 210, w = 219, i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FB257391DD8>, cy = 541},

### Individuation function

According to lspc.ipynb, this cannot be implemented entirely within PyTTR, we need a Python function to do it (with PyTTR input & output).

In [26]:
pred_location = Pred('location', [Ind, ImageDetection])
def ind_fun(r):
    if not ImageDetection.query(r):
        return None
    return RecType({
        'a': Ind,
        'loc': (Fun('v', Ind, PType(pred_location, ['v', r.reg])), ['a']),
        'c': (r.pfun, ['a']),
        # Why is property (function?) application represented with
        # just a <fun, arg> tuple? How does that help us?
    })

print(show(ind_fun(image_detections[0])))

{loc : (lambda v:Ind . location(v, {cx = 138, h = 809, w = 276, i = <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7FB257391DD8>, cy = 654}), [a]), c : (lambda v:Ind . person(v), [a]), a : Ind}


## Sketching on spatial relations

In [19]:
def is_left_of(imgpart_s, imgpart_o):
    return imgpart_s.pathvalue('cx') < imgpart_o.pathvalue('cx')

print(is_left_of(image_detections[0].pathvalue('reg'), image_detections[1].pathvalue('reg')))
print(is_left_of(image_detections[1].pathvalue('reg'), image_detections[2].pathvalue('reg')))

True
False


In [None]:
left_of_detector()