# Image recognition with TTR


## Bridging between perceptual and conceptual domains

Let's apply the object detection representation proposed in Dobnik & Cooper's *Interfacing language, spatial perception and cognition in TTR* to image recognition.

![Fig 8](fig/lspc-fig8.png)

Here, we use `Image` instead of `PointMap` for the whole, but instead of `reg:PointMap` we use yet another type (and rename it), `seg:Segment`. In Cooper's case the same type can be used to represent both the region and the whole, because a `PointMap` is a set of absolute positions. With `Image`, positions are relative to an origin, which needs to be specified when cropping.

I guess in the general case, the domain of an `ObjectDetector` function need not be the same as the `reg` fields in the output elements.

In [2]:
import sys
sys.path.append('pyttr')
from pyttr.ttrtypes import *
from pyttr.utils import *
import PIL.Image

ttrace()

# Basic types.

Ind = BType('Ind')

Int = BType('Int')
Int.learn_witness_condition(lambda x: isinstance(x, int))
print(Int.query(365))

Image = BType('Image')
Image.learn_witness_condition(lambda x: isinstance(x, PIL.Image.Image))
img = PIL.Image.open('res/dogcar.jpg')
print(Image.query(img))

# Segment type: a rectangular area of a given image.

Segment = RecType({#'i': Image,
    'cx': Int, 'cy': Int, 'w': Int, 'h': Int})
print(Segment.query(Rec({#'i': img,
    'cx': 100, 'cy': 150, 'w': 40, 'h': 20})))

# Redefine Image.show() to work with Rec.show().
def image_show(self):
    return str(self)
PIL.Image.Image.show = image_show
show(img)

True
True
True


'<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1080x1080 at 0x7F0B981042E8>'

In [292]:
def latex(*objs):
    texcode = '\n\n'.join(to_ipython_latex(obj) for obj in objs)
    #print(texcode)
    return Latex(texcode)

In [293]:
latex(Segment)

<IPython.core.display.Latex object>

$Ind$ and $Image$ are basic types.

$Segment = \left[\begin{array}{rcl}
\text{cx} &:& Int\\
\text{cy} &:& Int\\
\text{w} &:& Int\\
\text{h} &:& Int\\
\end{array}\right]$

$Ppty = (Ind \rightarrow Type)$

$Object = \left[ \begin{array}{rcl}
    \text{pfun} &:& Ppty \\
    \text{seg} &:& Segment \\
\end{array} \right]$

$ObjectDetector = ( Image \rightarrow [Object] )$

In [356]:
Ppty = FunType(Ind, Ty)
Object = RecType({'seg': Segment, 'pfun': Ppty})
Objects = ListType(Object)
ObjectDetector = FunType(Image, Objects)

latex(Ppty, ObjectDetector)

<IPython.core.display.Latex object>

## Object detection model YOLO

We use an object detection model to detect and recognize objects in an image. The output is modeled as a set of TTR records.

Requires OpenCV and [Darkflow](https://github.com/thtrieu/darkflow). `yolo.weights` is from [Yolo](https://pjreddie.com/darknet/yolo/).

In [6]:
from darkflow.net.build import TFNet
import numpy as np

tfnet = TFNet({"model": "yolo/yolo.cfg", "load": "yolo/yolo.weights",
    'config': 'yolo', "threshold": 0.1})
yolo_out = dict()
def yolo(img):
    if str(img) not in yolo_out:
        yolo_out[str(img)] = tfnet.return_predict(np.array(img))
    return yolo_out[str(img)]

Parsing yolo/yolo.cfg
Loading yolo/yolo.weights ...
Successfully identified 203934260 bytes
Finished in 0.040488481521606445s
Model has a coco model name, loading coco labels.

Building net ...
Source | Train? | Layer description                | Output size
-------+--------+----------------------------------+---------------
       |        | input                            | (?, 608, 608, 3)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 608, 608, 32)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 304, 304, 32)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 304, 304, 64)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 152, 152, 64)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 152, 152, 128)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 152, 152, 64)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 152, 152, 128)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 76, 76, 128)
 Load  |  Yep!  | conv 3x3p1_1  +b

In [295]:
# Make preds and ptypes identifiable by their predicate names.
# From now on, use mktype().
ptypes = dict()
def mkptype(sym, types=[Ind], vars=['v']):
    id = '/'.join([sym, ','.join(show(type) for type in types), ','.join(vars)])
    if id not in ptypes:
        ptypes[id] = PType(Pred(sym, types), vars)
    return ptypes[id]

print(show(mkptype('rabbit') is mkptype('rabbit')))

True


In [296]:
def xy1xy2_to_cwh(x1, y1, x2, y2):
    '''Transform to center, width and height.'''
    return {'cx': int(x1/2 + x2/2), 'cy': int(y1/2 + y2/2), 'w': x2 - x1, 'h': y2 - y1}

In [357]:
def yolo_detector(i):
    return [Rec({
        'seg': Rec({
            #'i': i,
            **xy1xy2_to_cwh(o['topleft']['x'], o['topleft']['y'], o['bottomright']['x'], o['bottomright']['y']),
        }),
        'pfun': Fun('v', Ind, mkptype(o['label'].replace(' ', '_'))),
        
    }) for o in yolo(i)] # @todo RBG/BGR?

objs = yolo_detector(img)

print(Objects.query(objs))
print(Object.query(objs[0]))
print(Ppty.query(objs[0].pfun))
print(Segment.query(objs[0].seg))

latex(objs[-1])

True
True
True
True


<IPython.core.display.Latex object>

## Individualization function

The object detection model gave us evidence that certain segments contain something that present certain properties/classes.

Now let's recognize that there are individuals which are located at those segments and having those properties.

**Is the domain of $Individualize$ really objects *of type* $IndObj$? Can a record type be *of* another record type?**

$IndObj = \left[\begin{array}{rcl}
\text{x} &:& Ind\\
\text{c}_{prop} &:& Type\\
\text{c}_{loc} &:& Type\\
\end{array}\right]$

$Individualize : (Object \rightarrow IndObj)$

$Individualize = \lambda r : Object\ . \left[\begin{array}{rcl}
    \text{x} &:& Ind \\
    \text{c}_{prop} &:& r.\text{pfun}(\text{x}) \\
    \text{c}_{loc} &:& \text{location}(\text{x}, r.\text{seg}) \\
\end{array}\right]$

In [363]:
Loc = mkptype('location', [Ind, Segment], ['v_1', 'v_2'])
LocFun = Fun('v_1', Ind, Fun('v_2', Segment, Loc))

def individualize(r):
    return RecType({
        'x': Ind,
        'c_{prop}': r.pfun.app('x'),
        'c_{loc}': LocFun.app('x').app(r.seg),
    })
latex(list(individualize(r) for r in objs))

<IPython.core.display.Latex object>

## Combining commitments

All observed situations are combined into one, so they can be considered simultaneously.

To begin with, the function $Update$ adds two record types.

$Update = \lambda r : RecType \ .\ \lambda s : RecType \ .\ \left[ s \cdot\wedge [\text{prev}: r ] \right]$

Then, $Combine$ applies $Update$ recursively through a list.

$Combine : ([RecType] \rightarrow RecType)$

$Combine = \lambda s_{0...i} : [RecType]\ . \begin{cases}
    s_0 & \text{if } i = 0, \\
    Update(Combine(s_{0...i-1}), s_i) & \text{otherwise} \\
\end{cases}$

The result is a deep structure which is *flattened* and concisely *relabeled*. In the latter step we also eliminate duplicates of the singleton $Ind$ fields.

All the operations described in this section are implemented in Python (rather than PyTTR).

In [364]:
from functools import reduce
def combine_prev(*ts):
    return reduce(lambda t1, t2: RecType({'prev': t1}).merge(t2) if t2 else t1, ts)
latex(combine_prev(RecType({'a': 'A'}), RecType({'b': 'B'})))

<IPython.core.display.Latex object>

In [365]:
from itertools import product

objs_few = objs[2:5]
situations = [individualize(r) for r in objs_few]
comb = combine_prev(*situations)
latex(comb)

<IPython.core.display.Latex object>

In [366]:
combflat = comb.flatten()
latex(combflat)

<IPython.core.display.Latex object>

In [367]:
# My own copy of gensym(), just so I can reset it...
my_gennum = dict()
def my_gensym(x):
    if x not in my_gennum:
        my_gennum[x] = count(1) 
    return x+'_{'+str(my_gennum[x].__next__())+'}'

def simplify_rectype(T):
    # Copy
    R = RecType()
    for k, v in T.comps.__dict__.items():
        R.addfield(k, v)

    # Squash inds
    ind_types = dict((show(t), t) for t in R.comps.__dict__.values() if isinstance(t, SingletonType)).values()
    for t in ind_types:
        l = my_gensym('x')
        for k, v in R.comps.__dict__.items():
            if equal(t, v):
                R.Relabel(k, l)
                
    # Prettify other labels
    for k, v in R.comps.__dict__.items():
        if not isinstance(v, SingletonType):
            l = my_gensym('c')
            R.Relabel(k, l)
    
    return R
            
simp = simplify_rectype(combflat)
latex(simp)

<IPython.core.display.Latex object>

## Spatial relations

In [369]:
# An index of Objects by Ind.
ind_objs = dict((r.x, r) for r in objs)

Left = mkptype('left', [Ind, Ind], ['a', 'b'])
Left.learn_witness_condition(lambda ab: ind_objs[ab[0]].seg.cx < ind_objs[ab[1]].seg.cx)
print(show(Left))

print(Left.query((objs[0].x, objs[1].x)))
print(Left.query((objs[1].x, objs[2].x)))

AttributeError: 'Rec' object has no attribute 'x'

In [370]:
Rels = [Left]

def sit_rel(r, s):
    for Rel in Rels:
        if Rel.query((r.x, s.x)):
            RelFun = Fun('a', Ind, Fun('b', Ind, Rel))
            yield RecType({
                'x': SingletonType(Ind, r.x),
                'y': SingletonType(Ind, s.x),
                'c': RelFun.app('x').app('y'),
            })
        
latex(list(sit_rel(objs[0], objs[1])),
      list(sit_rel(objs[1], objs[0])))

AttributeError: 'Rec' object has no attribute 'x'

## Text parsing

In [371]:
def create_abc(prop_a, prop_b, rel):
    '''Creates a record type describing two individuals and a relation between them.'''
    return RecType({
        'x': Ind,
        'y': Ind,
        'c_{' + prop_a + '}': Fun('v', Ind, mkptype(prop_a)).app('x'),
        'c_{' + prop_b + '}': Fun('v', Ind, mkptype(prop_b)).app('y'),
        'c_{' + rel + '}': Fun('a', Ind, Fun('b', Ind, mkptype(rel, [Ind, Ind], ['a', 'b']))).app('x').app('y')
    })

print("A dog is to the left of a car")
question = create_abc('dog', 'car', 'left')
latex(question)

A dog is to the left of a car


<IPython.core.display.Latex object>

In [372]:
import nltk

grammar = nltk.grammar.FeatureGrammar.fromstring('''
%start S
S[SEM=(?a, ?b, ?prep)] -> NP[SEM=?a] 'is' Prep[SEM=?prep] NP[SEM=?b]
NP[DEF=?def, SEM=?n] -> Det[DEF=?def] N[SEM=?n]
N[SEM=<dog>] -> 'dog'
N[SEM=<car>] -> 'car'
N[SEM=<person>] -> 'person'
N[SEM=<chair>] -> 'chair'
Det -> 'a' | 'an'
Prep[SEM=<left>] -> 'to' 'the' 'left' 'of'
Prep[SEM=<right>] -> 'to' 'the' 'right' 'of'
Prep[SEM=<above>] -> 'above'
Prep[SEM=<under>] -> 'under'
''')
parser = nltk.FeatureChartParser(grammar)

texts = [
    'A dog is to the left of a car',
    'A car is to the left of a dog',
#     'There is a dog to the left of a car',
#     'Is the dog to the left of the car',
#     'Is there a dog to the left of the car',
]

def parse_abc(text):
    trees = parser.parse(text.lower().split())
    tree = list(trees)[0]
    sem = nltk.sem.root_semrep(tree)
    return create_abc(*(str(s) for s in sem))

for text in texts:
    print(text)
    r = parse_abc(text)
    print(show(r))

latex(r)

A dog is to the left of a car
{c_{dog} : dog(x), c_{left} : left(x, y), c_{car} : car(y), x : Ind, y : Ind}
A car is to the left of a dog
{c_{car} : car(x), c_{left} : left(x, y), c_{dog} : dog(y), x : Ind, y : Ind}


<IPython.core.display.Latex object>

## Checking text against image

Essentially, we would like to check if the situation observed is a subtype of the situation described by the text/question, whether $Q \sqsupseteq A$. A new problem here is that field labels do not match, even if the field values (the types) match. We thus need to consider all (?) relabelings of Q:

A record type $T_1$ is a *relabel-subtype* of $T_2$ if there is a relabeling of $T_1$, $T_{1_{rlb}}$ where $T_{1_{rlb}} \sqsubseteq T_2$.

Could we forget field labels and just look at the two sets of field values? Not really, because we have dependent types, so $\text{dog}(x_1) ≠ \text{dog}(x_2)$. We need to carry out each candidate *relabeling* and check subtypeness. In practice, and in this case, relabeling the basic-type ($Ind$) fields is enough, because those are the only ones whose labels appear in dependent fields. For each basic-field relabeling, we can then kind of forget labels and just find subtypeness of field values.

In [373]:
from itertools import permutations, combinations

def copy_rectype(T):
    R = RecType()
    for k, v in T.comps.__dict__.items():
        R.addfield(k, v)
    return R

def is_basic_type(T):
    tn = lambda T: type(T).__name__
    return (tn(T) == 'BType') if tn(T) != 'SingletonType' else is_basic_type(T.comps.base_type)

def basic_fields(T, neg=False):
    return [k for k, v in T.comps.__dict__.items() if is_basic_type(v)]

def nonbasic_fields(T, neg=False):
    return [k for k, v in T.comps.__dict__.items() if not is_basic_type(v)]

def find_subtype_relabeling(T, U):
    '''Could record type T be a sub type of record type U if relabeling in T is allowed?'''
    # Find possible relabelings for basic-type fields
    basic_label_permutations = set(ps[:len(basic_fields(U))] for ps in permutations(basic_fields(T)))
    
    for tks in basic_label_permutations:
        # Copy U and try a basic-fields relabeling
        U2 = copy_rectype(U)
        rlb = list(zip(basic_fields(U), tks))
        for uk, tk in rlb:
            U2.Relabel(uk, tk)
        
        # For each U field, find a T field that is a subtype
        match = dict()
        for uk in nonbasic_fields(U2):
            for tk in nonbasic_fields(T):
                if T.comps.__dict__[tk].subtype_of(U2.comps.__dict__[uk]):
                    match[uk] = tk
                    break
            if uk not in match:
                break

        # Successful if all non-basic fields match.
        if len(match) == len(nonbasic_fields(U2)):
            return dict(list(rlb) + list(match.items()))
    return None

r = parse_abc(texts[1])
print(find_subtype_relabeling(simp, r))
r2 = copy_rectype(r)
for k1, k2 in find_subtype_relabeling(simp, r).items():
    r2.Relabel(k1, k2)
print(simp.subtype_of(r2))
latex(r2)

None


AttributeError: 'NoneType' object has no attribute 'items'