>[Getting started](#scrollTo=JqL6-Dnri-nI)

>>[Imports](#scrollTo=j_FYRZOYR2Cx)

>>[Defining the KG schema](#scrollTo=jC_zFFk5RwyL)

>>[Loading KG triples](#scrollTo=j07E2VBsR7EX)

>[Following relations in the KG](#scrollTo=dRrFTN4_u6FW)

>[Conditionals](#scrollTo=qxaVy-iffD8u)

>[Quantifying over relations](#scrollTo=7cyyZan-zrDe)

>[Under the hood - NQL and TensorFlow](#scrollTo=K7DOLcCbfPWW)

>>[Weighted sets and counting paths](#scrollTo=f8WxHlQNTPKG)

>>[Why weights are useful](#scrollTo=t6IMb1iOgJSC)

>>[Following relations via sparse matrix multiplication](#scrollTo=3m10zmsqY_37)



# Getting started

## Imports

NQL is a PathQuery-like language which is closely integrated with Tensorflow.  This lab walks you through some simple use cases for NQL, the Neural Query Language.  The focus is on explaining the semantics of the query language - another CoLab looks at using NQL in conjunction with Tensorflow for learning.

In [0]:
%tensorflow_version 2.x
!test -d language_repo || git clone https://github.com/google-research/language.git language_repo
%cd /content/language_repo/language/nql
!pwd
!pip install .
%cd
!pwd

/content/language_repo/language/nql
/content/language_repo/language/nql
Processing /content/language_repo/language/nql
Building wheels for collected packages: nql
  Building wheel for nql (setup.py) ... [?25l[?25hdone
  Created wheel for nql: filename=nql-0.0.1.dev0-cp36-none-any.whl size=48759 sha256=a9c2a673a44f830cdc58babff153e9c5521a61208cdc77a39503be00c307621b
  Stored in directory: /tmp/pip-ephem-wheel-cache-3qoxtlh9/wheels/51/1e/3d/f92e27698ae7f6d586fd5482f569704ed8ba1a16f293bb0a5c
Successfully built nql
Installing collected packages: nql
  Found existing installation: nql 0.0.1.dev0
    Uninstalling nql-0.0.1.dev0:
      Successfully uninstalled nql-0.0.1.dev0
Successfully installed nql-0.0.1.dev0
/root
/root


In [0]:
import math
import time
import importlib

import numpy as np
import tensorflow as tf
import nql
from nql import util
  
# Prints a dictionary or list in a sorted way.
def printd(obj, top=None):
  if isinstance(obj, dict):
    print(sorted(obj.items())[:top])
  elif isinstance(obj, list):
    print(sorted(obj)[:top])
  else:
    print(obj)

## Defining the KG schema

Now, let's set up a small KG to run over.  The one we'll work with is inspired by a [very old Hinton paper on learning relations](http://www.cs.toronto.edu/~fritz/absps/icml-lre.pdf). It has one type - a person type - and twelve relations, which correspond to familial relations. 

We start out by setting up a context object, to hold the data, and defining a schema.  By convention a type name will end with "_t". When we declare a relation, we will provide the name of the type of its first and second arguments.  So to define the schema for this KG we will do this:

In [0]:
context = nql.NeuralQueryContext()
relations = [
  'aunt', 'brother', 'daughter', 'father', 'husband', 'mother', 
  'nephew', 'niece', 'sister', 'son', 'uncle', 'wife' 
]
for r in relations:
  context.declare_relation(r, 'person_t', 'person_t')

print('declared relations', sorted(context.get_relation_names()))
print('declared types', sorted(context.get_type_names()))


declared relations ['aunt', 'brother', 'daughter', 'father', 'husband', 'mother', 'nephew', 'niece', 'sister', 'son', 'uncle', 'wife']
declared types ['person_t']


## Loading KG triples

Now let's load in some data.  These are the relationships between members of several European royal families.  Loading these will define the relations that we declared.

In [0]:
with open('/content/language_repo/language/nql/demos/data/royal92/royal_family.tsv') as f:
  context.load_kg(files=f)

INFO:tensorflow:0 facts loaded


The original [input](https://github.com/google-research/language/tree/master/language/nql/demos/data/royal92/royal_family.tsv) is stored as tab-separated triples, each triple being a relation name, followed by a person name.  For instance the first few lines are as follows.

In [0]:
with open('/content/language_repo/language/nql/demos/data/royal92/royal_family.tsv') as f:
  print(f.readlines()[:3])

['husband\tVictoria of house of Hanover\tAlbert Augustus Charles\n', 'wife\tAlbert Augustus Charles\tVictoria of house of Hanover\n', 'father\tVictoria Adelaide Mary\tAlbert Augustus Charles\n']


# Following relations in the KG

Now that a KG is loaded in, we can use the context object as a factory to create some NQL expressions.  The *one* method creates a singleton set containing one entity, which is specified via its name and type.

In [0]:
queen_vic = context.one('Victoria of house of Hanover', 'person_t')
henry8 = context.one('Henry_VIII of house of Tudor', 'person_t')

print(queen_vic.eval())
print(henry8.eval())

{'Victoria of house of Hanover': 1.0}
{'Henry_VIII of house of Tudor': 1.0}


Expressions in NQL always evaluate to *sets* of entities.  So, if you want to inspect a particular entity, then you need to construct a singleton set containing that entity.    For example, the expressions below find people in the KG (i.e., entities of type 'person_t').  Like Tensorflow expressions NQL expressions are lazy, and you should evaluate them in the context of a session.  

The default output of *nql_expression*.eval() is a Python dictionary mapping entity names to floats.  The floats make it possible to have a "soft" or weighted set of entities.  We'll come back to this feature later, but for now we can ignore the floats.

We can also follow relations in the KG, for example, asking 'who were the wives of Henry VIII?'

In [0]:
printd(henry8.wife().eval())

[('Anne of house of Boleyn', 1.0), ('Anne of_Cleves', 1.0), ('Catherine of house of Howard', 1.0), ('Catherine of house of Parr', 1.0), ('Catherine of_Aragon', 1.0), ('Jane of house of Seymour', 1.0)]


Alternatively you can follow relations by name, or follow the inverse of a relation.  You can also chain relations.

In [0]:
printd(queen_vic.follow('husband').eval())
printd(queen_vic.follow('wife', -1).eval())
printd(queen_vic.wife(-1).eval())
printd(queen_vic.son().son().eval())

[('Albert Augustus Charles', 1.0)]
[('Albert Augustus Charles', 1.0)]
[('Albert Augustus Charles', 1.0)]
[('Albert Victor Christian', 1.0), ('Alfred', 1.0), ('Arthur of_Connaught', 1.0), ('Charles Edward', 1.0), ('George_V of house of Windsor', 1.0), ('John Alexander', 1.0)]


You can take unions and intersections of sets.

In [0]:
print('queen victoria\'s children')
print(' - sons:')
printd(queen_vic.son().eval())

print(' - daughters:')
printd(queen_vic.daughter().eval())

print(' - all:')
printd((queen_vic.son() | queen_vic.daughter()).eval())

queen victoria's children
 - sons:
[('Alfred Ernest Albert', 1.0), ('Arthur William Patrick', 1.0), ('Edward_VII of house of Wettin', 1.0), ('Leopold George Duncan', 1.0)]
 - daughters:
[('Alice Maud Mary', 1.0), ('Beatrice Mary Victoria', 1.0), ('Helena Augusta Victoria', 1.0), ('Louise Caroline Alberta', 1.0), ('Victoria Adelaide Mary', 1.0)]
 - all:
[('Alfred Ernest Albert', 1.0), ('Alice Maud Mary', 1.0), ('Arthur William Patrick', 1.0), ('Beatrice Mary Victoria', 1.0), ('Edward_VII of house of Wettin', 1.0), ('Helena Augusta Victoria', 1.0), ('Leopold George Duncan', 1.0), ('Louise Caroline Alberta', 1.0), ('Victoria Adelaide Mary', 1.0)]


In [0]:
printd((henry8.wife() & context.one('Ferdinand_V', 'person_t').daughter()).eval())

[('Catherine of_Aragon', 1.0)]


(By the  way, * and + are synonyms for & and |, so you can use either notation).

And as in TensorFlow, you can use Python functions which return expressions as macros, to build up more complex relationships.

In [0]:
def child(x): return x.son() | x.daughter()
def parent(x): return x.mother() | x.father()
def grandfather(x): return parent(x).father()

printd(child(queen_vic).eval())
printd(grandfather(henry8).eval())

[('Alfred Ernest Albert', 1.0), ('Alice Maud Mary', 1.0), ('Arthur William Patrick', 1.0), ('Beatrice Mary Victoria', 1.0), ('Edward_VII of house of Wettin', 1.0), ('Helena Augusta Victoria', 1.0), ('Leopold George Duncan', 1.0), ('Louise Caroline Alberta', 1.0), ('Victoria Adelaide Mary', 1.0)]
[('Edmund of house of Tudor #2', 1.0), ('Edward_IV', 1.0)]


Also notice that the newly-defined constructs are functions: e.g., *male(S)* returns a new set *S'* which contains the male members of *S.*  If it seems awkward to switch from "chaining-style" syntax (e.g., *x.son().son()*) to "function-call" syntax, there is a way around that, which we will discuss below.

# Conditionals #

This dataset has no sex or gender indications, so here's a way to identify the men and women in the dataset.  Note the use of the new construct *context.all(type)* which returns a set of all entities in a type.

In [0]:
def male(x): return (x.mother().son() & x)
def female(x): return (x.mother().daughter() & x)

printd(male(context.all('person_t')).eval(), 5)
printd(female(context.all('person_t')).eval(), 5)



[('Adalbert', 7.0), ('Adalbert #2', 8.0), ('Adolphe of_Luxembourg', 2.0), ('Adolphus', 3.0), ('Adolphus 2nd', 4.0)]
[('(Daughter)', 11.0), ('(Sophia) Charlotte', 1.0), ('Adela', 10.0), ('Adelaide', 4.0), ('Adelaide Horatia Elizabeth of house of Seymour', 1.0)]


Note that the weights in these sets are not uniform -- we'll get back to this below, in the "under the hood" section.

Now, however, we will use *male* and *female* to demonstrate the syntax for conditional relation-following.  As a somewhat artificial example, suppose we want to define *opposite_sex_parent* as the father of a female person, and mother of a male person.  We could do this as follows.

In [0]:
def opposite_sex_parent(x):
  return x.father().if_any(female(x)) | x.mother().if_any(male(x))

print('opposite_sex_parent of henry:')
printd(opposite_sex_parent(henry8).eval())
print('opposite_sex_parent of victoria:')
printd(opposite_sex_parent(queen_vic).eval())

opposite_sex_parent of henry:
[('Elizabeth of_York', 1.0)]
opposite_sex_parent of victoria:
[('Edward Augustus of house of Hanover', 1.0)]


# Quantifying over relations

NQL lets you "follow" specific relations.  Sometimes it's useful to variabilize the relations you follow.  We already saw the syntax *x.follow(r)* where *r* is a Python variable bound to a string naming a relation.  Here we'll discuss a way to push these "relation variables" down into Tensorflow.

We will first construct a "relation group" consisting of a selected subset of relations.  By convention relation group names end with "_g".  You usually won't need the return value of *construct_relation_group* but it's printed to help explain what's happens under the hood, below.

In [0]:
nuclear_family_relations = ['son', 'daughter', 'father', 'mother', 'brother', 'sister', 'wife', 'husband']
g = context.construct_relation_group('nuclear_family_g', 'person_t', 'person_t', nuclear_family_relations)
printd(g.__dict__)

[('members', ['son', 'daughter', 'father', 'mother', 'brother', 'sister', 'wife', 'husband']), ('name', 'nuclear_family_g'), ('object_rel', 'nuclear_family_g_obj'), ('relation_rel', 'nuclear_family_g_rel'), ('subject_rel', 'nuclear_family_g_subj'), ('triple_prefix', 'nuclear_family_g_trip_'), ('triple_type', 'nuclear_family_g_triple_t'), ('weight_rel', 'nuclear_family_g_weight')]


The effect of this call is to add two new types and four new relations to the KG.

In [0]:
print('all relations:')
printd(context.get_relation_names())
print('all types:')
printd(context.get_type_names())
print('new relations:')
printd(list(filter(lambda x:x.startswith(g.name), context.get_relation_names())))
print('new types:')
printd(list(filter(lambda x:x.startswith(g.name), context.get_type_names())))

all relations:
dict_keys(['aunt', 'brother', 'daughter', 'father', 'husband', 'mother', 'nephew', 'niece', 'sister', 'son', 'uncle', 'wife', 'nuclear_family_g_rel', 'nuclear_family_g_subj', 'nuclear_family_g_obj', 'nuclear_family_g_weight'])
all types:
dict_keys(['person_t', 'nuclear_family_g', 'nuclear_family_g_triple_t'])
new relations:
['nuclear_family_g_obj', 'nuclear_family_g_rel', 'nuclear_family_g_subj', 'nuclear_family_g_weight']
new types:
['nuclear_family_g', 'nuclear_family_g_triple_t']


The instances of *nuclear_family_g* are the relations listed on *construct_relation_group*.

In [0]:
for rel_id in range(context.get_max_id('nuclear_family_g')):
  print(context.get_entity_name(rel_id, 'nuclear_family_g'))

son
daughter
father
mother
brother
sister
wife
husband


We can now define sets of relations the same way that we can define sets of entities, for example

In [0]:
spouse = context.one('wife', 'nuclear_family_g') | context.one('husband', 'nuclear_family_g')
printd(spouse.eval())

[('husband', 1.0), ('wife', 1.0)]


NQL lets you follow a group of relations, much as you can follow a single relation.

In [0]:
printd(henry8.follow(spouse).eval())
printd(queen_vic.follow(spouse).eval())

[('Anne of house of Boleyn', 1.0), ('Anne of_Cleves', 1.0), ('Catherine of house of Howard', 1.0), ('Catherine of house of Parr', 1.0), ('Catherine of_Aragon', 1.0), ('Jane of house of Seymour', 1.0)]
[('Albert Augustus Charles', 1.0)]


As another example, these are all of the members of Queen Victoria's nuclear family.

In [0]:
printd(queen_vic.follow(context.all('nuclear_family_g')).eval())

[('Albert Augustus Charles', 1.0), ('Alfred Ernest Albert', 1.0), ('Alice Maud Mary', 1.0), ('Arthur William Patrick', 1.0), ('Beatrice Mary Victoria', 1.0), ('Edward Augustus of house of Hanover', 1.0), ('Edward_VII of house of Wettin', 1.0), ('Helena Augusta Victoria', 1.0), ('Leopold George Duncan', 1.0), ('Louise Caroline Alberta', 1.0), ('Victoria Adelaide Mary', 1.0), ('Victoria Mary Louisa', 1.0)]


Under the hood, this is implemented using the other new type, the triple type.  Let's look at a particular triple, which one which encodes the fact Queen Victoria's husband is Prince Albert. 

In [0]:
trip = queen_vic.follow('nuclear_family_g_subj',-1) & spouse.follow('nuclear_family_g_rel',-1)
printd(trip.eval())
print(context.get_id(list(trip.eval().keys())[0], 'nuclear_family_g_triple_t'))

[('nuclear_family_g_trip_14295', 1.0)]
14295


In [0]:
t = context.one('nuclear_family_g_trip_14295', 'nuclear_family_g_triple_t')
print('triple t:')
printd(t.eval())
print('subject of t:')
printd(t.nuclear_family_g_subj().eval())
print('object of t:')
printd(t.nuclear_family_g_obj().eval())
print('relation of t:')
printd(t.nuclear_family_g_rel().eval())


triple t:
[('nuclear_family_g_trip_14295', 1.0)]
subject of t:
[('Victoria of house of Hanover', 1.0)]
object of t:
[('Albert Augustus Charles', 1.0)]
relation of t:
[('husband', 1.0)]


This lets you follow relation sets using NQLs other primitives - for instance, to follow the *spouse* relation for Queen Victoria, we first find triples that have her as the subject, and something in the set *spouse* relations as the relation.  We can  then step from these triples (in this case, there's only) to the object.

In [0]:
queen_vics_spouse_triples = queen_vic.nuclear_family_g_subj(-1) & spouse.nuclear_family_g_rel(-1)
queen_vics_spouse = queen_vics_spouse_triples.nuclear_family_g_obj()

print(queen_vics_spouse.eval())

{'Albert Augustus Charles': 1.0}


Similarly, for Henry VIII:

In [0]:
henry8_spouse_triples = henry8.nuclear_family_g_subj(-1) & spouse.nuclear_family_g_rel(-1)
print('triples:')
printd(henry8_spouse_triples.eval())
print('wives:')
printd(henry8_spouse_triples.nuclear_family_g_obj().eval())

triples:
[('nuclear_family_g_trip_13470', 1.0), ('nuclear_family_g_trip_13472', 1.0), ('nuclear_family_g_trip_13473', 1.0), ('nuclear_family_g_trip_13474', 1.0), ('nuclear_family_g_trip_13476', 1.0), ('nuclear_family_g_trip_13478', 1.0)]
wives:
[('Anne of house of Boleyn', 1.0), ('Anne of_Cleves', 1.0), ('Catherine of house of Howard', 1.0), ('Catherine of house of Parr', 1.0), ('Catherine of_Aragon', 1.0), ('Jane of house of Seymour', 1.0)]


# Subclassing NQLExpression

Sometimes it's clearer to define methods that you can chain, rather than functions that you call.  For instance above we added 

```
def child(x): return x.son() | x.daughter()
def parent(x): return x.mother() | x.father()
def grandfather(x): return parent(x).father()
```
allowing you to write
```
print child(queen_vic).eval()
print grandfather(henry8).eval()
```

But it might be nicer to be able use the same syntax for both user-defined things like *child* and KG-defined things like *son*.  If course, we could easily define a functional version of the KG-based relations,  like so:







In [0]:
def son_fun(x): return x.son()

printd(son_fun(queen_vic).eval())

[('Alfred Ernest Albert', 1.0), ('Arthur William Patrick', 1.0), ('Edward_VII of house of Wettin', 1.0), ('Leopold George Duncan', 1.0)]


Defining chaining-style operations is a little more complicated.  We need to subclass the NQLExpression class and add some new methods.  Then, because the *context* is a factory class, which generates expressions, we need to tell it to create instances of my subclass.  To avoid confustion I'll do this with a fresh context object.

In [0]:
class MyNQLExpression(nql.NeuralQueryExpression):
  def child(self): return self.son() | self.daughter()

#load a new copy of the context object
context2 = nql.NeuralQueryContext()
context2.expression_factory_class = MyNQLExpression
relations = [
  'aunt', 'brother', 'daughter', 'father', 'husband', 'mother', 
  'nephew', 'niece', 'sister', 'son', 'uncle', 'wife' 
]
for r in relations:
  context2.declare_relation(r, 'person_t', 'person_t')
with open('/content/language_repo/language/nql/demos/data/royal92/royal_family.tsv') as f:
  context2.load_kg(files=f)

# now let's try using my new chaining-style method
queen_vic2 = context2.one('Victoria of house of Hanover', 'person_t')
printd(queen_vic2.child().eval())


INFO:tensorflow:0 facts loaded
[('Alfred Ernest Albert', 1.0), ('Alice Maud Mary', 1.0), ('Arthur William Patrick', 1.0), ('Beatrice Mary Victoria', 1.0), ('Edward_VII of house of Wettin', 1.0), ('Helena Augusta Victoria', 1.0), ('Leopold George Duncan', 1.0), ('Louise Caroline Alberta', 1.0), ('Victoria Adelaide Mary', 1.0)]


# Under the hood - NQL and TensorFlow

## Weighted sets and counting paths

To understand the non-uniform weights that occur when we look for all the male or female people, we need to drill down into what's actually happening here as we evalulate code like. 


```
x = context.all('person_t')
x.mother().son() & x
```


Let's take it step by step.  The *all* sets are uniformly weighted...

In [0]:
e = context.all('person_t').eval()
printd(e, 10)


[('(Daughter)', 1.0), ('(Frederick) Christian Charles', 1.0), ('(Sophia) Charlotte', 1.0), ('5sons_1dau', 1.0), ('Ada', 1.0), ('Adalberon of_Rheims', 1.0), ('Adalbert', 1.0), ('Adalbert #2', 1.0), ('Adam of_Rowallan of house of Mure', 1.0), ('Adela', 1.0)]


...but when we take the next step, we end up with differing weights:

In [0]:
e = context.all('person_t').mother().eval()
printd(e, 10)

[('(Sophia) Charlotte', 15.0), ('Ada', 3.0), ('Adela', 5.0), ('Adelaide Horatia Elizabeth of house of Seymour', 1.0), ('Adelaide Judith', 1.0), ('Adelaide Louisa Theresa', 4.0), ('Adelaide of_Savoy', 7.0), ('Adele of_Champagne', 1.0), ('Agatha #2', 3.0), ('Agnes', 1.0)]


The weights here correspond to the *number of children* each mother has, which we can check as follows (for a sample of the values)

In [0]:
mothers = context.all('person_t').mother().eval()
for name,weight in sorted(mothers.items())[0:10]:
    print('%s has %d kid(s):' % (name, weight))
    printd(child(context.one(name,'person_t')).eval())

(Sophia) Charlotte has 15 kid(s):
[('Adolphus of_Cambridge of house of Hanover', 1.0), ('Alfred of house of Hanover', 1.0), ('Amelia of house of Hanover', 1.0), ('Augusta Sophia of house of Hanover', 1.0), ('Augustus Frederick of house of Hanover', 1.0), ('Charlotte Augusta Matilda of house of Hanover', 1.0), ('Edward Augustus of house of Hanover', 1.0), ('Elizabeth of house of Hanover', 1.0), ('Ernest Augustus_I of house of Hanover', 1.0), ('Frederick of house of Hanover', 1.0), ('George_IV of house of Hanover', 1.0), ('Mary of house of Hanover', 1.0), ('Octavius of house of Hanover', 1.0), ('Sophia of house of Hanover', 1.0), ('William_IV Henry of house of Hanover', 1.0)]
Ada has 3 kid(s):
[('David of_Huntingdon', 1.0), ('Malcolm_IV the_Maiden', 1.0), ('Willaim_I the_Lion', 1.0)]
Adela has 5 kid(s):
[('Henry of_Winchester', 1.0), ('Matilda #5', 1.0), ('Stephen', 1.0), ('Theobald', 1.0), ('William #11', 1.0)]
Adelaide Horatia Elizabeth of house of Seymour has 1 kid(s):
[('Charles Robe

What's happening is that while the set of keys is the set of all mothers, the weight for any person *x* indicate **the number of ways** that x can be "demonstrated to be a mother" --- i.e., when we started with the set *S* and followed the relation *mother*, the weights for *x* became the number of elements in *S* from which *x* could be reached via the *mother* relation.


This behavior cascades - for example, if we evaluated *S.mother().mother()* the weight for each *x'* would be the number of people for whom *x'* is a maternal grandmother, and we will get more extreme weights.

In [0]:
mothers = context.all('person_t').mother()
print('moms:')
printd(mothers.eval(), 10)

grandmothers = mothers.mother()
print('grandmoms:')
printd(grandmothers.eval(), 10)

moms:
[('(Sophia) Charlotte', 15.0), ('Ada', 3.0), ('Adela', 5.0), ('Adelaide Horatia Elizabeth of house of Seymour', 1.0), ('Adelaide Judith', 1.0), ('Adelaide Louisa Theresa', 4.0), ('Adelaide of_Savoy', 7.0), ('Adele of_Champagne', 1.0), ('Agatha #2', 3.0), ('Agnes', 1.0)]
grandmoms:
[('Agatha #2', 5.0), ('Agnes', 1.0), ('Alexandra of house of Windsor', 1.0), ('Alexandra of_Denmark "Alix"', 3.0), ('Alexandra of_Greece', 1.0), ('Alice Maud Mary', 12.0), ('Alice de_Courtenay', 5.0), ('Alice of house of de_Toledo', 3.0), ('Alice of_Battenberg', 6.0), ('Amalie of_Wurttemberg', 4.0)]


In this colab, we implemented a special *printd* method to show the top weights in an evaluated NQL expression, so you would know exactly what was happening.  However, getting the top-weighted elements of an NQL entity set is common enough that you can do it with a special option to the *eval* method.

In [0]:
m = mothers.eval(as_top=6)
print('moms:')
printd(m)

moms:
[('(Sophia) Charlotte', 15.0), ('Anne of house of Stuart', 12.0), ('Eleanor of_Castile', 15.0), ('Elizabeth of house of Woodville', 12.0), ('Philippa of_Hainault', 12.0), ('Sophia Dorothea of house of Hanover', 13.0)]


While you can ignore the weights ,  the NQL implementors believes that this behavior is actually a feature and not a bug, as the weights often are useful information.  Here, for example, the weights reflect the number of descendants.

If you want to get rid of the  non-uniform weights, one way would be to convert the weighted NQL expression to Tensorflow - which can be done with the *.tf* property -- clip the weights using the appropriate Tensorflow op, and then convert back to NQL,  as follows.

In [0]:
weighted_men = context.all('person_t').mother().son()
weighted_men_tensor = weighted_men.tf
men_tensor = tf.clip_by_value(weighted_men_tensor, 0.0, 1.0)
men = context.as_nql(men_tensor,'person_t')

printd(men.eval(), 10)

[('Adalbert', 1.0), ('Adalbert #2', 1.0), ('Adolphe of_Luxembourg', 1.0), ('Adolphus', 1.0), ('Adolphus 2nd', 1.0), ('Adolphus Frederick_V', 1.0), ('Adolphus of_Cambridge of house of Hanover', 1.0), ('Agustus', 1.0), ('Albert', 1.0), ('Albert Augustus Charles', 1.0)]


If you're just jumping out to Tensorflow for a short time, it's sometimes more concise to use the *tf_op* method, which lets you use any Python function that transforms a tensor to one with the same shape.


In [0]:
men2 = weighted_men.tf_op(lambda t:tf.clip_by_value(t, 0.0, 1.0))
printd(men2.eval(), 10)

[('Adalbert', 1.0), ('Adalbert #2', 1.0), ('Adolphe of_Luxembourg', 1.0), ('Adolphus', 1.0), ('Adolphus 2nd', 1.0), ('Adolphus Frederick_V', 1.0), ('Adolphus of_Cambridge of house of Hanover', 1.0), ('Agustus', 1.0), ('Albert', 1.0), ('Albert Augustus Charles', 1.0)]


We discuss the possible Tensorflow - NQL conversion processes below.

## Following relations via sparse matrix multiplication

Under the hood, what's going on here is that each weighted set is represented internally as a vector - generally a sparse vector, with one component for each entity.   To make it a little more complicated, NQL is, like most neural systems, set up to operate on minibatches, so the internal representation of a single set is a *matrix*.  For these queries, the matrix has one row, and *n* columns, where *n* is the number of *person_t* entiities.  A relation is represented as a matrix, and following a relation is implemented as a vector-matrix multiplication.

Let's drill down into this a little more.


There are two equivalent ways to get at this 'raw' representation.  One is to pass the *as_dicts=False* option into the *eval* method for an NQL expression.

In [0]:
m = henry8.wife().eval(as_dicts=False)
print('m', m)
print('m.shape', m.shape)
print('nonzeros in m', np.nonzero(m))


m [[0. 0. 0. ... 0. 0. 0.]]
m.shape (1, 3007)
nonzeros in m (array([0, 0, 0, 0, 0, 0]), array([1039, 1064, 1069, 1071, 1074, 1079]))


Alternatively we can get at the underlying tensorflow expression, and evaluate it:

In [0]:
m = henry8.wife().tf.numpy()
print('m', m)
print('m.shape', m.shape)
print('nonzeros in m', np.nonzero(m))

m [[0. 0. 0. ... 0. 0. 0.]]
m.shape (1, 3007)
nonzeros in m (array([0, 0, 0, 0, 0, 0]), array([1039, 1064, 1069, 1071, 1074, 1079]))


In this representation, the nonzero column indices are ids for the wives of Henry VIII.  We can decode these ids using the *context* object.  (The row indices are all zero, since the matrix has only one row.)

In [0]:
row_indices, col_indices = np.nonzero(henry8.wife().tf.numpy())
for person_id in col_indices.tolist():
  print('person with id %d is %s' % (person_id, context.get_entity_name(person_id, 'person_t')))
# double check these values are correct
print('target set is:')
printd(henry8.wife().eval().keys())

person with id 1039 is Catherine of_Aragon
person with id 1064 is Anne of house of Boleyn
person with id 1069 is Jane of house of Seymour
person with id 1071 is Anne of_Cleves
person with id 1074 is Catherine of house of Howard
person with id 1079 is Catherine of house of Parr
target set is:
dict_keys(['Catherine of_Aragon', 'Anne of house of Boleyn', 'Jane of house of Seymour', 'Anne of_Cleves', 'Catherine of house of Howard', 'Catherine of house of Parr'])


As an alternative, we could convert m back to an NQL expression to examine it.

In [0]:
printd(context.as_nql(m, 'person_t').eval())

[('Anne of house of Boleyn', 1.0), ('Anne of_Cleves', 1.0), ('Catherine of house of Howard', 1.0), ('Catherine of house of Parr', 1.0), ('Catherine of_Aragon', 1.0), ('Jane of house of Seymour', 1.0)]


Every relation is represented as a sparse matrix.  Below we'll look at the sparse matrix for the *wife* relation, and decode a few indices as a check.

In [0]:
wife_tensor = context.get_tf_tensor('wife')
print(wife_tensor)

SparseTensor(indices=tf.Tensor(
[[   0    1]
 [  12    4]
 [   2   24]
 ...
 [2995 2996]
 [3001 3002]
 [3005 3006]], shape=(1138, 2), dtype=int64), values=tf.Tensor([1. 1. 1. ... 1. 1. 1.], shape=(1138,), dtype=float32), dense_shape=tf.Tensor([3007 3007], shape=(2,), dtype=int64))


In [0]:
henry8_id = context.get_id('Henry_VIII of house of Tudor', 'person_t')
for row_index,col_index in wife_tensor.indices.numpy().tolist():
  if col_index==henry8_id:
      print('id: %d name: %s' % (row_index, context.get_entity_name(row_index, 'person_t')))

id: 1039 name: Catherine of_Aragon
id: 1064 name: Anne of house of Boleyn
id: 1069 name: Jane of house of Seymour
id: 1071 name: Anne of_Cleves
id: 1074 name: Catherine of house of Howard
id: 1079 name: Catherine of house of Parr


*Aside:* The storage is a little counterintuitive: instead of *wife_matrix* mapping from Henry to Ann Bolyn, for instance, by making the husbands the row indices and the wives the column indices, NQL stores indices in the opposite order.  This is because Tensorflow doesn't support multiplying a dense vector **v**  by a sparse matrix **S**, only multiplication of **S** by **v**.  So instead of computing **v * S**, we must compute something more like

```
transpose( transpose(S) * transpose(v) )
```
Hence storing *transpose(S)* directly is somewhat more efficient.



So finally, if we want to simulate *henry8.wife()*, we can do it as follows.

In [0]:
v = henry8.eval(as_dicts=False)
S = wife_tensor
henrys_wives = tf.transpose(tf.sparse.sparse_dense_matmul(S, tf.transpose(v)))
print(np.nonzero(henrys_wives.numpy()))

(array([0, 0, 0, 0, 0, 0]), array([1039, 1064, 1069, 1071, 1074, 1079]))


If you look above, you can see that this is indeed the correct result.

Finally,  set union **v | w** is implemented as vector sum, set intersection **v & w** is component-wise vector multiplication, and 
the **v**.*if_any*(**w**) construct is implemented by multiplying the vector **v** by the sum of the value of the components of **w**.   The last one is worth a little discussion, so let's break down the example  code above


```
def opposite_sex_parent(x):
  return x.father().if_any(female(x)) | x.mother().if_any(male(x))
```

which if we expand out the definitions of *femaile* and *male* is actually


```
def opposite_sex_parent(x):
  return x.father().if_any(x.mother().daughter() & x) 
       | x.mother().if_any(x.father().son() & x)
```

The steps in the computation given **v** might be

1.   Compute the set of all female children **sisters** of the mother of **v** with two matrix multiplications:  **sisters =v M D**, where **M** is the motherhood matrix and **D** is the daughter matrix.
2.   Compute the componentwise product **sisters * v**.  If **v** is male, this is the all-zeros vector, so *a = sum*(**sisters  * v**) = 0.  Otherwise the product will be equivalent to **v** so *a=1*.
3. Likewise compute ** brothers = v F S ** and *b = sum*(**brothers * v**)
4. Finally compute (*a* ** x F **) + (*b* **x M**)




Below is a summary of the implementations of all the NQL operators.  The only things not discussed above are 

1. The *none* construct, which returns an empty set;
2. The *x.jump_to_xxx* constructs, which let you construct singleton sets, empty sets, and universal sets from an expression, rather than using a context object.
3. The operation *x.filtered_by(relation, entity_name)* which is shorthand for


```
   x & x.jump_to_one(entity_name, type_name).relation(-1)
```
where *type_name* is the type declared for the second arguments of *relation*.

4. The alternative operators for union and intersection, and the alternative names for *filtered_by* and *if_any*, which are sometimes useful when you want to emphasize the behavior on weighted sets.

> NQL                              | Matrix operations
> ---------------------------------|------------------
> `context.one('bob', 'person_t')` <br/> `x.jump_to('bob', 'person_t')` | $v_\text{bob}$, one hot vector for entity  'bob'
> `context.all('person_t')` <br/> `x.jump_to_all('person_t')`| k-hot vector for set of all element of type 'person' i.e. a ones vector
> `context.none('person_t')` <br/> `x.jump_to_none('person_t')` | k-hot vector for empty set of element of type 'person' i.e. a zeros vector
> `x.r()` <br/> `x.follow('r')` | $x \cdot M_r$
> `x \| y` <br/> `x + y` | $x + y$
> `x & y` <br/> `x * y` | $x * y$ <br/> Hadamard a.k.a. component-wise product
> `x.filtered_by('r', 'bob')` <br/> `x.weighted_by('r', 'bob')` | $x * \left( v_\text{bob} \cdot M_r^T \right)$
> `x.if_any(y)` <br/> `x.weighted_by_sum(y)` | $x * \sum y$