# Spanner Workbench Introduction
In this tutorial you will learn the basics of spanner workbench:
* how to import and use RGXlog
* variable assignment
* reading from a file
* declaring a relation
* adding facts
* adding rules
* queries
* using RGXlog's primitive information extractor: functional regex formulas
* using custom information extractors

# Using RGXlog
In order to use RGXlog, you must first import it:

In [26]:
import rgxlog

Now whenever you want to a cell to use RGXlog, simply type '%%spanner' at the beginning
of that cell. For example:

In [27]:
%%spanner
parent("bob", "greg")

fact
  relation
    parent
    "bob"
    "greg"

Tree(fact, [Tree(relation, [Token(NAME, 'parent'), Token(STRING, '"bob"'), Token(STRING, '"greg"')])])


# Variable assignment
RGXlog allows you to use two types of variables: strings and spans.
The assignment of a string is intuitive:

In [28]:
%%spanner
b = "bob"
b2 = b # b2's value is "bob"

start
  assign_normal_string
    b
    "bob"
  assign_var
    b2
    b

Tree(start, [Tree(assign_normal_string, [Token(NAME, 'b'), Token(STRING, '"bob"')]), Tree(assign_var, [Token(NAME, 'b2'), Token(NAME, 'b')])])


 A span identifies a substring of a string by specifying its bounding indices. It is constructed from two integers.
 You can assign a span value like this:

In [29]:
%%spanner
span1 = [3,7)
span2 = span1 # span2 value is [1,2)

start
  assign_span
    span1
    span
      3
      7
  assign_var
    span2
    span1

Tree(start, [Tree(assign_span, [Token(NAME, 'span1'), Tree(span, [3, 7])]), Tree(assign_var, [Token(NAME, 'span2'), Token(NAME, 'span1')])])


# Reading from a file
You can also perform a string assignment by reading from a file. You will need to provide a path to a file by either using a string literal or a string variable:

In [30]:
%%spanner
a = read("path/to/file")
b = "path/to/file" 
c = read(b) # c holds the same string value as a

start
  assign_string_from_file_string_param
    a
    "path/to/file"
  assign_normal_string
    b
    "path/to/file"
  assign_string_from_file_var_param
    c
    b

Tree(start, [Tree(assign_string_from_file_string_param, [Token(NAME, 'a'), Token(STRING, '"path/to/file"')]), Tree(assign_normal_string, [Token(NAME, 'b'), Token(STRING, '"path/to/file"')]), Tree(assign_string_from_file_var_param, [Token(NAME, 'c'), Token(NAME, 'b')])])


# Derclaring a relation
RGXlog allows you to define and query relations.
You have to declare a relation before you can use it (TBD if rule heads should be declared). Each term in a relation could be a string or a span. Here are some examples for declaring relations:

In [34]:
%%spanner
# 'brothers' is a relation with two string terms.
new brothers(str, str)
# 'angry' is a relation with one string term.
new angry(str)
# 'noun' is a relation with one string term, and one span term 
new noun(str, spn)

start
  relation_declaration
    brothers
    decl_string
    decl_string
  relation_declaration
    angry
    decl_string
  relation_declaration
    noun
    decl_string
    decl_span

Tree(start, [Tree(relation_declaration, [Token(NAME, 'brothers'), Tree(decl_string, []), Tree(decl_string, [])]), Tree(relation_declaration, [Token(NAME, 'angry'), Tree(decl_string, [])]), Tree(relation_declaration, [Token(NAME, 'noun'), Tree(decl_string, []), Tree(decl_span, [])])])


# adding facts
RGXlog is an extension of Datalog, a declarative logic programming language. In Datalog you can declare "facts", essentially adding tuples to a relation. Here's how to do it in RGXlog:

In [59]:
%%spanner
# first declare the relation that you want to use
new noun(str, spn)
# now you can add facts (tuples) to that relation
noun("Life finds a way", [0,4))
# another example
new sisters(str, str)
sisters("alice", "rin")                          

start
  relation_declaration
    noun
    decl_string
    decl_span
  fact
    relation
      noun
      "Life finds a way"
      span
        0
        4
  relation_declaration
    sisters
    decl_string
    decl_string
  fact
    relation
      sisters
      "alice"
      "rin"

Tree(start, [Tree(relation_declaration, [Token(NAME, 'noun'), Tree(decl_string, []), Tree(decl_span, [])]), Tree(fact, [Tree(relation, [Token(NAME, 'noun'), Token(STRING, '"Life finds a way"'), Tree(span, [0, 4])])]), Tree(relation_declaration, [Token(NAME, 'sisters'), Tree(decl_string, []), Tree(decl_string, [])]), Tree(fact, [Tree(relation, [Token(NAME, 'sisters'), Token(STRING, '"alice"'), Token(STRING, '"rin"')])])])


# adding rules
Datalog allows you to deduce new tuples for a relation.
RGXlog includes this feature as well:

In [61]:
%%spanner
new parent(str ,str)
parent("bob", "greg")
parent("greg", "alice")
# now add a rule that deduce that bob is a grandparent of alice
grandparent(x,z) <- paren(x,y), parent(y,z) # ',' is similar to the 'and' operator

start
  relation_declaration
    parent
    decl_string
    decl_string
  fact
    relation
      parent
      "bob"
      "greg"
  fact
    relation
      parent
      "greg"
      "alice"
  rule
    rule_head
      grandparent
      name_list
        x
        z
    rule_body_normal_relation
      relation
        paren
        x
        y
    rule_body_normal_relation
      relation
        parent
        y
        z

Tree(start, [Tree(relation_declaration, [Token(NAME, 'parent'), Tree(decl_string, []), Tree(decl_string, [])]), Tree(fact, [Tree(relation, [Token(NAME, 'parent'), Token(STRING, '"bob"'), Token(STRING, '"greg"')])]), Tree(fact, [Tree(relation, [Token(NAME, 'parent'), Token(STRING, '"greg"'), Token(STRING, '"alice"')])]), Tree(rule, [Tree(rule_head, [Token(NAME, 'grandparent'), Tree(name_list, [Token(NAME, 'x'), Token(NAME, 'z')])]), Tree(rule_body_normal_relation, [Tree(relation, [Token(NAME, 'paren'), Token(NAME, 'x'), Token(NAME, 'y')])]), Tree(rule_body_normal_relati

# queries
Querying is very simple in RGXlog. You can query by using string literals, span literals and capture variables:

In [71]:
%%spanner
# first create a relation with some facts for the example
new grandfather(str, str)
# bob and george are the grandfathers of alice and rin
grandfather("bob", "alice")
grandfather("bob", "rin")
grandfather("george", "alice")
grandfather("george", "rin")
# edward is the grandfather of john
grandfather("edward", "john")
# now for the queries
?grandfather("bob", "alice") # returns ("bob", "alice") as alice is bob's grandchild
?grandfather("edward", "alice") # returns nothing as alice is not edward's grandchild
?grandfather("george", x) # returns [("george", "alice"), ("george", "rin")] as both rin
# and alice are george's grandchildren
?grandfather(x, "rin") # returns [("bob", "rin"), ("george", "rin")] (rin's grandfathers)

start
  relation_declaration
    grandfather
    decl_string
    decl_string
  fact
    relation
      grandfather
      "bob"
      "alice"
  fact
    relation
      grandfather
      "bob"
      "rin"
  fact
    relation
      grandfather
      "george"
      "alice"
  fact
    relation
      grandfather
      "george"
      "rin"
  fact
    relation
      grandfather
      "edward"
      "john"
  query
    relation
      grandfather
      "bob"
      "alice"
  query
    relation
      grandfather
      "edward"
      "alice"
  query
    relation
      grandfather
      "george"
      x
  query
    relation
      grandfather
      x
      "rin"

Tree(start, [Tree(relation_declaration, [Token(NAME, 'grandfather'), Tree(decl_string, []), Tree(decl_string, [])]), Tree(fact, [Tree(relation, [Token(NAME, 'grandfather'), Token(STRING, '"bob"'), Token(STRING, '"alice"')])]), Tree(fact, [Tree(relation, [Token(NAME, 'grandfather'), Token(STRING, '"bob"'), Token(STRING, '"rin"')])]), Tree(fact

# Functional regex formulas
RGXlog supports information extraction using regular expressions.  