# Module 7 - Programming Assignment

## Directions

1. Change the name of this file to be your JHED id as in `jsmith299.ipynb`. Because sure you use your JHED ID (it's made out of your name and not your student id which is just letters and numbers).
2. Make sure the notebook you submit is cleanly and fully executed. I do not grade unexecuted notebooks.
3. Submit your notebook back in Blackboard where you downloaded this file.

*Provide the output **exactly** as requested*

# Unification

This is actually Part I of a two part assignment. In a later module, you'll implement a Forward Planner. In order to do that, however, you need to have a unifier. It is important to note that you *only* need to implement a unifier. Although the module talked about resolution, you do not need to implement anything like "standardizing apart". From the unifier's point of view, that should already have been done.

Unification is simply the *syntactic* balancing of expressions. There are only 3 kinds of expressions: constants, lists and (logic) variables. Constants and lists are only equal to each other if they're exactly the same thing or can be made to be the same thing by *binding* a value to a variable.

It really is that simple...expressions must be literally the same (identical) except if one or the other (or both) has a variable in that "spot".

## S-Expressions

With that out of the way, we need a language with which to express our constants, variables and predicates and that language will be based on s-expressions.

**constants** - There are two types of constants, values and predicates. Values should start with an uppercase letter. Fred is a constant value, so is Barney and Food. Predicates are named using lowercase letters. loves is a predicate and so is hates. This is only a convention. Secret: your code does not need to treat these two types of constants differently.

**variables** - these are named using lowercase letters but always start with a question mark. ?x is a variable and so is ?yum. This is not a convention.

**expressions (lists)** - these use the S-expression syntax a la LISP. (loves Fred Wilma) is an expression as is (friend-of Barney Fred) and (loves ?x ?y).

## Parsing

These functions are already included in the starter .py file.

In [13]:
import tokenize
from io import StringIO

This uses the above libraries to build a Lisp structure based on atoms. It is adapted from [simple iterator parser](http://effbot.org/zone/simple-iterator-parser.htm). The first function is the `atom` function.

In [14]:
def atom( next, token):
    if token[ 1] == '(':
        out = []
        token = next()
        while token[ 1] != ')':
            out.append( atom( next, token))
            token = next()
            if token[ 1] == ' ':
                token = next()
        return out
    elif token[ 1] == '?':
        token = next()
        return "?" + token[ 1]
    else:
        return token[ 1]

The next function is the actual `parse` function:

In [15]:
def parse(exp):
    src = StringIO(exp).readline
    tokens = tokenize.generate_tokens(src)
    return atom(tokens.__next__, tokens.__next__())

**Note** there was a change between 2.7 and 3.0 that "hid" the next() function in the tokenizer.

From a Python perspective, we want to turn something like "(loves Fred ?x)" to ["loves" "Fred" "?x"] and then work with the second representation as a list of strings. The strings then have the syntactic meaning we gave them previously.

In [16]:
parse("Fred")

'Fred'

In [17]:
parse( "?x")

'?x'

In [18]:
parse( "(loves Fred ?x)")

['loves', 'Fred', '?x']

In [19]:
parse( "(father_of Barney (son_of Barney))")

['father_of', 'Barney', ['son_of', 'Barney']]

In [20]:
parse("loves")

'loves'

## Unifier

Now that that's out of the way, here is the imperative pseudocode for unification. This is a classic recursive program with a number of base cases. Students for some reason don't like it, try the algorithm in the book, can't get it to work and then come back to this pseudocode.

Work through the algorithm by hand with your Self-Check examples if you need to but I'd suggest sticking with this implementation. It does work.

Here is imperative pseudocode for the algorithm:

```
def unification( exp1, exp2):
    # base cases
    if exp1 and exp2 are constants or the empty list:
        if exp1 = exp2 then return {}
        else return FAIL
    if exp1 is a variable:
        if exp1 occurs in exp2 then return FAIL
        else return {exp1/exp2}
    if exp2 is a variable:
        if exp2 occurs in exp1 then return FAIL
        else return {exp2/exp1}

    # inductive step
    first1 = first element of exp1
    first2 = first element of exp2
    result1 = unification( first1, first2)
    if result1 = FAIL then return FAIL
    apply result1 to rest of exp1 and exp2
    result2 = unification( rest of exp1, rest of exp2)
    if result2 = FAIL then return FAIL
    return composition of result1 and result2
```

`unification` can return...

1. `None` (if unification completely fails)
2. `{}` (the empty substitution list) or 
3. a substitution list that has variables as keys and substituted values as values, like {"?x": "Fred"}. 

Note that the middle case sometimes confuses people..."Sam" unifying with "Sam" is not a failure so you return {} because there were no variables so there were no substitutions. You do not need to further resolve variables. If a variable resolves to an expression that contains a variable, you don't need to do the substition.

If you think of a typical database table, there is a column, row and value. This Tuple is a *relation* and in some uses of unification, the "thing" in the first spot..."love" above is called the relation. If you have a table of users with user_id, username and the value then the relation is:

`(login ?user_id ?username)`

*most* of the time, the relation name is specified. But it's not impossible for the relation name to be represented by a variable:

`(?relation 12345 "smooth_operator")`

Your code should handle this case (the pseudocode does handle this case so all  you have to do is not futz with it).

Our type system is very simple. We can get by with just a few boolean functions. The first tests to see if an expression is a variable.

## Office Hours Notes
- Can go over 20 lines
- Not many unit tests only helper like occurs
- When you get a substituition -- assign it to the rest of experession -- the apply step
- rest is one less than the item just looked at
- variable could be assigned to another variable
    - `x = y`
- cant have variable in first position of expression???

In [21]:
def is_variable( exp):
    return isinstance( exp, str) and exp[ 0] == "?"

In [22]:
is_variable( "Fred")

False

In [23]:
is_variable( "?fred")

True

The second tests to see if an expression is a constant:

In [24]:
def is_constant( exp):
    return isinstance( exp, str) and not is_variable( exp)

In [25]:
is_constant( "Fred")

True

In [26]:
is_constant( "?fred")

False

In [27]:
is_constant( ["loves", "Fred", "?wife"])

False

It might also be useful to know that:

<code>
type( "a")
&lt;type 'str'>
type( "a") == str
True
type( "a") == list
False
type( ["a"]) == list
True
</code>


You need to write the `unification` function described above. It should work with two expressions of the type returned by `parse`. See `unify` for how it will be called. It should return the result of unification for the two expressions as detailed above and in the book. It does not have to make all the necessary substitions (for example, if ?y is bound to ?x and 1 is bound to ?y, ?x doesn't have to be replaced everywhere with 1. It's enough to return {"?x":"?y", "?y":1}. For an actual application, you would need to fix this!)

-----

<a id="constant_or_empty"></a>
## constant_or_empty

- Checks if both expressions are constants or not
- Also checks if either expression is empty
- Return true if any of those cases are true else false

* **exp1** list or str: expression 1 which can be a list of expressions or a constant or a variable
* **exp2** list or str: expression 2 which can be a list of expressions or a constant or a variable


**returns** bool: true or false if a both constants or one expression is empty

In [28]:
def constant_or_empty(exp1:list|str, exp2:list|str)->bool:
    con_check = is_constant(exp1) and is_constant(exp2)
    list_check = isinstance(exp1, list) and isinstance(exp2, list)
    
    empty_check = False
    if list_check:
        if len(exp1) == 0 or len(exp2) == 0:
            empty_check = True
    
    # base check case initial check
    # not implementing the return part
    if con_check or empty_check:
        return True
    
    return False


In [45]:
# verify empty lists return true
exp1_test = []
assert constant_or_empty(exp1_test, exp1_test) == True
assert constant_or_empty(["Fred"], exp1_test) == True

#verify two constants return true
exp1_test2 = "Fred"
exp2_test2 ="Bill"
assert constant_or_empty(exp1_test2, exp1_test2) == True # same constant
assert constant_or_empty(exp1_test2, exp2_test2) == True # diff constants

# verify mixed type returns false
exp1_test3 = "Fred"
exp2_test3 = "?w"
assert constant_or_empty(exp1_test3, exp2_test3) == False
# verify different size lists return False
exp1_test4 = ["Fred", "loves"]
exp2_test4 = ["Fred","loves", "?r"]
assert constant_or_empty(exp1_test4, exp1_test4) == False # same list
assert constant_or_empty(exp1_test4, exp2_test4) == False # diff list

<a id="occurs_in"></a>
## occurs_in

- Checks variable is in an expression or not. 
- If variables is equal to the expression, then false is returned

* **variable** str: a string for a variable
* **exp** list or str:An  expression which can be a list of expressions or a constant or a variable


**returns** bool: true or false variable is in an expression list

In [30]:
def occurs_in(variable:str, exp:list|str)->bool:
    if isinstance(exp, list):
        if variable in exp:
            return True
    #elif variable == exp:
    #    return True
    else:
        return False


In [46]:
var = "?Fred"

# verify returns true if variable in list
exp_test = [var, "Eric"]
assert occurs_in(var, exp_test) == True

#verify return false when value not in list
exp_test2 = ["Tom", "Eric"]
assert occurs_in(var, exp1_test2) == False

# verify two variable cases 
# verify returns False if same variable
assert occurs_in(var, var) == False

# verify if two diff variables returns false
var2 = "?Brandi"
assert occurs_in(var, var2) == False


<a id="get_first"></a>
## get_first

- Returns the first item in an expression if the expression is a list
- Otherwise, the expression is returned. 

* **exp** list or str: expression 1 which can be a list of expressions or a constant or a variable

**returns** str: returns the first item in an expression

In [32]:
def get_first(exp:list|str)->str:
    if isinstance(exp, list):
        return exp[0]
    else:
        return exp # variable or constant

In [33]:
# list check
assert get_first(["hi","there"]) == "hi"
assert get_first([["Tom", "Eric"], "Zac"]) == ["Tom", "Eric"]

# variable check
assert get_first("?x") == "?x"

# constant check
assert get_first("Eric") == "Eric"

<a id="update_exp"></a>
## update_exp

- A recursive function to update all values in a list based on the mapping
- no value is returned but the expression (list) will be updated

* **list1** list: an expression that might have variables to be updated
* **mapping** dict: mapping to update variables with a new value



In [34]:
def update_exp(list1:list, mapping:dict):

    for item in list1:
        if isinstance(item, list):
            update_exp(item, mapping)
        if is_variable(item):
            if item in mapping:
                list1[list1.index(item)] = mapping[item]
                continue

In [35]:
exp1 = ["loves","Eric","?x"]
mapping_test = {"?x":"Brandi"}

# check can update list of only constants and variables
update_exp(exp1, mapping_test)
assert exp1 == ['loves', 'Eric', 'Brandi']
exp5 = ["loves","Eric","?x"]
update_exp(exp5, {"?x":["engineer", "Brandi"]})
assert exp5 == ["loves","Eric",["engineer", "Brandi"]]
# check no change happens to list if variable doesnt match mapping
exp2 = ["loves","Eric","?y"]
update_exp(exp2, mapping_test)
assert exp2 == ['loves', 'Eric', '?y']
# check can handle nested updates
exp3 = ["love", ["son", "mike"], ["engineer", "?x"]]
update_exp(exp3, mapping_test)
assert exp3 == ['love', ['son', 'mike'], ['engineer', 'Brandi']]
exp4 = ["love", ["son", "mike"], ["engineer", ["sister", "?x"]]]
update_exp(exp4, mapping_test)
assert exp4 == ["love", ["son", "mike"], ["engineer", ["sister", "Brandi"]]]

<a id="get_rest"></a>
## get_rest

- If an expression is a list, updates the expression by the mapping
- then it returns the list with the first value removed
- If the expression is a variable or constant, an empty expression is returned
- An error in the function will return None

* **exp** list or str: expression 1 which can be a list of expressions or a constant or a variable
* **mapping** dict: mapping to update variables with a new value

**returns** list: returns an expression with updated mapping and first item removed


In [36]:
def get_rest(exp:list|str, mapping:dict)->list:
    if is_variable(exp):
        if exp in mapping:
            return []
        else:
            return None# something in previous step went wrong
    elif isinstance(exp, list):
        update_exp(exp, mapping) # apply mapping to expression
        return exp[1:] # remove first index
    else: #is_constant(exp)
        return []


In [37]:
exp1_test = ['love', 'Eric', '?x']
mapping_test = {'?x':'Brandi'}

# verify exp1 is returned variable updated and one less is size
new_exp1 = get_rest(exp1_test, mapping_test)
assert len(new_exp1) == 2
assert new_exp1 == ['Eric','Brandi']

# verify variable is turned into an empty list
new_exp2 = get_rest('?x', mapping_test)
assert len(new_exp2) == 0
new_exp3 = get_rest(['?x'], mapping_test)
assert len(new_exp3) == 0

# verify constant is turned into an empty list
new_exp2 = get_rest('Eric', mapping_test)
assert len(new_exp2) == 0
new_exp3 = get_rest(['Eric'], mapping_test)
assert len(new_exp3) == 0

<a id="get_final_mapping"></a>
## get_final_mapping

- Combines the results, the mappings, into one mapping
- has error checking. Will return None if results have variable mapping to two different values

* **result1** dict: mapping from first unificiation pass
* **result2** dict: mapping from second unificiation pass


**returns** dict: dictionary combined of the two results

In [38]:
def get_final_mapping(result1:dict, result2:dict)->dict:
    for k in result1:
        if k in result2:
            if result1[k] != result2[k]:
                print("Something went wrong. Multiple assignments")
                return None
    final_result = {}
    final_result.update(result1)
    final_result.update(result2)
    return final_result

In [39]:
result1_test = {"?x":"Brandi", "?y":"Eric"}
result2_test = {"?v":"loves ?x, ?y"}
result3_test = {"?x":"Tom"}

# assert a new dictionary of size 3 is returned
rall = get_final_mapping(result1_test, result2_test)
assert len(rall) == 3

# verify an empty result set doesn't cause failure
rall2 = get_final_mapping(result1_test, {})
assert len(rall2) == len(result1_test)
rall3 = get_final_mapping({}, {})
assert rall3 == {}

# verify if contradicting mapping cause a failure
rall4 = get_final_mapping(result1_test, result3_test)
assert rall4 == None


Something went wrong. Multiple assignments


<a id="unification"></a>
## unification

- Performs unification, an algorithm that uses recursion to decide whether two expressions unify or not
- Returns a dictionary if unification is successful else None if a failure

* **list_expression1** list or str: expression 1 which can be a list of expressions or a constant or a variable
* **list_expression1** list or str: expression 2 which can be a list of expressions or a constant or a variable


**returns** dict: mapping to make unification true

In [40]:
def unification( list_expression1:list|str, list_expression2:list|str)->dict:# from parse
    # base cases
    if constant_or_empty(list_expression1, list_expression2):
        if list_expression1 == list_expression2:
            return {}
        else:
            return None
    if is_variable(list_expression1):
        if occurs_in(list_expression1, list_expression2):
            return None
        else:
            return {list_expression1:list_expression2}
    if is_variable(list_expression2):
        if occurs_in(list_expression2, list_expression1):
            return None
        else:
            return {list_expression2:list_expression1}
    # inductive step
    first1 = get_first(list_expression1)
    first2 = get_first(list_expression2)
    
    result1 = unification(first1, first2)
    if result1 == None:
        return None
    # apply and get rest
    result2 = unification(get_rest(list_expression1, result1), get_rest(list_expression2, result1))
    if result2 == None:
        return None
    return get_final_mapping(result1, result2)

In [41]:
def list_check(parsed_expression):
    if isinstance(parsed_expression, list):
        return parsed_expression
    return [parsed_expression]

The `unification` pseudocode only takes lists so we have to make sure that we only pass a list.
However, this has the side effect of making "foo" unify with ["foo"], at the start.
That's ok.

<a id="unify"></a>
## unify

- First parses both strings, then calls unification function
- Returns mapping if unification is successful or None if a failure

* **s_expression1** str: expression 1 
* **s_expression2** str: expression 2 


**returns** dict: mapping to make unification successful

In [42]:
def unify( s_expression1:str, s_expression2:str)->dict:
    list_expression1 = parse(s_expression1)
    list_expression2 = parse(s_expression2)
    return unification( list_expression1, list_expression2)

**Note** If you see the error,

```
tokenize.TokenError: ('EOF in multi-line statement', (2, 0))
```
You most likely have unbalanced parentheses in your s-expression.

## Test Cases

Use the expressions from the Self Check as your test cases...

In [43]:
self_check_test_cases = [
    ['(son Barney Barney)', '(daughter Wilma Pebbles)', None]
    ,['Fred', 'Barney', None]
    ,['Pebbles', 'Pebbles', {}]
    ,['(quarry_worker Fred)', '(quarry_worker ?x)', {'?x':'Fred'}]
    ,['(son Barney ?x)','(son ?y Bam_Bam)',{'?x':"Bam_Bam", '?y':"Barney"}]
    ,['(married ?x ?y)','(married Barney Wilma)',{"?x":"Barney", "?y":"Wilma"}]
    ,['(son Barney ?x)','(son ?y (son Barney))',{"?x":["son","Barney"],"?y":"Barney"}] 
    # unsure if expression should be returned like parse or not
    ,['(son Barney ?x)','(son ?y (son ?y))', {"?y":"Barney", "?x":["son","Barney"]}]
    ,['(son Barney Bam_Bam)','(son ?y (son Barney))', None]
    ,['(loves Fred Fred)','(loves ?x ?x)',{"?x":"Fred"}]
    ,['(future George Fred)','(future ?y ?y)', None]
    ]


for case in self_check_test_cases:
    exp1, exp2, expected = case
    actual = unify(exp1, exp2)
    print(f"actual = {actual}")
    print(f"expected = {expected}")
    print("\n")
    assert expected == actual

actual = None
expected = None


actual = None
expected = None


actual = {}
expected = {}


actual = {'?x': 'Fred'}
expected = {'?x': 'Fred'}


actual = {'?y': 'Barney', '?x': 'Bam_Bam'}
expected = {'?x': 'Bam_Bam', '?y': 'Barney'}


actual = {'?x': 'Barney', '?y': 'Wilma'}
expected = {'?x': 'Barney', '?y': 'Wilma'}


actual = {'?y': 'Barney', '?x': ['son', 'Barney']}
expected = {'?x': ['son', 'Barney'], '?y': 'Barney'}


actual = {'?y': 'Barney', '?x': ['son', 'Barney']}
expected = {'?y': 'Barney', '?x': ['son', 'Barney']}


actual = None
expected = None


actual = {'?x': 'Fred'}
expected = {'?x': 'Fred'}


actual = None
expected = None




Now add at least **five (5)** additional test cases of your own making, explaining exactly what you are testing. They should not be testing the same things as the self check test cases above.

In [47]:
new_test_cases = [
    ['(son Barney Barney)', '(daughter Wilma Pebbles)', None, "non-equal constants"]
    ,['(son ?x Joe)', '(cousin ?x Joe)', None, "non-equal predicates"]
    ,['(love ?x ?y)', '(love Eric Brandi ?w)', None, "non-equal lengthed expresions"]
   ,['(?x Eric Brandi)', '(love Eric Brandi)', {"?x":"love"}, "a variable in the predicate spot"] 
   ,['(love (son (engineer Tom)) ?x)', '(love ?y Brandi)', {"?x":"Brandi", "?y":["son", ["engineer", "Tom"]]}, "a variable maps to nested expression"] 
   ,['(love ?y (son (engineer ?y)))', '(love Brandi ?x)', {"?y":"Brandi", "?x":["son", ["engineer", "Brandi"]]}, "a variable inside a double nested expression is updated"] 
   ,['((love Eric Brandi) ?x (child Perrie Ivy))', '(?y (works Tom Zak) ?w)'
    , {"?y":["love", "Eric", "Brandi"],"?x":["works", "Tom", "Zak"],"?w":["child","Perrie", "Ivy"]}, "an expression of three expressions"] 
    ,['(?y ?x)', '(?y ?x)', {"?y":"?y", "?x":"?x"}, "All variables equal to each other"]
]
for case in new_test_cases:
    exp1, exp2, expected, message = case
    actual = unify(exp1, exp2)
    print(f"Testing {message}...")
    print(f"actual = {actual}")
    print(f"expected = {expected}")
    print("\n")
    assert expected == actual

Testing non-equal constants...
actual = None
expected = None


Testing non-equal predicates...
actual = None
expected = None


Testing non-equal lengthed expresions...
actual = None
expected = None


Testing a variable in the predicate spot...
actual = {'?x': 'love'}
expected = {'?x': 'love'}


Testing a variable maps to nested expression...
actual = {'?y': ['son', ['engineer', 'Tom']], '?x': 'Brandi'}
expected = {'?x': 'Brandi', '?y': ['son', ['engineer', 'Tom']]}


Testing a variable inside a double nested expression is updated...
actual = {'?y': 'Brandi', '?x': ['son', ['engineer', 'Brandi']]}
expected = {'?y': 'Brandi', '?x': ['son', ['engineer', 'Brandi']]}


Testing an expression of three expressions...
actual = {'?y': ['love', 'Eric', 'Brandi'], '?x': ['works', 'Tom', 'Zak'], '?w': ['child', 'Perrie', 'Ivy']}
expected = {'?y': ['love', 'Eric', 'Brandi'], '?x': ['works', 'Tom', 'Zak'], '?w': ['child', 'Perrie', 'Ivy']}


Testing All variables equal to each other...
actual = {'?y'

## Before You Submit...

1. Did you provide output exactly as requested?
2. Did you re-execute the entire notebook? ("Restart Kernel and Rull All Cells...")
3. If you did not complete the assignment or had difficulty please explain what gave you the most difficulty in the Markdown cell below.
4. Did you change the name of the file to `jhed_id.ipynb`?

Do not submit any other files.