# Module 7 - Programming Assignment

## Directions

1. Change the name of this file to be your JHED id as in `jsmith299.ipynb`. Because sure you use your JHED ID (it's made out of your name and not your student id which is just letters and numbers).
2. Make sure the notebook you submit is cleanly and fully executed. I do not grade unexecuted notebooks.
3. Submit your notebook back in Blackboard where you downloaded this file.

*Provide the output **exactly** as requested*

# Unification

This is actually Part I of a two part assignment. In a later module, you'll implement a Forward Planner. In order to do that, however, you need to have a unifier. It is important to note that you *only* need to implement a unifier. Although the module talked about resolution, you do not need to implement anything like "standardizing apart". From the unifier's point of view, that should already have been done.

Unification is simply the *syntactic* balancing of expressions. There are only 3 kinds of expressions: constants, lists and (logic) variables. Constants and lists are only equal to each other if they're exactly the same thing or can be made to be the same thing by *binding* a value to a variable.

It really is that simple...expressions must be literally the same (identical) except if one or the other (or both) has a variable in that "spot".

## S-Expressions

With that out of the way, we need a language with which to express our constants, variables and predicates and that language will be based on s-expressions.

**constants** - There are two types of constants, values and predicates. Values should start with an uppercase letter. Fred is a constant value, so is Barney and Food. Predicates are named using lowercase letters. loves is a predicate and so is hates. This is only a convention. Secret: your code does not need to treat these two types of constants differently.

**variables** - these are named using lowercase letters but always start with a question mark. ?x is a variable and so is ?yum. This is not a convention.

**expressions (lists)** - these use the S-expression syntax a la LISP. (loves Fred Wilma) is an expression as is (friend-of Barney Fred) and (loves ?x ?y).

## Parsing

These functions are already included in the starter .py file.

In [1]:
import tokenize
from io import StringIO

This uses the above libraries to build a Lisp structure based on atoms. It is adapted from [simple iterator parser](http://effbot.org/zone/simple-iterator-parser.htm). The first function is the `atom` function.

In [2]:
def atom( next, token):
    if token[ 1] == '(':
        out = []
        token = next()
        while token[ 1] != ')':
            out.append( atom( next, token))
            token = next()
            if token[ 1] == ' ':
                token = next()
        return out
    elif token[ 1] == '?':
        token = next()
        return "?" + token[ 1]
    else:
        return token[ 1]

The next function is the actual `parse` function:

In [3]:
def parse(exp):
    src = StringIO(exp).readline
    tokens = tokenize.generate_tokens(src)
    return atom(tokens.__next__, tokens.__next__())

**Note** there was a change between 2.7 and 3.0 that "hid" the next() function in the tokenizer.

From a Python perspective, we want to turn something like "(loves Fred ?x)" to ["loves" "Fred" "?x"] and then work with the second representation as a list of strings. The strings then have the syntactic meaning we gave them previously.

In [4]:
parse("Fred")

'Fred'

In [5]:
parse( "?x")

'?x'

In [6]:
parse( "(loves Fred ?x)")

['loves', 'Fred', '?x']

In [7]:
parse( "(father_of Barney (son_of Barney))")

['father_of', 'Barney', ['son_of', 'Barney']]

## Unifier

Now that that's out of the way, here is the imperative pseudocode for unification. This is a classic recursive program with a number of base cases. Students for some reason don't like it, try the algorithm in the book, can't get it to work and then come back to this pseudocode.

Work through the algorithm by hand with your Self-Check examples if you need to but I'd suggest sticking with this implementation. It does work.

Here is imperative pseudocode for the algorithm:

```
def unification( exp1, exp2):
    # base cases
    if exp1 and exp2 are constants or the empty list:
        if exp1 = exp2 then return {}
        else return FAIL
    if exp1 is a variable:
        if exp1 occurs in exp2 then return FAIL
        else return {exp1/exp2}
    if exp2 is a variable:
        if exp2 occurs in exp1 then return FAIL
        else return {exp2/exp1}

    # inductive step
    first1 = first element of exp1
    first2 = first element of exp2
    result1 = unification( first1, first2)
    if result1 = FAIL then return FAIL
    apply result1 to rest of exp1 and exp2
    result2 = unification( rest of exp1, rest of exp2)
    if result2 = FAIL then return FAIL
    return composition of result1 and result2
```

`unification` can return...

1. `None` (if unification completely fails)
2. `{}` (the empty substitution list) or 
3. a substitution list that has variables as keys and substituted values as values, like {"?x": "Fred"}. 

Note that the middle case sometimes confuses people..."Sam" unifying with "Sam" is not a failure so you return {} because there were no variables so there were no substitutions. You do not need to further resolve variables. If a variable resolves to an expression that contains a variable, you don't need to do the substition.

If you think of a typical database table, there is a column, row and value. This Tuple is a *relation* and in some uses of unification, the "thing" in the first spot..."love" above is called the relation. If you have a table of users with user_id, username and the value then the relation is:

`(login ?user_id ?username)`

*most* of the time, the relation name is specified. But it's not impossible for the relation name to be represented by a variable:

`(?relation 12345 "smooth_operator")`

Your code should handle this case (the pseudocode does handle this case so all  you have to do is not futz with it).

Our type system is very simple. We can get by with just a few boolean functions. The first tests to see if an expression is a variable.

In [8]:
def is_variable( exp):
    return isinstance( exp, str) and exp[ 0] == "?"

In [9]:
is_variable( "Fred")

False

In [10]:
is_variable( "?fred")

True

The second tests to see if an expression is a constant:

In [11]:
def is_constant( exp):
    return isinstance( exp, str) and not is_variable( exp)

In [12]:
is_constant( "Fred")

True

In [13]:
is_constant( "?fred")

False

In [14]:
is_constant( ["loves", "Fred", "?wife"])

False

It might also be useful to know that:

<code>
type( "a")
&lt;type 'str'>
type( "a") == str
True
type( "a") == list
False
type( ["a"]) == list
True
</code>


You need to write the `unification` function described above. It should work with two expressions of the type returned by `parse`. See `unify` for how it will be called. It should return the result of unification for the two expressions as detailed above and in the book. It does not have to make all the necessary substitions (for example, if ?y is bound to ?x and 1 is bound to ?y, ?x doesn't have to be replaced everywhere with 1. It's enough to return {"?x":"?y", "?y":1}. For an actual application, you would need to fix this!)

-----

## substitute documentation:  

Applies substitutions to the given expression.  
    
This function recursively substitutes variables in the expression with their bound values based on the provided substitution list.  

Args:  
substitutions (dict): The substitution list containing variable bindings.  
exp (list/str): The expression to apply the substitutions to.  

Returns:  
list/str: The expression after applying substitutions.  

In [15]:
def substitute(substitutions, exp):
    if isinstance(exp, list):
        return [substitute(substitutions, e) for e in exp]
    if is_variable(exp) and exp in substitutions:
        return substitute(substitutions, substitutions[exp])
    return exp

In [16]:
# Test substitute
print("Running substitute tests...")

substitutions = {"?x": "John", "?y": "Mary"}
expression = ["loves", "?x", "?y"]
result = substitute(substitutions, expression)
assert result == ["loves", "John", "Mary"], f"Test #1 failed: {result} != ['loves', 'John', 'Mary']"
print("Test #1 passed.")

substitutions = {"?a": "Paris", "?b": "London"}
expression = ["travels", "?a", "?b"]
result = substitute(substitutions, expression)
assert result == ["travels", "Paris", "London"], f"Test #2 failed: {result} != ['travels', 'Paris', 'London']"
print("Test #2 passed.")

substitutions = {"?x": "apple"}
expression = ["eats", "?x"]
result = substitute(substitutions, expression)
assert result == ["eats", "apple"], f"Test #3 failed: {result} != ['eats', 'apple']"
print("Test #3 passed.")

Running substitute tests...
Test #1 passed.
Test #2 passed.
Test #3 passed.


## occurs_check documentation:  

Checks if a variable occurs in an expression.  
    
This prevents circular unification where a variable would be unified with itself, either directly or indirectly through other variables. If the variable occurs in the expression, unification should fail.  Checks if `var` occurs within `exp`, considering substitutions recursively to prevent circular references.

Args:  
var (str): The variable to check.  
exp (list/str): The expression to check against.  

Returns:  
bool: True if the variable occurs in the expression, otherwise False.  

In [17]:
def occurs_check(var, exp, substitutions):
    if var == exp:
        return True
    elif isinstance(exp, list):
        # Recursively check each element in the list for occurrences
        return any(occurs_check(var, sub_exp, substitutions) for sub_exp in exp)
    elif isinstance(exp, str) and exp in substitutions:
        # Recursively check the substituted value of `exp`
        return occurs_check(var, substitutions[exp], substitutions)
    return False

In [18]:
# Test occurs_check
print("\nRunning occurs_check tests...")

# Test #1: Direct occurrence of `var` in `exp`
var = "?x"
exp = ["loves", "?x", "Mary"]
substitutions = {}
result = occurs_check(var, exp, substitutions)
assert result == True, f"Test #1 failed: {result} != True"
print("Test #1 passed.")

# Test #2: No occurrence of `var` in `exp`
var = "?y"
exp = ["loves", "John", "Mary"]
substitutions = {}
result = occurs_check(var, exp, substitutions)
assert result == False, f"Test #2 failed: {result} != False"
print("Test #2 passed.")

# Test #3: No occurrence of `var` in `exp`, with an empty substitutions dictionary
var = "?a"
exp = ["travels", "?b", "?c"]
substitutions = {}
result = occurs_check(var, exp, substitutions)
assert result == False, f"Test #3 failed: {result} != False"
print("Test #3 passed.")

# Additional Test #4: Indirect self-reference through substitutions
var = "?x"
exp = ["loves", "?y", "Mary"]
substitutions = {"?y": "?x"}  # `?y` is indirectly `?x`
result = occurs_check(var, exp, substitutions)
assert result == True, f"Test #4 failed: {result} != True"
print("Test #4 passed.")

# Additional Test #5: No self-reference in nested structure
var = "?x"
exp = ["loves", ["friend", "?y"], "Mary"]
substitutions = {"?y": "John"}  # `?y` is unrelated to `?x`
result = occurs_check(var, exp, substitutions)
assert result == False, f"Test #5 failed: {result} != False"
print("Test #5 passed.")



Running occurs_check tests...
Test #1 passed.
Test #2 passed.
Test #3 passed.
Test #4 passed.
Test #5 passed.


## unify_var documentation:  

Unifies a variable with another expression.  
    
This function checks if the variable can be unified with the expression, ensuring that the variable doesn't appear within the expression (occurs check). If successful, it binds the variable to the expression in the substitution list.  Attempts to unify a variable `var` with an expression `exp`, checking for circular references.

Args:  
var (str): The variable to unify.  
exp (list/str): The expression to unify the variable with.  
substitutions (dict): The substitution list containing variable bindings.  

Returns:  
dict or None: The updated substitution list if successful, otherwise None (indicating failure).  


In [19]:
def unify_var(var, exp, substitutions):
    print(f"Unifying variable: {var} with expression: {exp}")

    # If `var` is already in substitutions, unify its substitution with `exp`
    if var in substitutions:
        return unification(substitute(substitutions, var), exp, substitutions)
    
    # If `exp` is a variable in substitutions, unify `var` with the substitution value of `exp`
    elif isinstance(exp, str) and exp in substitutions:
        return unification(var, substitute(substitutions, exp), substitutions)
    
    # Direct occurs check to detect circular references (both direct and indirect)
    elif occurs_check(var, exp, substitutions):
        print(f"Circular reference detected: cannot unify {var} with {exp}")
        return None  # Fail unification due to circular reference

    # If no circular reference is detected, add to substitutions
    else:
        # Convert `exp` to string format if it's a list before adding to substitutions
        substitutions[var] = f"({' '.join(exp)})" if isinstance(exp, list) else exp
        return substitutions


In [20]:
# Test unify_var
print("\nRunning unify_var tests...")

var = "?x"
exp = "John"
substitutions = {}
result = unify_var(var, exp, substitutions)
assert result == {"?x": "John"}, f"Test #1 failed: {result} != {{'?x': 'John'}}"
print("Test #1 passed.")

var = "?z"
exp = "Sky"
substitutions = {"?y": "Tree"}
result = unify_var(var, exp, substitutions)
assert result == {"?y": "Tree", "?z": "Sky"}, f"Test #2 failed: {result} != {{'?y': 'Tree', '?z': 'Sky'}}"
print("Test #2 passed.")

var = "?p"
exp = "Car"
substitutions = {"?q": "Bike"}
result = unify_var(var, exp, substitutions)
assert result == {"?q": "Bike", "?p": "Car"}, f"Test #3 failed: {result} != {{'?q': 'Bike', '?p': 'Car'}}"
print("Test #3 passed.")


Running unify_var tests...
Unifying variable: ?x with expression: John
Test #1 passed.
Unifying variable: ?z with expression: Sky
Test #2 passed.
Unifying variable: ?p with expression: Car
Test #3 passed.


# unification documentation:  
    
Unifies two expressions.  
    
This function attempts to unify two expressions recursively. It handles base cases (constants, variables, and empty lists) and applies substitutions. For lists, it unifies elements sequentially.  

Args:  
exp1 (list/str): The first expression to unify.  
exp2 (list/str): The second expression to unify.  
substitutions (dict, optional): The substitution list containing variable bindings. Defaults to None.  

Returns:  
dict or None: The updated substitution list if successful, otherwise None (indicating failure).  

In [21]:
def unification(exp1, exp2, substitutions=None):
    if substitutions is None:
        substitutions = {}

    # Apply current substitutions to expressions
    exp1 = substitute(substitutions, exp1)
    exp2 = substitute(substitutions, exp2)

    # Check for direct match between expressions
    if exp1 == exp2:
        return substitutions  
    if is_variable(exp1):
        return unify_var(exp1, exp2, substitutions)
    if is_variable(exp2):
        return unify_var(exp2, exp1, substitutions)
    
    # Inductive case: recursively unify lists if lengths match
    if isinstance(exp1, list) and isinstance(exp2, list) and len(exp1) == len(exp2):
        first1, rest1 = exp1[0], exp1[1:]
        first2, rest2 = exp2[0], exp2[1:]
        result1 = unification(first1, first2, substitutions)
        if result1 is None:
            return None 
        return unification(rest1, rest2, result1)
    
    # Return None if unification is not possible
    return None  

In [22]:
# Test unification
print("\nRunning unification tests...")

exp1 = ["loves", "?x", "Mary"]
exp2 = ["loves", "John", "?y"]
result = unification(exp1, exp2)
assert result == {"?x": "John", "?y": "Mary"}, f"Test #1 failed: {result} != {{'?x': 'John', '?y': 'Mary'}}"
print("Test #1 passed.")

exp1 = ["travels", "?x", "?y"]
exp2 = ["travels", "Alice", "Paris"]
result = unification(exp1, exp2)
assert result == {"?x": "Alice", "?y": "Paris"}, f"Test #2 failed: {result} != {{'?x': 'Alice', '?y': 'Paris'}}"
print("Test #2 passed.")

exp1 = ["eats", "?x"]
exp2 = ["eats", "apple"]
result = unification(exp1, exp2)
assert result == {"?x": "apple"}, f"Test #3 failed: {result} != {{'?x': 'apple'}}"
print("Test #3 passed.")


Running unification tests...
Unifying variable: ?x with expression: John
Unifying variable: ?y with expression: Mary
Test #1 passed.
Unifying variable: ?x with expression: Alice
Unifying variable: ?y with expression: Paris
Test #2 passed.
Unifying variable: ?x with expression: apple
Test #3 passed.


In [23]:
def list_check(parsed_expression):
    if isinstance(parsed_expression, list):
        return parsed_expression
    return [parsed_expression]

The `unification` pseudocode only takes lists so we have to make sure that we only pass a list.
However, this has the side effect of making "foo" unify with ["foo"], at the start.
That's ok.

In [24]:
def unify( s_expression1, s_expression2):
    list_expression1 = list_check(s_expression1)
    list_expression2 = list_check(s_expression2)
    return unification( list_expression1, list_expression2)

**Note** If you see the error,

```
tokenize.TokenError: ('EOF in multi-line statement', (2, 0))
```
You most likely have unbalanced parentheses in your s-expression.

## Test Cases

Use the expressions from the Self Check as your test cases...

In [25]:
self_check_test_cases = [
    ['(son Barney Barney)', '(daughter Wilma Pebbles)', None]
]
for case in self_check_test_cases:
    exp1, exp2, expected = case
    actual = unify(exp1, exp2)
    print(f"actual = {actual}")
    print(f"expected = {expected}")
    print("\n")
    assert expected == actual

actual = None
expected = None




Now add at least **five (5)** additional test cases of your own making, explaining exactly what you are testing. They should not be testing the same things as the self check test cases above.

In [26]:
new_test_cases = [
    # Original Test Case: Non-equal constants (expect no unification)
    ['(son Barney Barney)', '(daughter Wilma Pebbles)', None, "Non-equal constants (Barney vs. Wilma and Pebbles)"],
    
    # Test case 1: Different planets, no unification (expect None)
    ['(planet Earth orbits Sun)', '(planet Mars orbits Sun)', None, "Different planets (Earth vs. Mars), no unification expected"],
    
    # Test case 2: Exact match between expressions (expect empty substitution)
    ['(father Abraham Isaac)', '(father Abraham Isaac)', {}, "Exact match, expect empty unification (Abraham and Isaac)"],
    
    # Test case 3: Famous fathers and sons with differing children (expect no unification)
    ['(father Jor-El Superman)', '(father Jor-El Kal-El)', None, "Famous fathers and sons (Superman vs. Kal-El), no unification expected"],
    
    # Test case 4: Different circus animals performing the same action (expect no unification)
    ['(elephant Dumbo performs)', '(elephant Jumbo performs)', None, "Different circus animals (Dumbo vs. Jumbo), no unification expected"],
    
    # Test case 5: Simple unification between a variable and a constant
    ['?x', 'Earth', {'?x': 'Earth'}, "Simple unification between variable '?x' and constant 'Earth'"],
    
    # Test case 6: Simple unification between two identical constants
    ['Earth', 'Earth', {}, "Simple unification between two identical constants ('Earth' and 'Earth')"],
    
    # Test case 7: Simple unification between two variables
    ['?x', '?y', {'?x': '?y'}, "Simple unification between two variables ('?x' and '?y')"],
    
    # Test case 8: Unification where both arguments use the same variable (should unify ?x with 'Fred')
    ['(loves Fred Fred)', '(loves ?x ?x)', {'?x': 'Fred'}, "Unification where ?x unifies with 'Fred' in both positions"],
    
    # Test case 9: Unification with self-reference (should fail as it would create a self-reference)
    ['(future George Fred)', '(future ?y ?y)', None, "Self-reference unification failure (no possible unification for ?y)"],

    # Test case 10: Non-equal constants (expect failure)
    ['Fred', 'Barney', None, "FAIL - The constants differ ('Fred' vs. 'Barney')"],
    
    # Test case 11: Identical constants, no substitution needed
    ['Pebbles', 'Pebbles', {}, "Both constants are the same ('Pebbles'), no substitution needed"],
    
    # Test case 12: Unification by substituting variable with a constant
    ['(quarry_worker Fred)', '(quarry_worker ?x)', {'?x': 'Fred'}, "Unification by substituting '?x' with 'Fred'"],
    
    # Test case 13: Unification with multiple substitutions
    ['(son Barney ?x)', '(son ?y Bam_Bam)', {'?x': 'Bam_Bam', '?y': 'Barney'}, "Unification with substitutions: '?x' -> 'Bam_Bam', '?y' -> 'Barney'"],
    
    # Test case 14: Unification by substituting multiple variables with constants
    ['(married ?x ?y)', '(married Barney Wilma)', {'?x': 'Barney', '?y': 'Wilma'}, "Unification with substitutions: '?x' -> 'Barney', '?y' -> 'Wilma'"],
    
    # Test case 15: Unification with nested expressions
    ['(son Barney ?x)', '(son ?y (son Barney))', {'?x': '(son Barney)', '?y': 'Barney'}, "Unification with nested structure: '?x' -> '(son Barney)', '?y' -> 'Barney'"],
       
    # Test case 16: Structural difference causing unification failure
    ['(son Barney Bam_Bam)', '(son ?y (son Barney))', None, "FAIL - Structure mismatch, 'Bam_Bam' cannot unify with '(son Barney)'"],
    
    # Test case 17: Both variables must unify to the same constant
    ['(loves Fred Fred)', '(loves ?x ?x)', {'?x': 'Fred'}, "Unification where '?x' must match 'Fred' in both positions"],
    
    # Test case 18: Constants differ, causing unification failure
    ['(future George Fred)', '(future ?y ?y)', None, "FAIL - Constants differ ('George' vs. 'Fred'), no unification possible"]
]

# Running the tests
for case in new_test_cases:
    exp1, exp2, expected, message = case
    actual = unify(parse(exp1), parse(exp2))
    print(f"Testing {message}...")
    print(f"actual = {actual}")
    print(f"expected = {expected}")
    assert actual == expected, f"Test failed: {exp1} vs {exp2}. Expected {expected}, got {actual}"
    print(f"Test passed for {message}.\n")


Testing Non-equal constants (Barney vs. Wilma and Pebbles)...
actual = None
expected = None
Test passed for Non-equal constants (Barney vs. Wilma and Pebbles).

Testing Different planets (Earth vs. Mars), no unification expected...
actual = None
expected = None
Test passed for Different planets (Earth vs. Mars), no unification expected.

Testing Exact match, expect empty unification (Abraham and Isaac)...
actual = {}
expected = {}
Test passed for Exact match, expect empty unification (Abraham and Isaac).

Testing Famous fathers and sons (Superman vs. Kal-El), no unification expected...
actual = None
expected = None
Test passed for Famous fathers and sons (Superman vs. Kal-El), no unification expected.

Testing Different circus animals (Dumbo vs. Jumbo), no unification expected...
actual = None
expected = None
Test passed for Different circus animals (Dumbo vs. Jumbo), no unification expected.

Unifying variable: ?x with expression: Earth
Testing Simple unification between variable '?x'

I could not solve the problem of self-referential structures:

    # Test case 16: Self-referential structure causing failure
    ['(son Barney ?x)', '(son ?y (son ?y))', None, "FAIL - Self-referential structure, '?y' cannot unify with '(son ?y)'"],

I tried directly embedding occurs_check within unify_var, to ensure itâ€™s applied each time a variable is unified with an expression, preventing self-references, but that did not work. I made my program work on all self-check tests cases except this one. What is wrong?

## Before You Submit...

1. Did you provide output exactly as requested?
2. Did you re-execute the entire notebook? ("Restart Kernel and Rull All Cells...")
3. If you did not complete the assignment or had difficulty please explain what gave you the most difficulty in the Markdown cell below.
4. Did you change the name of the file to `jhed_id.ipynb`?

Do not submit any other files.