## Assigment-01

### 基于模式匹配的对话机器人


### 1. Pattern Match
Pattern: (我想要A)     
Response:（如果你有A, 对你意味着什么呢？）     

为了实现模板的判断和定义，我们需要定义一个特殊的符号类型，这个符号类型就叫做"variable"， 这个"variable"用来表示是一个占位符。例如，定义一个目标: "I want X"， 我们可以表示成 "I want ?X", 意思就是?X是一个用来占位的符号。

如果输入了"I want holiday"， 在这里 'holiday' 就是 '?X'

In [1]:
def is_variable(pat):
    return pat.startswith('?') and all(s.isalpha() for s in pat[1:])

def pat_match(pattern, saying):
    "Define a simple proceture that checks whether the saying matches with the pattern"
    
    if is_variable(pattern[0]): return True
    else:
        if pattern[0] !=saying[0]: return False
        else:
            return pat_match(pattern[1:], saying[1:])

# This isalpha() returns True if all characters in the strings are alphabets and 
# False if at least one character is not alphabet.

In [2]:
pat_match('I want ?X'.split(), 'I want holiday'.split())

True

In [3]:
pat_match('I dreamed about ?X'.split(), 'I have dreamed a dog'.split())

False

In [4]:
def pat_match(pattern, saying):
    "Revise the previous procedure to return the matched variable <A>."
    
    if is_variable(pattern[0]):
        return pattern[0], saying[0]
    else:
        if pattern[0] != saying[0]: return False
        else:
            return pat_match(pattern[1:], saying[1:])


In [5]:
pat_match('I want ?X'.split(), 'I want holiday'.split())

('?X', 'holiday')

In [6]:
pat_match('?X equals ?X'.split(), '2+2 equals 4'.split())
# Note that when we have two pairs matched variables, the procedure only returns the first one

('?X', '2+2')

In [7]:
def pat_match(pattern, saying):
    "Revise the previous procedure so that we can identify all matched pairs."
    
    if not pattern or not saying: return []
    
    if is_variable(pattern[0]):
        return [(pattern[0], saying[0])] + pat_match(pattern[1:], saying[1:])
    else:
        if pattern[0] != saying[0]: return []
        else:
            return pat_match(pattern[1:], saying[1:])
        

In [8]:
pat_match('?X is greater than ?Y'.split(), '3 is greater than 2'.split())

[('?X', '3'), ('?Y', '2')]

In [9]:
# Now we want to do two more things --- a) define a procedure that translates the matched relations
# into a dictionary; and b) define a procedure that can substitute variables given a dictionary
def pat_to_dic (patterns):
    return{k:v for k, v in patterns}

def substitute(pattern, pat_dic):
    if not pattern: return []
    
    return [pat_dic.get(pattern[0], pattern[0])] + substitute(pattern[1:], pat_dic)


In [10]:
got_patterns = pat_match('I want ?X'.split(), 'I want iPhone'.split())

In [11]:
got_patterns

[('?X', 'iPhone')]

In [13]:
substitute("What do you do if you got a ?X".split(), pat_to_dic(got_patterns))

['What', 'do', 'you', 'do', 'if', 'you', 'got', 'a', 'iPhone']

In [14]:
' '.join(substitute("What do you do if you got a ?X".split(), pat_to_dic(got_patterns)))
# join the sentence together

'What do you do if you got a iPhone'

In [15]:
# try another example
john_pat = pat_match('?P needs ?X'.split(), "John needs vacation".split())

In [17]:
' '.join(substitute('Why does ?P need ?X'.split(), pat_to_dic(john_pat)))

'Why does John need vacation'

In [77]:
# Now we can define a pattern dictionary
defined_patterns = {
    "I need ?X": ["Image you will get ?X soon", "Why do you need ?X ?"], 
    "My ?X told me something": ["Talk about more about your ?X", "How do you think about your ?X ?"]
}

[key.split() for key in defined_patterns.keys()]
defined_patterns.get("I need ?X")

['Image you will get ?X soon', 'Why do you need ?X ?']

In [91]:
import random
def get_response(saying, pat_dic):
    "Define a procedure that returns a patterned response based the dictionary supplied"
    
    if not saying or not pat_dic: return []
    
    response_list = []
    
    for key, value in pat_dic.items():
        got_patterns = pat_match(key.split(), saying.split())
        if got_patterns:
            pat_response = pat_dic.get(key)
            for r in pat_response:
                response_list.append(' '.join(substitute(r.split(), pat_to_dic(got_patterns))))
        else: continue
    
    if response_list: 
        return random.choice(response_list)
    else: 
        return "Sorry, I don't know how to answer."


In [95]:
get_response('My mom told me something', defined_patterns)

'Talk about more about your mom'

In [96]:
get_response('I need vacation', defined_patterns)

'Why do you need vacation ?'

### 2. Segment Match
我们上边的这种形式，能够进行一些初级的对话了，但是我们的模式逐字逐句匹配的， "I need iPhone" 和 "I need ?X" 可以匹配，但是"I need an iPhone" 和 "I need ?X" 就不匹配了，那怎么办？

为了解决这个问题，我们可以新建一个变量类型 "?*X", 这种类型多了一个星号(*),表示匹配多个

首先，和前文类似，我们需要定义一个判断是不是匹配多个的variable.


In [97]:
def is_pattern_segment(pattern):
    return pattern.startswith('?*') and all(a.isalpha() for a in pattern[2:])

In [99]:
is_pattern_segment('?*PYAMC')

True

In [104]:
from collections import defaultdict

fail = [True, None]


def segment_match(pattern, saying):
    set_pat, rest = pattern[0], pattern[1:]
    seg_pat = seg_pat.replace('?*', '?')
    
    if not rest: return (seg_pat, saying), len(saying)
    
    for i, token in enumerate(saying):
        if rest[0] == token:
            return (seg_pat, saying[:i]), i
    
    return (seg_pat, saying), len(saying)

def pat_match_with_seg(pattern, saying):
    "Revise the previous pat_match for matching ?* variable with a segment of texts"
    if not pattern or not saying: return []
    
    pat = pattern[0]
    
    if is_variable(pat):
        return [(pat, saying[0])] + pat_match_with_seg(pattern[1:], saying[1:])
    elif is_pattern_segment(pat):
        match, index = segment_match(pattern, saying)
        return [match] + pat_match_with_seg(pattern[1:], saying[index])
    elif pat == saying[0]:
        return pat_match_with_seg(pattern[1:], saying[1:])
    else:
        return fail

In [105]:
segment_match('?*P is very good and ?*X'.split(), "My dog is very good and my cat is very cute".split())

UnboundLocalError: local variable 'seg_pat' referenced before assignment