# 基于模式匹配的对话机器人实现
## Pattern Match
为了实现模板的判断和定义，我们需要定义一个特殊的符号类型，这个符号类型就叫做"variable"， 这个"variable"用来表示是一个占位符。例如，定义一个目标: "I want X"， 我们可以表示成 "I want ?X", 意思就是?X是一个用来占位的符号。
如果输入了"I want holiday"， 在这里 'holiday' 就是 '?X'

In [3]:
def is_variable(pat):
    return pat.startswith('?') and all(s.isalpha() for s in pat[1:])

定义函数：检查字符串是否以？开头，且剩下的字符均为字母组成

In [15]:
def pat_match1(pattern, saying):
    if is_variable(pattern[0]): return True
    else:
        if pattern[0] != saying[0]: return False
        else:
            return pat_match1(pattern[1:], saying[1:])

定义函数：两个列表中元素相比较：所有元素相同，返回True；出现？的元素前所有元素相同,则返回True；若两个列表元素均相同，错误

In [19]:
pat_match1('I love ?u'.split(), 'I love you'.split())

True

In [20]:
pat_match1('I love a ?u'.split(), 'I love you'.split())

False

## 获得匹配的变量

以上的函数能够判断两个 pattern 是不是相符，但是我们更加希望的是获得每个variable对应的是什么值。
我们对程序做如下修改

In [22]:
def pat_match(pattern, saying):
    if not pattern or not saying: return []
    if is_variable(pattern[0]):
        return [(pattern[0], saying[0])] + pat_match(pattern[1:], saying[1:])
    else:
        if pattern[0] != saying[0]: return []
        else:
            return pat_match(pattern[1:], saying[1:])

In [23]:
pat_match("?X greater than ?Y".split(), "3 greater than 2".split())

[('?X', '3'), ('?Y', '2')]

如果我们知道了每个变量对应的是什么，那么我们就可以很方便的使用我们定义好的模板进行替换：

为了方便接下来的替换工作，我们新建立两个函数，一个是把我们解析出来的结果变成一个 dictionary，一个是依据这个 dictionary 依照我们的定义的方式进行替换。

In [24]:
def pat_to_dict(patterns):
    return {k:v for k,v in patterns}

In [25]:
def subsitite(rule, parsed_rules):
    if not rule: return []
    return [parsed_rules.get(rule[0], rule[0])] + subsitite(rule[1:], parsed_rules)

In [30]:
got_patterns = pat_match("I want ?X".split(), "I want iPhone".split())
' '.join(subsitite("What if you mean if you got a ?X".split(), pat_to_dict(got_patterns)))

'What if you mean if you got a iPhone'

In [31]:
defined_patterns = {
    "I need ?X": ["Image you will get ?X soon", "Why do you need ?X ?"], 
    "I need ?X and ?X": ["Image you will get ?X and ?X soon", "Why do you need ?X and ?X ?"], 
    "My ?X told me something": ["Talk about more about your ?X", "How do you think about your ?X ?"]
}

In [65]:
from random import choice
def func(saying, defined):
    while True:
        for k in defined:
            if pat_match1(k.split(), saying.split()):
                pattern = pat_match(k.split(), saying.split())
                dict = pat_to_dict(pattern)
                answer1 = choice(defined[k])
                answer = ' '.join(subsitite(answer1.split(), dict))
                return answer
        print("Sorry, I don't understand you.")
        break                       

In [66]:
func("I need iPhone", defined_patterns)

'Why do you need iPhone ?'

In [69]:
func("I need iPhone and iPad", defined_patterns)

'Why do you need iPhone ?'

这里func只能识别句子中有一个？的情况，需要定义新函数

In [94]:
from random import choice
def func(saying, defined):
    while True:
        for k in defined:
            a = k.split()
            print(a)
            b = saying.split()
            print(b)
            if pat_match2(a, b):
                pattern = pat_match(a, b)
                print(pattern)
                dict = pat_to_dict(pattern)
                print(dict)
                answer1 = choice(defined[k])
                print(answer1)
                replace = subsitite(answer1.split(), dict)
                print(replace)
                answer = ' '.join()
                return answer
        print("Sorry, I don't understand you.")
        break   

In [95]:
func("I need iPhone and iPad", defined_patterns)

['I', 'need', '?X']
['I', 'need', 'iPhone', 'and', 'iPad']


IndexError: list index out of range

In [74]:
pat_match1('?X like you', 'I hate you')

True

In [139]:
def pat_match2(pattern, saying):
    if is_variable(pattern[0]):
        if len(saying) == 1:
            global a 
            a = 1 
            return True
        for i in range(len(saying)-1):
            if is_variable(pattern[i+1]):
                pat_match2(pattern[i+2:], saying[i+2:])
            else:
                if pattern[i+1] != saying[i+1]: return False
        return True
    else:
        if pattern[0] != saying[0]: return False
        else:
            pat_match2(pattern[1:], saying[1:])
            if a == 1:
                return True

定义一个新函数，之前pat_match2()界定第一个词为？时不够完善

In [140]:
pat_match2('?X like you'.split(), 'I hate you'.split())

False

In [141]:
pat_match2('?X like you'.split(), 'he like you'.split())

True

In [142]:
pat_match2('I like ?X'.split(), 'I like you'.split())

True

In [143]:
pat_match2('I like ?X too much '.split(), 'I like you very much'.split())

True

In [144]:
pat_match1('I like ?X too much '.split(), 'I like you very much'.split())

True