In [116]:
fs = open('result.txt', 'r', encoding='utf8')
blocks = fs.read().split("====================================")

In [117]:
# 一行行拆成 json
def line2obj(line):
    col = line.split('\t')
    return col[0][:-1], {
        'token': col[1],
        'lemma': col[2],
        'tag': col[3],
        'dep': col[4] if len(col) == 5 else None
    }

# 將編輯部份轉成 json 形式
def parse_edit(edit):
    edit = edit.replace("	Token	Lemma	Tag	Dep(to head)\n", "")
    
    template = { 'Head': {}, 'Target': {}, 'Child': [] , 'Delete': []}
    for line in edit.split('\n'):
        node, content = line2obj(line)
        if node == 'Child' or node == 'Delete':
            template[node].append(content)
        else:
            template[node] = content
            
    return template

# 取得類型跟句子
def parse_info(meta):
    meta = meta.split('\n')
    edit_type = meta[0].split(' ')[0][1:-1]
    sent = meta[1].split('\t')[1]
    return edit_type, sent

In [123]:
all_edits = []
for block in blocks:
    block = block.strip()
    if block == '': continue
    
    sections = block.split('\n\n')
    edit_type, sent = parse_info(sections[0])

    all_edits.append({
        'edit_type': edit_type,
        'sent': sent,
        'edits': [parse_edit(edit) for edit in sections[1:]]
    })
        
# pprint(all_edits)

In [147]:
def search(edit_type_q=None, head_q={}, target_q={}, child_q={}, delete_q={}):
    groups = filter(lambda e: e['edit_type'] == edit_type_q, all_edits) if edit_type_q else all_edits
    
    def extract(group, node):
        return map(lambda e: e[node], group['edits']) 
        
    def match(nodes, query):
        for node in nodes:
            if not node: continue
            
            correct = sum([query[key] == node[key] for key in query])
            if correct == len(query):
                return True
        return False
    
    if head_q:
        groups = filter(lambda g: match(extract(g, 'Head'), head_q), groups)
    
    if target_q:
        groups = filter(lambda g: match(extract(g, 'Target'), target_q), groups)
    
    if child_q:
        get_childs = lambda g: [c for child in extract(g, 'Child') for c in child] # Extract and flatten
        groups = filter(lambda g: match(get_childs(g), child_q), groups)
    
    if delete_q:
        get_deletes = lambda g: [d for delete in extract(g, 'Delete') for d in delete] # Extract and flatten
        groups = filter(lambda g: match(get_deletes(g), delete_q), groups)
    
    def format_line(node):
        try:
            return node['token'] + "\t" + node['lemma'] + "\t" + node['tag'] + "\t" + (node['dep'] if node['dep'] else '')
        except: # TODO 拿掉
            return node['token'] + "\t" + node['lemma'] + "\t" + node['tag'] + "\t"
        
#     print(list(groups))
    for group in groups:
        print("==========================")
        print("type:", group['edit_type'])
        print("sent:", group['sent'])
        print()
        
        for edit in group['edits']:
            head, target, childs, deletes = edit['Head'], edit['Target'], edit['Child'], edit['Delete']
            
            if head: print("Head:\t" + format_line(head))
            if target: print("Target:\t" + format_line(target))
                
            for child in childs:
                print("Child:\t" + format_line(child))
            for delete in deletes:
                print("Delete:\t" + format_line(delete))
            print()
            
    


In [149]:
# import json
search(edit_type_q='Delete', target_q={'token': 'discuss'}, delete_q={'tag': 'IN'})

type: Delete
sent: Therefore we could set up a meeting to discuss [-about-] this issue and settling it out of court .

Head:	set	set	VB	
Target:	discuss	discuss	VB	advcl
Child:	to	to	TO	aux
Child:	issue	issue	NN	dobj
Child:	and	and	CC	cc
Child:	settling	settle	VBG	conj

Head:	issue	issue	NN	
Target:	this	this	DT	det

Delete:	about	about	IN	
Delete:	about	about	IN	

type: Delete
sent: I hope this unpleasant lunch does not impede our having another meeting to discuss [-about-] the contract .

Head:	impede	impede	VB	
Target:	discuss	discuss	VB	advcl
Child:	to	to	TO	aux
Child:	contract	contract	NN	dobj

Head:	contract	contract	NN	
Target:	the	the	DT	det

Delete:	about	about	IN	
Delete:	about	about	IN	

type: Delete
sent: First of all it is too unfair that people discuss [-about-] the life of this couple .

Head:	is	be	VBZ	
Target:	discuss	discuss	VBP	ccomp
Child:	that	that	IN	mark
Child:	people	people	NNS	nsubj
Child:	life	life	NN	dobj

Head:	life	life	NN	
Target:	the	the	DT	det

Delete:	a

Child:	to	to	TO	aux
Child:	it	-PRON-	PRP	dobj

Head:	discuss	discuss	VB	
Target:	it	-PRON-	PRP	dobj

Delete:	about	about	IN	
Delete:	about	about	IN	

type: Delete
sent: Dear Giovanna , It is a long time that we do n't speak about our dreams and project for the future .... Maybe there are better times to discuss [-about-] the future .... Belonging to this .... It is two months that I 'm unemployed ...

Head:	are	be	VBP	
Target:	discuss	discuss	VB	advcl
Child:	to	to	TO	aux
Child:	future	future	NN	dobj

Head:	future	future	NN	
Target:	the	the	DT	det

Delete:	about	about	IN	
Delete:	about	about	IN	

type: Delete
sent: This important event was to discuss [-about　''-] how to build safely .

Head:	was	be	VBD	
Target:	discuss	discuss	VB	xcomp
Child:	to	to	TO	aux
Child:	build	build	VB	xcomp

Head:	build	build	VB	
Target:	how	how	WRB	advmod

Delete:	about	about	IN	
Delete:	''	''	''	
Delete:	about	about	IN	
Delete:	''	''	''	

type: Delete
sent: This is great ideas to meet and discuss [-about-] th