Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debugging UP through dictionaries #40

Open
khaledJabr opened this issue Jul 25, 2018 · 1 comment
Open

debugging UP through dictionaries #40

khaledJabr opened this issue Jul 25, 2018 · 1 comment

Comments

@khaledJabr
Copy link

I have been working on debugging issues with UniversalPetrarch, mainly the issue of matching the dictionaries and the extracted patterns. @ahalterman (and @philip-schrodt ) suggested a way of doing so by tracking how does UP produce events through outputting the dictionary verbs and verb-patterns it matched. This method was used in debugging Petrarch2, and here is the relavent code snippets that does it (by @ahalterman )

Here's the code block I added to PETR2: this is the version of PETR2 with a file date of 28 June 2016
t1 = time.time()
               sentence = PETRtree.Sentence(treestr,SentenceText,Date)
               print(sentence.txt)
               coded_events , meta = sentence.get_events()  # this is the entry point into the processing in PETRtree
              # =========== new code starts here =======
              for k1, v1 in meta.items():
                   if k1 != 'nouns' and k1 != 'conv_code':
                       fwmp.write("\n" + str(k1) + '\n')
                       fwmp.write(SentenceID + '\n')                        
                       try:
                           fwmp.write(sentence.txt + '\n')
                       except:
                           fwmp.write("Sentence error\n")
                       for lst in v1:
# --                            fwmp.write("++ " + str(lst))
                           if "~" in lst:                            
                               fwmp.write("-- " + lst)
                           elif len(lst) > 1:
                               if "[" in lst[1]:
                                   fwmp.write("-- " + lst[0] + ": " + lst[1][:lst[1].find("[")].strip() + '\n')                            
                               else:
                                   fwmp.write("-- " + lst[0] + ": " + str(lst[1:]) + '\n')                            
                           else:
                               if lst[0]: fwmp.write("-- " + lst[0] + '\n')                                
               """if "conv_code" in meta:
                   fwmp.write(meta["conv_code"])"""  # used to figure out convert_code, which seems to be pretty innocuous
               if "comb_code" in sentence.metadata:
                   fwmp.write(sentence.metadata["comb_code"])
               # ===== new code ends here =========
              code_time = time.time()-t1
               if PETRglobals.NullVerbs or PETRglobals.NullActors:
                   event_dict[key]['meta'] = meta
                   event_dict[key]['text'] = sentence.txt (edited)
"fwmp" is the file where the patterns are written to, so it is open and closed elsewhere in the code
This code block is in "petrarch2.py"

I am having issues fitting this code to UP since it uses PETRgraph and it does not return a meta object. I would appreciate any help of how to tackle this.

@JingL1014
Copy link
Collaborator

The sentence object in PETRgraph.py has an entry triplets that can be used for debugging.

Here is an example of sentence:
The Syrian Observatory for Human Rights, a UK-based group that tracks the war, said eight people
were killed in an air strike by government forces in a separate, rebel-held part of the city.

{'-#18#20#4': -->triplet_ ID
{'transfermation': '~ a (b . ATTACK) SAY = a b 112\n', -->Transformation pattern matched if any
'meaning': 'KILL,KILL', --> block meaning
'verbcode': '190',
'triple': ('-', <PETRgraph.NounPhrase instance at 0x7f47fd9dc128>, <PETRgraph.VerbPhrase instance at 0x7f47fd9dacb0>),
'before_transfer': ([u'SYR'], ([u'---MIL'], [u'---PPL'], '190'), '010'), --> events involved in tranformation
'after_transfer': [([u'SYR'], [u'---MIL'], u'112')] -->event after transformation
'event': ([u'---MIL'], [u'---PPL'], '190'), -->event or event before transformation
'matched_txt': u'KILL'}, -->matched verb pattern or block meaning if only verb is matched
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants