New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the meaning of [mg] in FilterEnglishTriplts.py #19
Comments
Hi, Are you using the raw Freebase dump? "[mg] in the regx pattern" indicates that we consider the triplets with [mg] mid, which is the format of the entities in Freebase. For other language triplets, we filter them out. Can you show me some triplets in your raw dump? Best, |
yes, I download freebase dump from official website, and these are some lines at the head of freebase-rdf-dump. I do not find [mg]
|
The samples that you provided will be filtered out as it doesn't match any of the regex patterns. The triplet below (I just give a fake example) will match the first regex pattern ''re_ns_ns'' (this pattern will match the triplet whose head entity is a mid and tail entity is also a mid). It's strange that you obtain nothing after running the code, maybe you can investigate the code by figuring out which unexcepted pattern is matched for a certain triplet http://rdf.freebase.com/ns/m.01v3v75 http://rdf.freebase.com/ns/type.type.instance http://rdf.freebase.com/ns/m.01v3v75 . |
Thanks a lot, I finally find out the problem , the regex pattern from you works fine. |
I use this file to filter the freebase dump, but it filter all, why do you use "[mg]" in the regx pattern
The text was updated successfully, but these errors were encountered: