-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
no author information in key #41
Comments
We're glad you liked it! This looks like an import problem. This query works for me, and I have the following entry in my db:
but I have FTS support. How did you import this entry? What happens if you run
@davvil, any guesses what might have caused this? |
Is there way to quickly see whether sqlite is compiled with FTS support or not? But in any case FTS or no-FTS should involve in later parts of the processing, no? UPDATE: There was a new release of
EDIT: Tried this and it seems that it has FTS support:
|
I don't think this has to do with FTS, as FTS involves only search and the
entry is found. What is quite suspicious is that the bibtex entry has two
author fields, the second one being "UNKNOWN", which is probably what is
taken for generating the key. I also suspect a problem when
importing/parsing the entry. Unfortunately I am also not able to reproduce
it on my system :-(
What version of pybtex are you using? Did this happen with an empty
database or did you include the entry after adding others?
|
Yes, this is quite strange to see two author fields. I agree it looks like a parsing error. Do you have time and interest to debug this? And can you provide more details about your environment (OS, python version, etc)? |
Oh I didn't see the second author field above. Yes let me dig into it a little. Sorry for updating again and again my previous comment for which probably you did not receive separate notifications. I tried with pybtex 0.21 and 0.22 and got the same result. |
No need to apologize—we're happy to have someone point out a bug and go through the work of trying to fix it. I think it should be easy to track down: either pybtex parsing is broken (which would be strange, since this entry is fairly standard), or our code is broken. I'm curious what pybtex.Entry items look like here after parsing. |
I think there's a very weird thing going on. I tried also on my desktop, same issue. The problem is this:
This makes me think whether we are using two completely different |
What happens if you try to import this version?
@InProceedings{D18-2029,
author = {Cer, Daniel and Yang, Yinfei and Kong, Sheng-yi and Hua, Nan and Limtiaco, Nicole and St. John, Rhomni and Constant, Noah and Guajardo-Cespedes, Mario and Yuan, Steve and Tar, Chris and Strope, Brian and Kurzweil, Ray},
title = {Universal Sentence Encoder for English},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
year = {2018},
publisher = {Association for Computational Linguistics},
pages = {169--174},
location = {Brussels, Belgium},
url = {http://aclweb.org/anthology/D18-2029}
}
… On Nov 20, 2018, at 9:22 AM, Ozan Çağlayan ***@***.***> wrote:
I think there's a very weird thing going on. I tried also on my desktop, same issue. The problem is this: pybtex.Entry never has an author field for me and that's why the code injects an UNKNOWN author. For me, all authors are inside entry.persons['author']. This is indeed how pybtex documents as well, see this: https://docs.pybtex.org/api/parsing.html <https://docs.pybtex.org/api/parsing.html>
>>> from pybtex.database import parse_file
>>> bib_data = parse_file('../examples/tugboat/tugboat.bib')
>>> print(bib_data.entries['Knuth:TB8-1-14'].fields['title'])
Mixing right-to-left texts with left-to-right texts
>>> for author in bib_data.entries['Knuth:TB8-1-14'].persons['author']:
... print(unicode(author))
Knuth, Donald
MacKay, Pierre
This makes me think whether we are using two completely different pybtex, i.e. maybe an old fork which provided the authors within fields and the one that gets installed (for me) through pip, which does not seem to provide this?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#41 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAbyWEsMKUpFBt1oqOMu5lFnqEEsi2Hwks5uxBApgaJpZM4YpK8M>.
|
Saved it into a local file
|
Can you try to run this file? #!/usr/bin/env python
import pybtex.database
BIBTEX="""\
@InProceedings{D18-2029,
author = {Cer, Daniel and Yang, Yinfei and Kong, Sheng-yi and Hua, Nan and Limtiaco, Nicole and St. John, Rhomni and Constant, Noah and Guajardo-Cespedes, Mario and Yuan, Steve and Tar, Chris and Strope, Brian and Kurzweil, Ray},
title = {Universal Sentence Encoder for English},
booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
year = {2018},
publisher = {Association for Computational Linguistics},
pages = {169--174},
location = {Brussels, Belgium},
url = {http://aclweb.org/anthology/D18-2029}
}"""
if __name__ == '__main__':
print('Pybtex version: ', pybtex.__version__)
library = pybtex.database.parse_string(BIBTEX, 'bibtex')
entry = library.entries['D18-2029']
print('author in entry.fields?', 'author' in entry.fields)
print('author in entry.persons?', 'author' in entry.persons) Output:
|
|
Then how can you escape from having
|
Yes, I don't understand either. I tried downgrading to pybtex 0.20.0 and even 0.19.0, but still get I'll have to look into this later tonight, or maybe @davvil has an idea. This is strange. |
If you remove your already generated bibdb, can you still add this entry correctly with author information? |
Yeah, it works perfectly.
$ mv ~/.bibsearch ~/.bibsearch.bak
$ bibsearch add d.bib # contains the entry we're playing with
Added 1 entries, skipped 0 duplicates. Skipped 0 files
$ bibsearch print
@InProceedings{cer2018a:universal,
author = "Cer, Daniel and Yang, Yinfei and Kong, Sheng-yi and Hua, Nan and Limtiaco, Nicole and St. John, Rhomni and Constant, Noah and Guajardo-Cespedes, Mario and Yuan, Steve and Tar, Chris and Strope, Brian and Kurzweil, Ray",
title = "Universal Sentence Encoder for English",
booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
year = "2018",
publisher = "Association for Computational Linguistics",
pages = "169--174",
location = "Brussels, Belgium",
url = "http://aclweb.org/anthology/D18-2029",
original_key = "D18-2029"
}
… On Nov 20, 2018, at 10:07 AM, Ozan Çağlayan ***@***.***> wrote:
If you remove your already generated bibdb, can you still add this entry correctly with author information?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#41 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAbyWB9ongVP5ALIihhwXl_EQGGwTQ5nks5uxBqkgaJpZM4YpK8M>.
|
Looking through the code of pybtex, I see that the |
You can try to reproduce in here: |
Sorry about the delay—I'll pick this up after NAACL. |
Again, sorry about the delay. I was able to reproduce the issue on another computer and I have comited a fix for it. Please try the current master in github which should address this issue. After the three of us do some testing, we should update the pip package ASAP. I am also buffled as to why it worked before. Perhaps we were using a byproduct of the parsing itself. |
It seems to work on my side for the specific example above.
|
I just fixed the key generation (it took the original key before) and I also fixed an error when importing entries with unknown macros. It seems to work quite well now. @mjpost what do you think? Can you update PyPi? |
Sure, can you bump the version and add to the change log? Then I'll push. |
Done! We are now at version π. |
Pushed to pypi. |
Hello,
Thanks for this wonderful project that I discovered this morning. I'm not sure if this is related to sqlite3 with no support for FTS but, i have a problem with author names (both during search and also in the returned keys):
Looking through the sqlite file, I see this:
The text was updated successfully, but these errors were encountered: