You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I managed to edit the example from the wiki into a valid BibTex item that is not correctly parsed by bibtexparser 0.6.0
It looks as follows (I've removed the multiline for simplicity):
@ARTICLE{Cesar2013
, author = {Jean César}
, title = {An amazing title}
, year = {2013}
, month = jan
, volume = {12}
, pages = {12--23}
, journal = {Nice Journal}
, abstract = {This is an abstract. This line should be long enough to test}
, comments = {A comment}
, keywords = {keyword1, keyword2}
}
The comma first syntax is valid in BibTex, e.g. I have a reasonably big Bibtex database in a working project and good ol' Patashnik's bibtex have no problems with it. Patashnik's parser uses a BNF coding so it does not care where lines start or end.
On the other hand bibtexparser only splits on commas at the end of the lines (seen in bparser.py), which is not true for the comma first syntax. If you change
kvs = [i.strip() for i in record.split(',\n')]
to
kvs = [i.strip() for i in record.split(',')]
At line 239 of bparser.py it seems to do the trick and parse the file correctly.
This change shall not have impact on the rest of the package as the newline is stripped in i.strip() right away, in the same list comprehension.
I have tested this change with and without multiline and with and without comma first syntax and it seems to do fine.
If no one has anything against BibTeX comma first syntax (Algol60 purists maybe?) I'll make a pull request in 24-48h.
The text was updated successfully, but these errors were encountered:
After some more testing my solution shows it's flaws: very often conference locations have a comma in them, e.g. "New York, US" or "London, UK", and the solution above break these lines into different (broken) key-value pairs.
Another solution is to use re.split() as follows:
kvs = [i.strip() for i in re.split(',\s*\n|\n\s*,', record)]
which overcomes the conference location problem.
Yet, I shall write a proper unit test before going forward with it. (but the pull request will come)
Hi guys,
I managed to edit the example from the wiki into a valid BibTex item that is not correctly parsed by bibtexparser 0.6.0
It looks as follows (I've removed the multiline for simplicity):
The comma first syntax is valid in BibTex, e.g. I have a reasonably big Bibtex database in a working project and good ol' Patashnik's bibtex have no problems with it. Patashnik's parser uses a BNF coding so it does not care where lines start or end.
On the other hand bibtexparser only splits on commas at the end of the lines (seen in bparser.py), which is not true for the comma first syntax. If you change
to
At line 239 of bparser.py it seems to do the trick and parse the file correctly.
This change shall not have impact on the rest of the package as the newline is stripped in
i.strip()
right away, in the same list comprehension.I have tested this change with and without multiline and with and without comma first syntax and it seems to do fine.
If no one has anything against BibTeX comma first syntax (Algol60 purists maybe?) I'll make a pull request in 24-48h.
The text was updated successfully, but these errors were encountered: