Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error for some special characters when running doi2pdf #8

Closed
kellertuer opened this issue Jul 12, 2021 · 2 comments
Closed

Error for some special characters when running doi2pdf #8

kellertuer opened this issue Jul 12, 2021 · 2 comments

Comments

@kellertuer
Copy link
Contributor

On my machine when running doi2pdf 10.1137/0803026

I get

usage: grep [-abcDEFGHhIiJLlmnOoqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
	[-e pattern] [-f file] [--binary-files=value] [--color=when]
	[--context[=num]] [--directories=action] [--label] [--line-buffered]
	[--null] [pattern] [file ...]
Traceback (most recent call last):
  File "/Users/ronnber/Repositories/numapde/numapde-bibliography/mathbin/doi2pdfname.py", line 7, in <module>
    tree = etree.parse(sys.argv[1])
  File "src/lxml/etree.pyx", line 3521, in lxml.etree.parse
  File "src/lxml/parser.pxi", line 1859, in lxml.etree._parseDocument
  File "src/lxml/parser.pxi", line 1885, in lxml.etree._parseDocumentFromURL
  File "src/lxml/parser.pxi", line 1789, in lxml.etree._parseDocFromFile
  File "src/lxml/parser.pxi", line 1177, in lxml.etree._BaseParser._parseDocFromFile
  File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
  File "/tmp/368219713", line 32
lxml.etree.XMLSyntaxError: Entity 'nbsp' not defined, line 32, column 60

which for me looks like an &nbsp; (nonbreaking space in html) is not properly handled/escaped when trying to generate the name?

@gerw
Copy link
Owner

gerw commented Jul 12, 2021

It seems that your grep is missing -P for perl regexp.

@kellertuer
Copy link
Contributor Author

kellertuer commented Jul 12, 2021

Hm, that is interesting, because it is the same machine I wrote the tutorial for that is now in the Readme and I am not aware of having changed grep.

Edit: Maybe it would be better to have these script really a little bit more system independent. The only thing keeping me from starting that is, that the php code is really not that easy to read, i.e. a little hard to understand what it does and how it is actually done. Maybe I‘ll try to reimplement this in one language without so many dependencies at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants