-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
command-line-interface #27
Conversation
naustica
commented
Apr 18, 2020
•
edited
edited
- Command-line interface
- Unit tests for cli
- Functions to download and view PDFs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me, but the open command does not work on my system (Ubuntu). xdg-open does work though.
Apparently you can find out the operating system from within python, but I don't know if that is useful.
https://stackoverflow.com/questions/1854/what-os-am-i-running-on/58071295#58071295
I am not sure how to do this in Windows though.
It might also be good to use '.' as a default for the path argument.
thanks, I will fix that. You mean the open command on line 301, right? |
Also, is there an alternative way to express urllib.request.urlopen(pdf_link) with the requests library? |
Yes, I guess you could put the response text into a BytesIO object. Something like from io import BytesIO
response = requests.get("http://google.com")
BytesIO(bytearray(response.text,encoding="utf-8")) |
@bganglia Thanks. Did you test the BytesIO variant with your pdfminer? |
@naustica Not yet. Let me try that now. |
@naustica It looks like it works the same as urllib. For example this one works: >>> minecart.Document(io.BytesIO(bytearray(requests.get("https://www.jns-journal.com/article/S0022-510X(20)30168-4/pdf").text,encoding="utf-8")))
WARNING:root:Cannot locate objid=215
<minecart.miner.Document object at 0x7f1a71833110> But now that I look at it more closely, sometimes both urllib and requests fail because a link like |
@bganglia That looks interesting. Let me hear when you find out something new. |
@bganglia Can you review my code again? I would then merge the code into the develop branch. |
@naustica I am taking a look at it right now. If no PDF is available, Everything else works great on Linux. I can try it on Windows too. |
@bganglia I would tackle the issue when Im writing the remaining tests for the unpywall class if thats ok for you. |
@naustica Ok, sounds good. I would say that it looks great to merge then |