Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing requirements #15

Open
ottowg opened this issue Jan 8, 2020 · 3 comments
Open

missing requirements #15

ottowg opened this issue Jan 8, 2020 · 3 comments

Comments

@ottowg
Copy link

ottowg commented Jan 8, 2020

beautifulsoup4 is missing in requirements.
pdfminer is missing in requirements.
requests-html is missing in requirements.
ray is missing in requirements

ceteri added a commit that referenced this issue Jan 8, 2020
@ceteri
Copy link
Contributor

ceteri commented Jan 8, 2020

Thank you -- BS4 was missing.

The others were there in requirements.txt:

  • pdfminer.six
  • ray
  • requests-html

But were there any problems using those three libraries?

ceteri added a commit that referenced this issue Jan 8, 2020
add missing bs4 to requirements for #15
@ottowg
Copy link
Author

ottowg commented Jan 8, 2020 via email

@ceteri
Copy link
Contributor

ceteri commented Jan 13, 2020

Thank you @ottowg this is super-helpful to know about Py 3.8 errors on Ubuntu.

I was able to download 1379 of the 1662 pdfs.
Is this a comparable result?

Yes, that's the number that we saw for the PDF downloads without errors. There's a task in progress to troubleshoot the download process: #6

FWIW, we're running on Ubuntu on our cloud instances, although generally with Py 3.6. We'll try to troubleshoot further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants