Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quota option seems to leave wpull hanging #361

Open
newmanships opened this issue Mar 28, 2017 · 0 comments
Open

Quota option seems to leave wpull hanging #361

newmanships opened this issue Mar 28, 2017 · 0 comments
Labels

Comments

@newmanships
Copy link

What I wanted: (Describe briefly what you want to achieve here)
Using Quota option for wpull to stop after X amount of bytes have been downloaded to warc file
What I expect: (Describe briefly how you think the program/feature will work)
It stops crawling when it hits the quota, but leaves Wpull hanging (see output below):

The command or website causes the problem: (Copy the options provided to Wpull here)
wpull http://www.techcrunch.com --quota 1000000 --warc-file techcrunch.com -rH --warc-cdx --level 2 -Dtechcrunch.com --no-check-certificate
Reducing the quota amount below this amount works as expected, for example:
wpull http://www.techcrunch.com --quota 100000 --warc-file techcrunch.com -rH --warc-cdx --level 2 -Dtechcrunch.com --no-check-certificate (one less 0)
Operating system: (Write your OS name here such as Windows 10/Ubuntu Linux 14.04 32-bit/OS X 10.10)
10.11.6
Python version: (What does python --version say?)
3.6.1
Wpull version: (What does wpull --version say?)
2.0.1
Log/Output:

  [           O             ] 6.0 B 0:01:49 -1.1 KiB/s
INFO Fetched ‘https://techcrunch.com/2017/03/28/uber-restarts-self-driving-passenger-pilots-in-arizona-and-pittsburgh/?ncid=mobilenavtrend’: 200 OK. Length: unspecified [text/html; charset=UTF-8]. ```

@chfoo chfoo added the bug label Apr 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants