Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option for a delay between request #133

Closed
XiangRongLin opened this issue Jan 22, 2022 · 2 comments · Fixed by #134
Closed

Add option for a delay between request #133

XiangRongLin opened this issue Jan 22, 2022 · 2 comments · Fixed by #134

Comments

@XiangRongLin
Copy link
Contributor

Feature description

In the scenario that multiple urls are passed in, I want to be able to specify a delay between the request to the website.

My usecase would be downloading all chapters from a table of contents, where I image that I would quickly get blocked if hundreds of requests are sent as fast as possible.

Existing workarounds

Is there any way to obtain the desired effect with the current functionality?

Not that I know of, because I want the output to be combined into a single epub file.

@danburzo
Copy link
Owner

Hi @XiangRongLin, thank you for the report. In general, I've avoided implementing options for fetching pages, since that opens up a whole new dimension of configuration (do we support delays / parallelism? proxies? authentication headers? etc.). Instead you are able to use a combination of - and --url to offload the responsibility to a separate program (eg. curl) as below:

curl https://example.com | percollate pdf - --url=https://example.com

For bundling multiple pages into a single EPUB, the workaround is admittedly a bit convoluted:

  1. fetch each page using curl and feed it to percollate html with the - operand and the --url option, using your desired parallelism and delay between requests.
  2. feed all local HTML pages to percollate epub.

It might make sense to introduce an option to control parallelism and delay, such as:

percollate epub --wait=N url1 url2 ...

When --wait is supplied, percollate could switch from fetching in parallel to fetching sequentially, with a delay of N seconds between requests.

danburzo added a commit that referenced this issue Jan 24, 2022
Adds the -w, --wait=<sec> global CLI option to pause between processing URLs for a number of seconds. If unspecified, URLs are processed in parallel as before. Fixes #133.
@danburzo
Copy link
Owner

The --wait option has been published in percollate@2.2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants