Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why await? #10

Closed
SidKwok opened this issue Sep 6, 2018 · 10 comments
Closed

why await? #10

SidKwok opened this issue Sep 6, 2018 · 10 comments
Labels
discussion Talk about features or implementation

Comments

@SidKwok
Copy link

SidKwok commented Sep 6, 2018

Cool project, but I am confused that why you use await in your example:

const { Cluster } = require('puppeteer-cluster');

(async () => {
  const cluster = await Cluster.launch({
    concurrency: Cluster.CONCURRENCY_CONTEXT,
    maxConcurrency: 2,
  });
  
  // Is `await` necessary?
  await cluster.task(async ({ page, data: url }) => {
    await page.goto(url);
    const screen = await page.screenshot();
    // Store screenshot, do something else
  });

  await cluster.queue('http://www.google.com/');
  await cluster.queue('http://www.wikipedia.org/');
  // many more pages

  await cluster.idle();
  await cluster.close();
})();

when you define a task or try to add some queues, why await? I try to remove them and is ok to do that.

@thomasdondorf
Copy link
Owner

The await there is currently not really needed (it's also not needed for cluster.queue by the way). It's there for two reasons:

  1. Consistency: All functions on cluster.* return Promises. That way you don't need to worry whether to use await or not when dealing with the cluster functions. Just use it everywhere and you are fine.
  2. Forward compatibility: Right now the functions task and queue have only synchronous calls inside. That means right now we don't need Promises. I can imagine that in the future this might change. There are several use cases for this. Imaging we want to log data asynchronously when queue or task is called. Or maybe we want to store the queued data into a database instead of an array.

Hope that makes sense.

@thomasdondorf thomasdondorf added the discussion Talk about features or implementation label Sep 6, 2018
@SidKwok
Copy link
Author

SidKwok commented Sep 7, 2018

Right now the functions task and queue have only synchronous calls inside.

You mean cluster.queue('http://www.wikipedia.org/') will be called after cluster.queue('http://www.google.com/') finish?

@thomasdondorf
Copy link
Owner

No, I meant there are no await (asynchronous) calls inside of the task and queue functions. The functions are very simple and don't need to be async right now.

@SidKwok
Copy link
Author

SidKwok commented Sep 7, 2018

I got it. The doc indicate that all functions on cluster.* return Promises. The example may cause confusion that cluster.queue('http://www.wikipedia.org/') will be called after cluster.queue('http://www.google.com/') finish, take axios as example:

await axios.get('http://api.org/get-name');
// will be called after the above function is finished
await axios.get('http://api.org/get-age');

// both will be called asynchronously
axios.get('http://api.org/get-name');
axios.get('http://api.org/get-age');

Hope it only causes confusion for me lol.

@thomasdondorf
Copy link
Owner

Thank you. I had not thought about it that way. Maybe the await should be removed then?

I have to think about this. Feedback is welcome :)

@thomasdondorf thomasdondorf mentioned this issue Sep 7, 2018
15 tasks
@thomasdondorf
Copy link
Owner

await really seems to confuse people: https://stackoverflow.com/questions/46597304/how-to-enable-parallel-tests-with-puppeteer/51408815#comment92960172_51408815

Currently I'm in favor of removing await from the documentation, but keeping the returning of a Promise to not break compatibility with previous versions.

@SidKwok
Copy link
Author

SidKwok commented Oct 31, 2018

I think people would use queue to define a task and add it to end of the queue. Is it really necessary to return a Promise in this case?

@thomasdondorf
Copy link
Owner

thomasdondorf commented Feb 28, 2019

I implemented the execute function which hopefully makes things clearer. Right now I think there are three options on how to proceed:

  1. Remove Promise from queue function and return void instead of Promise<void>
  2. Still return a Promise, but remove all await cluster.queue(..) from documentation
  3. Continue as is.

I prefer option 2 as this keeps backward compatibility and will hopefully make it more clear through documentation.

@SidKwok
Copy link
Author

SidKwok commented Mar 6, 2019

Option 2 seems nice.

@thomasdondorf
Copy link
Owner

I removed the awaits from all cluster.queue examples and added this hint to the docs:

"Be aware that this function only returns a Promise for backward compatibility reasons. This function does not run asynchronously and will immediately return."

I guess that should clear any confusion in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Talk about features or implementation
Projects
None yet
Development

No branches or pull requests

2 participants