Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress reporting #33

Closed
YurySolovyov opened this issue Jul 11, 2016 · 39 comments
Closed

Progress reporting #33

YurySolovyov opened this issue Jul 11, 2016 · 39 comments

Comments

@YurySolovyov
Copy link
Contributor

It would be nice to have an ability to hook into underlying stream(s) to have this. This might be more tricky to get right WRT multiple files, though. Opening issue first to discuss.

@sindresorhus
Copy link
Owner

What kind of progress are you interested in? Per-file or percentage? Or rather, what do you need it for?

@YurySolovyov
Copy link
Contributor Author

I need it for UI.
I'm interested in at least how many files are done copying.

Next, this is a matter of what is easier to implement:

  • total copied size + remaining size aggregated for all files.
  • copied + remainig size for current file

As writing this, I thought, maybe it is better to do it at cp-file level?
At least in terms of

  • copied + remainig size for current file

@jamestalmage
Copy link

It looks like cpy actually copies all files in parallel. I wonder how performant that is compared to doing it serially or with some limits on concurrency.

@YurySolovyov
Copy link
Contributor Author

@jamestalmage I think progress events can be throttled with some configurable or reasonable delay.
About concurrency: if it actually does unlimited parallel streams now, limiting this would be nice to have anyway.

@YurySolovyov
Copy link
Contributor Author

Seems to be doable with queue

@SamVerschueren
Copy link
Contributor

Or with promise-concurrent ;)

@YurySolovyov
Copy link
Contributor Author

Are we waiting for some action to move forward here?
I can do some prototyping and/or performance testing if needed.

@schnittstabil
Copy link
Collaborator

I can do some prototyping and/or performance testing if needed.

@YurySolovyov If you want to, please create a separate issue/PR for limiting parallel copying. Otherwise, things get too messy.

About reporting:
ECMAScript Promises and Promises/A+ don't support progress handlers atm.

Some approaches popped into my head:

  1. cpy(files, destination, …).then(onFulfilled , onRejected, onProgress) (see kriskowal/q)
  2. cpy(files, destination, {onProgress: …})
  3. cpy(files, destination, …) returns a thenable, node EventEmitter:
const c = cpy(files, destination);

c.on('progress', onProgress);
c.then(() => console.log('done'));

@YurySolovyov
Copy link
Contributor Author

@schnittstabil I thought about option 2 mostly because it does not break API and allows to detect if we don't need progress reporting, so users won't pay for the thing they don't use.

@schnittstabil
Copy link
Collaborator

@YurySolovyov I also prefer the second one, easy to switch to a promise-based solution in the future.

@SamVerschueren
Copy link
Contributor

Or do a breaking change and use Observables instead.

@YurySolovyov
Copy link
Contributor Author

@schnittstabil My concern is mainly about if this module is right place (I mean maybe cp-file is better place for it?) for this and if such functionality considered generally useful, so I don't waste time for stuff that would be rejected.

@schnittstabil
Copy link
Collaborator

schnittstabil commented Jul 13, 2016

@YurySolovyov cpy knows about the number of files and the sizes of these. Progress per file would be interesting too, but IMO can be introduced afterwards.

EDIT: Aww, the sizes aren't known by cpy atm, but cp-file doesn't know them either...

@sindresorhus
Copy link
Owner

@SamVerschueren I would prefer not making a breaking change at this point and while Observables are nice, the main use-case is just executing and knowing when it's done. Promises also have the benefit of being part of the language and it's common knowledge. Observables not so much.

@sindresorhus
Copy link
Owner

I like 3, but what kind of overhead would progress reporting really cause?

@YurySolovyov
Copy link
Contributor Author

@sindresorhus close to zero if no listener attached, and somewhat proportional to frequency of data events otherwise I guess.

@sindresorhus
Copy link
Owner

@schnittstabil Why do you prefer 2 over 3?

@SamVerschueren Which one do you prefer if we leave out Observables?

@SamVerschueren
Copy link
Contributor

Definitely 3 for me. execa kinda works the same way, it returns a thenable child process.

@YurySolovyov
Copy link
Contributor Author

EDIT: Aww, the sizes aren't known by cpy atm, but cp-file doesn't know them either...

Yeah, maybe we should put size reporting into cp-file and add file count reporting and aggregation to cpy ?

@sindresorhus
Copy link
Owner

👍

@YurySolovyov
Copy link
Contributor Author

@sindresorhus

👍

You mean that I should proceed with a PR for cp-file ? Just clarifying things :)

@sindresorhus
Copy link
Owner

sindresorhus commented Jul 13, 2016

I'm ok with the idea, but would like to hear from @schnittstabil first.

How do you plan to expose the sizes in cp-file?

@YurySolovyov
Copy link
Contributor Author

@sindresorhus not sure there is much options here: need to do stat call + count bytes on stream data events, or inspect bytesWritten with some interval, maybe add some throttling, does that sound ok?

@schnittstabil
Copy link
Collaborator

@schnittstabil Why do you prefer 2 over 3?

Mainly because 3 involves some multi-inheritance strangeness, e.g.:

(cpy()) instanceof Promise // => false

Thinking about it, a fourth solution, using delegation comes to my mind:

const c = cpy(files, destination);

c instanceof Promise // => true
c.events instanceof EventEmitter // => true

Definitely 3 for me. execa kinda works the same way, it returns a thenable child process.

I'm also fine with 3, mainly because, that would be consistent with the API-style of other sindresorhus projects.

By the way, I don't think there is real benefit for cp-file to use 3 or 4 - but that should be discussed somewhere else.

@YurySolovyov
Copy link
Contributor Author

I started with 2 for now to figure out how invasive it would be.
So far fits pretty good and keeps current API.

@schnittstabil
Copy link
Collaborator

schnittstabil commented Jul 14, 2016

@YurySolovyov In the meantime, I see a real benefit of EventEmitters:

cpy.on('start', filename => console.log(`copying of '${filename}' started`));

cpy.on('end', filename => console.log(`copying of '${filename}' ended`));

@YurySolovyov
Copy link
Contributor Author

@schnittstabil so you purpose to embed EE into promise returned by cp-file ?

@schnittstabil
Copy link
Collaborator

@schnittstabil so you purpose to embed EE into promise returned by cp-file ?

No, returned by cpy 😉

Something like:

module.exports = function (src, dest, opts) {
  // …
  var events = new EventEmitter()

  var p = globby(src, opts)
    // …
    // use events.emit or events in some way
    //...;

  p.events = events;

  return p;
}

@YurySolovyov
Copy link
Contributor Author

YurySolovyov commented Jul 14, 2016

I guess the something similar should be done at cp-file level so we can consume it here.

@schnittstabil
Copy link
Collaborator

@YurySolovyov Feel free to investigate what cp-file should transmit during copying and which approach fits best for cp-file - cpy is only one use case and consuming differences are very small.

@YurySolovyov
Copy link
Contributor Author

@schnittstabil one thing that bothers me a little with events embedding is that it mutates promise object which may lead to performance decrease. Though it is not yet known if it is the case, but it might be.

@sindresorhus
Copy link
Owner

@schnittstabil We don't have to inherit the EventEmitter, we just use it as a mixin on the Promise. That way we get events, but also the proper inheritance. With ES2015, we could even cheat it by using Symbol.hasInstance, but I would prefer the mixin way.

@schnittstabil
Copy link
Collaborator

Premature optimization is the root of all evil.
C. A. R. Hoare 😉

The mutation itself? Very, very unlikely - I can't image any problem.

Creating an EE object and emitting events, especially if there's no subscriber?
That depends, I think in most cases, the real bottleneck is the IO stuff (fs.stat, reading, writing, …).

@YurySolovyov
Copy link
Contributor Author

YurySolovyov commented Jul 22, 2016

Now we need to consume info from cp-file. Will hopefully look into that soon.

@sindresorhus
Copy link
Owner

Any interested in finishing this? No worries if not. Just thought I would ask :)

@YurySolovyov
Copy link
Contributor Author

Will try to look into it this weekend.

@YurySolovyov
Copy link
Contributor Author

Any ideas about shape of the progress object?
I'm thinking of something like:

{
  completedFiles: Number,
  totalFiles: Number,
  completedSize: Number,
  totalSize: Number
}

Thing is that getting total size might add a delay proportional to number of files to copy.
It might be ok thing, just waned to raise the question.

@sindresorhus
Copy link
Owner

Looks good, but I don't think totalSize makes sense.

@andrefgneves
Copy link
Contributor

Hi. I thought I'd take a shot at this. Let me know if you want me to change anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants