Progress callbacks #990

ben · 2012-10-17T17:18:46Z

Fetch, checkout, and clone currently report progress through a shared memory object, which requires a two-thread arrangement on the caller's side. This is a generally good idea, but not especially friendly to bindings.

This PR converts these three APIs to use inline callbacks for reporting progress, so no threading is necessary. It probably accomplishes the same goal as #890, and in a more binding-friendly way.

Checkout will report on every diff
Fetch will report on every object received, every object indexed, and (ideally) every 100kb or so received

jamill · 2012-10-19T22:22:45Z

src/checkout.c

 			data->found_submodules = true;
+			data->num_stages = 3;


It seems that this could come fairly late in the checkout process (always in the second stage? possibly as the very last item of the second stage?) Is it possible that progress could jump back quite a bit? (say, from ~100% to ~66% in the worst case...)?

It actually hits somewhere in the first stage, while missing items are being created. And yeah, I've seen the progress jump back from 50% to 33%. Is this better or worse than having an instantaneous jump from 66% to 100%?

We could also weight the third stage to be 10% of the total and always include it. It's only creating a relatively small number of directories, and I doubt we'll be implementing anything that includes fetching and checking out the submodules.

I think having progress jump backwards is very problematic. It seems vastly preferable to have it jump forward. Imagine this being displayed as a progress bar in the UI. A backwards jump is not good a nice user experience.

By the way, I actually think as implemented, this jump will happen somewhere during the second phase (first phase is removing files). You can move the jump to the first phase by replicating the "is there a submodule that would need to be checked out" logic while scanning through the deletes, but I opted not to do that since I knew that further rethinking of the progress reporting would still be yet to come.

Another alternative is to exclude the third pass from the progress reporting completely since it just calls mkdir on the submodule. We could then require an explicit call to checkout submodules and/or add a submodule callback a la the notification callback.

Yeah, the time spent doing mkdir is probably negligible. I'll weight that stage to 0%. 😉

jamill · 2012-10-19T22:39:10Z

Thanks Ben! I think this looks pretty good to me!

Just two small question on the progress value calculation.

With the current logic is that it seems that it would be possible for the progress value to jump back (if there are submodules).
Does the checkout algorithm know the total number of files it is going to modify? And how many it has already updated? For instance, Core Git (from observation) reports checkout progress in the following format: Checking out files: 100% (253/253), done. I assume the numbers refer to the number of files it is going to modify. Would that be a useful metric to report as well (if possible)? I am not sure how much work it would be to calculate this data from the diff structures (# of updated files, total # of files to update) or if it is even possible. Just a thought - it seems that this would also avoid the backward progress (if that is an issue).

ben · 2012-10-19T22:50:23Z

Thanks for the review!

See above.
I'm being a little lazy here. Yes, we know how many steps the checkout will go through, but git_diff_foreach passes in a float for progress, so that's what I used. I'll put that on my list.

Also removing all the *stats parameters from external APIs that don't need them anymore.

Also converted the network example to use it.

git_index_read_tree() was exposing a parameter to provide the user with a progress indicator. Unfortunately, due to the recursive nature of the tree walk, the maximum number of items to process was unknown. Thus, the indicator was only counting processed entries, without providing any information how the number of remaining items.

Also implemented in the git2 example.

The fetch code takes advantage of this to implement a progress callback every 100kb of transfer.

Also, now only reporting checkout progress for files that are actually being added or removed.

ben · 2012-10-20T03:35:29Z

There, I switched the checkout callbacks to current/total.

I've also added a network-transfer callback for when you're fetching large objects (try git2 clone https://github.com/PublicMapping/DistrictBuilder /tmp/foo and watch what happens at 91% of the fetch). @carlosmn will have something to say about this.

carlosmn · 2012-10-20T18:17:36Z

The only thing would be that the callback and bytes use different sources, and bytes is bound to lag a bit behind, as it only updates when we receive a full packet.

ben · 2012-10-22T16:38:23Z

The callback gets an accumulated number of bytes that's been returned from p_recv or SSL_read, which is the same place gitno_recv gets its number from. There will be a bit of jitter from sideband packets, so the absolute number of bytes will occasionally differ, but it's brought back into sync when a GIT_PKT_DATA packet has been completed.

ben · 2012-10-23T16:29:34Z

I think this is ready. Any more commentary?

vmg · 2012-10-23T16:44:04Z

include/git2/clone.h

+		git_repository **out,
+		const char *origin_url,
+		const char *dest_path,
+		git_indexer_progress_callback fetch_progress_cb,


This irks me. Is indexer_progress the right name for a callback that will also report network operations?

git_fetch_progress_callback?

git_network_progress_callback?

git_something_happened_update_your_ui?

Also: git_indexer_stats is used to report network-transfer progress. Just sayin.

There, I changed everything that was git_indexer_* to git_transfer_progress_*, and sane-ified the progress member naming. What do you think?

jamill · 2012-10-24T17:51:54Z

git_remote_download still passes the bytes parameter by reference. Will consumers still have to poll in order to get this value?

I guess consumers could read the bytes value as part of the callback routine (either include a pointer to it in progress_payload or some other mechanism) - but will this be sufficient to cover the times that bytes are being received? That is, will there be extended periods where we are receiving bytes but not calling any callbacks?

git_indexer_stats and friends -> git_transfer_progress* Also made git_transfer_progress members more sanely named.

ben · 2012-10-24T20:43:40Z

@jamill: how would you feel if I nuked bytes from git_remote_download? If the caller wants progress info, they can always call git_remote_stats and watch that buffer, or provide a callback.

jamill · 2012-10-24T20:57:23Z

how would you feel if I nuked bytes from git_remote_download? If the caller wants progress info, they can always call git_remote_stats and watch that buffer, or provide a callback.

👍 I think that makes sense. The other reference parameters have already been removed. I think it would be consistent to remove bytes as well.

Progress callbacks

nulltoken mentioned this pull request Oct 18, 2012

remote: support fetch cancelation #992

Merged

jamill reviewed Oct 19, 2012
View reviewed changes

ben and others added 22 commits October 19, 2012 19:34

Clone: fix indentation

92f91b0

Add git_indexer_stats field to git_remote

3028be0

Also removing all the *stats parameters from external APIs that don't need them anymore.

Add accessor for git_remote's stats field

d57c47d

Also converted the network example to use it.

Convert checkout_index to use progress callback

2c8bbb2

Convert checkout_* to use progress callback

8064265

Remove checkout_stats from git_clone

183d8bd

Remove dead code

1f7c741

Example: compile fixes (not yet working)

2b7efe0

Fetch/indexer: progress callbacks

216863c

Fix example compilation

7635a11

Fix clone.c's indentation

9c3a98f

Clone: in-line callbacks for progress

aa1e867

Also implemented in the git2 example.

Adjust for rebase

30a46ab

Correct progress reporting from checkout

45b60d7

Indexing progress now goes to 100%

909f626

Fix broken tests

25e8b20

gitno_buffer: callback on each packet

7bcd9e2

The fetch code takes advantage of this to implement a progress callback every 100kb of transfer.

Remove third stage from checkout progress reporting

63afb00

Also, now only reporting checkout progress for files that are actually being added or removed.

Fix from rebase

cd001bb

Checkout progress now reports completed/total steps

9c05c17

Improve clone sample's formatting

2dae54a

ben mentioned this pull request Oct 23, 2012

A real struct for indicating progress #890

Closed

Update doc strings, warn about callback perf

c70ad94

vmg reviewed Oct 23, 2012
View reviewed changes

ben added 3 commits October 24, 2012 12:38

Fix documentation comment

c4958e6

Network progress: rename things

7d222e1

git_indexer_stats and friends -> git_transfer_progress* Also made git_transfer_progress members more sanely named.

Renaming: fix example

9762ad9

Remove 'bytes' param from git_remote_download

1e3b8ed

vmg pushed a commit that referenced this pull request Oct 25, 2012

Merge pull request #990 from ben/clone-callbacks

1eb8cd7

Progress callbacks

vmg merged commit 1eb8cd7 into libgit2:development Oct 25, 2012

jamill mentioned this pull request Oct 30, 2012

Fetch should report progress through callbacks. libgit2/libgit2sharp#238

Merged

phatblat pushed a commit to phatblat/libgit2 that referenced this pull request Sep 13, 2014

Merge pull request libgit2#990 from ben/clone-callbacks

d1bbf77

Progress callbacks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Progress callbacks #990

Progress callbacks #990

ben commented Oct 17, 2012

jamill Oct 19, 2012

ben Oct 19, 2012

arrbee Oct 19, 2012

ben Oct 19, 2012

jamill commented Oct 19, 2012

ben commented Oct 19, 2012

ben commented Oct 20, 2012

carlosmn commented Oct 20, 2012

ben commented Oct 22, 2012

ben commented Oct 23, 2012

vmg Oct 23, 2012

ben Oct 23, 2012

ben Oct 23, 2012

ben Oct 24, 2012

jamill commented Oct 24, 2012

ben commented Oct 24, 2012

jamill commented Oct 24, 2012

Progress callbacks #990

Progress callbacks #990

Conversation

ben commented Oct 17, 2012

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jamill commented Oct 19, 2012

ben commented Oct 19, 2012

ben commented Oct 20, 2012

carlosmn commented Oct 20, 2012

ben commented Oct 22, 2012

ben commented Oct 23, 2012

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jamill commented Oct 24, 2012

ben commented Oct 24, 2012

jamill commented Oct 24, 2012