Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clone #223

Merged
merged 1 commit into from Nov 12, 2012
Merged

Clone #223

merged 1 commit into from Nov 12, 2012

Conversation

ben
Copy link
Member

@ben ben commented Oct 11, 2012

This adds Repository.Clone, building on the just-merged checkout code.

Things this should be merged after:

It relies on types that are most naturally merged with them. I'll rebase once they're merged into vNext.

UPDATE

The two PRs mentioned have been merged. Any objections?

@ben
Copy link
Member Author

ben commented Oct 11, 2012

This will have conflicts with #221 (IndexerStats, etc.), so I'll rebase once that's merged.

@ethomson
Copy link
Member

Hey, this is great...!

It would be sort of nice if there was an enum for FileOpenFlags so that I don't have to remember that 0x0400 is O_EXCL...

It would also be nice if I could pass paths to clone... as a string[] or better, an ICollection and not having to create a git_strarray myself... (Is that struct even public? I'm not sure off the top of my head...)

If that were true, I think that GitCheckoutOptions would need to expose an ICollection to callers, and then there would need to be a private implementation in UnsafeNativeMethods that was actually the struct passed to the pinvokes that actually had the git_strarray...

It looks like there's helper methods in UnsafeNativeMethods for git_strarrays -> string[], it should be easy to add a BuildStrArrayOf from an ICollection. And maybe a corresponding ReleaseStrArray() that FreeHGlobals... So if your GitCheckoutOptions struct contained a git_strarray for paths instead of an IntPtr, then I think you could do something along the lines of:

strArray->size = (uint) stringCollection.count;
strArray->strings = (sbyte**)Marshal.AllocHGlobal(IntPtr.Size * stringCollection.Count);

int i = 0;
foreach (string stringItem in stringCollection)
{
strArray->strings[i] = (sbyte*)Utf8Marshaler.FromManaged(stringItem);
i++;
}

string url,
string destination,
GitIndexerStats fetch_stats = null,
GitIndexerStats checkout_stats = null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive me if I'm wrong about this elsewhere in LibGit2Sharp, but we should probably use C# conventions here, not C. So fetchStats and checkoutStats.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, I didn't see any unit tests that used these parameters. It looks like they return the outcome of this method, which makes it odd that they're being passed in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haacked Those parameters are supposed to be references to volatile data so that if the Clone is being run in one thread, another thread can check on the progress and potentially report on it. Since Clone can take a long time, this was our solution for handling progress without embedding threading or awkward progress callbacks into the libgit2 internals.

I have no comment re how to name things and/or manage volatile data idiomatically in C#.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a fair amount of discussion about this over on #213 (and code on #221). I'll see if I can't convert the progress reporting to callback delegates, which is way more idiomatic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @arrbee for the explanation. @ben I think callbacks would be a definite plus. Another option is to simply have an object represent the clone operation and it'd have events for the various callbacks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this closer to the platonic ideal?

var repo = new Clone() {
  SourceUrl = "https://github.com/libgit2/libgit2sharp",
  WorkDir = @"C:\Users\me\Desktop\libgit2sharp",
  Progress = (a,b,c) => { /* ... */ }
}.Execute();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, let me read through the existing discussion more carefully. In the current proposal, does the Repository.Clone method return immediately? I assume it can't (and must therefore block) until it's complete. Which is not ideal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think @dahlbyk's comments on Fetch apply to clone. So something like:

void Clone(string sourceUrl
  , string workingDirectory
  , CheckoutOptions checkoutOptions
  , ProgressHandler handler
  , Action<Repository> onComplete);

You might need to tweak that a bit as I'm not sure if you need 3 callbacks for the 3 different GitIndexerStats type, or just a single one that has the type of callback as a property. Either way, the basic pattern is to make this have callbacks. Having read @dahlbyk's comments, the callback approach is certainly easier to setup.

And this makes it easy for us to wrap Rx around it. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, the API should look almost exactly like Fetch, except with a few extra parameters needed for a clone. :)

@ben
Copy link
Member Author

ben commented Oct 11, 2012

@ethomson I'm with you! When it comes time to implement Checkout, CheckoutOptions will definitely be growing an ICollection for the string array. I wanted to get some early feedback – and it looks like I'm getting it. 😃 Thanks!

@nulltoken
Copy link
Member

👍 for callbacks as well!

@ethomson @ben Even if libgit2 makes this possible, I'm not sure I'm willing to expose the CheckoutOptions right now. I'd prefer keeping the Clone() signature as simple as possible for now.

@ethomson
Copy link
Member

@nulltoken: @ben's PR already exposes CheckoutOptions...? In any case, cloning w/ constrained paths is important enough to me that I'd follow up with a PR to add it immediately. :)

@haacked
Copy link
Contributor

haacked commented Oct 12, 2012

Just my $0.02 but try to avoid using optional parameters in public APIs. I've written about the versioning hell you can easily get into. This is why ASP.NET MVC removed all of them from its public APIs.

One option is to simply have two overloads. One simple, one with all the options.

BTW (stepping on my soapbox for a minute), the reliance on static methods can make an implementation difficult to extend. Could we consider having Clone and Fetch be methods of some object? Like a GitClient object or something?

The benefit then is that if this method is virtual, people can override it. It will allow me to mock this method in my unit tests easily. And lastly, it'll allow us to write extension methods that provide simpler calling patterns. steps off soapbox

@nulltoken
Copy link
Member

@ben's PR already exposes CheckoutOptions...?

You're right. However, I already asked @ben to remove it ;)

In any case, cloning w/ constrained paths is important enough to me that I'd follow up with a PR to add it immediately. :)

@ethomson As far as I know, git.git doesn't allow one to only fetch a subset of a directory. Thus, the whole repository would be cloned. In this case, only the checkout operation would consider these paths while updating the working directory.

I don't know about your use case, but if you're willing to avoid the direct checkout of the whole HEAD commit, I'd rather propose the following:

  • make repo.Clone() expose the equivalent of the --no-checkout option (I think that internally passing GIT_CHECKOUT_DEFAULT as the checkout strategy should do the trick). @ben how do you feel about this?
  • make repo.Checkout() accept a list of paths and make it leverage @ben's awesome git_checkout_xxx() libgit2 functions. (@ethomson I'd would be delighted to review a PR proposing this ;) )

Would combining those two steps above match your need?

Note: It's possible I'm far off the track and that I haven't understood your point. If this is the case, would you be so kind as to help me understand and reformulate?

@jamill
Copy link
Member

jamill commented Oct 12, 2012

Just my $0.02 but try to avoid using optional parameters in public APIs. I've written about the versioning hell you
can easily get into. This is why ASP.NET MVC removed all of them from its public APIs.

@haacked There has been a good bit of discussion on API for Fetch() in #213 and #221 (as noted above) - I agree that we should be consistent between Fetch() and Clone(). I would not be against removing the optional parameters and having a "simple" and "complete" flavor of each method, as you suggested, if that is the agreed upon pattern.

BTW (stepping on my soapbox for a minute), the reliance on static methods can make an implementation difficult
to extend. Could we consider having Clone and Fetch be methods of some object? Like a GitClient object or
something?

Fetch as currently implemented in PR #221 is on the Remote object (and not static). I suppose Clone is a little bit different in that there is not an existing Repository object.

@ethomson
Copy link
Member

@nulltoken Gotcha, sorry, I didn't see your comment to @ben. In any case, you're right - I don't necessarily care what the mechanism is by which I can accomplish this. This is a weird use case and calling checkout explicitly after a non-checking-out clone is fine with me.

@haacked
Copy link
Contributor

haacked commented Oct 12, 2012

@jamill yeah, Clone is different in that regards. I'm guessing there are other methods that are similar though like Repository.Init and Repository.Discover. Would be nice to make those methods of a new object and not static methods of Repository.

@ben
Copy link
Member Author

ben commented Oct 12, 2012

See 32b79b6 for an API proposal; it's obviously not complete, but it represents how this could work.

Unfortunately, getting progress information from a clone operation is inherently a two-thread deal right now. One thread has to do the work, another has to watch for changes in the data structures and call the user's callbacks. It's possible that libgit2 could grow the ability to do inline callbacks, but there are performance implications.

How icky would it be to spawn a new thread to call the native method inside Execute()?

@dahlbyk
Copy link
Member

dahlbyk commented Oct 15, 2012

Just my $0.02 but try to avoid using optional parameters in public APIs. I've written about the versioning hell you
can easily get into. This is why ASP.NET MVC removed all of them from its public APIs.

We are still very far from what I think any of us would consider to be a final public API. Dropping in a new version without recompiling your app is simply asking for trouble. Maybe a TODO leading up to the 1.0 milestone would be to lock down the API with overloads, but doing so now would be premature.

@haacked
Copy link
Contributor

haacked commented Oct 15, 2012

Sure, locking down overloads would be premature at this point. But I feel like adding methods with optional parameters is also premature. Just make all the parameters required (but nullable where appropriate). These things have a tendency to be forgotten about and stick around past their expiration date. ;)

@nulltoken
Copy link
Member

See 32b79b6 for an API proposal; it's obviously not complete, but it represents how this could work. Unfortunately, getting progress information from a clone operation is inherently a two-thread deal right now. One thread has to do the work, another has to watch for changes in the data structures and call the user's callbacks. It's possible that libgit2 could grow the ability to do inline callbacks, but there are performance implications. How icky would it be to spawn a new thread to call the native method inside Execute()?

@ben I'm not a huge fan of playing with threads from within LibGit2Sharp. As far as I can remember, @carlosmn wasn't strongly opposed to a callback (cf. #213 (comment)). There's indeed a trade-off to be agreed on regarding how often it's being invoked.

In other words, the API should look almost exactly like Fetch, except with a few extra parameters needed for a clone. :)

I'm with @haacked on this. Let's keep this simple.

@nulltoken
Copy link
Member

@haacked @dahlbyk I have a love/hate relationship with optional parameters. They're useful while you're iteratively sketching an API. However, I agree that by 1.0, we should have gotten rid of them in favor of overloads.

These things have a tendency to be forgotten about and stick around past their expiration date. ;)

  • Maybe is it time to start writing some contributing guidelines? ;-)
  • When the time comes to get rid of them, I think some Cecil love should help us list all the methods bearing optional parameters.

@haacked
Copy link
Contributor

haacked commented Oct 15, 2012

Great ideas! Perhaps running FxCop (or that Mono equivalent) on the final code before releasing 1.0 should be a required step. We don't have to accept all the suggestions, but it might help flag things we've forgotten about.

@ben
Copy link
Member Author

ben commented Oct 30, 2012

I've rebased this on top of @jamill's fetch update PR, so the only commit that's really new here is the last one. That one should be merged first; I'm just trying to avoid conflicts.

I've switched to encapsulating a clone operation as a new object. This allows for a lot more optional things to be specified without bunches of constructor overloads. Here's how you do something like git clone <url>:

Repository repo = new CloneOperation(url, workdirPath).Execute();

And here's git clone --bare <url> with progress reporting:

var cloneop = new CloneOperation(url, workdirPath)
{
   Bare = true,
   TransferProgress = (stat) => Console.WriteLine("Transfer: {0}/{1}/{2}",
        stat.ReceivedObjectCount, stat.IndexedObjectCount, stat.TotalObjectCount)
};
var repo = cloneop.Execute();

@nulltoken
Copy link
Member

I'd rather have a static ´Clone()´ method exposed by the ´Repository´ type. This method could accept a ´CloneOptions´ type (similar to the ´RepositoryOptions´ for instance) for optional parameters.

@ben
Copy link
Member Author

ben commented Oct 30, 2012

Like this?

var opts = new CloneOptions
{
   Bare = true,
   TransferProgress = (stat) => Console.WriteLine("Transfer: {0}/{1}/{2}",
        stat.ReceivedObjectCount, stat.IndexedObjectCount, stat.TotalObjectCount)
};
var repo = Repository.Clone(url, workdir, opts);

Or with defaults:

var repo = Repository.Clone(url, workdir);

@dahlbyk
Copy link
Member

dahlbyk commented Oct 30, 2012

A static Repository.Clone() is discoverable, but not particularly test-friendly. I'd include it as a shortcut, but hang the "real" Clone() off an interface.

@nulltoken
Copy link
Member

@ben ❤️ it!

@ben
Copy link
Member Author

ben commented Nov 2, 2012

@dahlbyk That sounds like a good idea if we get to a point where we need to test something that calls Clone. Right now we don't need to mock it out – it's a top-level concept, nothing else uses it. Or am I missing something?

@dahlbyk
Copy link
Member

dahlbyk commented Nov 2, 2012

I'm just thinking as a consumer, if I want to mock out code that performs a Clone() (or Init(), for that matter).

@ben
Copy link
Member Author

ben commented Nov 2, 2012

From a client's perspective, the Clone method would probably live on something like an ILibgit2Sharp interface, which they can mock to verify their own app's behavior.

There's an argument to be made for shipping that interface in the box, but I don't think it would be part of this PR.

@dahlbyk
Copy link
Member

dahlbyk commented Nov 2, 2012

That works for me, just thought I'd mention it.

@nulltoken
Copy link
Member

#238 is now merged

@jamill
Copy link
Member

jamill commented Nov 5, 2012

@ben @nulltoken Is this PR still waiting on #233 (Checkout), as indicated in the comments?

@ben
Copy link
Member Author

ben commented Nov 5, 2012

@jamill Checkout is kind of up in the air right now. It's shaking up inside libgit2, and this PR is going to have conflicts with #233, no easy way around it. I figured that, since Clone is built on Fetch and Checkout, it might make sense to merge those first, but if we want to merge this first and figure out the conflicts on #233 (and, presumably, another checkout-api-adjustment PR later on), that would be fine too. Either way, Checkout won't be finished.

@ben
Copy link
Member Author

ben commented Nov 8, 2012

Rebased!

@jamill
Copy link
Member

jamill commented Nov 9, 2012

How do you feel about the pattern used to expose parameters that 1) affect operation behavior and 2) expose callbacks. The approach here seems slightly different than those used in Fetch and Checkout. Now that we have a couple of operations that require a way to control both of the aspects (options that affect behavior, and expose callbacks), maybe we should see which way we like best and want to move forward with.

Fetch and Checkout, for instance takes the callbacks individually on the method interface. Fetch takes an individual parameter that controls tag fetching behavior. I was thinking that Checkout would take an enumeration (flags) that would control the various aspects of the Checkout behavior.

Clone, has an options class that encapsulates both of these into a single parameter. I kinda like it, as it the individual options are exposed directly as properties. I know that there were some comments that consumers of these API's would rather be able to pass in the callbacks directly, instead of having to create a new object to pass in (This could also be achieved with a method overload as well).

Assert.Equal(Path.Combine(scd.RootedDirectoryPath, ".git" + Path.DirectorySeparatorChar), repo.Info.Path);
Assert.False(repo.Info.IsBare);

Assert.True(File.Exists(Path.Combine(scd.RootedDirectoryPath, "master.txt")));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be reasonable to assert the current branch / head commit are as expected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it would! Fixed!

@jamill
Copy link
Member

jamill commented Nov 9, 2012

Looks great @ben 👍 Just added a couple of comments.

@ben
Copy link
Member Author

ben commented Nov 9, 2012

I just pushed some commits that get rid of GitCheckoutOptions and use ThreadAffinity in the proxy, thanks for pointing those out.

When the CloneOptions class came about, it seemed like there would be a lot more optional parameters with sensible defaults. Since it boiled down to just three, maybe they could be passed as parameters, and we can simplify usage with overloads. Unfortunately I have to run right now, but I'll try to push something like that a bit later today.

@ben
Copy link
Member Author

ben commented Nov 12, 2012

The two PRs this is dependent on have been merged, and I've rebased. Any more comments, or is this ready to go?

@nulltoken
Copy link
Member

Any more comments, or is this ready to go?

@ben, very nice job. Terse and to the point! 👍

Could you please squash the commits in order to clean up the history?

</Project>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really necessary?

@ben
Copy link
Member Author

ben commented Nov 12, 2012

Squashed, and I removed the silly change to the csproj file.

@nulltoken nulltoken merged commit ac88bb8 into vNext Nov 12, 2012
@nulltoken
Copy link
Member

TeamCity is happy... and so am I!

💎 ‼️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants