New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clone #223
Clone #223
Conversation
This will have conflicts with #221 ( |
Hey, this is great...! It would be sort of nice if there was an enum for FileOpenFlags so that I don't have to remember that 0x0400 is O_EXCL... It would also be nice if I could pass paths to clone... as a string[] or better, an ICollection and not having to create a git_strarray myself... (Is that struct even public? I'm not sure off the top of my head...) If that were true, I think that GitCheckoutOptions would need to expose an ICollection to callers, and then there would need to be a private implementation in UnsafeNativeMethods that was actually the struct passed to the pinvokes that actually had the git_strarray... It looks like there's helper methods in UnsafeNativeMethods for git_strarrays -> string[], it should be easy to add a BuildStrArrayOf from an ICollection. And maybe a corresponding ReleaseStrArray() that FreeHGlobals... So if your GitCheckoutOptions struct contained a git_strarray for paths instead of an IntPtr, then I think you could do something along the lines of: strArray->size = (uint) stringCollection.count; int i = 0; |
string url, | ||
string destination, | ||
GitIndexerStats fetch_stats = null, | ||
GitIndexerStats checkout_stats = null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive me if I'm wrong about this elsewhere in LibGit2Sharp, but we should probably use C# conventions here, not C. So fetchStats
and checkoutStats
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think about it, I didn't see any unit tests that used these parameters. It looks like they return the outcome of this method, which makes it odd that they're being passed in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haacked Those parameters are supposed to be references to volatile data so that if the Clone is being run in one thread, another thread can check on the progress and potentially report on it. Since Clone can take a long time, this was our solution for handling progress without embedding threading or awkward progress callbacks into the libgit2 internals.
I have no comment re how to name things and/or manage volatile data idiomatically in C#.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this closer to the platonic ideal?
var repo = new Clone() {
SourceUrl = "https://github.com/libgit2/libgit2sharp",
WorkDir = @"C:\Users\me\Desktop\libgit2sharp",
Progress = (a,b,c) => { /* ... */ }
}.Execute();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, let me read through the existing discussion more carefully. In the current proposal, does the Repository.Clone method return immediately? I assume it can't (and must therefore block) until it's complete. Which is not ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think @dahlbyk's comments on Fetch
apply to clone. So something like:
void Clone(string sourceUrl
, string workingDirectory
, CheckoutOptions checkoutOptions
, ProgressHandler handler
, Action<Repository> onComplete);
You might need to tweak that a bit as I'm not sure if you need 3 callbacks for the 3 different GitIndexerStats
type, or just a single one that has the type of callback as a property. Either way, the basic pattern is to make this have callbacks. Having read @dahlbyk's comments, the callback approach is certainly easier to setup.
And this makes it easy for us to wrap Rx around it. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, the API should look almost exactly like Fetch
, except with a few extra parameters needed for a clone. :)
@ethomson I'm with you! When it comes time to implement Checkout, |
@nulltoken: @ben's PR already exposes CheckoutOptions...? In any case, cloning w/ constrained paths is important enough to me that I'd follow up with a PR to add it immediately. :) |
Just my $0.02 but try to avoid using optional parameters in public APIs. I've written about the versioning hell you can easily get into. This is why ASP.NET MVC removed all of them from its public APIs. One option is to simply have two overloads. One simple, one with all the options. BTW (stepping on my soapbox for a minute), the reliance on static methods can make an implementation difficult to extend. Could we consider having The benefit then is that if this method is virtual, people can override it. It will allow me to mock this method in my unit tests easily. And lastly, it'll allow us to write extension methods that provide simpler calling patterns. steps off soapbox |
You're right. However, I already asked @ben to remove it ;)
@ethomson As far as I know, git.git doesn't allow one to only fetch a subset of a directory. Thus, the whole repository would be cloned. In this case, only the checkout operation would consider these paths while updating the working directory. I don't know about your use case, but if you're willing to avoid the direct checkout of the whole
Would combining those two steps above match your need? Note: It's possible I'm far off the track and that I haven't understood your point. If this is the case, would you be so kind as to help me understand and reformulate? |
@haacked There has been a good bit of discussion on API for
Fetch as currently implemented in PR #221 is on the |
@nulltoken Gotcha, sorry, I didn't see your comment to @ben. In any case, you're right - I don't necessarily care what the mechanism is by which I can accomplish this. This is a weird use case and calling checkout explicitly after a non-checking-out clone is fine with me. |
@jamill yeah, |
See 32b79b6 for an API proposal; it's obviously not complete, but it represents how this could work. Unfortunately, getting progress information from a clone operation is inherently a two-thread deal right now. One thread has to do the work, another has to watch for changes in the data structures and call the user's callbacks. It's possible that libgit2 could grow the ability to do inline callbacks, but there are performance implications. How icky would it be to spawn a new thread to call the native method inside |
We are still very far from what I think any of us would consider to be a final public API. Dropping in a new version without recompiling your app is simply asking for trouble. Maybe a TODO leading up to the 1.0 milestone would be to lock down the API with overloads, but doing so now would be premature. |
Sure, locking down overloads would be premature at this point. But I feel like adding methods with optional parameters is also premature. Just make all the parameters required (but nullable where appropriate). These things have a tendency to be forgotten about and stick around past their expiration date. ;) |
@ben I'm not a huge fan of playing with threads from within LibGit2Sharp. As far as I can remember, @carlosmn wasn't strongly opposed to a callback (cf. #213 (comment)). There's indeed a trade-off to be agreed on regarding how often it's being invoked.
I'm with @haacked on this. Let's keep this simple. |
@haacked @dahlbyk I have a love/hate relationship with optional parameters. They're useful while you're iteratively sketching an API. However, I agree that by 1.0, we should have gotten rid of them in favor of overloads.
|
Great ideas! Perhaps running FxCop (or that Mono equivalent) on the final code before releasing 1.0 should be a required step. We don't have to accept all the suggestions, but it might help flag things we've forgotten about. |
I've rebased this on top of @jamill's fetch update PR, so the only commit that's really new here is the last one. That one should be merged first; I'm just trying to avoid conflicts. I've switched to encapsulating a clone operation as a new object. This allows for a lot more optional things to be specified without bunches of constructor overloads. Here's how you do something like Repository repo = new CloneOperation(url, workdirPath).Execute(); And here's var cloneop = new CloneOperation(url, workdirPath)
{
Bare = true,
TransferProgress = (stat) => Console.WriteLine("Transfer: {0}/{1}/{2}",
stat.ReceivedObjectCount, stat.IndexedObjectCount, stat.TotalObjectCount)
};
var repo = cloneop.Execute(); |
I'd rather have a static ´Clone()´ method exposed by the ´Repository´ type. This method could accept a ´CloneOptions´ type (similar to the ´RepositoryOptions´ for instance) for optional parameters. |
Like this? var opts = new CloneOptions
{
Bare = true,
TransferProgress = (stat) => Console.WriteLine("Transfer: {0}/{1}/{2}",
stat.ReceivedObjectCount, stat.IndexedObjectCount, stat.TotalObjectCount)
};
var repo = Repository.Clone(url, workdir, opts); Or with defaults: var repo = Repository.Clone(url, workdir); |
A static |
@ben ❤️ it! |
@dahlbyk That sounds like a good idea if we get to a point where we need to test something that calls Clone. Right now we don't need to mock it out – it's a top-level concept, nothing else uses it. Or am I missing something? |
I'm just thinking as a consumer, if I want to mock out code that performs a |
From a client's perspective, the There's an argument to be made for shipping that interface in the box, but I don't think it would be part of this PR. |
That works for me, just thought I'd mention it. |
#238 is now merged |
@ben @nulltoken Is this PR still waiting on #233 (Checkout), as indicated in the comments? |
@jamill Checkout is kind of up in the air right now. It's shaking up inside libgit2, and this PR is going to have conflicts with #233, no easy way around it. I figured that, since Clone is built on Fetch and Checkout, it might make sense to merge those first, but if we want to merge this first and figure out the conflicts on #233 (and, presumably, another checkout-api-adjustment PR later on), that would be fine too. Either way, Checkout won't be finished. |
Rebased! |
How do you feel about the pattern used to expose parameters that 1) affect operation behavior and 2) expose callbacks. The approach here seems slightly different than those used in Fetch and Checkout. Now that we have a couple of operations that require a way to control both of the aspects (options that affect behavior, and expose callbacks), maybe we should see which way we like best and want to move forward with. Fetch and Checkout, for instance takes the callbacks individually on the method interface. Fetch takes an individual parameter that controls tag fetching behavior. I was thinking that Checkout would take an enumeration (flags) that would control the various aspects of the Checkout behavior. Clone, has an options class that encapsulates both of these into a single parameter. I kinda like it, as it the individual options are exposed directly as properties. I know that there were some comments that consumers of these API's would rather be able to pass in the callbacks directly, instead of having to create a new object to pass in (This could also be achieved with a method overload as well). |
Assert.Equal(Path.Combine(scd.RootedDirectoryPath, ".git" + Path.DirectorySeparatorChar), repo.Info.Path); | ||
Assert.False(repo.Info.IsBare); | ||
|
||
Assert.True(File.Exists(Path.Combine(scd.RootedDirectoryPath, "master.txt"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be reasonable to assert the current branch / head commit are as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it would! Fixed!
Looks great @ben 👍 Just added a couple of comments. |
I just pushed some commits that get rid of When the |
The two PRs this is dependent on have been merged, and I've rebased. Any more comments, or is this ready to go? |
@ben, very nice job. Terse and to the point! 👍 Could you please squash the commits in order to clean up the history? |
</Project> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that really necessary?
Squashed, and I removed the silly change to the |
TeamCity is happy... and so am I! 💎 |
This adds
Repository.Clone
, building on the just-merged checkout code.Things this should be merged after:
Update working directory on Checkout #233 (Checkout)Fetch should report progress through callbacks. #238 (Fetch update)It relies on types that are most naturally merged with them. I'll rebase once they're merged into vNext.
UPDATE
The two PRs mentioned have been merged. Any objections?