Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make TargetId of TreeEntry public #231

Closed
wants to merge 1 commit into from
Closed

Conversation

gorbach
Copy link
Contributor

@gorbach gorbach commented Oct 21, 2012

No description provided.

Signed-off-by: Gorbach Alexey <gorbach.alexey@gmail.com>
@nulltoken
Copy link
Member

Hey @gorbach. Glad to see you back!

Have you experienced some performance related issue in this area? Is it what motivated this PR?

Indeed, being able to access the ObjectId of the target without being compelled to load it would provide the consumer with a nice speed bump.

And I agree with you, we should try to make this happen.

However, I'm not sure about merging this PR in its current state.

  • First of all, the tests do not pass ;)
  • On a separate note, the TargetId might look a bit redundant to the user. Without knowing the inner workings of it, it's not easy to understand the value of using the TargetId property rather than accessing the Id property of the Target object.

So, I was thinking about a slightly different approach (greatly inspired by the work you've performed with the Blob object).

How about tweaking every GitObject derived type so that they're always lazily loaded? They would be built by only passing an ObjectId and a Repository as constructor parameters. Then, when one of their properties is accessed, the ObjectSafeWrapper would be used to retrieve the handle, fill the property and eagerly load the remaining properties of the object.

How does that sound? Feasible?

@gorbach
Copy link
Contributor Author

gorbach commented Oct 26, 2012

Hi @nulltoken ! Glad to see you too!

We are working on a new version of GitExtenstions, which is going to use libgit2sharp for most of it git access. The problem, that I faced with is a performance problem. Here are some more details.

The method gets list of TreeEntries. GitExtensions need the name of the entry, it's type, Id and filename. The old implementation was like this:

        public List<IGitItem> GetTreeOld(string id, bool full)
        {
            string args = "-z";
            if (full)
                args += " -r";
            var tree = this.RunCachableCmd(Settings.GitCommand, "ls-tree " + args + " \"" + id + "\"", GitModule.SystemEncoding);

            return GitItem.CreateIGitItemsFromString(this, tree);
        }

The new one (without using TreeId) was like this:

public List<IGitItem> GetTreeNew(string id, bool full)
{
    using (var repo = new LibGit2Sharp.Repository(WorkingDir))
    {
        var tree = repo.Lookup<Tree>(id);
        return tree.Where(t => t.Type == GitObjectType.Tree || t.Type == GitObjectType.Blob)
                   .Select(t => (IGitItem)new GitItem(this)
                        {
                            Mode = ((int) t.Mode).ToString(),
                            ItemType = t.Type.ToString().ToLower(),
                            Guid = t.Target.Sha,
                            Name = t.Name,
                            FileName = t.Name
                        })
                    .ToList();

    }
}

Implementation using TargetId is almost the same, but instead of Guid = t.Target.Sha use Guid = t.TargetId.Sha.

In terms of the performance GetTreeNew method with t.Target.Sha executes about 2.8 ms, but most of the time( 2.7 ms ) spends in looking up itself, while result is not needed. So the last one implementation is much faster.

About implementation. Of cause, I'll fix unit test, but I'm not sure too, if this solution fits overall design of libgit2sharp. The current API is more clear, but has some performance issues :(. Your suggestion is looking very attractive, but I'm not sure if it easy to implement, Do we need some kind of proxy, that intercepts all call to object properties?

@nulltoken
Copy link
Member

@gorbach I've paired with @yorah on this topic.

nulltoken/libgit2sharp@7a9721e0d98088bbeef09d9427f0a67a2ca1e998 is the best approach we've been able to come up with considering the usual performance/readability/usability factors...

  • The retrieval of the ObjectId costs nothing.
  • This allows to define "groups" of lazy properties that are batch-evaluated.
  • This should also allow us to get rid of the BuildFromPtr() pattern.

Feel free to shoot at it, improve....

/cc @dahlbyk

@nulltoken
Copy link
Member

@gorbach I've updated the spike. Check the nulltoken/lazy-groups branch

@nulltoken
Copy link
Member

A proposal to drop the ParentCount property has been pushed as well.
The idea would be to favor the use of c.Parents.Count()

/cc @yorah @dahlbyk

@dahlbyk
Copy link
Member

dahlbyk commented Oct 29, 2012

Rebound: https://github.com/dahlbyk/libgit2sharp/compare/lazy-groups

Big take-away is that an interface removes the need for the base class.

It strikes me as odd that LazyGroup isn't responsible for the lock around TriggerEvaluation(). I'm toying with moving that into LazyGroup instead.

@nulltoken
Copy link
Member

@dahlbyk. I like most of it. The only issue I see with it is the removal of the IList<> interface which triggers the load of the full list when only the number of entries are required. See the comment in the commit for a better explanation.

@dahlbyk
Copy link
Member

dahlbyk commented Oct 29, 2012

@yorah
Copy link
Contributor

yorah commented Oct 29, 2012

I love it! 👍

@nulltoken
Copy link
Member

I can't play with it atm, but it looks like a very elegant approach! 👍

@nulltoken
Copy link
Member

Guys, I'm more and more sold on this ´GitObject´ types redesign proposal.

  • A protected ctor with Repository and ObjectId parameters
  • "Groupable" lazy loaded properties

This should provide @gorbach with a little speed bump without cluttering the API.
As a bonus, we should also be able to get rid of the Commit.ParentCount property.

Is there anyone willing to take a stab at a full featured PR? Warning, there may be dragons out there ;-)

@nulltoken
Copy link
Member

https://github.com/nulltoken/libgit2sharp/tree/231_spike

@dahlbyk could you please peek at this? Also, I've slightly upgraded the LazyGroup<>to make it cope with several sub groups of cohesive properties. I've separated the change in a dedicated commit.

@gorbach How does it look to you?

@dahlbyk
Copy link
Member

dahlbyk commented Nov 5, 2012

The lazy group level implementation looks good, but I wonder about the performance impact. Since enumerating over commit info, in particular, is a common operation, the overhead might not be worth the (IMO) minor readability improvement in Commit/TagAnnotation. Another alternative would be to make a static ILazy<T> GitObjectLazyGroup.Singleton<T>(repo, id, func) to simplify creating single-member groups.

And if Commit.ParentsCollection is going to implement ICollection<>, it might as well implement the methods that it can (Contains(), CopyTo(), IsReadOnly) and throw a NotSupportedException (or whatever ReadOnlyCollection does) rather than NotImplementedException.

@dahlbyk
Copy link
Member

dahlbyk commented Nov 5, 2012

Come to think of it, ILazy<T> GitObjectLazyGroup.Singleton<T>(repo, id, func) wouldn't even need to use the group mechanics under the hood; if Lazy<T> also implemented ILazy<T> it could use the obvious simpler implementation.

@nulltoken
Copy link
Member

if Lazy also implemented ILazy it could use the obvious simpler implementation.

Sadly, we're supposed to be compatible with the .Net 4.0 type

@nulltoken
Copy link
Member

And if Commit.ParentsCollection is going to implement ICollection<>, it might as well implement the methods that it can (Contains(), CopyTo(), IsReadOnly) and throw a NotSupportedException (or whatever ReadOnlyCollection does) rather than NotImplementedException.

As ParentsCollection is only going to be used in this very place, I don't see the point of implementing the other methods (Or am I missing something?). However, I agree that NotSupportedException would be a better choice. :)

@dahlbyk
Copy link
Member

dahlbyk commented Nov 5, 2012

Sadly, we're supposed to be compatible with the .Net 4.0 type

private class LazyWrapper<T> : Lazy<T>, ILazy<T> { ... }?

As ParentsCollection is only going to be used in this very place, I don't see the point of implementing the other methods (Or am I missing something?). However, I agree that NotSupportedException would be a better choice. :)

I guess since we return IEnumerable<Commit> it probably doesn't matter. Still, IsReadOnly should return true. :)I wonder if Enumerable.Contains() has smarts to check for ICollection<>? Not that I expect that would be a common operation...

@nulltoken
Copy link
Member

private class LazyWrapper : Lazy, ILazy { ... }?

Ok. Are you willing to give it a stab?

@nulltoken
Copy link
Member

I wonder if Enumerable.Contains() has smarts to check for ICollection<>? Not that I expect that would be a common operation...

Grmpff. I'm starting to hate you 😉

  public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value)
    {
      ICollection<TSource> collection = source as ICollection<TSource>;
      if (collection != null)
        return collection.Contains(value);
      else
        return Enumerable.Contains<TSource>(source, value, (IEqualityComparer<TSource>) null);
    }

@nulltoken
Copy link
Member

Count() and Contains() are the only methods that check if the underlying type is a ICollection

@dahlbyk
Copy link
Member

dahlbyk commented Nov 5, 2012

Grmpff. I'm starting to hate you 😉

I ask because I care

@dahlbyk
Copy link
Member

dahlbyk commented Nov 5, 2012

Ok. Are you willing to give it a stab?

Sure

@nulltoken
Copy link
Member

I ask because I care

Thank you so much for this ... and for being such an awesome reviewer! 💖

@dahlbyk
Copy link
Member

dahlbyk commented Nov 6, 2012

nulltoken:231_spike...dahlbyk:231_spike

Blob and Tree could also use singletons, or we could leave the group in should future related properties be added. Either way...

@dahlbyk
Copy link
Member

dahlbyk commented Nov 6, 2012

Or reducing a bit of noise...

nulltoken:231_spike...dahlbyk:231_spike_too

@nulltoken
Copy link
Member

@dahlbyk
Copy link
Member

dahlbyk commented Nov 6, 2012

👍

@nulltoken
Copy link
Member

Merged into vNext ‼️

@gorbach gorbach closed this Nov 13, 2012
@choffmeister
Copy link

I'm not sure if I'm right, but when I have a repository with a sub module, then there is a tree entry of type Commit. Accessing treeEntry.Target.Sha throws an exception:

TreeEntry target of type 'Commit' are not supported.

Hence having access to the ObjectId directly would be nice.

@dahlbyk
Copy link
Member

dahlbyk commented Mar 21, 2013

/cc #312

@dahlbyk
Copy link
Member

dahlbyk commented Mar 21, 2013

Hence having access to the ObjectId directly would be nice.

bf7c434 adds a test for this in the context of new Submodule support. If you need the fix now, you can cherry-pick just the TreeEntry change and it "just works".

@nulltoken
Copy link
Member

@choffmeister vNext has been updated with @dahlbyk's fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants