WIP use TreeDefinition instead of index to make trees #524

Closed
wants to merge 3 commits into
from

Conversation

Projects
None yet
5 participants
@kgybels
Contributor

kgybels commented Jan 10, 2014

No description provided.

@sc68cal

This comment has been minimized.

Show comment Hide comment
@sc68cal

sc68cal Jan 10, 2014

Contributor

YES

Contributor

sc68cal commented Jan 10, 2014

YES

@KindDragon

This comment has been minimized.

Show comment Hide comment
@KindDragon

KindDragon Jan 10, 2014

Contributor

👍

Contributor

KindDragon commented Jan 10, 2014

👍

@KindDragon

This comment has been minimized.

Show comment Hide comment
@KindDragon

KindDragon Jan 10, 2014

Contributor
   Clone in this PR works several times faster for me :+1: :+1: :+1: 

My mistake, a little faster

Contributor

KindDragon commented Jan 10, 2014

   Clone in this PR works several times faster for me :+1: :+1: :+1: 

My mistake, a little faster

GitTfs/Core/GitHelpers.cs
@@ -237,7 +237,7 @@ private Process Start(string[] command)
protected virtual Process Start(string [] command, Action<ProcessStartInfo> initialize)
{
var startInfo = new ProcessStartInfo();
- startInfo.FileName = "git";

This comment has been minimized.

Show comment Hide comment
@pmiossec

pmiossec Jan 10, 2014

Owner

I think you should let the previous value (and add git to your path). Everyone doesn't install git in the "program files" folder....

@pmiossec

pmiossec Jan 10, 2014

Owner

I think you should let the previous value (and add git to your path). Everyone doesn't install git in the "program files" folder....

@sc68cal

This comment has been minimized.

Show comment Hide comment
@sc68cal

sc68cal Jan 10, 2014

Contributor

@KindDragon Awesome - do you have a comparison in runtime between this PR and using the temporary index? It would be a useful stat to see how much faster we get from not shelling out to the Git binary.

Contributor

sc68cal commented Jan 10, 2014

@KindDragon Awesome - do you have a comparison in runtime between this PR and using the temporary index? It would be a useful stat to see how much faster we get from not shelling out to the Git binary.

@pmiossec

This comment has been minimized.

Show comment Hide comment
@pmiossec

pmiossec Jan 10, 2014

Owner

I have made some tests with my codeplex test repository (with some interesting cases inside) and I end up with exactly the same sha1. So, that's good...

So, after some little cleaning, I am good for merging.

PS :

  • I can't see (and measure) improvements because I am constraint by the network.
  • Some tests failed on Travis build but not on my computer. I hope Travis will pass in the next build or we will have to have a look why before merging...
Owner

pmiossec commented Jan 10, 2014

I have made some tests with my codeplex test repository (with some interesting cases inside) and I end up with exactly the same sha1. So, that's good...

So, after some little cleaning, I am good for merging.

PS :

  • I can't see (and measure) improvements because I am constraint by the network.
  • Some tests failed on Travis build but not on my computer. I hope Travis will pass in the next build or we will have to have a look why before merging...
@sc68cal

This comment has been minimized.

Show comment Hide comment
@sc68cal

sc68cal Jan 10, 2014

Contributor

Going to kick travis again - that's how badly I want this to get merged 😄

Contributor

sc68cal commented Jan 10, 2014

Going to kick travis again - that's how badly I want this to get merged 😄

@sc68cal

This comment has been minimized.

Show comment Hide comment
@sc68cal

sc68cal Jan 10, 2014

Contributor

Ah. I know why. It's because he hard coded in that path to Git, and Travis runs the build and tests on Mono+Linux.

@kgybels please fix that and we should be good to go!

---- System.ComponentModel.Win32Exception : ApplicationName='c:\program files (x86)\git\bin\git.exe', CommandLine='log --no-color --pretty=medium HEAD --', CurrentDirectory=''
Contributor

sc68cal commented Jan 10, 2014

Ah. I know why. It's because he hard coded in that path to Git, and Travis runs the build and tests on Mono+Linux.

@kgybels please fix that and we should be good to go!

---- System.ComponentModel.Win32Exception : ApplicationName='c:\program files (x86)\git\bin\git.exe', CommandLine='log --no-color --pretty=medium HEAD --', CurrentDirectory=''
@sc68cal

This comment has been minimized.

Show comment Hide comment
@sc68cal

sc68cal Jan 10, 2014

Contributor

@pmiossec Would it be useful to investigate automating a smoke test, that does a clone from that codeplex test repository and makes sure the SHA ends up being the right one?

Contributor

sc68cal commented Jan 10, 2014

@pmiossec Would it be useful to investigate automating a smoke test, that does a clone from that codeplex test repository and makes sure the SHA ends up being the right one?

@pmiossec

This comment has been minimized.

Show comment Hide comment
@pmiossec

pmiossec Jan 10, 2014

Owner

@sc68cal It could be done but this repository is the one I use for development and there is 2 cases not supported by the current version of git-tfs and only supported by my work in progress. The current version of git-tfs can't clone the tfs repository :(

Owner

pmiossec commented Jan 10, 2014

@sc68cal It could be done but this repository is the one I use for development and there is 2 cases not supported by the current version of git-tfs and only supported by my work in progress. The current version of git-tfs can't clone the tfs repository :(

@sc68cal sc68cal referenced this pull request Jan 10, 2014

Open

Functional testing #525

@KindDragon

This comment has been minimized.

Show comment Hide comment
@KindDragon

KindDragon Jan 10, 2014

Contributor

@KindDragon Awesome - do you have a comparison in runtime between this PR and using the temporary index? It would be a useful stat to see how much faster we get from not shelling out to the Git binary.

My mistake, new version a little faster.
But new code consume a lot more memory. 1,5Gb after 2 hours (old code around 0,5Gb). Maybe we have some memory leaks.

Contributor

KindDragon commented Jan 10, 2014

@KindDragon Awesome - do you have a comparison in runtime between this PR and using the temporary index? It would be a useful stat to see how much faster we get from not shelling out to the Git binary.

My mistake, new version a little faster.
But new code consume a lot more memory. 1,5Gb after 2 hours (old code around 0,5Gb). Maybe we have some memory leaks.

@KindDragon

This comment has been minimized.

Show comment Hide comment
@KindDragon

KindDragon Jan 10, 2014

Contributor

Main problem with GitRepository.ParseEntries not solved
gittfs3

Contributor

KindDragon commented Jan 10, 2014

Main problem with GitRepository.ParseEntries not solved
gittfs3

@kgybels

This comment has been minimized.

Show comment Hide comment
@kgybels

kgybels Jan 10, 2014

Contributor

@KindDragon

Main problem with GitRepository.ParseEntries not solved

We need a case-insensitive TreeDefinition to be able to get rid of GitRepository.ParseEntries.
See comment on libgit2/libgit2sharp#587 for more information.

Contributor

kgybels commented Jan 10, 2014

@KindDragon

Main problem with GitRepository.ParseEntries not solved

We need a case-insensitive TreeDefinition to be able to get rid of GitRepository.ParseEntries.
See comment on libgit2/libgit2sharp#587 for more information.

@kgybels

This comment has been minimized.

Show comment Hide comment
@kgybels

kgybels Jan 10, 2014

Contributor

VS2013 insists on adding BOMs to all my new code files and changed project files. Is it a problem to leave them in?

Contributor

kgybels commented Jan 10, 2014

VS2013 insists on adding BOMs to all my new code files and changed project files. Is it a problem to leave them in?

kgybels added some commits Jan 10, 2014

WIP for code review
Work in progress for code review.
This is a first attempt at removing the slow ParseEntries() method.

LibGit2Sharp's TreeDefinition needs to operate in case-insensitive mode. I
made a quick hack to it to be able to test my git-tfs changes.

Unit tests should pass.

@kgybels kgybels referenced this pull request in libgit2/libgit2sharp Jan 10, 2014

Closed

Exposing low level method for external application #602

@spraints

This comment has been minimized.

Show comment Hide comment
@spraints

spraints Jan 10, 2014

Owner

🎸

VS2013 insists on adding BOMs to all my new code files and changed project files. Is it a problem to leave them in?

BOMs should be fine.

Main problem with GitRepository.ParseEntries not solved

We need a case-insensitive TreeDefinition to be able to get rid of GitRepository.ParseEntries.
See comment on libgit2/libgit2sharp#587 for more information.

I wonder if we should replace initialTree with something that can do the same thing (i.e. same interface), but make it lazy and based off of a LibGit2Sharp Tree?

Owner

spraints commented Jan 10, 2014

🎸

VS2013 insists on adding BOMs to all my new code files and changed project files. Is it a problem to leave them in?

BOMs should be fine.

Main problem with GitRepository.ParseEntries not solved

We need a case-insensitive TreeDefinition to be able to get rid of GitRepository.ParseEntries.
See comment on libgit2/libgit2sharp#587 for more information.

I wonder if we should replace initialTree with something that can do the same thing (i.e. same interface), but make it lazy and based off of a LibGit2Sharp Tree?

@spraints

This comment has been minimized.

Show comment Hide comment
@spraints

spraints Jan 10, 2014

Owner

... and then I look at the code, and see that you've started doing this. Doing the case-insensitive tree manipulation should be doable. Instead of _treeDefinition.Add(path, file, Mode.ToFileMode(mode));, we could do something like this (make-believe code with make-believe APIs):

  var parts = path.Split('/');
  var tree = _treeDefinition.Tree;
  while(parts.Length > 1) {
    var subdir = tree.Entries.SingleOrDefault(e => e.Name.ToLower() == parts[0].ToLower());
    if(subdir == null) {
      subdir = new Tree();
      tree.Add(parts[0], subdir);
    }
    tree = subdir;
  }
  tree.Add(parts[0], file, mode);
Owner

spraints commented Jan 10, 2014

... and then I look at the code, and see that you've started doing this. Doing the case-insensitive tree manipulation should be doable. Instead of _treeDefinition.Add(path, file, Mode.ToFileMode(mode));, we could do something like this (make-believe code with make-believe APIs):

  var parts = path.Split('/');
  var tree = _treeDefinition.Tree;
  while(parts.Length > 1) {
    var subdir = tree.Entries.SingleOrDefault(e => e.Name.ToLower() == parts[0].ToLower());
    if(subdir == null) {
      subdir = new Tree();
      tree.Add(parts[0], subdir);
    }
    tree = subdir;
  }
  tree.Add(parts[0], file, mode);
@kgybels

This comment has been minimized.

Show comment Hide comment
@kgybels

kgybels Jan 10, 2014

Contributor

@spraints

I wonder if we should replace initialTree with something that can do the same thing (i.e. same interface), but make it lazy and based off of a LibGit2Sharp Tree?

That is what I did in my last 2 commits, except I used TreeDefinition instead of Tree.

I also think we can consolidate IGitTreeBuilder and IGitTreeInformation. The information of the tree with some changes already applied to it, should be sufficient, I don't think we need the tree exactly like it is "initially" (the parent tree), while applying changes. The responsibility of preserving file mode when a change is an update can also be moved to IGitTreeBuilder entirely. To summarize, with IGitTreeBuilder working case-insensitive and handling the file mode on its own, IGitTreeInformation becomes redundant.

Contributor

kgybels commented Jan 10, 2014

@spraints

I wonder if we should replace initialTree with something that can do the same thing (i.e. same interface), but make it lazy and based off of a LibGit2Sharp Tree?

That is what I did in my last 2 commits, except I used TreeDefinition instead of Tree.

I also think we can consolidate IGitTreeBuilder and IGitTreeInformation. The information of the tree with some changes already applied to it, should be sufficient, I don't think we need the tree exactly like it is "initially" (the parent tree), while applying changes. The responsibility of preserving file mode when a change is an update can also be moved to IGitTreeBuilder entirely. To summarize, with IGitTreeBuilder working case-insensitive and handling the file mode on its own, IGitTreeInformation becomes redundant.

@kgybels

This comment has been minimized.

Show comment Hide comment
@kgybels

kgybels Jan 10, 2014

Contributor

@spraints

... and then I look at the code, and see that you've started doing this. Doing the case-insensitive tree manipulation should be doable. Instead of _treeDefinition.Add(path, file, Mode.ToFileMode(mode));

I made TreeDefinition itself handle the case-insensitive part (like this). Still need to think about how to properly expose both use cases, though.

Contributor

kgybels commented Jan 10, 2014

@spraints

... and then I look at the code, and see that you've started doing this. Doing the case-insensitive tree manipulation should be doable. Instead of _treeDefinition.Add(path, file, Mode.ToFileMode(mode));

I made TreeDefinition itself handle the case-insensitive part (like this). Still need to think about how to properly expose both use cases, though.

Remove AssertTemporaryIndexClean
No need to assert that index is clean, because we no longer use it.
Note that this was causing a lot of unnecessary overhead.

Need to get rid of all the temporary index stuff, but this was the most
important one for now.
@sc68cal

This comment has been minimized.

Show comment Hide comment
@sc68cal

sc68cal Jan 11, 2014

Contributor

Looks great @kgybels ! Keep up the great work!

Contributor

sc68cal commented Jan 11, 2014

Looks great @kgybels ! Keep up the great work!

@kgybels

This comment has been minimized.

Show comment Hide comment
@kgybels

kgybels Jan 12, 2014

Contributor

@KindDragon Would you be so kind as to rerun your performance benchmark with the latest version of this PR? It would also be nice to know how big the repository that you are testing against is? I am using a repository with about 18000 files and the improvement is not that impressive, but I hope improvement scales with the size of the repository.

Contributor

kgybels commented Jan 12, 2014

@KindDragon Would you be so kind as to rerun your performance benchmark with the latest version of this PR? It would also be nice to know how big the repository that you are testing against is? I am using a repository with about 18000 files and the improvement is not that impressive, but I hope improvement scales with the size of the repository.

@KindDragon

This comment has been minimized.

Show comment Hide comment
@KindDragon

KindDragon Jan 13, 2014

Contributor

Clone approximately 10-20% faster

gittfs4

We can also try to do something with StructureMap.Container.GetInstance(12.16%)

Contributor

KindDragon commented Jan 13, 2014

Clone approximately 10-20% faster

gittfs4

We can also try to do something with StructureMap.Container.GetInstance(12.16%)

@spraints

This comment has been minimized.

Show comment Hide comment
@spraints

spraints Jan 15, 2014

Owner

#519 had major conflicts with this. I spent some time with the two branches and git-imerge and I think I have a good merge of the two (spraints/git-tfs@1f75d48).

We need to get libgit2sharp back to the libgit2 repo before merging this. @kgybels - can you work on a PR based on the libgit2sharp version this branch is using (https://github.com/kgybels/libgit2sharp/compare/experimental)?

Owner

spraints commented Jan 15, 2014

#519 had major conflicts with this. I spent some time with the two branches and git-imerge and I think I have a good merge of the two (spraints/git-tfs@1f75d48).

We need to get libgit2sharp back to the libgit2 repo before merging this. @kgybels - can you work on a PR based on the libgit2sharp version this branch is using (https://github.com/kgybels/libgit2sharp/compare/experimental)?

@kgybels

This comment has been minimized.

Show comment Hide comment
@kgybels

kgybels Jan 15, 2014

Contributor

@spraints It's not sure yet that this is something that LibGit2Sharp wants to support (see libgit2/libgit2sharp#587 and libgit2/libgit2sharp#607 for discussion). However, I'm willing to make the PR for it, but if we want any chance of getting it in, it needs to be decent, preferably with unit tests.

The first commit is something we could already put on master, even without the case-insensitive TreeDefinition. It already gets rid of calling git update-index and git write-tree for creating the tree. Once we have a version of LibGit2Sharp with the case-insenstive TreeDefinition, I'll rebase the rest of the work on git-tfs's master.

@KindDragon I toke a quick look at the code where that GetInstance is used. It seems like it is a bad idea to use that in a hotspot like that. However, it seems trivial to rework that code a bit to get rid of it, if only I had more time!

Contributor

kgybels commented Jan 15, 2014

@spraints It's not sure yet that this is something that LibGit2Sharp wants to support (see libgit2/libgit2sharp#587 and libgit2/libgit2sharp#607 for discussion). However, I'm willing to make the PR for it, but if we want any chance of getting it in, it needs to be decent, preferably with unit tests.

The first commit is something we could already put on master, even without the case-insensitive TreeDefinition. It already gets rid of calling git update-index and git write-tree for creating the tree. Once we have a version of LibGit2Sharp with the case-insenstive TreeDefinition, I'll rebase the rest of the work on git-tfs's master.

@KindDragon I toke a quick look at the code where that GetInstance is used. It seems like it is a bad idea to use that in a hotspot like that. However, it seems trivial to rework that code a bit to get rid of it, if only I had more time!

@spraints spraints referenced this pull request Jan 17, 2014

Merged

No more GitIndexInfo #529

@spraints

This comment has been minimized.

Show comment Hide comment
@spraints

spraints Jan 17, 2014

Owner

@spraints It's not sure yet that this is something that LibGit2Sharp wants to support (see libgit2/libgit2sharp#587 and libgit2/libgit2sharp#607 for discussion). However, I'm willing to make the PR for it, but if we want any chance of getting it in, it needs to be decent, preferably with unit tests.

I just read up on that, and I'd be fine with making this solely a concern in git-tfs. IGitTreeDefinition and IGitTreeBuilder both provide places where we can do the necessary work.

The first commit is something we could already put on master, even without the case-insensitive TreeDefinition. It already gets rid of calling git update-index and git write-tree for creating the tree. Once we have a version of LibGit2Sharp with the case-insenstive TreeDefinition, I'll rebase the rest of the work on git-tfs's master.

#529 has just the first commit (and some minor cleanup). Also, merge commits aren't a big deal to me. Feel free to use spraints/git-tfs@1f75d48 as a starting point for working on the case-sensitivity stuff.

Owner

spraints commented Jan 17, 2014

@spraints It's not sure yet that this is something that LibGit2Sharp wants to support (see libgit2/libgit2sharp#587 and libgit2/libgit2sharp#607 for discussion). However, I'm willing to make the PR for it, but if we want any chance of getting it in, it needs to be decent, preferably with unit tests.

I just read up on that, and I'd be fine with making this solely a concern in git-tfs. IGitTreeDefinition and IGitTreeBuilder both provide places where we can do the necessary work.

The first commit is something we could already put on master, even without the case-insensitive TreeDefinition. It already gets rid of calling git update-index and git write-tree for creating the tree. Once we have a version of LibGit2Sharp with the case-insenstive TreeDefinition, I'll rebase the rest of the work on git-tfs's master.

#529 has just the first commit (and some minor cleanup). Also, merge commits aren't a big deal to me. Feel free to use spraints/git-tfs@1f75d48 as a starting point for working on the case-sensitivity stuff.

@pmiossec

This comment has been minimized.

Show comment Hide comment
@pmiossec

pmiossec Jun 20, 2014

Owner

Seems merged with #529 (no?)

Owner

pmiossec commented Jun 20, 2014

Seems merged with #529 (no?)

@pmiossec pmiossec closed this Jun 20, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment