Skip to content
This repository

More efficient counting of tags/remotes/heads #121

Closed
wants to merge 8 commits into from

3 participants

Nathan Van der Auwera Brandon Keepers Dmitriy Zaporozhets
Nathan Van der Auwera

In Gitlab the number of tags is shown, using the command repo.tags.count. Now we have some very large repositories where this takes a lot of time, and slows down the complete UI.

Instead of first building the list of tags, just to take the size of it, I added a method count_all to Ref, which can then be used to implement the methods tag_count, head_count and remote_count on Repo.

Some timings, to start on the grit repository itself:


ruby-1.9.3-p125@grit ~/work/git/grit (master) $ irb
1.9.3p125 :001 > require 'grit'
 => true 
1.9.3p125 :002 > repo = Grit::Repo.new('.')
 => #<Grit::Repo "/home/nathan/work/git/grit/.git"> 
1.9.3p125 :003 > repo.tag_count
 => 15 
1.9.3p125 :004 > repo.tags.count
 => 15 
1.9.3p125 :005 > require 'benchmark'
 => true 
1.9.3p125 :006 > Benchmark.measure { repo.tag_count }
 =>   0.000000   0.000000   0.000000 (  0.000788)

1.9.3p125 :007 > Benchmark.measure { repo.tags.count }
 =>   0.030000   0.000000   0.030000 (  0.024960)

Then, for good measurement, on our huge repository:

ruby-1.9.3-p125@grit ~/work/git/grit (master) $ irb 
1.9.3p125 :001 > require 'grit'
 => true 
1.9.3p125 :002 > require 'benchmark'
 => true 
1.9.3p125 :004 > repo = Grit::Repo.new('../../vasco/git/ttt')
 => #<Grit::Repo "/home/nathan/work/vasco/git/ttt/.git"> 
1.9.3p125 :005 > Benchmark.measure { repo.tag_count }
 =>   0.020000   0.000000   0.020000 (  0.023077)

1.9.3p125 :006 > Benchmark.measure { repo.tags.count }
 =>   4.130000   0.140000   4.270000 (  4.283126)

I added tests as well. Any remarks or suggestions?

and others added some commits May 21, 2012
Dmitriy Zaporozhets GITLAB patch: More stable raw commit parsing 810e3c1
Nathan Van der Auwera Added a method count_all for faster counting. Added methods called ta…
…g_count, head_count and remote_count. These just count how many there are, without building the actual items. On big git repositories this makes a lot of difference.
27f17f6
Nathan Van der Auwera Alias head_count to branch_count. cbe9a70
Dmitriy Zaporozhets Merge pull request #1 from nathanvda/master
More efficient counting of tags/remotes/heads
9536f30
Dmitriy Zaporozhets travis file added 1c56688
Dmitriy Zaporozhets Gemfile added 205c7da
Nathan Van der Auwera Make sure that Ref does not create the commit when doing a `find_all`…
…, but instead only looks for the commit when it is needed. Improves lookup time enormously.
7fd3723
Nathan Van der Auwera Fix build on ci. b3b214e
Brandon Keepers
Collaborator

Grit is no longer maintained. See #183 and check out libgit2/rugged.

Brandon Keepers bkeepers closed this February 03, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 8 unique commits by 2 authors.

May 21, 2012
Dmitriy Zaporozhets GITLAB patch: More stable raw commit parsing 810e3c1
Jun 03, 2012
Nathan Van der Auwera Added a method count_all for faster counting. Added methods called ta…
…g_count, head_count and remote_count. These just count how many there are, without building the actual items. On big git repositories this makes a lot of difference.
27f17f6
Nathan Van der Auwera Alias head_count to branch_count. cbe9a70
Dmitriy Zaporozhets Merge pull request #1 from nathanvda/master
More efficient counting of tags/remotes/heads
9536f30
Dmitriy Zaporozhets travis file added 1c56688
Dmitriy Zaporozhets Gemfile added 205c7da
Jun 04, 2012
Nathan Van der Auwera Make sure that Ref does not create the commit when doing a `find_all`…
…, but instead only looks for the commit when it is needed. Improves lookup time enormously.
7fd3723
Jun 13, 2012
Nathan Van der Auwera Fix build on ci. b3b214e
This page is out of date. Refresh to see the latest.
6  .travis.yml
... ...
@@ -0,0 +1,6 @@
  1
+branches:
  2
+  only:
  3
+    - 'master'
  4
+rvm:
  5
+  - 1.9.2
  6
+script: "bundle exec rake test"
9  Gemfile
... ...
@@ -0,0 +1,9 @@
  1
+source "http://rubygems.org"
  2
+
  3
+group :development, :test do
  4
+  gem 'rake'
  5
+  gem 'posix-spawn', "~> 0.3.6"
  6
+  gem 'mime-types', "~> 1.15"
  7
+  gem 'diff-lcs', "~> 1.1"
  8
+  gem 'mocha'
  9
+end
20  Gemfile.lock
... ...
@@ -0,0 +1,20 @@
  1
+GEM
  2
+  remote: http://rubygems.org/
  3
+  specs:
  4
+    diff-lcs (1.1.3)
  5
+    metaclass (0.0.1)
  6
+    mime-types (1.18)
  7
+    mocha (0.11.4)
  8
+      metaclass (~> 0.0.1)
  9
+    posix-spawn (0.3.6)
  10
+    rake (0.9.2.2)
  11
+
  12
+PLATFORMS
  13
+  ruby
  14
+
  15
+DEPENDENCIES
  16
+  diff-lcs (~> 1.1)
  17
+  mime-types (~> 1.15)
  18
+  mocha
  19
+  posix-spawn (~> 0.3.6)
  20
+  rake
4  README.md
Source Rendered
... ...
@@ -1,6 +1,8 @@
1  
-Grit
  1
+Grit [![build status](https://secure.travis-ci.org/gitlabhq/grit.png)](https://secure.travis-ci.org/gitlabhq/grit)
2 2
 ====
3 3
 
  4
+
  5
+
4 6
 Grit gives you object oriented read/write access to Git repositories via Ruby.
5 7
 The main goals are stability and performance. To this end, some of the
6 8
 interactions with Git repositories are done by shelling out to the system's
10  lib/grit/commit.rb
@@ -142,6 +142,12 @@ def self.list_from_string(repo, text)
142 142
       commits = []
143 143
 
144 144
       while !lines.empty?
  145
+        # GITLAB patch
  146
+        # Skip all garbage unless we get real commit
  147
+        while !lines.empty? && lines.first !~ /^commit [a-zA-Z0-9]*$/
  148
+          lines.shift 
  149
+        end
  150
+
145 151
         id = lines.shift.split.last
146 152
         tree = lines.shift.split.last
147 153
 
@@ -159,6 +165,10 @@ def self.list_from_string(repo, text)
159 165
         # not doing anything with this yet, but it's sometimes there
160 166
         encoding = lines.shift.split.last if lines.first =~ /^encoding/
161 167
 
  168
+        # GITLAB patch
  169
+        # Skip Signature and other raw data
  170
+        lines.shift while lines.first =~ /^ /
  171
+
162 172
         lines.shift
163 173
 
164 174
         message_lines = []
34  lib/grit/ref.rb
@@ -4,6 +4,16 @@ class Ref
4 4
 
5 5
     class << self
6 6
 
  7
+      # Count all Refs
  8
+      #   +repo+ is the Repo
  9
+      #   +options+ is a Hash of options
  10
+      #
  11
+      # Returns int
  12
+      def count_all(repo, options = {})
  13
+        refs = repo.git.refs(options, prefix)
  14
+        refs.split("\n").size
  15
+      end
  16
+
7 17
       # Find all Refs
8 18
       #   +repo+ is the Repo
9 19
       #   +options+ is a Hash of options
@@ -13,8 +23,7 @@ def find_all(repo, options = {})
13 23
         refs = repo.git.refs(options, prefix)
14 24
         refs.split("\n").map do |ref|
15 25
           name, id = *ref.split(' ')
16  
-          commit = Commit.create(repo, :id => id)
17  
-          self.new(name, commit)
  26
+          self.new(name, repo, id)
18 27
         end
19 28
       end
20 29
 
@@ -27,22 +36,34 @@ def prefix
27 36
     end
28 37
 
29 38
     attr_reader :name
30  
-    attr_reader :commit
31 39
 
32 40
     # Instantiate a new Head
33 41
     #   +name+ is the name of the head
34 42
     #   +commit+ is the Commit that the head points to
35 43
     #
36 44
     # Returns Grit::Head (baked)
37  
-    def initialize(name, commit)
  45
+    def initialize(name, repo, commit_id)
38 46
       @name = name
39  
-      @commit = commit
  47
+      @commit_id = commit_id
  48
+      @repo_ref = repo
  49
+      @commit = nil
  50
+    end
  51
+
  52
+    def commit
  53
+      @commit ||= get_commit
40 54
     end
41 55
 
42 56
     # Pretty object inspection
43 57
     def inspect
44 58
       %Q{#<#{self.class.name} "#{@name}">}
45 59
     end
  60
+
  61
+    protected
  62
+
  63
+    def get_commit
  64
+      Commit.create(@repo_ref, :id => @commit_id)
  65
+    end
  66
+
46 67
   end # Ref
47 68
 
48 69
   # A Head is a named reference to a Commit. Every Head instance contains a name
@@ -64,8 +85,7 @@ def self.current(repo, options = {})
64 85
       head = repo.git.fs_read('HEAD').chomp
65 86
       if /ref: refs\/heads\/(.*)/.match(head)
66 87
         id = repo.git.rev_parse(options, 'HEAD')
67  
-        commit = Commit.create(repo, :id => id)
68  
-        self.new($1, commit)
  88
+        self.new($1, repo, id)
69 89
       end
70 90
     end
71 91
 
13  lib/grit/repo.rb
@@ -213,7 +213,12 @@ def heads
213 213
       Head.find_all(self)
214 214
     end
215 215
 
  216
+    def head_count
  217
+      Head.count_all(self)
  218
+    end
  219
+
216 220
     alias_method :branches, :heads
  221
+    alias_method :branch_count, :head_count
217 222
 
218 223
     def get_head(head_name)
219 224
       heads.find { |h| h.name == head_name }
@@ -278,6 +283,10 @@ def tags
278 283
       Tag.find_all(self)
279 284
     end
280 285
 
  286
+    def tag_count
  287
+      Tag.count_all(self)
  288
+    end
  289
+
281 290
     # Finds the most recent annotated tag name that is reachable from a commit.
282 291
     #
283 292
     #   @repo.recent_tag_name('master')
@@ -310,6 +319,10 @@ def remotes
310 319
       Remote.find_all(self)
311 320
     end
312 321
 
  322
+    def remote_count
  323
+      Remote.count_all(self)
  324
+    end
  325
+
313 326
     def remote_list
314 327
       self.git.list_remotes
315 328
     end
17  lib/grit/tag.rb
@@ -7,17 +7,6 @@ class Tag < Ref
7 7
     lazy_reader :tagger
8 8
     lazy_reader :tag_date
9 9
 
10  
-    def self.find_all(repo, options = {})
11  
-      refs = repo.git.refs(options, prefix)
12  
-      refs.split("\n").map do |ref|
13  
-        name, id = *ref.split(' ')
14  
-        sha = repo.git.commit_from_sha(id)
15  
-        raise "Unknown object type." if sha == ''
16  
-        commit = Commit.create(repo, :id => sha)
17  
-        new(name, commit)
18  
-      end
19  
-    end
20  
-
21 10
     # Writes a new tag object from a hash
22 11
     #  +repo+ is a Grit repo
23 12
     #  +hash+ is the hash of tag values
@@ -97,6 +86,12 @@ def lazy_source
97 86
       end
98 87
       self
99 88
     end
  89
+
  90
+    def get_commit
  91
+      sha = @repo_ref.git.commit_from_sha(@commit_id)
  92
+      raise "Unknown object type." if sha == ''
  93
+      Commit.create(@repo_ref, :id => sha)
  94
+    end
100 95
   end
101 96
 
102 97
 end
77  test/test_commit_parse.rb
... ...
@@ -0,0 +1,77 @@
  1
+require File.dirname(__FILE__) + '/helper'
  2
+
  3
+class TestCommitParse < Test::Unit::TestCase
  4
+  def setup
  5
+@output = %Q{
  6
+commit 36a1987cd891fa82d9981886c3abbbe82c428c0d
  7
+tree 26f2c1ebc2d0485de222f13ebf812456ee8a7cb8
  8
+parent 31ae98359d26ff89b745c4f8094093cbf6ccbdc6
  9
+parent 0d9f4f135eb6dea06bdcb7065b1e4ff78274a5e9
  10
+author Linus Torvalds <torvalds@linux-foundation.org> 1337273075 -0700
  11
+committer Linus Torvalds <torvalds@linux-foundation.org> 1337273075 -0700
  12
+mergetag object 0d9f4f135eb6dea06bdcb7065b1e4ff78274a5e9
  13
+ type commit
  14
+ tag md-3.4-fixes
  15
+ tagger NeilBrown <neilb@suse.de> 1337229653 +1000
  16
+ 
  17
+ md: 2 fixes for 3.4
  18
+ tagger NeilBrown <neilb@suse.de> 1337229653 +1000
  19
+ 
  20
+ md: 2 fixes for 3.4
  21
+ 
  22
+ one fixes a bug in the new raid10 resize code so is relevant
  23
+ to 3.4 only
  24
+ Other fixes a bug in the use of md by dm-raid, so is relevant
  25
+ to any kernel with dm-raid support
  26
+ -----BEGIN PGP SIGNATURE-----
  27
+ Version: GnuPG v2.0.18 (GNU/Linux)
  28
+ 
  29
+ iQIVAwUAT7SBkznsnt1WYoG5AQK3wQ//Q2sPicPHb5MNGTTBpphYo1QWo+l9jFHs
  30
+ ZDBM+MaiNJg3kBN5ueUU+MENvLcaA5+zoxsGVBXBKyXr70ffqiQcLXyU7fHwrGu3
  31
+ 5MD36p55ZPnq2pemCrp4qdTXEUabmDb+0/R7e5lywnzNdbmCAfh4uYih0VPiaClV
  32
+ ihq/Ci12TDnezmLjksc09OCquhm0s3zH2BnMCVdmSAkhnXCxTeZ45s/ob71Y2xvj
  33
+ cJ15SYlAG4t0QCikL5R8pZtkh0h2SuUhufDE09eD8yT4RGO4PHSQ4oHujajftzey
  34
+ 9sB0NGH7Yla8gOXjA+EpzKPaiqtZxJB+1v/bhqA2FoOYAks8VoFfeqgwUbPYE7bk
  35
+ GIfGB4hFsUXaJo13uzofyJXBIp9mM/J5Sk1VJsiLE85P7wewg6N199B8lpC3lFDw
  36
+ tMLjfTMJzFOUqZBESjJoxyrc4fairZ9VCUWwpqjuioLO50e+lOi/jQHTspX78e+w
  37
+ GxgjHp8hh0RqQiTkl7vIz9KVcQIeOTG9uzz61IuDp15cRSrMs6E8gVKoX8gKW9g2
  38
+ Hec17fdG/H6ZeZa7MB9GzUD4HCj0PRbODQ3/fPhUdsbgtQjOvsVUH8LCRRU0U6cb
  39
+ YF+qsDFtUF7QT2kNbrs9R6adGj97c2HWUMyRWMQAXGuL5TkstvhrRv/rk1+bv2VG
  40
+ w7ptbiklj7o=
  41
+ =9zxe
  42
+ -----END PGP SIGNATURE-----
  43
+
  44
+    Merge tag 'md-3.4-fixes' of git://neil.brown.name/md
  45
+    
  46
+    Pull two md fixes from NeilBrown:
  47
+     "One fixes a bug in the new raid10 resize code so is relevant to 3.4
  48
+      only.
  49
+    
  50
+      The other fixes a bug in the use of md by dm-raid, so is relevant to
  51
+      any kernel with dm-raid support"
  52
+    
  53
+    * tag 'md-3.4-fixes' of git://neil.brown.name/md:
  54
+      MD: Add del_timer_sync to mddev_suspend (fix nasty panic)
  55
+      md/raid10: set dev_sectors properly when resizing devices in array.
  56
+
  57
+commit 31ae98359d26ff89b745c4f8094093cbf6ccbdc6
  58
+tree 26f2c1ebc2d0485de222f13ebf812456ee8a7cb8
  59
+parent 31ae98359d26ff89b745c4f8094093cbf6ccbdc6
  60
+parent 0d9f4f135eb6dea06bdcb7065b1e4ff78274a5e9
  61
+author Linus Torvalds <torvalds@linux-foundation.org> 1337273075 -0700
  62
+committer Linus Torvalds <torvalds@linux-foundation.org> 1337273075 -0700
  63
+
  64
+    Simple Commit
  65
+}
  66
+  end
  67
+
  68
+  def test_list_from_string
  69
+    commits = Grit::Commit.list_from_string(nil, @output)
  70
+
  71
+    assert_equal 2, commits.size
  72
+    assert_equal "36a1987cd891fa82d9981886c3abbbe82c428c0d", commits.first.id
  73
+    assert_equal "31ae98359d26ff89b745c4f8094093cbf6ccbdc6", commits.last.id
  74
+    assert_equal "Merge tag 'md-3.4-fixes' of git", commits.first.message[0..30]
  75
+  end
  76
+end
  77
+
1  test/test_head.rb
@@ -43,6 +43,7 @@ def test_is_head
43 43
 
44 44
   def test_head_count
45 45
     assert_equal 5, @r.heads.size
  46
+    assert_equal 5, @r.head_count
46 47
   end
47 48
 
48 49
 
4  test/test_remote.rb
@@ -11,4 +11,8 @@ def test_inspect
11 11
     remote = @r.remotes.first
12 12
     assert_equal %Q{#<Grit::Remote "#{remote.name}">}, remote.inspect
13 13
   end
  14
+
  15
+  def test_remote_count
  16
+    assert_equal 4, @r.remote_count
  17
+  end
14 18
 end
4  test/test_tag.rb
@@ -13,6 +13,10 @@ def test_list_from_string_size
13 13
     assert_equal 5, @r.tags.size
14 14
   end
15 15
 
  16
+  def test_tag_count
  17
+    assert_equal 5, @r.tag_count
  18
+  end
  19
+
16 20
   # list_from_string
17 21
 
18 22
   def test_list_from_string
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.