Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

[Question] Can you tell me the test files? #52

Closed
Shougo opened this Issue · 29 comments

5 participants

@Shougo

https://raw.githubusercontent.com/junegunn/i/master/fzf-commandt-ctrlp-unite.gif

I read the page and found unite.vim cache performance was too bad.
But I have improved file_rec/async performance in unite.vim.

I want to re-test for the files.

@junegunn
Owner

Hi, Shougo.

I didn't particularly prepare the files for the test. It was run in a directory where I put all the random git repositories so I don't have the exact snapshot of it at the time of the writing. But if I recall correctly, the directory contained more than 150 git repositories, with over 150K files in total. Actually I don't think the content of the directory matters, if you have that many number of files, the result should be similar.

To be more specific, the test was run on my Macbook Pro which has 4-cores and an SSD drive. And the four plugins were run simultaneously using synchronized tmux split-panes, so one can say it was not a perfectly fair test for Command-T which can use multiple cores.

I'm glad to hear that you improved the performance of Unite and I'm willing to help you with your progress. If you want me to run the test again, let me know. :smiley:

@junegunn junegunn added the question label
@Shougo

Thank you!

I'm glad to hear that you improved the performance of Unite and I'm willing to help you with your progress. If you want me to run the test again, let me know.

Can you test it again? Latest unite.vim and vimproc are needed.

I measured your previous test.

fzf: 4.9[s]
Command-T: 7.0[s]
ctrlp: 35.5[s]
unite.vim: Over 240[s]

@junegunn
Owner

showdown

I can clearly see that the scan performance of Unite.vim has been significantly improved. In this case, Unite.vim was lucky that the match popped up in an earlier batch. But the entire scan still took longer than ctrlp.

One thing I noticed though is that Unite.vim found smaller number of files around 100k, as opposed to 120k from fzf or ctrlp. FYI, the default command fzf uses is given as follows:

find * -path '*/\.*' -prune -o -type f -print -o -type l -print 2> /dev/null
@junegunn
Owner

I hope that answered your question. FYI, I updated the blog post and added the link to this issue.

@junegunn junegunn closed this
@wellle

I updated the blog post

Where can I find that blog post?

@wellle

Good read! You might want to link that in the Readme :+1:

@junegunn
Owner

Thanks. It already is there :)

@wellle

Oh, now I see it, nevermind then :)

@Shougo

Thank you so much!

I can clearly see that the scan performance of Unite.vim has been significantly improved. In this case, Unite.vim was lucky that the match popped up in an earlier batch. But the entire scan still took longer than ctrlp.

Oh, unite.vim is slower than ctrlp...
But I think your unite.vim configuration is not full power.

  1. Can you enable if_lua in your Vim? unite.vim is optimized for if_lua environment.
  2. Can you install ag command? Ag is faster than find.
  3. Can you use :Unite file_rec/async -sync? It blocks Vim, but it is faster.
  4. If it is git repository, "file_rec/git" is faster than "file_rec/async"

One thing I noticed though is that Unite.vim found smaller number of files around 100k, as opposed to 120k from fzf or ctrlp. FYI, the default command fzf uses is given as follows:

unite.vim ignores the files matched by g:unite_source_rec_ignore_pattern automatically.

@junegunn
Owner

Can you install ag command? Ag is faster than find.

Actually as far as I know, this is not true. For just blindly traversing the directories find is much faster. ag is slower (and useful) because it does more work than find, like filtering patterns in .gitignore, etc. And more importantly, other plugins as well can be configured to use ag.

If it is git repository, "file_rec/git" is faster than "file_rec/async"

I assume that it uses git ls-files, right? Yes, it is faster that plain find, but other plugins can also benefit from the command. And the test directory was not a single git repository, but a collection of over 100 git repos.

unite.vim ignores the files matched by g:unite_source_rec_ignore_pattern automatically.

It was undefined and not used in the test.


So I'm rerunning the test with lua enabled, but contrary to my expectation, it doesn't seem to make a lot of difference. Maybe there's something wrong with my configuration? This is all I have.

let g:unite_source_rec_max_cache_files = 0
call unite#filters#matcher_default#use(['matcher_fuzzy'])

And by the way, do you have your own benchmark result?

@Shougo

Actually as far as I know, this is not true. For just blindly traversing the directories find is much faster. ag is slower (and useful) because it does more work than find, like filtering patterns in .gitignore, etc. And more importantly, other plugins as well can be configured to use ag.

OK, but I tested it. Ag is faster than find. Why?

It was undefined and not used in the test.

It is set automatically by unite.vim. To disable it, you must set it to "". But it does not ignore ".git" directories.

let g:unite_source_rec_ignore_pattern = ''

So I'm rerunning the test with lua enabled, but contrary to my expectation, it doesn't seem to make a lot of difference. Maybe there's something wrong with my configuration? This is all I have.

Is it :Unite file_rec/async -sync?

And by the way, do you have your own benchmark result?

Unfortunatelly, no...

@junegunn
Owner

OK, but I tested it. Ag is faster than find. Why?

Hmm, this is what I got:

> time find * | wc -l
  177118

real    0m0.837s
user    0m0.146s
sys     0m0.420s

> time find * -path '*/\.*' -prune -o -type f -print -o -type l -print 2> /dev/null | wc -l
  129134

real    0m0.995s
user    0m0.577s
sys     0m0.432s

> time ag -g "" | wc -l
  113153

real    0m1.971s
user    0m1.413s
sys     0m0.567s

It is set automatically by unite.vim. To disable it, you must set it to "".

Okay. I see. I'll set it on the next run.

Is it :Unite file_rec/async -sync?

I've tried the option, but I haven't yet found time to precisely measure and compare the results. I'll let you know, hopefully in a couple of days.

@Shougo
% time ag -g "" | wc -l
40424
ag -g ""  0.39s user 1.02s system 56% cpu 2.479 total
wc -l  0.01s user 0.00s system 0% cpu 2.479 total
% time find * -path '*/\.*' -prune -o -type f -print -o -type l -print 2> /dev/null | wc -l
40844
find * -path '*/\.*' -prune -o -type f -print -o -type l -print 2> /dev/null  0.08s user 0.12s system 91% cpu 0.222 total
wc -l  0.00s user 0.01s system 3% cpu 0.222 total
% time find * | wc -l
68540
find *  0.12s user 0.54s system 41% cpu 1.601 total
wc -l  0.00s user 0.02s system 1% cpu 1.600 total

It is my result. Thanks. I will improve it.

@Shougo

I improved "file_rec/async" source. Can you test it?

So I'm rerunning the test with lua enabled, but contrary to my expectation, it doesn't seem to make a lot of difference. Maybe there's something wrong with my configuration? This is all I have.

I think this configuraton is better for performance.

let g:unite_source_rec_max_cache_files = 0
let g:unite_winheight = 10
call unite#filters#matcher_default#use(['matcher_fuzzy'])
call unite#custom#source('file_rec/async', 'converters', [])
call unite#custom#source('file_rec/async', 'sorters', [])
call unite#custom#source('file_rec/async', 'max_candidates', 10)

It is maximum performance. I gave up.

@junegunn
Owner

Okay, I'll rerun the test with the configuration tomorrow or the day after tomorrow and let you know. But you don't need to wait for my result, the test is very simple you can easily do a similar one and the result probably won't contradict mine.

@Shougo

Thank you!

@Shougo Shougo referenced this issue in Shougo/unite.vim
Closed

Use of ag as async command #622

@wincent

Command-T has some benchmarks to test the speed of the matcher (unlike what you're measuring here, which is scanning speed).

Would be interesting to test the other matchers against this:

https://github.com/wincent/Command-T/blob/master/bin/benchmarks/matcher.rb

@junegunn
Owner

@wincent

Thanks! I'm interested in both scanning performance and matcher performance (although fzf itself doesn't scan but simply delegate), and ultimately the integrated user experience as a whole. Unfortunately these concerns are not clearly separated in this thread.

My conclusion so far is this: Regarding the matcher performance Command-T is the fastest, but since it blocks until the list ready, it is possible that asynchronous fzf finishes before Command-T as shown in the GIF. I know that Command-T caches the list, so it's pretty likely that it will outperform fzf in the subsequent tests, and as this observation is not stated in the article I understand that you might find it a bit unfair in the sense. :)

@markwu

@Shougo @junegunn A better idea, can I use fzf with unite? Then, I can combine both of them, and get super power.

@Shougo

No. Because, fzf has original interactive UI and requires Terminal emulator.
If fzf works like find, git ls-files, ag, unite may use faf.

@Shougo Shougo referenced this issue in Shougo/unite.vim
Closed

Speed of file source #625

@markwu

I see, too bad.

@junegunn
Owner

@Shougo Hey Shougo, I was trying to test again but I ran into this error, any idea?

Error detected while processing function <SNR>35_call_unite_empty..unite#start..unite#start#standard..unite#view#_init_cursor..unite#view#_set_cursor_line..unite#view#_match_line:
line    1:
E117: Unknown function: matchaddpos
E15: Invalid expression: has('patch7.4.340') ? matchaddpos(a:highlight, [a:line], 10, a:id) : matchadd(a:highlight, '^\%'.a:line.'l.*', 10, a:id)

FYI vim --version shows:

VIM - Vi IMproved 7.4 (2013 Aug 10, compiled Jun 16 2014 17:50:59)
MacOS X (unix) version
Included patches: 1-273
Compiled by Homebrew
Huge version without GUI.  Features included (+) or not (-):
+acl             +farsi           +mouse_netterm   +syntax
+arabic          +file_in_path    +mouse_sgr       +tag_binary
+autocmd         +find_in_path    -mouse_sysmouse  +tag_old_static
-balloon_eval    +float           +mouse_urxvt     -tag_any_white
-browse          +folding         +mouse_xterm     -tcl
++builtin_terms  -footer          +multi_byte      +terminfo
+byte_offset     +fork()          +multi_lang      +termresponse
+cindent         -gettext         -mzscheme        +textobjects
-clientserver    -hangul_input    +netbeans_intg   +title
+clipboard       +iconv           +path_extra      -toolbar
+cmdline_compl   +insert_expand   -perl            +user_commands
+cmdline_hist    +jumplist        +persistent_undo +vertsplit
+cmdline_info    +keymap          +postscript      +virtualedit
+comments        +langmap         +printer         +visual
+conceal         +libcall         +profile         +visualextra
+cryptv          +linebreak       +python          +viminfo
+cscope          +lispindent      -python3         +vreplace
+cursorbind      +listcmds        +quickfix        +wildignore
+cursorshape     +localmap        +reltime         +wildmenu
+dialog_con      +lua             +rightleft       +windows
+diff            +menu            +ruby            +writebackup
+digraphs        +mksession       +scrollbind      -X11
-dnd             +modify_fname    +signs           -xfontset
-ebcdic          +mouse           +smartindent     -xim
+emacs_tags      -mouseshape      -sniff           -xsmp
+eval            +mouse_dec       +startuptime     -xterm_clipboard
+ex_extra        -mouse_gpm       +statusline      -xterm_save
+extra_search    -mouse_jsbterm   -sun_workshop    -xpm
   system vimrc file: "$VIM/vimrc"
     user vimrc file: "$HOME/.vimrc"
 2nd user vimrc file: "~/.vim/vimrc"
      user exrc file: "$HOME/.exrc"
  fall-back for $VIM: "/usr/local/share/vim"
Compilation: /usr/bin/clang -c -I. -Iproto -DHAVE_CONFIG_H   -F/usr/local/Frameworks -DMACOS_X_UNIX  -Os -w -pipe -march=native -mmacosx-version-min=10.9 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1
Linking: /usr/bin/clang   -L. -L/usr/local/lib -L/usr/local/lib -F/usr/local/Frameworks -Wl,-headerpad_max_install_names -o vim        -lm  -lncurses -liconv -framework Cocoa  -L/usr/local/lib -llua  -framework Python   -lruby.2.0.0 -lobjc
@junegunn
Owner

I reverted Unite.vim as suggested in Shougo/unite.vim#627 (comment).

So here is the new GIF (sorry the demo is a bit silly):

showdown-20140619

Configuration used:

let g:ctrlp_max_files    = 0

let g:CommandTMaxFiles    = 200000
let g:CommandTFileScanner = 'find'

let g:unite_source_rec_max_cache_files = 0
let g:unite_winheight = 10
call unite#filters#matcher_default#use(['matcher_fuzzy'])
call unite#custom#source('file_rec/async', 'converters', [])
call unite#custom#source('file_rec/async', 'sorters', [])
call unite#custom#source('file_rec/async', 'max_candidates', 10)
@wincent

Great gif. I'm still puzzled at how FZF can scan the file-system so fast. I get that it's doing it async and starting to populate the results before its finished scanning, but still, eyeballing it, it looks like it takes about 2 seconds to scan about 130k files (compared to Command-T, which looks to take about 5 seconds, even though it's using the relatively quick find-based scanner).

I haven't looked at the source yet, but it's odd that Ruby appears to scanning faster than find (which is written in C).

@junegunn
Owner

@wincent No, fzf doesn't scan. It simply delegates to find command.

> time find * -path '*/\.*' -prune -o -type f -print -o -type l -print | wc -l
  129134

real    0m1.012s
user    0m0.589s
sys     0m0.435s
@wincent

I guess the difference is that Command-T is calling back into Vim for each path in order to exclude files matched by the 'wildignore' pattern:

https://github.com/wincent/Command-T/blob/master/ruby/command-t/scanner/file_scanner/find_file_scanner.rb#L63

...which itself calls...

https://github.com/wincent/Command-T/blob/master/ruby/command-t/scanner/file_scanner.rb#L72-77

The Watchman scanner doesn't do this, so is much faster. But now I'm tempted to make a FastFindFileScanner that does the faster, but less correct, thing. (Benchmarks, eh...)

@junegunn
Owner

@wincent Ah yes, that should be the reason for the difference. fzf obviously does not do it. I guess it would be much faster if you could translate wildignore into arguments to find command instead of calling Vim function every time.

@Shougo

@junegunn Thanks. I fixed the error.

So here is the new GIF (sorry the demo is a bit silly):

Thanks. I will improve unite.vim performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.