Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many data will crash native fruzzy matcher #19

Closed
mars90226 opened this issue May 16, 2019 · 12 comments
Closed

Too many data will crash native fruzzy matcher #19

mars90226 opened this issue May 16, 2019 · 12 comments

Comments

@mars90226
Copy link

When I use Denite with command_history source that will list vim command history and use native fruzzy matcher, it will crash. The detail bug information can be found in Shougo/denite.nvim#636.
When I use nvim -i NONE to ignore previous command history, the problem is gone. So I think it's probably because my vim command history is too large.

@raghur
Copy link
Owner

raghur commented May 16, 2019

Can you turn off usenative and see how many entries you have on your command history so I can test with something similar?

@mars90226
Copy link
Author

Sure, I have 8305 entries in my command history.

@mars90226
Copy link
Author

I've used set history=[limit] suggested by Shougo to test the limit. For me, the limit is 1548. When I set history=1549, fruzzy will crash. And the corresponding command history size is between 27461 ~ 27485 bytes. (Counted by using q:, select all, and use g <C-g>)

raghur added a commit that referenced this issue May 17, 2019
@raghur
Copy link
Owner

raghur commented May 17, 2019

@mars90226 - I just added a test with 2400 lines (120k file size). At least with pytest, this passes (I'm testing this on Win 8.1). That indicates that this could be something between the denite/fruzzy border.

What OS are you on? Also, if you aren't on Windows, is it possible to run pytest? See instructions on the README.

@mars90226
Copy link
Author

I've run FUZZY_CMOD=1 pytest --log-level=debug -s. Here's the result:

==================================================================================== test session starts =====================================================================================
platform linux -- Python 3.6.7, pytest-4.5.0, py-1.8.0, pluggy-0.11.0
benchmark: 3.2.2 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /root/.vim/plugged/fruzzy/rplugin
plugins: benchmark-3.2.2
collected 18 items

fruzzy_test.py .................lenght of results - 10
.


------------------------------------------------------------------------------------------------ benchmark: 5 tests --------------------------------------------------------------------------
---------------------
Name (time in us)                               Min                 Max               Mean             StdDev             Median                IQR            Outliers  OPS (Kops/s)
   Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------
test_measure_baseline_native_call           11.4143 (1.0)      421.0388 (3.68)     13.3437 (1.0)      21.3837 (5.06)     11.8725 (1.0)       0.1984 (1.0)      147;1799       74.9420 (1.0)
    34313           1
test_must_prefer_longer_match               32.2582 (2.83)     114.5201 (1.0)      39.3677 (2.95)      5.5705 (1.32)     42.4199 (3.57)      9.4557 (47.67)    7012;128       25.4015 (0.34)
    21841           1
test_must_prefer_match_at_end               40.8161 (3.58)     137.7100 (1.20)     48.9782 (3.67)      5.7976 (1.37)     51.1087 (4.30)     10.4345 (52.60)     1400;24       20.4172 (0.27)
     4296           1
test_must_prefer_match_after_separators     48.7715 (4.27)     159.2608 (1.39)     58.9746 (4.42)      4.5775 (1.08)     58.9928 (4.97)      1.7174 (8.66)    1608;1959       16.9564 (0.23)
    13202           1
test_must_score_cluster_higher              49.7187 (4.36)     132.0997 (1.15)     59.4936 (4.46)      4.2253 (1.0)      59.7546 (5.03)      1.0524 (5.31)    1320;2613       16.8085 (0.22)
    13013           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
================================================================================= 18 passed in 4.47 seconds ==================================================================================

It seems normal. I'm not sure what's the problem here. Maybe I should try ctrlp with fruzzy?

@raghur
Copy link
Owner

raghur commented May 17, 2019

Thanks - much appreciated. I suppose there's a bug there somewhere - just that I haven't been able to find it. It doesn't seem to be the actual size of the data though. I'm going to try to inject the same dataset through denite and see if I can repro your crash.

Can I request you to try one more test? On the same branch, if you can replace the contents of neomru_file_big with your command history and run the test? Or if you can send me your command history, I can spare you the trouble.

@mars90226
Copy link
Author

Here's the result with neomru_file_big replaced with my command history. It seems that there's some invalid Unicode sequence.

==================================================================================== test session starts =====================================================================================
platform linux -- Python 3.6.7, pytest-4.5.0, py-1.8.0, pluggy-0.11.0
benchmark: 3.2.2 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /root/.vim/plugged/fruzzy/rplugin
plugins: benchmark-3.2.2
collected 0 items / 1 errors

=========================================================================================== ERRORS ===========================================================================================
______________________________________________________________________________ ERROR collecting fruzzy_test.py _______________________________________________________________________________
fruzzy_test.py:143: in <module>
    biglist = [line.strip() for line in fh.readlines()]
/usr/lib/python3.6/codecs.py:321: in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfd in position 3605: invalid start byte
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================================================== 1 error in 0.08 seconds ===================================================================================

I've also found that :Denite register will crash too, and it's caused by a register with hhi<80><fc>^Hn^[l as content. But I'm not sure how this <80><fc> come from.

@mars90226
Copy link
Author

mars90226 commented May 17, 2019

There are two commands that has corrupt unicodes:

call ������2_last_tab()
echo ������2_last_tab()

Deleting them will make native fruzzy work. But it's weird that non-native fruzzy can work, but native fruzzy can't.

The hex of the corrupt commands are:

┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 63 61 6c 6c 20 fd bf bf ┊ ba b4 83 32 5f 6c 61 73 │call ×××┊×××2_las│
│00000010│ 74 5f 74 61 62 28 29 0a ┊ 65 63 68 6f 20 fd bf bf │t_tab()_┊echo ×××│
│00000020│ ba b4 83 32 5f 6c 61 73 ┊ 74 5f 74 61 62 28 29 0a │×××2_las┊t_tab()_│
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘

@raghur
Copy link
Owner

raghur commented May 17, 2019 via email

mars90226 added a commit to mars90226/dotvim that referenced this issue May 20, 2019
The reason is that native fruzzy cannot handle unicode sequence and
crash. Delete the commands that contain unicode sequence can avoid this
problem.

Ref: raghur/fruzzy#19
@mars90226
Copy link
Author

So, I think I can close this issue as this is the duplicate issue of #2?

@raghur
Copy link
Owner

raghur commented May 23, 2019

Yeah - I just have to get around to it.

@raghur
Copy link
Owner

raghur commented May 24, 2019

@mars90226 - I tried some basic unicode support but the timings are about the same as Python3...(so 370 - 420 us) Nim's unicode API isn't my strong suit and if all its going to result in is a solution that's as performant as plain python then it basically calls to question whether its even worth it.

I've pushed it on the same branch that I created for this bug (if I pick it up again)..

mars90226 added a commit to mars90226/dotvim that referenced this issue Jan 22, 2020
The reason is that native fruzzy cannot handle unicode sequence and
crash. Delete the commands that contain unicode sequence can avoid this
problem.

Ref: raghur/fruzzy#19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants