New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplemented ctags-search command #2625

Open
wants to merge 14 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@andreyorst
Copy link
Contributor

andreyorst commented Dec 8, 2018

Conversation: #662

I have reimplemented ctags-search command so it could search for any arbitrary sequence of symbols within the opened file, thus making this command more reliable. Current awk-based implementation fails to find many tags in non-C files. For example in asciidoc files:

image
We search for Scratch Buffers tag, and jump to it's location, but end up on = Buffers tag for some reason.

My implementation doesn't have this bug, but more importantly, it handles this:
image
So there's no need to truncate tag when we find first opening curly brace, like it is done in current awk command.

The downside of this implementation is that it is slower when we have lots of files with identical tag names, e.g. searching for Kakoune tag in Kakoune sources will take around 2 seconds to build menu on my machine. I think that this part can be optimized, but I don't see a way of doing it.

@occivink

This comment has been minimized.

Copy link
Contributor

occivink commented Dec 9, 2018

The performance snag you're hitting is likely due to all the calls to sed and cut in a tight loop. When operating on small input, the cost of spawning a subprocess completely dwarfs anything else. The awk implementation doesn't have that problem since it's only one subprocess. Is there a reason why the awk implementation couldn't be improved/rewritten to handle any symbol?

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Dec 9, 2018

The performance snag you're hitting is likely due to all the calls to sed and cut in a tight loop.

I understand it, that's why I've put as much as I've could to single sed call.

Is there a reason why the awk implementation couldn't be improved/rewritten to handle any symbol?

I don't know awk. I've made a symbol command to navigate through the current file symbols, in which I've used awk implementation from ctags-search and noticed cracks in it's design, as it handled tags badly or wrong in different situations. So I've teared down awk command, understood what it does and rewritten it with sed and cut.

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Dec 10, 2018

I'll try to implement it in awk

@lenormf

This comment has been minimized.

Copy link
Contributor

lenormf commented Dec 10, 2018

I downloaded the latest v4.9 longterm kernel and created tag files in the root directory, in net and in net/ipv4, then opened tcp_output.c from the net/ipv4/ directory and set the option:

set global ctagsfiles tags ../tags ../../tags

Looking up the tcp_sk tag took more or less 2 seconds on my system with the builtin implementation of ctags-search, and about a second with the one from this PR, so it looks encouraging (I don't know if caching took a role in this).

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Dec 10, 2018

@lenormf I suppose that the only slow part in this implementation is building the menu when there's multiple files where tags with identical name are defined. I didn't change the rest of this function, so if there's only one instance of tcp_sk tag in those tag files, then it will be as fast as current builin implementation.

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Dec 10, 2018

I've also added selection of actual tag in the line. Because there are lots of places like this:
notes2

@mawww
Copy link
Owner

mawww left a comment

My understanding is that this just fixes the escaping problems with the previous implementation. I think I would prefer to keep the awk based implementation that I find generally cleaner (and should be easier to make more efficient, as we do not need to fork many seds and cuts, especially as awk is providing sed/cut as built-in (it already supports field splitting, and sed like line matching).

Show resolved Hide resolved rc/base/ctags.kak Outdated
Show resolved Hide resolved rc/base/ctags.kak Outdated
@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Dec 23, 2018

My understanding is that this just fixes the escaping problems with the previous implementation.

@mawww Yes, that's the point of this PR. It shows that we can support full featured tag search in file, without problems with using characters that are used in balanced strings, which seems the core issue with previous implementation that author faced back then.

I think I would prefer to keep the awk based implementation that I find generally cleaner (and should be easier to make more efficient, as we do not need to fork many seds and cuts, especially as awk is providing sed/cut as built-in (it already supports field splitting, and sed like line matching).

I know that this is better to use awk, since it doesn't spam lots of small processes, but unfortunately I don't know awk. I've tried to implement it in awk, but failed and I don't really understand why it wasn't working. If anyone can transit this escaping mechanism and candidate selection to old awk implementation I will be happy. If I understand it correctly, awk can do the same type substitutions which sed does, but I don't understand the syntax.

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Jan 29, 2019

I've figured out how awk works and now I'm re-implementing this in it, using all suggestions that @mawww commented out in reviews. This isn't compete yet right now, but I'll finish it soon with some more commits.

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Jan 30, 2019

The first half is working fine. How do I generate tags file with line numbers instead of search patterns in it?

I'm trying with ctags -x --_xformat="%-20N %-16{input} %4n", and replacing all spaces with tabs between words, but that might be wrong format compared to what actually being used. It works, both old original implementation and my implementation.

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Jan 30, 2019

With current changes command is ready for testing. I've implemented the same behaviors for both search-pattern tags and line-number tags. Everything should work the same as shown in the gifs above.

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Feb 2, 2019

I've tested this code with daily usage of my symbol command (I don't use ctags-search much, because of LSP, but I generate temporary tags file to navigate within current file), which shares the same implementation, and was able to jump on every tag I wanted to jump.

andreyorst added some commits Feb 13, 2019

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Feb 13, 2019

@mawww what do you think about externalizing the awk part from ctags-search function into ctags-jump function? Two my other plugins and one custom command rely on this part of code, but I can't use ctags-search in those cases directly, because it does additional stuff which I do myself and differently. Because of that I need to duplicate the code in each realisation instead of using builtin one.

@mawww

This comment has been minimized.

Copy link
Owner

mawww commented Feb 14, 2019

What would that ctags-jump command do ?

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Feb 14, 2019

It will do the same thing as ctags-search does now, and awk implementation will stay the same. Core difference will be that ctags-search won't generate a menu by itself, as only thing it should do is "find a tag in some file", then we will be able to pass a tag file and a tag to ctags-jump via parameters. Then ctags-jump will build a menu for this tag, and jump.

Maybe this is overkill, but this way the awk-parser thing can be reused in different tag related places.

@andreyorst

This comment has been minimized.

Copy link
Contributor Author

andreyorst commented Feb 17, 2019

I've tried to, but it overcomplicates the code because of the need to search in several tag files while producing a single menu. So I'll keep it as is now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment