-
Notifications
You must be signed in to change notification settings - Fork 620
Enable local variables for C/C++ and improve autocompletion #3185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
What the heck, why a new PR? So much invaluable discussion in the old PR that won't be directly connected. |
See the end of the discussion - I wanted to rebase the PR because there were many fixups and reverts and other mess. If I rebased it there, it wouldn't be clear what the discussion was about because the commits wouldn't correspond to the discussion. The PR is linked at the top of this PR so I don't see any problem. |
Tested in the same place as before, and some other places, works as expected. This is such a big improvement on previous capabilities it should be merged and improvements can be added later. If there are no changes and no objections I will merge as is after the weekend. Then hopefully it gets some testing and any actual performance issues can be identified rather than more premature optimisation. (The idea is to merge before you think of something else and mess this PR up too 😜 ) |
Good strategy :-). But I'm pretty happy with the result after testing it a little more (and discovering the easy-to-fix problems with inheritance from other files). So yeah, I agree. Maybe #3176 could be merged too at the same time - I think it's a low-risk and simple patch, also related to autocompletion and hopefully not very controversial. |
I approved #3176 a while ago, question is why it's not merged already :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really happy with b710cf3
- it makes a lot of assumption
- it seems really expensive
- works only for C/C++ by hardcoding
- has nothing to do with local variables
Please lets discuss that feature in a separate PR. Also, if we want that, then it may make sense to expose that "find header" logic somehow to implement a "go to header" feature (I think ProjectOrganizer already has something like that?)
src/symbols.c
Outdated
foreach_ptr_array(tmtag, i, tags) | ||
{ | ||
if (tmtag->type & tm_tag_local_var_t && | ||
(doc->tm_file != tmtag->file || | ||
current_line < tmtag->line || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tm_parser_var_valid_before_declare()
should be called here, no?
Also, the whole condition is basically a duplicate of the same in is_valid_autocomplete_tag()
. Can is_valid_autocomplete_tag()
be made callable from here (perhaps with a different name)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tm_parser_var_valid_before_declare() should be called here, no?
In fact, I'm less convinced whether tm_parser_var_valid_before_declare()
is a good idea now - even though you can declare (i.e. assign) variables after their first appearance in file, I think this pattern is not so common and like in the C/C++ case because of this you will get invalid variables in the autocompletion in many cases which IMO is not worth it.
Also, the whole condition is basically a duplicate of the same in is_valid_autocomplete_tag(). Can is_valid_autocomplete_tag() be made callable from here (perhaps with a different name)?
I was thinking about it but there's small difference - while with autocompletion we have only prefix, here we have the whole symbol name so less "false positives" in the result so I thought I could leave it this way. In any case, I don't have a strong opinion about this and these two could indeed be unified. This function could go to tm_tag.c.
@@ -601,6 +601,7 @@ static void show_autocomplete(ScintillaObject *sci, gsize rootlen, GString *word | |||
} | |||
/* store whether a calltip is showing, so we can reshow it after autocompletion */ | |||
calltip.set = (gboolean) SSM(sci, SCI_CALLTIPACTIVE, 0, 0); | |||
SSM(sci, SCI_AUTOCSETORDER, SC_ORDER_CUSTOM, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the docs but I couldn't grasp the difference between SC_ORDER_PRESORTED
and SC_ORDER_CUSTOM
. In either case scintilla does not sort by itself. SC_ORDER_PRESORTED
seems to assume the list is presorted alphabetically but what's the deal with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't explore the logic of how we show the popup and if we need it (but @elextr seemed to experience some problem without it). The way I understand it is that when we call SCI_AUTOCSHOW
, Scintilla takes care of filtering the list based on what the user types unless we update it ourselves. When the list is sorted, Scintilla can bisect where the start/end of the range to be displayed is and show the values in between (this really is my assumption, I haven't checked Scintilla sources). When not sorted, it first creates an index that corresponds to alphabetic ordering and does the bisection over this index. This makes the complexity 2*log(N)
(plus the initial creation of the index) instead of N
if it had to go through all the values every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick scan of the code suggests Scintilla makes the index in all cases, but only sorts it in the SC_ORDER_CUSTOM
case, so the sort is the extra cost. Then as @techee said it binary searches the index for the selection each time you type, so when we passed SC_ORDER_PRESORTED
by leaving the default it was binary searching an unsorted index and that gives undefined results, which is probably why @techee didn't see it and I did.
for (i = 0; i < src_len && num > 0; i++) | ||
{ | ||
TMTag *tag = *src; | ||
if (predicate(tag, info) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every predicate function contains the is_autocomplete_tag() check. I think it makes sense to have that call here (before calling the predicate). The function could be suitably named copy_autocomplete_tags()
That should perform a bit better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do, I would just move it after predicate()
- is_autocomplete_tag()
potentially makes string comparison of scopes and it's better to eliminate those with the other conditions first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm ok. I was more concerned about calling a function via pointer and hoped it could be avoided. But a string comparison isn't negligible either. And I don't really want to start doing micro-benchmarks here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, lets leave it and start worrying if it ever actually causes an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call over pointer doesn't matter at all, it's still the same function in the loop and the processor will handle it for free. Having to compare scope for local variables is another story - with local variables enabled we get 3x more tags than before and eliminating the comparison early by checking whether we are in the right file definitely helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of the order, IMO the call to is_autocomplete_tag()
doesn't need to be duplicated in each predicate function but can be called in one place here, especially if the compiler whats to inline it.
Indirect calls aren't as free as direct calls but it lets not focus on that, I don't care much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of the order, IMO the call to is_autocomplete_tag() doesn't need to be duplicated in each predicate function but can be called in one place here, especially if the compiler whats to inline it.
Agree.
And if performance is your concern, we can actually do better. I didn't think of it when writing the patch but these calls
copy_tags(dst, found, count, name_table, max_num - dst->len, is_local_tag, info);
copy_tags(dst, found, count, name_table, max_num - dst->len, is_current_file_tag, info);
copy_tags(dst, found, count, name_table, max_num - dst->len, is_header_file_tag, info);
could be modified not to perform on the workspace tags but only the tags from the current file or the current header (that is, from info->file->tags_array
or info->header->tags_array
(of course after performing tm_tags_find() on them before). That should speed up the search for local tags, current file's tags and header (and possibly include) tags significantly.
Basically the only assumption that this makes is that sources and corresponding headers have the same file name and headers have some of the commonly used header extensions.
Note that this doesn't use any file system operations - this works over TMSourceFile structures which represent parsed source files, which, without any extensions, are only open documents (with something like ProjectOrganizer it can be whole projects). It's true this function does string comparisons which can be a bit expensive but the number of files tends to be 2 orders of magnitude less than the number of tags.
This is true but I don't see any sign that any of the other ctags parsers would get the capabilities of the cxx parser soon so scope autocompletion will unfortunately be C/C++ only feature for the foreseeable future.
It doesn't but neither other things in this PR do - this is about improving scope autocompletion. When you type something like
it makes a difference whether the code first looks for a member In fact, I was thinking about improving this by looking at Now thinking about it I should have implemented it differently - I could do it inside
I can separate it but in some form I want to keep this feature because it improves the results of scope autocompletion.
The difference between ProjectOrganizer and Geany is that Geany doesn't know about all project files (there's no notion of a file belonging to a project in Geany). For every project file in ProjectOrganizer there is TMSourceFile, in Geany just for open files and this functionality cannot be easily replicated in Geany itself unless Geany would search the file system for the header. |
I wasn't sure whether LGBI is sufficient approval ;-). Merged now. |
Yea, a lot of changes that don't relate to local tags, true. Ideally these would all be separate PRs. It makes reviewing the necessary changes for local tags harder than necessary. However, the "non-header" changes are less invasive contentious from what I saw so I'm willing to accept them as part of this PR. With ProjectOrganizer there can be lots of open files so I'm a bit concert about simply iterating through all of them. Also, there might be header files with the same name, so some kind of clever resolution is needed. Anyway, the patch for that feature is large enough to warrant a separate PR. |
Self correction - scope completion will probably find the right class (though looking at tags from headers definitely won't hurt). But it's definitely useful for non-scope autocompletion. |
Please separate PRs |
@kugel- why do you come in at the last minute and require a PR to be split wasting the effort others have done to test and check it? Its not like this project has masses of spare resources, and making it harder is not the way to get improvements included. |
Why have you changed? My question related to your review and if I should dismiss it. |
I can't follow. Yesterday I tried to review the whole thing for the first time. I was striked how large the change has become since my initial, very coarse, look (that wasnt a review) and have found that a lot of unrelated changes (w.r.t. to local tags) were added. GitHub even hides the diff for tm_workspace by default!! Since I also want this feature I'm trying to accept most of them in one PR although I don't feel truly confident. But the header feature a bit too much for my taste and should be dealt with separately. From what I can see its a single commit that has to be backed out, provably not much of a hassle. For the future I wish that PRs stay focused on what they were created for and don't accumulate many unrelated changes. |
For me personally, the problem is your attitude. You didn't indicate in #3175 that you are going to spend some more time with this PR and the conversation was mostly between me and @elextr. From the beginning of this PR you started bitching about "why a new PR", what you are "willing to accept" and about (in your opinion) "unrelated changes" that were introduced to the PR. The effect on me is significant reduction of willingness "to obey". Those "unrelated changes" as you call them come from the testing of how autocompletion works with local variables enabled. It offers many more results than before and it's not sufficient to just simply enable local variables. I spent quite a bit of time testing how autocompletion works and the patches on the way are the result of the problems I (or @elextr ) run into (like the sorting of non-scope autocompletion tags was a very good idea Lex had and having the tags from the header means that in C++ you have members of classes - which you typically access frequently - at the top of the list). I normally try to separate patches as much as possible but at the same time I wanted to make the result useful and not annoying because you will see scope completion much more frequently and you will see more symbols in non-scope autocompletion. No need to remind me about making compact PRs, I try to do that with truly unrelated code but this isn't the case. In any case, right now I suggest not to merge this PR until all problems are resolved and everything is discussed - I think there's no rush. You mention you "don't feel truly confident" so please take your time and post the things that are questionable in your opinion here. At the same time, I'd like to have the discussion about the header patch here too - I'm not going to spend time to remove it just because you happen not to like it - I'd like to see some real arguments (I tried to reply to your initial comments) and know what you suggest instead. As I mentioned, I'd like to generalize it so symbols from included files are also sorted towards the top of the list - this would probably mean to change the implementation of the function checking whether the file is a header (or include) by having a hash table (simulating a set) with the included file names, going through TMSourceFile's and checking whether the file name is in the table. |
The "look for files that could be headers corresponding to this C file" is clearly a new feature that has nothing to do with local tags. I did not realize that this feature was added to the previous PR until starting an in-depth review on this PR. I also never said that I don't like it or do not want it in Geany, I just want to discuss and review it separately because I have some doubts about the implemented assumptions and I don't want the "local tags" effort to be blocked by that. It's also a big chunk of the overall diff and I like to review smaller diffs. I followed your discussion on the other PR but I simply didn't have the time for a in-depth review yet. Sorry for being late to the party. If you're not going to spend time to remove it then I won't spend any more time to review this PR. Clearly you have @elextr already be happy enough to get this over the finish line without my involvement - and that's perfectly fine. There is also nothing in the diff that's fatally wrong so just go ahead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(github wants me to comment)
@kugel- I don't see the searching and sorting and header parts as something separate from enabling local variables, they are an integral part of presenting the locals to the user in a usable manner. You complained about separation of the conversation on #3175 but it does not appear you understood it. Otherwise you would have understood that without the filtering and the ordering parts of the PR unrelated globals from the C/C++ global tags files overwhelm the locals in the autocomplete list and make them unusable. And without the work for selecting the most likely base symbol for scope autocomplete it is also less useful since it will be more likely to present completions for the wrong variable. Locals are inherently scoped, and inner scopes hide outer scopes. So "enabling locals" inherently involves handling scopes, and those rules are inherently language specific, see the examples in #3175. Geany does not have exact scope detection, especially with include files, so the sorting and filtering, including (for C/C++) prioritising header file symbols over globals provides a heuristic for presenting the most likely candidates at the top of the autocomplete list, and choosing the most likely base for scope autocomplete so making them usable Splitting the PR now means more work, double testing, the initial PR presenting a bad experience, and the risk of introducing bugs in the split. For all the above reasons I do not support splitting this PR. I did react negatively to your blocking and seeming to demand it be split after seeming to accept the PR as I pointed out above, and I apologise for that. Perhaps you didn't mean to seem to accept it, and you didn't mean to be demanding without reason. English is imprecise and usages vary around the world, here "please do x" means "I would like you to do x if its reasonable", whereas "do x please" means "I demand you do x and am saying please just to be polite", but I am not sure its the same in other English speaking countries, let alone when English is not a first language. |
@kugel- All I want is to have the header stuff discussed now so I know what you think is wrong about it. It means some extra work for me and if it turns out that the discussion evolves into "let's leave it the way it is" then the work is wasted. This is why I'm not really thrilled about comments like "Please separate PRs" without proper discussion about what you think is wrong, what you'd suggest to change or whether you want to scrap the feature altogether. I'm totally fine to rework the patch, separate it to a new PR or scrap it if it turns out not to be a good idea but after a proper discussion and not just because someone orders me to do so. If your primary concern is performance, I suggest to change the implementation so we add a hash table geany/src/tagmanager/tm_workspace.h Line 27 in 55aea4c
so we have a fast access to TMSourceFile based on file name. Then we can perform a fast lookup of a header (constructing possible header candidates using common C/C++ header extensions and looking them up in the hash table - that is 6 lookups for the headers I used which is super-cheap) and this could also serve for included file lookups if we start using the include tags.
|
With regard to performance, I suggest that unless someone can provide a real world usage that demonstrates poor performance that we leave it as is and see how it goes. Reasoning about performance with modern chips with multi-level caches, and optimising compilers is questionable to say the least, all the old rules of thumb are wrong, they were the first thing optimised away 😄 There are regularly comments about string comparisons being "slow" (in lots of issues/PRs and not just Geany, not picking on @techee comment above) but modern compilers require So unless an implementation has gross performance problems (like time is exponential in size) then unless we have an actual usage thats a problem, or benchmarks, then use the simplest to write and maintain implementation first. |
I don't want to cause some tensions here with my PRs and don't want to introduce code others don't agree with so I removed the header-related code. I believe I also addressed all the other comments @kugel- had, improved the performance of non-scope autocompletion by searching in per-file tag arrays when applicable and made a few cleanups. @kugel- Does it look good to you? What about the proposal related to handling header files (and possibly include files) by having a hash map in |
@elextr I agree with most of the stuff about performance and I think we shouldn't do any crazy optimizations. On the other hand, I don't feel comfortable when we do some things unnecessarily and there's a simple fix like in this case. |
@techee can we finish this, what else do you need? |
I'm not quite sure we settled on something which is why I mostly abandoned this PR ;-).
Basically from the code-related things, unless I don't overlook something, the remaining stuff was the missing comment I just added and also documentation. Regarding documentation, I changed what @elextr proposed to something a bit different - will add an explanation in the next post. |
Despite everyone's love to tables, graphs and similar cool stuff, I believe the table is an overkill here. The current sorting is not something that has to be preserved, could change in the future if we find a better heuristic, and it's just a proxy to "show the most relevant symbols first". So
is completely sufficient IMO. The scope autocompletion part:
isn't correct. In scope autocompletion we always get all the members sorted alphabetically - when you think of e.g. members of a struct or class, these are defined in a single file and we have no guess which of the members the user wants. So I used this wording instead:
|
So if you squash a bit (especially reverts of earlier commits) then I'll immediately approve. The documentation is good enough for me. If there's any desire to improve it it can be done later on, IMO. I don't feel such desire right now. |
OK (if others are OK with it). I'm not sure how much time I'll have in the coming weeks so it may be a bit delayed on my side. |
As I won't get any benefit from this PR since my C++ is done elsewhere I won't waste any more time on it. Suffice to say that until the last of the changes (those that have now been removed) there was no real improvement in C++ usability, and in fact it was possibly worse due to more noise in autocompletes. @techee I am sorry to have wasted your time on trying to improve it. @kugel- I do not know if it adds any worthwhile capabilities to C, I only tested lots of C++. Its up to you to merge it if you think it adds anything useful, I won't. |
Hmm, but if nothing else, wasn't it at least useful to see local variables in the non-scope autocompletion popup (and placed at the top)? From my testing, it was the non-scope autocompletion which benefited most from this PR and started offering more relevant results thanks to the ordering. The usefulness of the scope autocompletion may indeed be iffy since we see only the statically declared things and don't see things like |
Its now stretching my ... erm ... erm .. oh yeah memory!!! But IIRC: Maybe its the way different C++ is written, but in the range of C++ I tested most code was in class member functions, and class data member names were the most common, but they were declared in the class definitions, which were in the header files and so without header preferential sorting they were buried down in the alphabetic part of the autocomplete, which grew significantly. Hence my dissatisfaction at header preferential sorting removal. Locals were second most common, but mostly short so they didn't even trigger 4 character autocomplete, so it didn't help that much with those1. Also many locals were declared within The main codes I tested had a certain amount of repetition between classes, and since its not indicated where names in the Geany autocomplete list come from, my inference about why things appeared where they did may be incorrect. Since that testing I have used vscode for some short experimental coding where I didn't bother to set up build info, and the autocompletes show a similar list to what I remember getting with the header file sorting enabled3 As I suggested C (or Cish C++) may get greater benefit from locals, since it doesn't have class member names that are visible in member functions without any qualification. And older C codes with the "declare at the top" idiom may tend to longer local names, although newer C following "don't declare until its initialise-able" may tend to shorter names and smaller scopes. Footnotes
|
Dammit, I have to apologise to vscode, my experimental code used several |
@techee Hey. Any progress on this? Just some squashing was requested from my side. |
tm_tag_file_t is unused in Geany (and we'll probably never need tags for files) so let's use it for local variable tags.
Enable generating these tags both for local variables and function parameters - those are more or less identical for what we will be using local tags for so they can be mapped to the same type. Local tags aren't interesting for tag files so filter them out when generating these (but this also means that we cannot create unit tests for them).
We have to ignore local variables that: 1. Are from a different file 2. Are declared later in the file than where the current cursor is 3. Have different scope than the current function scope
The code removes invalid local variables from other functions and behind current position. In addition, it sorts the found tags corresponding to the name in the editor for which we perform scope completion so local variables are searched first for their members, followed by tags from the current file, followed by workspace tags and finally using global tags.
We only want to use local variables within the current function for the goto. This means we want to filter out local variable tags that: 1. Are from a different file 2. Are declared later in the file than where the current cursor is 3. Have different scope than the current function scope Fundamentally the same requirements as for (non-scope) autocompletion.
We are interested in pure type name and ctags returns types including pointers, references, arrays, template parameters and keywords such as "const" or "struct". Strip all those. Also move strip_type() above so it's usable by other functions.
The code simply checks for inherited classes, strips unneeded stuff like templates, etc. from inherited classes, detects multiple inheritance by splitting the string using "," and calling tm_workspace_get_parents() recursively on every parent class and collecting the returned tags. The code limits the recursion depth to 10 to avoid possible inheritance cycles and infinite recursion. Also remove #if 0'd old code from TM implementing inheritance that we kept for reference as we now have a better implementation.
When performing scope completion for void A::foo() { bar. // <-- } we need to determine what 'bar' is. It could be a global variable, in which case we should look for variable tags, it could however also be a member of A in which case we should look for member tags. To determine this, the previous code checked whether there's a tag named 'bar' with the same scope as the method tag. One drawback of this approach is that it doesn't take namespace manipulation functions like 'using namespace' into account so for header namespace X { class A { Baz bar; }; }; and source using namespace X; void A::foo() { bar. // <-- } it wouldn't find 'bar' because ctags reports foo() to have scope A and bar to have scope X::A. Another drawback of this approach is that it doesn't take inheritance into account so it wouldn't find 'bar' when defined in a super-class, such as class B { Baz bar; }; class A : B { void foo(); } To avoid these problems, this patch rewrites member_at_method_scope() (and renames it to member_accessible()) so it gets the class name from the scope of the method in which we are (A in the above example) and returns all members of A (including the super-classes members). Afterwards, it checks if one of the members is really the member tag we are interested in ('bar' in the above example).
To present more relevant results at the top of the list, sort them in the following order: local variables tags from current file workspace tags global tags and alphabetically within a single group. In addition, we need to remove duplicates from the list of displayed names. Finally, since the entries are not sorted alphabetically, we need to call SSM(sci, SCI_AUTOCSETORDER, SC_ORDER_CUSTOM, 0); so Scintilla knows it shouldn't sort the list alphabetically.
In functions where we pass the current file as an argument, we can get the used language from the file and this extra parameter is unnecessary.
As disussed on #3266 I pushed a slightly-modified history. The diff is the same all around. Unless anyone speaks up that can be merged in my opinion. |
Great, thanks! |
This is rebased #3175 with cleaned up commit history but no functional change.