-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync ctags parsers with big changes with Geany #2991
Conversation
Warning Brain dump: Of the first list the only one I would suspect may have been modified in Geany independent of ctags is Lua. Of the second list I'm pretty sure Verilog and VHDL were added to ctags then brought into Geany but probably have developed from that initial version so the ctags version should be good.
I presume that any parser not in the lists is the same in both Geany and ctags so won't need updating? |
Thanks, let's see if others have some more information.
After #2984 is merged, I'd like to open a pull request switching us to the new c/c++ parser from uctags so these 2 most terrible languages (parsing-wise) should be eliminated from c.c. The rest of course remains. I remember the pre-babies version of Colomban talking about about separating those parsers from c.c but I suspect this task is delayed by about 15 years now ;-). Anyway, the purpose of this PR is to switch to those parsers where we know we can just grab the uctag version - we'll have to deal with the rest later parser by parser.
Those with minimal or no changes are in PR #2990 |
I'm here because I searched python ctag in pull requests to see if it was being addressed anywhere after being told that the GDScript ctags file that I adapted from the python ctags file I found in the geany repository was actually based on a decrepit version of the python parser. I intend to drag in the uctags python parser right after I'm finished redoing my GDScript alterations to the newer version of the python parser in ctags, unless someone else is doing this already. |
…ing uctags version
The tests only differ in added "()" to all functions in generated tags.
The section name seems to be parsed correctly now.
There seem to be missing () which aren't present in the source anyway so probably OK. Test1 procedure is correctly named here, not as the class name TTest.
The new go tags contain extra scope information.
PHP and zephir now use "\\" as the context separator instead of "::". For zephir, return values of methods are now parsed.
The new parser parses also method arguments.
Scope information seems to be correctly parsed now.
425d350
to
cb1c3e8
Compare
I just took the parsers from the first group plus python and tried them against our unit tests to see what happens. Fortran seems to produce different output in many situations so I dropped it for now. The asm parser uses the cpreprocessor.c/h that the new cxx parser uses and depends on some things from the new cxx parser so I dropped asm from now too. Finally, the new python parser generates different tags for imports and we should make some modifications in Geany to adjust for that so it's dropped for now too. The rest of the parsers seem to work fine and the updates in unit tests seem to do the right thing. I pushed the updates to unit tests language by language including comments about the changes to simplify review. I can of course drop some parser from this pull request if someone knows some updates were made to our parser that should go upstream first. To summarize, this is the result of this pull request: Parsers included here:
Different parsers (longer-term stuff):
Parsers for which I will make separate PRs soon:
|
@Davidy22 I'll prepare a separate PR for python soon - not sure if the version I use for all the parsers here from the end of October is new enough for you but when the python parser is in, it should be no problem to update it to the version you need. |
I've already been muddling around with my modified for gdscript python parser so I think I have a handle on the tags in it. I can update my gdscript PR in a bit so you can yoink the not-gdscript file changes maybe |
@Davidy22 Are all your changes submitted here? https://github.com/universal-ctags/ctags The point of all these pull requests is that we don't have any diffs of our own against universal ctags and all changes have to go there first and we just copy the corresponding parsers to Geany. |
Yeah the one on the ctags repo is my latest work, haven't updated my PR here in a bit, probably will once I get merged in lexilla and ctags, my open PRs linked in the pull request here. Geany does have a little extra layer of mappings that I fiddled to make the symbol table on the left show the right things |
Then I'd suggest to wait until the PR with python I'm planning to make lands in Geany so we don't have to deal with 2 different parsers at the same time. |
Applied #2990 and #2991 to Geany at eabc09a, works for me.
Also checked any filetypes I happened to have a source file lying around for, and knew enough to check what should be generated, none seem to have any different failures from a version of Geany before they were applied. If nobody beats me or objects I'd be happy to push #2990 and #2991 in a week (or so). |
Oh my work (#3012) is on a new parser for a language with similarities to python, no worries about colliding implementations |
I had a quick look on the changes and this looks good, we'll get some more new tag types and probably the newer uctags code is less buggy than the current ones. Though at least in the Pascal parser we will lose two features:
This two features are not in the uctags Pascal parser. I don't know where they came from, maybe we added them ourselves in the past. Since these could be useful for others as well, it might be best to push these two upstream to uctags, so we'll get them back with the next uctags update. Until then, it might be ok to have the two features removed (which will also solve #2987) by this PR and get them back with the next uctags sync. An alternative could be to remove the Pascal parser from this PR and merge #2987 in the meantime. I prefer the first option: use the uctags Pascal parser with this PR. What do you think? |
Is it from fixing the crash on typing "type"? But anyway it would be nice to push improvements upstream. I would agree that using the uctags parser is the preferable option instead of drawing out the alignment to upstream even further. Out of interest, do Pascalians get any new features with the uctags parser in return for losing "type" for a release or so? |
Well, "fixing the crash on typing "type"" is the consequence of the feature that "type" is parsed at all which seems to be an exclusive Geany feature. though not sure if this is what you asked :D.
Nope, the parsers seems equal apart from the "type" tag type and the argument list parsing. |
It is what I asked, which just exposes my confusion :-)
Ok, probably should add to release news something like "Pascal parser has been upgraded to align with ctags, the Geany specific extensions of 'type' and argument list parsing have been pushed upstream and will return next update." Unless the next Geany release takes so long that changes pushed upstream are accepted in a new parser beforehand and we can import it. Having synchronised with uctags at that stage importing should be simple if the symbols.c handling of those features is left in place. |
I think they have many more in uctags. In Geany though I think we don't even have to care about the parsers themselves (those are tested in uctags already) but rather the mapping to our internal representation (which is revealed in our tags files). So it's only important that all the tag types we support are generated by the tests. |
Ok. Let's go this way. I'll try to prepare PRs for uctags with the two missing features. If/When they are merged, we can sync the Pascal parser. |
This reverts commit 955e0eb.
Cool, I was going to prepare the PR myself - you were too fast :-) I just reverted the parser back to the original one (if desired, I can squash it with the previous commits). |
:) However, I only PR'ed the argument list parsing feature. type
TTest=class
procedure Test1;
procedure Test2;
procedure Test3;
end; Will create a tag for "TTest" but also for the procedures "Test2" and "Test3" which are wrong. And "Test1" is missing for some reason (fun fact: even our test case expects that broken behavior in https://github.com/geany/geany/blob/master/tests/ctags/bug612019.pas.tags). I guess the parsing of "type" declarations needs to implemented in a more sophisticated way. At least I don't want to submit it upstream in the current state. |
Yep, agreed. I just added the iniconf parser to this pull request too - after looking at it in more detail, there are just extra things added and there are no changes in our unit tests. |
…hp separators This patch introduces a new function tm_parser_update_scope() which can be used to modify scope information when the provided value from ctags doesn't suit our needs. This patch uses it to replace php scope separator \ with :: so there's a single scope separator for this language.
"." is used as the context separator in ruby now, the comment is probably just some historic artifact.
I think #2990 is OK to be merged (just added a comment there regarding what I did). I would still wait with this one. First, I guess merging #2990 will cause some merge conflicts here so these will have to be resolved first. Second, there is the problem with the go parser I mentioned in #3032 (comment) and subsequent comment and I'd like to know the opinion of @masatake. Also, someone might want to have a look at the patch I added yesterday here to rewrite PHP scope information if I didn't do something stupid there. |
# Conflicts: # ctags/Makefile.am
This patch allows us to enable/disable ctags roles for certain languages and kinds. Roles are currently disabled only for the Go kind 'p' for the reasons mentioned in the comment message in the patch.
@elextr I think I'm more or less done here. Merge conflicts are resolved, the scope separator problem of PHP should be fixed now and the Go problem from universal-ctags/ctags#3211 too. All the bugs introduced by me are your responsibility now :-P. |
Ok, so long as you fix #3032 to add all the new capabilities to Geany :-) Will try this on the weekend and merge next week if it WFM, giving others time to try it too. |
This patch is similar to the patch allowing us to enable/disable roles, it just does this for ctags kinds. This can be useful when there is a bug in a ctags parser causing incorrect parsing when certain kind is enabled - which is what happens when parsing cython source files with the 'z' flag enabled.
Enabling some of the kinds may be much easier than making sure nothing breaks here ;-) I added one more patch here to allow us to enable/disable certain kinds in ctags - will help to resolve the cython problem in #3031 until it's fixed in ctags and will allow us to easily handle such problems in the future. |
Instead of enabling all kinds and then manually disabling them if there are problems with them, we can enable/disable them based on whether we actually use them - i.e., when they are mapped to something else than tm_tag_undef_t.
Sorry for the last-minute push I just did, I just got an idea we could enable/disable kinds automatically based on whether we use them for the given language or not instead of enabling them all and this is just much better than hard-coding some kinds when there's some problem for them. This could also speed up parsing as the corresponding code paths can be skipped in individual parsers (and potentially reduces crashes because less code from the parser will be executed). This should fix the problem with cython parameters automatically as we don't use the 'z' kind. |
Still WFM, tests pass, merged before @techee can sneak more in :-) |
This pull request isn't meant to be merged, it should just serve as a discussion place which parsers apart from those in #2990 to select for syncing. I'm not so familiar with the state of Geany parsers and if there are some patches that should be first applied to uctags before we can grab the uctags version, it would be nice if people who know more than me (@b4n @eht16 @elextr and others) could help with this.
Just to start somewhere, I included a few parsers to this commit:
I went through all the parsers and divided them into 2 categories:
Parsers where uctags implementation is similar to the Geany implementation:
Parsers which differ a lot:
Apart from python, the languages from the first group seem to be the most obvious candidates for syncing. So does anyone have a better knowledge of the state of these parsers?