New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Multi pass parsing over multi source files #1495
base: master
Are you sure you want to change the base?
[RFC] Multi pass parsing over multi source files #1495
Conversation
…m singed to unsigned A negative value will be passed in the stages of multi pass parsing over multi source files Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
…source files Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
…rser typedefs gathered in "-1" mm pass is stored to barrel. At the first of the "-2" mm pass, typedefs are stored as keywords.
$ cat a.sv class test; t_user t_user_memb; endclass $ cat b.sv typedef int t_user; $ ./ctags -o - a.sv b.sv t_user b.sv /^typedef int t_user;$/;" T t_user_memb a.sv /^ t_user t_user_memb;$/;" v class:test test a.sv /^class test;$/;" C t_user_memb is captured as v.
After thinking I found barrel is not needed here. Adding typedefed typename to the keyword table in the -1 pass will be enough... |
Other interesting application of mm is "unknown" kind of python. |
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
What I wrote is something like linker. So I can borrow many concepts and ideas from linker. |
I have the intention of dedicating part of my weekend to Universal Ctags. |
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
I updated the documents. I didn't take an example from SystemVerilog parser because I don't know it well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good and the number of added lines isn't as big as I would have expected, which is nice :)
I think some code code be made common. Please review my comments and analyze if it is possible.
I also remembered "libraries". Code that is central and used frequently in the project but that is rarely changed. It would make sense to be able to "seed" the second pass with such libraries... but this is beyond the scope of this change, I guess.
main/keyword.h
Outdated
extern void addKeyword (const char *const string, langType language, int value); | ||
|
||
/* addKeywordStrdup does strdup `string'. | ||
Duplicated string is freed in freeKeywordTable() */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra space in comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I will make a fixup! commit.
@@ -122,7 +124,8 @@ static kindDefinition SystemVerilogKinds [] = { | |||
{ true, 'P', "program", "programs" }, | |||
{ false,'Q', "prototype", "prototypes" }, | |||
{ true, 'R', "property", "properties" }, | |||
{ true, 'T', "typedef", "type declarations" } | |||
{ true, 'T', "typedef", "type declarations" }, | |||
{ true, 'v', "member", "member elements" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meanwhile I started thinking about 'o'
and 'object'
.
I'm raising this now, because it might make sense to maintain various "groups". E.g.: instances of classes are objects, instances of typedef are "custom types".
Verilog is an hardware description language that allows a special type of "object" called "instance". Design architecture is divided in "modules", which are instantiated in other top level modules. Having ctags parse these as special types would allow editors that support these tags to easily show the overall design structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Designing kinds is the most important task in ctags development.
I don't know Verilog. So if you are o.k., I'm o.k.
@RadekRR, this is the chance to reflect your idea to the software you are using:-).
else | ||
return RESCAN_MM; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make the parser aware of passes? In my mind I think we could modify the keyword table to indicate which types we are looking for in the first pass, and which are only valid in the second.
Core ctags could take care of the rest, by providing a smaller keyword table to the parser in the first pass and then the complete table in the second. The parser itself would be simplified and any other parser could easily use this new architecture.
Do you like the idea? Or am I missing something basic that will not allow this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make the parser aware of passes?
The main part doesn't know which source file(s) should be rescanned.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
The main part of Universal-ctags provides APIs for Multi passes | ||
parsing over Multi source files (MM). The main part applies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't understand the meaning of "MM". First "M" is for multi, but what is the second for?
Typically you would do something like: "Multi source Files (MF)". That is, capitalize the words used in the abbreviation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MM means "M"ulti pass parsing over "M"ulti source files.
Do I make sense?
@vhda, I feel gaps between what you want and what I wrote. Could you look at Please, look at the document explaining MM. https://github.com/masatake/ctags/blob/6ea782371d7dc78b884a7882dd4dedd81a03f63c/docs/barrel.svg |
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
I read again the original discussion. My understanding of what you want. pass 1. just making a hashtable and fill it with type names (or something names). You want to make no tag entries in this stage. pass2. making tag entries and emiting them to tags file. These flow can be implemented on MM. What you should is just ignoring Barrel. You can create SystemVerilog own type name table with functions declared in hashtable.h. SystemVerilog parser However, you should emit some of tag entries in pass1 if possible. Because, newly introduced |
Hi @masatake , Could I please inquire what the status is of this pull request? The functionality proposed seems highly interesting in any case! Also a related question: the C++ parser appears to already be able to identify definitions for variables or objects of a type created with typedef or class. If I understand the discussion well that's something that this pull request was intending to add as well. For the SystemVerilog parser the same functionality does not appear to work yet though. So am I correct in assuming that the C++ parser has solved this problem in the parser itself? Or has some framework been added that other parsers could make use of to add similar functionality? Thanks! Wim |
There is no progress other than changes I proposed here. C++ parser may use heuristics. It is not perfect. e.g. about handling template variables. Before implementing Multi-passes/Mult-files parsing, I have to improve the infrastructure for Multi-passes/Single-files parsing. See #2115. If you want mm seriously, and want to implement, I will explain what kind of issues are. |
Close #80.
@vhda, how do you think about these changes? See the commits whose logs are started with
[TEMPORARY]
.I will update docs/internal.rst after getting your approval.
In my version, SystemVerilog emits the most of all tags in the first (-1) pass. However, SystemVerilog parser hands over some of them to the next mm pass via "barrel". In the seond (-2) pass, tags in the barrel are stored to the keyword table by setupMM method of SystemVerilog.
For testing I choose tags of typedef kind. In the -2 pass, SystemVerilog parser can recognize the next token of typedef'ed type. If such token is recognized in class context, the token can be tagged as "member" kind.
I didn't choose the kind name "variable" though 'v' was chosen. Because a member of class is not a variable.
This will not work well in interactive mode. I have to research the area more.