Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Multi pass parsing over multi source files #1495

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open
727 changes: 727 additions & 0 deletions docs/barrel.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
107 changes: 107 additions & 0 deletions docs/internal.rst
Expand Up @@ -420,3 +420,110 @@ An example can be found in DTS parser:

Setting `requestAutomaticFQTag` to `TRUE` implies setting
`useCork` to `TRUE`.

Multi passes parsing over multi input files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The main part of Universal-ctags provides APIs for Multi passes
parsing over Multi source files (MM). The main part applies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand the meaning of "MM". First "M" is for multi, but what is the second for?
Typically you would do something like: "Multi source Files (MF)". That is, capitalize the words used in the abbreviation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MM means "M"ulti pass parsing over "M"ulti source files.
Do I make sense?

a parser to the same input files more than once. Each application
is called "pass". Tags captured in a pass can be passed the next
pass via data structure called "barrel". A parser can use
"barrel" for two purpose: filling fields of tags and hinting.


What you can do MM
......................................................................

Let's see the next code (input.py) of python:

.. code-block:: python

from X import Y
...

.. code-block:: console

$ ctags --extras=+r --fields=+rK -o - input.py
X input.py /^from X import Y$/;" module role:namespace
Y input.py /^from X import Y$/;" unknown role:imported

The Kind for "Y" is "unknown"; as far as parsing only
``input.py``, ctags cannot know the kind of "Y". "Y" can be a class, a
variable, a function, etc. To decide the kind, at least, the knowledge
about module "X" (X.py).

.. code-block:: python

class Y:
...

If ``X.py`` is passed an input file to ctags, The python parser can know "Y"
is a class.

.. code-block:: console

$ ctags --extras=+r --fields=+rK -o - X.py
Y X.py /^class Y:$/;" class

However, python parser could not know this when parsing ``input.py``.

Here comes MM.

.. code-block:: console

$ ctags --extras=+r --fields=+rK -o - input.py X.py

At the first pass, the python parser puts following tags captured
from the two source files to "barrel"::

Y input.py /^from X import Y$/;" unknown role:imported
Y X.py /^class Y:$/;" class

At the second pass, the python parser receives above "barrel", and
decides the kind of "Y" referenced in input.py like::

Y input.py /^from X import Y$/;" class role:imported

At the first pass, the writing out "Y" in input.py to tags file can be
delayed. At the second pass, the python parser writes the tag for "Y"
to tags file with deciding the kind field.

This is just an example showing how useful MM is. What ctags can do at
the second pass is similar to what gcc can do at the linking stage.

MM/Barrel API
......................................................................

.. figure:: barrel.svg
:scale: 80%

MM parser must enable cork and use ``parser2`` method instead of
``parser`` method. ``parser2`` method has ``passCount`` parameter.
In MM process, a negative integer passed to ``parser2`` method via
``passCount``; -1 is for the first pass, -2 is for the second pass,
and so on. Positive integers are for Multi passes parsing over single
input file.

If a parser in a pass wants to process current input file,
return ``RESCAN_MM`` from ``parser2`` method of ``parserDefinition``.

The ctags main part gathers all input files which parsers
ask "next pass" with ``RESCAN_MM``. After all input files are
processed in the current pass, the main part starts the
next pass. Before applying ``parser2`` methods of parsers
to the input files marked ``RESCAN_MM``, the main part
calls ``setupMM`` method of the parsers with "Barrel".

A tag passed to ``makeTagEntry`` has two destinations:
tags file and/or "Barrel".

If ``placeholder`` field of a tag is 0, the tag is
written to tags file when corkQueue is flushed.

If ``barrel`` field of a tag is 1, the tag is stored to
"Barrel". ``handOverEntryToNextMMPass`` helper function is for setting
the ``barrel`` field to 1.

If ``placeholder`` field of a tag is 0, and ``barrel`` field of the
same tag is 1, the tag is written to tags file and stored to "Barrel".
20 changes: 18 additions & 2 deletions main/entry.c
Expand Up @@ -44,6 +44,7 @@
#include "fmt.h"
#include "kind.h"
#include "main.h"
#include "mm.h"
#include "options.h"
#include "ptag.h"
#include "read.h"
Expand Down Expand Up @@ -1062,8 +1063,15 @@ extern void clearParserFields (tagEntryInfo *const tag)
}
}

static void clearTagEntryInQueue (tagEntryInfo* slot)
extern void clearTagEntry (tagEntryInfo* slot)
{
if (slot->barrel)
{
slot->barrel = 0;
pourEntryToBarrel (slot);
return;
}

if (slot->pattern)
eFree ((char *)slot->pattern);
eFree ((char *)slot->inputFileName);
Expand Down Expand Up @@ -1224,7 +1232,7 @@ extern void uncorkTagFile(void)
makeQualifiedTagEntry (tag);
}
for (i = 1; i < TagFile.corkQueue.count; i++)
clearTagEntryInQueue (TagFile.corkQueue.queue + i);
clearTagEntry (TagFile.corkQueue.queue + i);

memset (TagFile.corkQueue.queue, 0,
sizeof (*TagFile.corkQueue.queue) * TagFile.corkQueue.count);
Expand Down Expand Up @@ -1558,3 +1566,11 @@ extern const char* getTagFileDirectory (void)
{
return TagFile.directory;
}

void handOverEntryToNextMMPass (unsigned int n)
{
Assert (n != CORK_NIL);

tagEntryInfo *e = getEntryInCorkQueue (n);
e->barrel = 1;
}
4 changes: 4 additions & 0 deletions main/entry.h
Expand Up @@ -51,6 +51,7 @@ struct sTagEntryInfo {
unsigned int placeholder :1; /* This is just a part of scope context.
Put this entry to cork queue but
don't print it to tags file. */
unsigned int barrel: 1; /* store to barrel */

unsigned long lineNumber; /* line number of tag */
const char* pattern; /* pattern for locating input line
Expand Down Expand Up @@ -172,6 +173,9 @@ void uncorkTagFile(void);
tagEntryInfo *getEntryInCorkQueue (unsigned int n);
tagEntryInfo *getEntryOfNestingLevel (const NestingLevel *nl);
size_t countEntryInCorkQueue (void);
void handOverEntryToNextMMPass (unsigned int n);

extern void clearTagEntry (tagEntryInfo* slot);

extern void makeFileTag (const char *const fileName);

Expand Down
24 changes: 20 additions & 4 deletions main/keyword.c
Expand Up @@ -28,6 +28,7 @@ typedef struct sHashEntry {
const char *string;
langType language;
int value;
bool dynamic;
} hashEntry;

/*
Expand Down Expand Up @@ -87,14 +88,15 @@ static unsigned int hashValue (const char *const string, langType language)
}

static hashEntry *newEntry (
const char *const string, langType language, int value)
const char *const string, langType language, int value, bool dynamic)
{
hashEntry *const entry = xMalloc (1, hashEntry);

entry->next = NULL;
entry->string = string;
entry->language = language;
entry->value = value;
entry->dynamic = dynamic;

return entry;
}
Expand All @@ -104,15 +106,15 @@ static hashEntry *newEntry (
* should be added in lower case. If we encounter a case-sensitive language
* whose keywords are in upper case, we will need to redesign this.
*/
extern void addKeyword (const char *const string, langType language, int value)
static void addKeywordCommon (const char *const string, langType language, int value, bool dynamic)
{
const unsigned int index = hashValue (string, language) % TableSize;
hashEntry *entry = getHashTableEntry (index);

if (entry == NULL)
{
hashEntry **const table = getHashTable ();
table [index] = newEntry (string, language, value);
table [index] = newEntry (string, language, value, dynamic);
}
else
{
Expand All @@ -131,11 +133,23 @@ extern void addKeyword (const char *const string, langType language, int value)
if (entry == NULL)
{
Assert (prev != NULL);
prev->next = newEntry (string, language, value);
prev->next = newEntry (string, language, value, dynamic);
}
}
}

extern void addKeyword (const char *const string, langType language, int value)
{
addKeywordCommon (string, language, value, false);
}

extern void addKeywordStrdup (const char *const string, langType language, int value)
{

char *s = eStrdup (string);
addKeywordCommon (s, language, value, true);
}

static int lookupKeywordFull (const char *const string, bool caseSensitive, langType language)
{
const unsigned int index = hashValue (string, language) % TableSize;
Expand Down Expand Up @@ -179,6 +193,8 @@ extern void freeKeywordTable (void)
while (entry != NULL)
{
hashEntry *next = entry->next;
if (entry->dynamic)
eFree ((char *)entry->string);
eFree (entry);
entry = next;
}
Expand Down
6 changes: 6 additions & 0 deletions main/keyword.h
Expand Up @@ -25,7 +25,13 @@
/*
* FUNCTION PROTOTYPES
*/

/* `string' should be allocated statically. */
extern void addKeyword (const char *const string, langType language, int value);

/* addKeywordStrdup does strdup `string'.
Duplicated string is freed in freeKeywordTable() */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space in comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I will make a fixup! commit.

extern void addKeywordStrdup (const char *const string, langType language, int value);
extern int lookupKeyword (const char *const string, langType language);
extern int lookupCaseKeyword (const char *const string, langType language);
extern void freeKeywordTable (void);
Expand Down
3 changes: 3 additions & 0 deletions main/main.c
Expand Up @@ -70,6 +70,7 @@
#include "error.h"
#include "field.h"
#include "keyword.h"
#include "mm.h"
#include "main.h"
#include "options.h"
#include "read.h"
Expand Down Expand Up @@ -481,6 +482,8 @@ static void batchMakeTags (cookedArgs *args, void *user CTAGS_ATTR_UNUSED)
if (! files && Option.recurse)
resize = recurseIntoDirectory (".");

resize = mmRun ()? true: resize;

timeStamp (1);

if ((! Option.filter) && (!Option.printLanguage))
Expand Down