Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JavaScript: Ignoring certain tag patterns for JS #1680

Open
kristijanhusak opened this issue Feb 6, 2018 · 20 comments
Open

JavaScript: Ignoring certain tag patterns for JS #1680

kristijanhusak opened this issue Feb 6, 2018 · 20 comments
Assignees
Milestone

Comments

@kristijanhusak
Copy link

I assume there is already a way to do this, i just wasn't able to figure it out.
I'm using ctags on a node js project, which uses node module structure with require()/module.exports.

I would like to skip generating tags for constants that contain the require() in them.
For example, this is line in ctags that i would like to skip generating:

CustomerValidator	lib/domain/user/signup_validator.js	/^const CustomerValidator = require('..\/customer\/customer_validator');$/;"	C

I know i could skip generating C kind, but i would still love to leave that for other things.

Thanks!

@masatake
Copy link
Member

masatake commented Feb 6, 2018

There is no way to do so in ctags.

grep may help you to strip unwanted items like:

ctags -o - the_node_file.js | grep -v require > ./tags

I don't know node js well, so I wonder why you don't want to skip the require lines.

@kristijanhusak
Copy link
Author

I want to skip them because when i use go to definition in vim, it goes to the file where i imported that tag, instead of going to that tag directly.

@codebrainz
Copy link
Contributor

@kristijanhusak not a direct solution, but Node supports standard JS import mechanism, maybe that will be handled better by ctags?

@masatake
Copy link
Member

masatake commented Feb 7, 2018

I understand what you wrote as follows:

In javascript language level, CustomerValidator is defined two twice.
Once as const in signup_validator.js, and once as something in custom_validator.
When doing "goto definitions" operation on vim, you expect it takes you to the later one. However, it shows the two for choosing.

Am I correct?

If your answer is yes, I would like to know custom_validator side.
I wonder how CustomerValidator is defined in custom_validator.

In Javascript level, there is no solution for your trouble. The code const foo = ... defines foo, and capturing definitions is what ctags should do. ctags should capture definitions as much as possible. You can apply a filter like grep to ctags output. Or you can use smarter (or customizable) front end that chooses proper one when a name is tagged twice or more. Writing here is the fundamental design policy of u-ctags.

However, u-ctags supports sub-languages on language.
(http://docs.ctags.io/en/latest/running-multi-parsers.html#tagging-definitions-of-higher-upper-level-language-sub-base)

I wonder I can do something interesting for nodejs input.
It is the initial step for implementing nodejs subparser to know how CustomerValidator is defined in custom_validator.
However, even if u-ctags handles nodejs in a speciall way, u-ctags just emits more tags for CustomerValidator. The result will not be what you want.

@kristijanhusak
Copy link
Author

@codebrainz if you are talking about esm (import module from 'file'), that is not an option, since this is existing project, and a big one. But yeah, ctags are much better for those types of imports, which i noticed on a frontend React project.

@masatake It's more than once, everywhere where i required it.

CustomerValidator	lib/domain/appointment/validation/schema.js	/^const CustomerValidator = require('..\/..\/customer\/customer_validator');$/;"	C
CustomerValidator	lib/domain/user/signup_validator.js	/^const CustomerValidator = require('..\/customer\/customer_validator');$/;"	C
CustomerValidator	test/generators/appointment_generator.js	/^const CustomerValidator = require('..\/..\/lib\/domain\/customer\/customer_validator');$/;"	C
CustomerValidator	test/unit/lib/domain/customer/create_customer_use_case_spec.js	/^const CustomerValidator = require('..\/..\/..\/..\/..\/lib\/domain\/customer\/customer_validator/;"	C
CustomerValidator	test/unit/lib/domain/customer/customer_validator_spec.js	/^const CustomerValidator = require('..\/..\/..\/..\/..\/lib\/domain\/customer\/customer_validator/;"	C

The main problem is that customer_validator.js itself doesn't have a proper definition where tags can be generated. This is the contents of the customer_validator.js

const Joi = require('../validation/joi');
const Validator = require('../validation/basic_validator');

const createSchema = Joi.object().keys({
  phone: Joi.string().phone().optional().allow(''),
  email: Joi.string().email().optional().allow(''),
});

module.exports = {
  validateCreate: data => Validator.validate(data, createSchema),
};

When i do something like this in customer_validator.js, the tag line for it gets generated:

// ..

const CustomerValidator = {
  validateCreate: data => Validator.validate(data, createSchema),
};
module.exports = CustomerValidator;

In tags i get:

CustomerValidator	lib/domain/customer/customer_validator.js	/^const CustomerValidator = {$/;"	c

I understand why the tag cannot be generated for this concrete file, but i would like to avoid having bad tags even if i don't have the right one.

@masatake
Copy link
Member

masatake commented Feb 7, 2018

I wrote long reply in English. Though I mistakenly close the page before submitting it.
So I wrote it in C instead.

[yamato@master]~/var/ctags-github% cat customer_validator.js
cat customer_validator.js
const Joi = require('../validation/joi');
const Validator = require('../validation/basic_validator');

const createSchema = Joi.object().keys({
  phone: Joi.string().phone().optional().allow(''),
  email: Joi.string().email().optional().allow(''),
});

module.exports = {
  validateCreate: data => Validator.validate(data, createSchema),
};
[yamato@master]~/var/ctags-github% git diff 
git diff 
diff --git a/main/lregex.c b/main/lregex.c
index 6bdbe0a8..1014b57d 100644
--- a/main/lregex.c
+++ b/main/lregex.c
@@ -967,6 +967,35 @@ static void parseKinds (
 *   Regex pattern matching
 */
 
+static char *
+translate(const char *input, const char* engine)
+{
+	int i;
+	bool seen_underscore = false;
+	vString *buf = vStringNew();
+
+	for (i = 0; i < strlen(input); i++)
+	{
+		if (i == 0)
+			seen_underscore = true;
+
+		if (seen_underscore && isalnum(input[i]))
+		{
+			vStringPut(buf,
+					   (islower (input[i]))
+					   ? input[i] - ('a' - 'A')
+					   : input[i]);
+			seen_underscore = false;
+		}
+		else if (isalnum(input[i]))
+			vStringPut (buf, input[i]);
+		else if (input[i] == '-' || input[i] == '_')
+			seen_underscore = true;
+		else if (input[i] == '.')
+			break;
+	}
+	return vStringDeleteUnwrap (buf);
+}
 
 static vString* substitute (
 		const char* const in, const char* out,
@@ -976,14 +1005,27 @@ static vString* substitute (
 	const char* p;
 	for (p = out  ;  *p != '\0'  ;  p++)
 	{
-		if (*p == '\\'  &&  isdigit ((int) *++p))
+		if (*p == '\\')
 		{
-			const int dig = *p - '0';
-			if (0 < dig  &&  dig < nmatch  &&  pmatch [dig].rm_so != -1)
+			p++;
+			if (isdigit ((int) *p))
 			{
-				const int diglen = pmatch [dig].rm_eo - pmatch [dig].rm_so;
-				vStringNCatS (result, in + pmatch [dig].rm_so, diglen);
+				const int dig = *p - '0';
+				if (0 < dig  &&  dig < nmatch  &&  pmatch [dig].rm_so != -1)
+				{
+					const int diglen = pmatch [dig].rm_eo - pmatch [dig].rm_so;
+					vStringNCatS (result, in + pmatch [dig].rm_so, diglen);
+				}
 			}
+			else if (((int)(*p)) == 'F')
+			{
+				const char *f0 = getInputFileName ();
+				char *f1 = translate (baseFilename(f0), "CamelCase");
+				vStringCatS (result, f1);
+				eFree (f1);
+			}
+			else
+				/* ???*/;
 		}
 		else if (*p != '\n'  &&  *p != '\r')
 			vStringPut (result, *p);
[yamato@master]~/var/ctags-github% cat nodejs.ctags
cat nodejs.ctags
--langdef=nodejs{base=JavaScript}
--kinddef-nodejs=m,module,modules
--extradef-nodejs=implicitDefinedModule,implicitly defined module that can be passed to require
--regex-nodejs=/^module.exports *= *\{/\F/m/{_extra=implicitDefinedModule}{translator=Basename,CameCase}

[yamato@master]~/var/ctags-github% ./ctags --options=./nodejs.ctags  --fields=+lK  --extras-nodejs=+'{implicitDefinedModule}' -o - customer_validator.js
./ctags --options=./nodejs.ctags  --fields=+lK  --extras-nodejs=+'{implicitDefinedModule}' -o - customer_validator.js
CustomerValidator	customer_validator.js	/^module.exports = {$/;"	module	language:nodejs
Joi	customer_validator.js	/^const Joi = require('..\/validation\/joi');$/;"	constant	language:JavaScript
Validator	customer_validator.js	/^const Validator = require('..\/validation\/basic_validator');$/;"	constant	language:JavaScript
createSchema	customer_validator.js	/^const createSchema = Joi.object().keys({$/;"	constant	language:JavaScript
exports	customer_validator.js	/^module.exports = {$/;"	class	language:JavaScript	class:module
validateCreate	customer_validator.js	/^  validateCreate: data => Validator.validate(data, createSchema),$/;"	property	language:JavaScript	class:module.exports
[yamato@master]~/var/ctags-github% 

ctags capatures CustomerValidator from customer_validator.js as an extra tag.
(The translators are stub. They are not implemented. )

Tell that tags of nodejs higher priority than JavaScript to your enough smart editor or file viewer, you can go directly to customer_validator.js. It is up to your tool:-P.

@masatake
Copy link
Member

masatake commented Feb 7, 2018

translator or translte is bad name. I think I should call it transformer. You can define your own transformers in C. You can apply them to a string pick updated by regex pattern. \F can be used like \1. It represents the name of current input file.

If I found enough supporters of this idea, I will finish the patch.

@masatake
Copy link
Member

masatake commented Feb 7, 2018

I should use \{input} instead of \F.
Using cryptic short name makes difficult being found via web search.

@kristijanhusak
Copy link
Author

That would be awesome! I have a lot of files that are like that, and i don't have a proper tag definition.
From what i can read in your code (i'm not so good with C), you do this for all files that have underscore in them. Is that right?

If it is, we should maybe limit creating these only for these situation where we don't have a proper tag definition in a file, like this module.exports = {} thing.

If this gets added, i believe it should be optional, since some people probably don't want this to happen by default.

@masatake
Copy link
Member

masatake commented Feb 7, 2018

It does NOT generating the underscore tags for all input files. Generating tags only for matching specified pattern.

The change comes from two parts. One is written in C and the other is written in ctags option file.
C part replaces \F in ---regex-... option with input file name with camel case conversion.
Currently the converter(transformer?) is hard-coded but it should not in the future.

What I would like to see is ctags option part. It does many things.
nodejs parser is defined based on JavaScript parser. nodejs parser is defined as a subparser. It is activated only when JavaScript parser runs. nodejs parser has its own kind, module. It has its onw extra
implicitDefinedModule. What the subparser does is very simple: it searches the pattern module.exports *= *\{. If the pattern matches and --extras-nodejs=+'{implicitDefinedModule}' is given, ctags records the current input file name as a tag with the transformation.

The C part I wrote must be added to ctags itself, of course. However, the .ctags optoin file part should be given by user, as you wrote as "optional".

I will wait for more comment from another person.
However, I think this is good hack.

@kristijanhusak
Copy link
Author

Ok, thanks. Looking forward to it.

@masatake
Copy link
Member

masatake commented Feb 7, 2018

https://stackoverflow.com/questions/48613460/using-universal-ctags-how-do-i-tag-a-variable-that-references-an-entire-file

The question on the page is very related to what we discussed here.
Making tags "defined by their filename" is the feature I added.

@codebrainz
Copy link
Contributor

It might be better to name the option after CommonJS or RequireJS or whichever library provides this pattern/convention rather than making it Node-specific. Later similar hacks could also be added for the competing module systems. This might make more sense than naming the option after Node which is just bog standard JS.

@masatake
Copy link
Member

@codebrainz, sure, I should not use "nodejs" as language name. I thinks the pattern and language should be defined by users. My proposal is the notation {input} in --regex- option.

@masatake
Copy link
Member

I will make a pull request.

@masatake masatake self-assigned this Feb 14, 2018
@masatake
Copy link
Member

masatake commented Mar 4, 2018

A new prototype works. I wrote a very small virtual machine for transforming names.

In the protoype, I introduce following notation in --regex-<LANG>:

  \{data|transformer0|transformer1|...}

With the new notatin, what you want can be written:

\{input|basename|deleteExntension|PascalCase}

https://en.wikipedia.org/wiki/Camel_case

$ cat /tmp/signup_validator.js 
cat /tmp/signup_validator.js 
const Joi = require('../validation/joi');
const Validator = require('../validation/basic_validator');

const createSchema = Joi.object().keys({
  phone: Joi.string().phone().optional().allow(''),
  email: Joi.string().email().optional().allow(''),
});

module.exports = {
  validateCreate: data => Validator.validate(data, createSchema),
};
[yamato@master]~/var/ctags-github% cat mynodejs.ctags
cat mynodejs.ctags
--langdef=mynodejs{base=JavaScript}
--kinddef-mynodejs=m,module,modules
--extradef-mynodejs=implicitDefinedModule,implicitly defined module that can be passed to require
--regex-mynodejs=/^module.exports *= *\{/\{input|basename|deleteExntension|PascalCase}/m/{_extra=implicitDefinedModule}
[yamato@master]~/var/ctags-github% ./ctags --fields=+Kl --options=mynodejs.ctags --extras-mynodejs=+'{implicitDefinedModule}' -o - /tmp/signup_validator.js 
<tions=mynodejs.ctags --extras-mynodejs=+'{implicitDefinedModule}' -o - /tmp/signup_validator.js 
Joi	/tmp/signup_validator.js	/^const Joi = require('..\/validation\/joi');$/;"	constant	language:JavaScript
SignupValidator	/tmp/signup_validator.js	/^module.exports = {$/;"	module	language:mynodejs
Validator	/tmp/signup_validator.js	/^const Validator = require('..\/validation\/basic_validator');$/;"	constant	language:JavaScript
createSchema	/tmp/signup_validator.js	/^const createSchema = Joi.object().keys({$/;"	constant	language:JavaScript
exports	/tmp/signup_validator.js	/^module.exports = {$/;"	class	language:JavaScript	class:module
validateCreate	/tmp/signup_validator.js	/^  validateCreate: data => Validator.validate(data, createSchema),$/;"	property	language:JavaScript	class:module.exports

@masatake
Copy link
Member

masatake commented Apr 5, 2018

As I wrote in the last comment I planed to introduce a shell like language that uses | for representing a chain of functin calls. However, after trying to solve #1577, I think a PostScript alike language is better for the purposes.

shell style:

\{input|basename|deleteExntension|PascalCase}

ps style

\{ input basename deleteExtension PascalCase}

Not so different at a glance.
However, we can implement, conditional jump and loop on it.
It may be possible to define a procedure if you want.

@masatake masatake added this to the Feature plan milestone Apr 5, 2018
@masatake masatake changed the title Ignoring certain tag patterns for JS JavaScript: Ignoring certain tag patterns for JS Jan 24, 2019
@gp42
Copy link

gp42 commented Dec 18, 2019

Any updates on this issue?
I have another use-case with groovy language, where class name is defined as a file name.

@laxman20
Copy link

This sounds exactly what I'm looking for and I hope we can get it merged. I'm working on a AngularJS project where the convention for all class definitions are defined as anonymous class exports.

Given a file named hello-world.controller.js

export default class {
...
}

The tag for this would be HelloWorldController

@ctulocal1
Copy link

Along these same lines, I don’t think ES6 modules are being supported. The syntax for them is export function {function name} (). That is, it’s essentially the same as a normal named function definition, but with they keyword export prepended. These definitions are found in the module (usually with a .mjs extension) and are then brought into another module or javascript source file using import {function name} from '{filename}'. Where * can be used to import all exported function names. Is there some way for me to add this with --regexp or similar flag / feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants