More nuance for match classes (class title? function title? etc.) #2521

joshgoebel · 2020-04-30T20:41:10Z

Table of some specific suggestions:

Name	Description
title.class	A class name
title.function	A function name
title.namespace	A namespace name
comment.block	A block comment
meta.shebang	A shebang line for a script file

Is your request related to a specific problem you're having?

Yes, I'd like to have a way to label/highlight names of a given type outside of their original context.

For example:

class Cat extends Pet { ...

Turns into <class><title>Cat</title><keyword>extends</keyword><title>Pet</title>. And if we wanted to style it a certain way we could target .hljs-class .hljs-title. Although Pet is semantically different here and that's not indicated at all.

The same is true for functions:

function cleanMyRoom() {}

Turns into <function><title>cleanMyRoom</title... and hence could be targeted by a theme with .hljs-function .hljs-title... but now we have the usage of these things outside of their definition:

var kitty = new Cat();
Cat.prototype.pet = function() { ... }

In both cases [re: Cat] we can easily figure out this is a "class" object and we could apply title to it... but is it a class? a namespace? a function name? We've lost all context.

And for functions:

let launchTheRockets = () => { ...

Technically launchTheRockets is a name/title here (and we could easily detect that and highlight)... but it's not the function itself (which is () => {...}... so again we have a name without context... and we're forced to fall back to the generic title.

And this problem get worse for languages that also have namespaces that could also be syntactically inferred... all we have is title and perhaps context at the definition site - although we have no namespace class...

The solution you'd prefer / feature you'd like to see added...

I think we may need to consider some new classes, or expanding existing ones with sub classes:

Alternative 1: New classes class_name, function_name, namespace_name
Alternative 2: Expand existing classes with subclass: title.function, title.class, title.namespace...

The way alternative 2 would unfolding during rendering CSS might look something like:

<span class="hljs-title hljs-title-function">launchTheRockets</span>

IE, the existing "broader" class is preserved. This would require no changes to any existing styles, while allowing new themes (or upgrades) to target items with more specificity. I'm a little more attached to system 2 as it seems more easily "backwards compatible".

Any alternative solutions you considered...

Nope, I outlined my two thoughts above.

Additional context...

I spent some time reviewing how Prism handles this and it has a lot more named options for tracking what "kind of thing" a given identifier might be:

constant
class-name
function-name (though most grammars use function)
namespace

This allows it to make some really nice distinctions when highlighting (like treating classes differently than function name, etc).

The text was updated successfully, but these errors were encountered:

joshgoebel · 2020-04-30T20:45:29Z

Option 2 also leaves a lot of runway for expanding other classes:

variable
variable.declaration
title.constant
title.function
function.invocation
meta.doctype
meta.shebang
comment.block
number.hex
literal.boolean

This allows grammars to start slowly becoming more expressive while also still thinking in terms of our initial list of classes to avoid all our existing themes from suddenly becoming invalid overnight. Some of these would even come for "free" to existing grammars simply by our upgrading the mode library to be more explicit, such as "comment.block"...

joshgoebel · 2020-04-30T20:57:47Z

I should note that technically it's possible to do this today in a grammar:

      { // ES6 class
        className: 'class', beginKeywords: 'class', ...,
        contains: [
          { skipBefore:/extends\s+/, begin: IDENT_RE, className: "title title-class title-class-super" },
          hljs.inherit(hljs.UNDERSCORE_TITLE_MODE, {className: "title title-class"})
        ]
      },

Ok, scratch that ALMOST possible (or possible if you want to hack the CSS a bit)... we only add the hljs prefix at the beginning of the string so you end up with:

hljs-title title-class

If someone called this a bug though I'm not sure I'd disagree. But I'd like to formalize this and come up with naming guidance so that over time we can see themes start to taken advantage of these things.

joshgoebel · 2020-04-30T21:10:28Z

I get tripped up on things like function.call/function.invoke vs title.function-call. I think one theme might decide it's nice to highlight function invocations while another might decide that's too noisy.

So filing them all under title.function is technically correct in both cases, yet semantically not enough information to be truly useful.

egor-rogov · 2020-05-01T20:31:23Z

Nice idea.
I've had a look into the classes reference and saw names like meta-keywords and meta-string, that indicates a desire for some hierarchical system.

joshgoebel · 2020-05-02T02:28:33Z

Yes, meta is the only thing we have broken up like that. It's a very good idea. TextMate goes a little overboard with it... going 3-5 levels deep - you can get VERY nuanced. :-)

And if we did it the way I'm suggesting above we could technically switch those to meta.keyword and meta.string and they'd output would still be exactly the same as it's always been.

joshgoebel · 2020-05-02T02:29:58Z

Oh Actually we have builtin-name also, which pairs very poorly with built_in (dash vs underscore, separation vs one word)... so our naming now isn't the most consistent. :-)

egor-rogov · 2020-05-02T11:28:33Z

I'm not sure you really need that level of details in highlighting, especially given that you can't 100% correctly parse source code with regexps... But consistency in classes naming (and usage) certainly won't hurt us (:

joshgoebel · 2020-05-02T19:19:46Z

Oh I wasn't suggesting we go that crazy, but it is easily possible for us to do things like contextually detect differences (in many cases) from a class name and say a function name, etc. So those are the easy wins.

TheElectronWill · 2020-08-17T16:09:36Z

Being more precise is a good idea!
Is there a reason why Pet in class Cat extends Pet is labeled as a "title"? It does seem wrong to me: what we are declaring here is Cat, this is the title, this is what is important here. Pet should be a type IMHO.

joshgoebel · 2020-08-17T16:17:47Z

Just how it's always been done? OTTOMH I think your logic sounds right. It's also still a class though, just not the one being defined... that might make it different than say string, not sure.

joshgoebel · 2020-10-18T12:35:58Z

The title below would also become name.function as grammars were updated.

return options.noHighlightRe.test(languageName);
//             \             ^-- name.function
//              `-- variable.property

Spending some time thinking about this... and it seems "variable.property" and "name.function" would add a lot of fidelity (often requested) and also be fairly easy to pattern recognize for at least most C-like languages. So instead of adding these and no themes having support, and then perhaps waiting forever (or living with only a very few themes that are "enhanced")... I think we could try our best to reuse the existing styling (even if it's not perfect)... so then immediately existing styles pick up the new highlighting and if someone wants to tweak them... I think this might cause some slight pain in the immediate future but have the best long-term result. So here is what I'm currently thinking:

title renamed to "name" => name, name.class, name.function, etc.
variable adds nuance => variable, variable.property, variable.constant, etc.
normal (unhighlighted) used to auto-generate => punctuation (~85% opacity)
string or number used to auto-generate => operator (tweaked opacity and small hue shift)

All existing themes would receive the new styles via programatic updates during the build process, ie these new styles will simply be added when missing. If a theme wants to opt-out (or override the defaults) they need only add the missing selector to the style to prevent the auto-upgrade from adding a selector.

The "hierarchy" here would be handled by the build process... for example a CSS rule for .name would be auto-expanded to cover .name-function (name.function), .name-class (name.class), etc... I like the . notation, so I'd probably support that in the grammars, but we wouldn't necessarily have to - could just use the older - syntax there. So the generated HTML would include only a single class:

class <span class="name-class">Person</span>

The alternative (for hierarchy) is generating a bunch of extra HTML which doesn't seem like a good idea to me. Rather a few additional selectors than a bunch of new HTML.

Thoughts? @allejo @egor-rogov

joshgoebel · 2020-10-18T12:47:17Z

At this point I think we could also perhaps consider allowing enhanced grammars to define their own semantic subgroups... Ie we dictate the top-level domain space "name" but then might allow name.whatever if 100% of programmers in that language would agree whatever is the appropriate contextual naming. This could also be solved with language aliases though (and is probably how we should do it)... So the fictional "Bicycle" language would just define an alias from "name.bicycle" to say "name.class"... then they could use the more descriptive name.bicycle when writing their grammar... this is a trick Prism uses to great effect.

joshgoebel · 2020-10-18T13:10:16Z

One final note about what I mean when I say:

I think this might cause some slight pain in the immediate future

I worry that some (not all) of our themes may have skewed towards being optimized for "scarcity"... ie perhaps "title" pops more than it should because this class was handed out more rarely in the past. That some themes may not handle the sudden flood of color as well as others... but again I think this could be dealt with on a case by case basis - even using the auto-generation logic to purposely "tune" or "exclude" some themes that got flagged as doing very poorly. This isn't an entirely new problem though, I noticed this a bit with the recent change to start highlighting class functions, etc.

Torxed · 2021-02-26T17:00:27Z

Not sure if this would catch:

handle = WithContext()

For instance? It would be nice to be able to highlight the class instantiation of WithContect() in this case.
Much like github does here. Compared to highlight js:

joshgoebel · 2021-02-26T17:04:21Z

I'm not sure what language that is meant to be... in many languages there isn't enough absolutely information in such a snippet to know WithContext is a constructor vs just any old function...

Torxed · 2021-02-27T08:39:51Z

I'm not sure what language that is meant to be... in many languages there isn't enough absolutely information in such a snippet to know WithContext is a constructor vs just any old function...

It was a subset of a larger code. Since higlight.js clearly detects function definitions and classes, it aught to be able to detect the initation/calls to those definitions? And by doing so add more nuance and highlight to the initiation calls.
The broader example where the snippet was taken from, was a Python code:

class WithContext():
	def __init__(self):
		self.value = 5

	def __enter__(self):
		return self

	def __exit__(self, *args, **kwargs):
		if args and args[1] == KeyError:
			pass
		return True

	def multiply(self):
		return self.value * 2

	def crash(self):
		raise KeyError("crashed")

for i in range(100000000):
	handle = WithContext()
	handle.multiply()

joshgoebel · 2021-02-27T17:36:23Z

Since higlight.js clearly detects function definitions and classes, it aught to be able to detect the initiation/calls to those definitions?

We are not a full language parser and we therefore (with small exceptions) do not track any context or parsing state. Once we highlight class WithContext on line 1 we forget about it entirely and when we see WithContext later it has no special meaning (apart from "immediate" context). This is by design as we are a pattern matching highlighter at heart.

Now some languages you have "immediate" context such as JS:

let x = new Widget();

It's possible for us to understand what Widget is here in Javascript (with mere pattern matching) because it's immediately proceeded by new. In other languages this can be determined by casing alone. ThisIsAClass, THIS_IS_A_CONSTANT, this_is_something_else...

But that's not possible in all languages.

joshgoebel · 2021-02-27T17:44:21Z

I've mentioned elsewhere that I'd be curious to see how we might allow 3rd party grammars to integrate a full parser of their own and pair that parser with our own output engine - but that is not a desire for the core library.

Torxed · 2021-02-28T09:21:21Z

I understand, I just started using highlight.js and came across this issue when I wanted to highlight the classes/functions in the code, not just the definition. I get that the parsing might be hell so I fully understand if that might never be implemented. It's fine the way it is :)

joshgoebel · 2021-02-28T17:22:34Z

I wanted to highlight the classes/functions in the code, not just the definition

If all the classes in Python are camel case by STRONG convention it's still possible with just patten matching, and I think we'd be open to such a PR.

joshgoebel · 2021-03-26T17:29:12Z

See #3078. We already have classes with - and _ in them:

template-tag
built_in
etc...

Since . isn't allowed in class names I wonder if we need a unique sequence for the new sub scope support? IE: title.class turns into what (in it's most expanded form)?

Right now I'm using .title-class... but I wonder if perhaps we shouldn't a bit grander:

.title--class
.title__class

Turnerj · 2021-03-27T12:21:03Z

These classes will still have the leading hljs- right? If so probably hljs-title--class over hljs-title__class (in my opinion).

joshgoebel · 2021-04-06T16:35:18Z

This may still change again because of the discussion happening over in PrismJS/prism#2849.

Removed the builtin-name replacing it with built_in in the few (4-5) grammars it was used. Also only 4 themes made any distinction between built_in and builtin-name, the rest choosing to style these exactly the same - so it's never been clear what the difference is or there is none. If more nuance is needed here it should be addressed via #2500 and #2521. In only one grammar was both built_in and builtin-name used side by side so other than that (in combo with very particular themes) this should be a quiet change.

joshgoebel · 2021-05-14T14:41:36Z

Progress has been made on this with v11 (see all the new advanced scopes, etc) and I'm closing this and considering it folded into #2500 which will remain open.

joshgoebel added enhancement An enhancement or new feature theme parser discuss/propose Proposal for a new feature/direction labels Apr 30, 2020

joshgoebel changed the title ~~[Proposal] Improve our ability to highlight names distinctly~~ [Proposal] Improve our ability to highlight important identifiers distinctly Apr 30, 2020

joshgoebel mentioned this issue Apr 30, 2020

Discuss: Higher fidelity language highlighting (in general) #2500

Open

joshgoebel pinned this issue Apr 30, 2020

joshgoebel changed the title ~~[Proposal] Improve our ability to highlight important identifiers distinctly~~ [Proposal] Add additional nuance to highlight classes (Ex: class title? function title? constant title?) Apr 30, 2020

joshgoebel changed the title ~~[Proposal] Add additional nuance to highlight classes (Ex: class title? function title? constant title?)~~ Propose: More nuance for match classes (class title? function title? etc.) Apr 30, 2020

joshgoebel mentioned this issue Sep 11, 2020

enh(R) major overhaul for the R language grammar #2680

Merged

6 tasks

joshgoebel mentioned this issue Sep 29, 2020

WIP: Proper highlighting for the Wolfram Language (Mathematica) #2706

Merged

joshgoebel mentioned this issue Oct 19, 2020

WIP: enh(parser) Better regular expression internals highlighting #2771

Closed

3 tasks

joshgoebel mentioned this issue Nov 6, 2020

(Julia) Improved highlighter #2832

Closed

This was referenced Mar 26, 2021

(JavaScript) It doesn't highlight properties or object methods #3079

Closed

enh(cpp) simpler struct matching #3078

Merged

joshgoebel changed the title ~~Propose: More nuance for match classes (class title? function title? etc.)~~ More nuance for match classes (class title? function title? etc.) Apr 5, 2021

joshgoebel mentioned this issue Apr 6, 2021

Document all token types PrismJS/prism#2849

Open

This was referenced Apr 8, 2021

(chore) consistency improvements with SAS #3116

Merged

Task: Create theme maintainers team and see who is still wanting to help maintain themes #3118

Closed

chore(themes) remove builtin-name CSS class #3119

Merged

joshgoebel mentioned this issue Apr 13, 2021

[Theme Maint] Write up on changes for theme maintainers #3134

Closed

joshgoebel closed this as completed May 14, 2021

joshgoebel unpinned this issue Sep 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More nuance for match classes (class title? function title? etc.) #2521

More nuance for match classes (class title? function title? etc.) #2521

joshgoebel commented Apr 30, 2020 •

edited

Loading

joshgoebel commented Apr 30, 2020 •

edited

Loading

joshgoebel commented Apr 30, 2020 •

edited

Loading

joshgoebel commented Apr 30, 2020 •

edited

Loading

egor-rogov commented May 1, 2020

joshgoebel commented May 2, 2020

joshgoebel commented May 2, 2020

egor-rogov commented May 2, 2020

joshgoebel commented May 2, 2020

TheElectronWill commented Aug 17, 2020 •

edited

Loading

joshgoebel commented Aug 17, 2020 •

edited

Loading

joshgoebel commented Oct 18, 2020 •

edited

Loading

joshgoebel commented Oct 18, 2020

joshgoebel commented Oct 18, 2020

Torxed commented Feb 26, 2021

joshgoebel commented Feb 26, 2021

Torxed commented Feb 27, 2021

joshgoebel commented Feb 27, 2021 •

edited

Loading

joshgoebel commented Feb 27, 2021

Torxed commented Feb 28, 2021

joshgoebel commented Feb 28, 2021

joshgoebel commented Mar 26, 2021 •

edited

Loading

Turnerj commented Mar 27, 2021

joshgoebel commented Apr 6, 2021

joshgoebel commented May 14, 2021 •

edited

Loading

More nuance for match classes (class title? function title? etc.) #2521

More nuance for match classes (class title? function title? etc.) #2521

Comments

joshgoebel commented Apr 30, 2020 • edited Loading

joshgoebel commented Apr 30, 2020 • edited Loading

joshgoebel commented Apr 30, 2020 • edited Loading

joshgoebel commented Apr 30, 2020 • edited Loading

egor-rogov commented May 1, 2020

joshgoebel commented May 2, 2020

joshgoebel commented May 2, 2020

egor-rogov commented May 2, 2020

joshgoebel commented May 2, 2020

TheElectronWill commented Aug 17, 2020 • edited Loading

joshgoebel commented Aug 17, 2020 • edited Loading

joshgoebel commented Oct 18, 2020 • edited Loading

joshgoebel commented Oct 18, 2020

joshgoebel commented Oct 18, 2020

Torxed commented Feb 26, 2021

joshgoebel commented Feb 26, 2021

Torxed commented Feb 27, 2021

joshgoebel commented Feb 27, 2021 • edited Loading

joshgoebel commented Feb 27, 2021

Torxed commented Feb 28, 2021

joshgoebel commented Feb 28, 2021

joshgoebel commented Mar 26, 2021 • edited Loading

Turnerj commented Mar 27, 2021

joshgoebel commented Apr 6, 2021

joshgoebel commented May 14, 2021 • edited Loading

joshgoebel commented Apr 30, 2020 •

edited

Loading

joshgoebel commented Apr 30, 2020 •

edited

Loading

joshgoebel commented Apr 30, 2020 •

edited

Loading

joshgoebel commented Apr 30, 2020 •

edited

Loading

TheElectronWill commented Aug 17, 2020 •

edited

Loading

joshgoebel commented Aug 17, 2020 •

edited

Loading

joshgoebel commented Oct 18, 2020 •

edited

Loading

joshgoebel commented Feb 27, 2021 •

edited

Loading

joshgoebel commented Mar 26, 2021 •

edited

Loading

joshgoebel commented May 14, 2021 •

edited

Loading