Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add editor syntax highlighting for jank #24

Open
5 tasks
jeaye opened this issue Apr 15, 2023 · 13 comments
Open
5 tasks

Add editor syntax highlighting for jank #24

jeaye opened this issue Apr 15, 2023 · 13 comments

Comments

@jeaye
Copy link
Member

jeaye commented Apr 15, 2023

jank has support for a new special form, called native/raw. It works in place of Clojure's interop syntax and allows for inline C++. But it also support interpolating jank expressions into that C++. Docs on the rationale and final solution are here: https://github.com/jank-lang/jank/blob/main/DESIGN.md#interop

Right now, all of this gets highlighted as a string, in vim/emacs/vscode, but the interpolated forms are Clojure code and should be highlighted accordingly. Ideally normal code completion, repl behavior, etc can work from within those forms, but we can take this one step at a time.

We may be able to make the syntax highlighting changes here in https://tree-sitter.github.io/tree-sitter/ and call it a day. But we might also consider getting into the vim/emacs/vscode configurations for Clojure and then forking them for jank to add this support. Would be your call on how to tackle this. I use vim and would love it to have this working, but we'll want great tooling for everyone, so might as well start with whatever you use.

  • vim
  • emacs
  • vscode
  • sublime
  • pulsar
@mauricioszabo
Copy link

A question: what do you think about replicating what ClojureScript already do? In CLJS, this is a way to do interop:

(js* "10 + ~{}" 30)

Maybe a (native/raw "std::cout << ~{} << \"\n\"" 20) would need less editor support and Kondo could work without changes.

@jeaye
Copy link
Member Author

jeaye commented Jun 15, 2023

Someone brought this up at the Conj as well. I actually wasn't familiar with the CLJS interpolation syntax, but I think consistency makes sense. Kondo would still need to understand that native/raw is a special form, but the code for ~{} could likely be reused.

@mauricioszabo
Copy link

I'll try to implement this in Pulsar, but first this ticket on Tree-Sitter needs to be done: sogaiu/tree-sitter-clojure#52.

Otherwise, "injected" C++ grammar will highlight the whole string, not the contents.

@jeaye
Copy link
Member Author

jeaye commented Jun 15, 2023

I'll try to implement this in Pulsar, but first this ticket on Tree-Sitter needs to be done: sogaiu/tree-sitter-clojure#52.

Otherwise, "injected" C++ grammar will highlight the whole string, not the contents.

Excellent! Thanks for the help. I'm looking forward to seeing this. Having great syntax highlighting support for the C++ inside native/raw and the jank inside its interpolation is going to be a huge win. Right now, it looks bad.

raw

@jeaye
Copy link
Member Author

jeaye commented Aug 9, 2023

Following up on Vim support for this. I've done some research into foreign syntax regions, and managed to get something working, but it's clunky and it flashes when I move the cursor. I suspect it's due to the performance of the jankNativeRawString regex. When it's just +\zs[^"]*\ze"+, which supports only one line, it works reliably. Multi-line matching is much harder.

syn include @CPP syntax/cpp.vim
syn match jankNativeRawString +"\zs\_.\{-}\ze"+ contains=@CPP
syntax region jankNativeRaw matchgroup=Special start=+(native/raw+  end=+)+ contains=jankNativeRawString

This is a good starting point for moving forward, but I wonder if this is complex enough to require semantic highlighting via LSP instead. Either way, if someone wants to pick this up and run with it, I'd love to get C++ highlighting for all of those functions.

Of course, this doesn't support going back to Clojure with interpolation.

@mauricioszabo
Copy link

Well, it is possible:
image

There are some issues with a string inside a string, because it needs to be escaped... but it's better than having nothing I guess :)

@jeaye
Copy link
Member Author

jeaye commented Sep 21, 2023

Nice! What editor did you do that in? When I had it around that point in vim, with foreign syntax regions, it flickered whenever I moved around or edited text. Are you seeing that?

@mauricioszabo
Copy link

mauricioszabo commented Sep 21, 2023

I did it on Pulsar editor, it's a fork of Atom. I did with Injections on tree-sitter, basically it "injects" one language into another.

In Clojure, a string usually is represented as (str_lit) but on this patched tree-sitter version, it is represented as (str_lit " (str_content) "), so I can say "if I have a (str_content) that is a child of a str_lit that is a child of a list_lit and that is the first argument is native/raw then inject the C++ language into it".

All the magic happens on this PR: pulsar-edit/pulsar#729, but more specifically, on these lines: https://github.com/pulsar-edit/pulsar/pull/729/files#diff-ed2a3159c63d8f3c78945b50e99346ef08662b9107182d3db84b9be755bd581fR54-R61

As this is all a feature of the editor (injections are used everywhere, all the time) I don't experience any flicker really

@jeaye
Copy link
Member Author

jeaye commented Sep 21, 2023

Oh, very cool! Great work. I'll add Pulsar to the list at the top of this thread and we can mark it off once everything's merged.

@sogaiu
Copy link

sogaiu commented Sep 21, 2023

IIUC, Neovim and Emacs do or will have ways to work with tree-sitter-clojure as-is:

We haven't reached a decision about whether to modify tree-sitter-clojure yet.

Below are some of the bits we've been considering regarding this situation.

  • As mentioned in the latter part of this comment, there appear to be other grammars that have a structural similarity to tree-sitter-clojure (WRT strings and their delimiters).

  • As remarked at the end of this comment:

    I haven't seen any recommendations in the official tree-sitter docs regarding structuring nodes in one's grammar to work better with injections, may be it could be suggested as an addition. Though at this point I wonder how much good it would do.

    Given the large number of grammars in existence (> 200), on the surface it seems unlikely to me that they will all be made to parse strings in a particular manner.

  • There have been some discussions about trying to share some or parts of queries among some editors. I don't know what the status of this is, but it seems to me that if that is pursued in some fashion, support for certain predicates (e.g. #offset! -- IIUC this is used in Neovim to help with some injection cases) might increase.

@dannyfreeman
Copy link

I don't know much about this jank project, at a glance it sounds very neat. Are there any new syntax constructs in jank that aren't present in Clojure apart of the interpolation in the native/raw strings? If there are we could consider supporting them in tree-sitter-clojure or perhaps a derivative of tree-sitter-clojure.

dannyfreeman added a commit to clojure-emacs/clojure-ts-mode that referenced this issue Sep 24, 2023
This creates a new derived mode for the clojure dialect jank
https://jank-lang.org/

See issue #23 for future work and
jank-lang/jank#24 for the expressed desire to
support nested c++
@jeaye
Copy link
Member Author

jeaye commented Sep 24, 2023

Are there any new syntax constructs in jank that aren't present in Clojure apart of the interpolation in the native/raw strings?

At this moment, no. jank is meant to be strongly Clojure[Script] source compatible and the only differences it should have will be around how it handles interop. For now, that's just via native/raw, but it's possible that the syntax is later extended to more seamlessly support working with C++. I don't foresee this happening in the coming few years, at least.

If there are we could consider supporting them in tree-sitter-clojure or perhaps a derivative of tree-sitter-clojure.

That would be superb and I really appreciate the support. There are three primary things needed for strong native/raw support:

  1. Highlighting the contents as C++ (and ideally getting LSP to work with the C++ here, too, understanding the scope of the expression)
  2. Highlighting the interpolation as jank again (and ideally getting back to jank/Clojure LSP in there)
  3. Handling braces/indentation as C++ would, rather than how Clojure would

It's worth noting that these can nest infinitely, so jank -> native/raw -> interpolation -> native/raw -> interpolation and so on is possible. This is a more pathological case, though, and I don't think we'd want our happy path to be hindered to support this, if it came to that.

Again, thanks for the interest here. A great developer tooling experience can make a language blossom just as much as a poor one can ensure it wilts.

@dannyfreeman
Copy link

dannyfreeman commented Sep 25, 2023 via email

jasonjckn pushed a commit to jasonjckn/clojure-ts-mode that referenced this issue Dec 10, 2023
This creates a new derived mode for the clojure dialect jank
https://jank-lang.org/

See issue clojure-emacs#23 for future work and
jank-lang/jank#24 for the expressed desire to
support nested c++
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants