Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Tree-sitter parser for syntax highlighting of Tact language on GitHub.com, and in many editors and IDEs #302

Closed
novusnota opened this issue Aug 23, 2023 · 29 comments · Fixed by #425
Assignees
Labels
Approved This proposal is approved by the committee Developer Tool Related to tools or utilities used by developers

Comments

@novusnota
Copy link
Contributor

novusnota commented Aug 23, 2023

Summary

The Tact programming language lacks proper syntax highlighting and other quality of life developer tools in many editors and IDEs besides VS Code and IntelliJ-based. Additionally, there's no syntax highlighting for it on the GitHub.com itself.

To address those issues, this proposal suggests developing a Tree-sitter parser for the Tact programming language and contributing it to the GitHub's language detection tool, Linguist

Context

Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited.

Tree-sitter has built-in support for syntax highlighting, via the tree-sitter-highlight library, which is currently used on GitHub.com for highlighting code written in several languages.

And besides it's use on GitHub.com, Tree-sitter is often used in many editors and IDEs, including, but not limited to: NeoVim, Emacs, Helix, Zed, and many others. Although VS Code is currently not on that list, one of the core members of VS Code team has recently assigned himself to this issue of using Tree-sitter in VS Code with no oriental release date just yet.

Having a robust implementation of a Tree-sitter would greatly benefit the overall growth and adoption of the Tact programming language, and as a result, TON Blockchain as a whole.

Goals

  • Foster the adoption of the Tact language by offering a more accessible developer tooling for many editors and IDEs using Tree-sitter
  • Add syntax highlighting for the Tact programming language on the GitHub.com

Deliverables

  • Develop a robust Tree-sitter parser for the Tact programming language
  • Contribute Tree-sitter grammars to Linguist, GitHub's language detection tool, to provide syntax highlighting on the platform

Definition of Done

  • GitHub repo with the Tree-sitter parser for Tact
  • Guidelines on usage and integration for editors supporting Tree-sitter
  • PR to the Linguist repo

Reward

  • Standard TON Footstep NFT
  • 3000 USD in TON equivalent

Total: $3000

Oriental Release Date

One month (or less) from approval and assignment.

@novusnota novusnota added the footstep This is a TON Footstep issue label Aug 23, 2023
@novusnota
Copy link
Contributor Author

novusnota commented Aug 23, 2023

@delovoyhomie, @SwiftAdviser if it's approved, please assign me to it, as I've made some progress on it already when doing Footstep 288

@novusnota
Copy link
Contributor Author

novusnota commented Aug 23, 2023

Also:

We try only to add languages once they have some usage on GitHub. In most cases we prefer that each new file extension be in use in at least 200 unique :user/:repo repositories before supporting them in Linguist.
Contributing to Linguist

And:

I know this is subjective and open to debate so the loose rules I'll be using are along the lines of:
• at least 2000 files per extension indexed in the last year (the number you see at the top of the search results), unless the extension is expected to only occur once per repo, then 200 files.
• with a reasonable distribution across unique :user/:repo combinations assessed by manually and randomly clicking through the results.
FYI: Temporary change in language and extension popularity assessment

So guys and gals, I need as much .tact files in your repos as possible :)

UPD: By making this search query, GitHub.com tells that there's about 750+ files with .tact extension at the moment. So my point still stands — we need more .tact files! In fact, the required goal is 2k files with reasonable distribution across unique user/repo combinations, so we can't just put more examples under tact-lang for this to work :)

I think it's a good idea to ask people who did TON Smart Contract Contests to also include Tact examples in all of their public repositories, that way we'll get a lot more .tact files across GitHub fast

@novusnota novusnota changed the title A tree-sitter parser for efficient Tact language syhighlighting, incremental selection, folding, indentation, etc. A tree-sitter parser for Tact language highlighting on GitHub.com, and in many editors and IDEs Aug 23, 2023
@novusnota novusnota changed the title A tree-sitter parser for Tact language highlighting on GitHub.com, and in many editors and IDEs A Tree-sitter parser for syntax highlighting of Tact language on GitHub.com, and in many editors and IDEs Aug 23, 2023
@delovoyhomie
Copy link
Collaborator

@Naltox, @krigga, what do you think?

@novusnota
Copy link
Contributor Author

@delovoyhomie Moved the oriental release date from 20.09.23 to "One month (or less) from approval and assignment". I hope that is ok

@SwiftAdviser
Copy link
Contributor

LGTM! Exactly what @Hiyorimi dreams of

@SwiftAdviser
Copy link
Contributor

SwiftAdviser commented Sep 25, 2023

Btw, will this task add Tact language to the Markdown files too or not? Like

"""tact
contract HelloWorld {

get fun greeting(): String {
return "hello world";
}

}
"""

How is it possible to do?

@novusnota
Copy link
Contributor Author

novusnota commented Sep 25, 2023

@SwiftAdviser

Btw, will this task add Tact language to the Markdown files too or not?

Yes! That's the beauty — this would enable highlighting throughout the GitHub once they accept the new grammar ;)
And it would also cover this footstep too: #320.

I already have some progress on it, ready to get assigned :)
And, arguably, it needs a reward bump to 3k.

@SwiftAdviser
Copy link
Contributor

Very nice. Let's do this

@delovoyhomie delovoyhomie added Approved This proposal is approved by the committee Developer Tool Related to tools or utilities used by developers and removed footstep This is a TON Footstep issue labels Oct 10, 2023
@SwiftAdviser
Copy link
Contributor

@novusnota approved

@novusnota
Copy link
Contributor Author

novusnota commented Oct 10, 2023

@SwiftAdviser great! I'm very close to finishing #320 (it had its own challenges 😅), but once completed I'll switch to this project ASAP!

@anton-trunov
Copy link

Looking forward to see this implemented! A heads up: we are going to introduce more language constructs, like +=, number literals with underscores, and some more, so making the grammar a bit more general would certainly help here.

@novusnota
Copy link
Contributor Author

@anton-trunov will do! Tacts' ohm grammar file is very convenient to use as a single source-of-truth for the language reference for anyone who builds any Tact-related tooling out there :)

@novusnota
Copy link
Contributor Author

@delovoyhomie Submitted a Questbook proposal for this task. Submission deadline is set to 10th of November, 2023. Continuing working to deliver great results on time 🚀

@novusnota
Copy link
Contributor Author

novusnota commented Nov 10, 2023

Progress report: Almost done. Technically, it's working rather well already, so the deadline (November 10th) is met, but I'll be adding more tests, improving docs, uploading it to a public repository and sending a PR to Linguist tomorrow-ish!

UPD: Probably will take another day or two, but this won't affect #352 :)

@anton-trunov
Copy link

Amazing! Looking forward to trying it out

@mbaneshi
Copy link
Contributor

With LSP integrated, we'll experience improved DX. Enjoy features like intelligent auto-completion, real-time diagnostics, and jump-to-definition, enhancing your overall coding experience
Thanks to @novusnota, this will be an amazing DX improvement.
As a Neovim user, suffering from a lack of a language server implementation I can't wait to try it.
Meanwhile, to make it ready to use, we need some technical tweaks to integrate it seamlessly. For instance, we should Configure Neovim to use the installed language server. we can use plugins like nvim-lspconfig for easy setup.
I am curious whether these also involved the tasks or not.

@anton-trunov
Copy link

@novusnota Perhaps you could share your repo with the tree-sitter grammar so we could start the preliminary review process? If it's not something you'd like to do, could you then provide some timeline for us to plan our work ahead?

@novusnota
Copy link
Contributor Author

novusnota commented Nov 27, 2023

@anton-trunov Terribly sorry for the delay, been very sick last two weeks, now almost fully recovered. I'm done with the grammar and it's testing, and will try to do all the necessary sharing, PRs and related things before the end of Wednesday (29.11). That also includes #352. Thank you for the patience. I'll deliver good results, as always :)

UPD: May take two more days to do it (i.e. until 01.12), just to be on the safe side with recovery process

@anton-trunov
Copy link

Sure, sure, no rush is needed. I wish you a speedy recovery

@novusnota
Copy link
Contributor Author

@anton-trunov @delovoyhomie Thank you for letting me take extended time off. I'm finally done with my recovery and feeling great! Now, I'll swiftly go over the results of my past work in the next 1-2 days, and I'll make sure to do all the necessary publications and PRs to denote the completion of this and #352'nd bounties. Will stay in touch from now on ✌

@novusnota
Copy link
Contributor Author

novusnota commented Dec 7, 2023

Brushing up and going over all the kinds of optimizations and tests now, expect completeness real soon!

By the way, once I upload the repository to public, should I then move it over to tact-lang right away @anton-trunov?

@anton-trunov
Copy link

@novusnota Sorry, I missed your comment. Yep, moving new projects to tact-lang org is our default mode.

Any updates on the project you can share?

@anton-trunov
Copy link

Great stuff! Hoping to see the final result soon. Btw, in the meantime, can you provide your TON wallet address for this SBT https://society.ton.org/contribute-to-tact-compiler? All your great contributions certainly qualify for this

@anton-trunov
Copy link

Yay! @novusnota has made awesome progress, see here: https://github.com/tact-lang/tree-sitter-tact. I think after the completion of the README.md's section on editor integration and other usages and an even unmerged PR to Linguist, this bounty can be fully paid as we cannot realistically expect thousands of Tact files across hundreds of projects to appear overnight.

@anton-trunov
Copy link

Btw, I tried tree-sitter highlight on some Tact files and the output looks really great [modulo the standard tree-sitter's color scheme :)]

@novusnota
Copy link
Contributor Author

novusnota commented Jan 31, 2024

It's finally done :)
Introducing a fully-featured 🌳 Tree-sitter grammar & parser for the ⚡ Tact contract programming language!

image

Deliverables:

  • A robust Tree-sitter parser for the Tact programming language (already moved to tact-lang org) — tact-lang/tree-sitter-tact
    • Fully test covered: general, highlighting and tag tests
    • Usage instructions for Neovim and Helix (with more to come!)
  • Editor queries for Tree-sitter, which enable different applications of the parser:
    • Generic
      • highlights.scm — syntax highlighting queries (for Tree-sitter CLI & GitHub)
      • locals.scm — fixed set of capture names to track local scopes and variables (and alike), used for highlighting
      • tags.scm — tagging queries for code navigation systems (as used on GitHub)
    • Neovim
      • highlights.scm
      • locals.scm — used to extract keyword definitions, scopes, references, etc., but NOT used for highlighting (unlike Generic or Helix queries)
      • injections.scm — highlighting of TODO, FIXME and related in single-line comments
      • folds.scm — syntax folds (note, that folding has to be enabled in config in order to use those)
      • indents.scm — indentation levels
      • textobjects.scm — syntax aware text-objects, select, move, swap, and peek support.
      • context.scm — shows sticky context on top of the editor as you scroll through file contents
    • Helix
      • highlights.scm
      • injections.scm
      • indents.scm
      • textobjects.scm
  • PR to the Linguist repo — Add language: Tact, and extension to JSON

About the PR:
Turns out there's a big number of .tact files out there, which are unrelated to tact-lang and are, in fact, JSON files under a certain schema used by some haptics gaming company. That resulted in the extension of the PR to correctly distinguish between the two via providing small set of heuristics, tests and code samples. Once or when the PR gets accepted, both .tact as a specific JSON schema and .tact as the actual Tact language would be highlighted appropriately.

About the Github integration:
Linguist (and dependent tools) use TextMate grammars. GitHub, however, uses Linguist as well as Tree-sitter grammars.

That is, Tree-sitter supersedes TextMate, but in order to do that, one needs to:

  • First, make a PR with TextMate grammar to Linguist (this by itself will add syntax highlighting to Github) — done!
  • Then, once it's accepted, make a request to Github team to use the Tree-sitter grammar (this will add code navigation & better highlighting) — I'll do it once the prior PR to Linguist gets through.

Thankfully, we have a nice VSCode extension for Tact, which features the TextMate grammar, so I've pointed to it in the PR. Many thanks to @logvik for making and maintaining the extension! That's it for step 1, and for now we've all got to write more Tact to get it included to Linguist (and Github) in the first place :)

I'll continue maintaining tact-lang/tree-sitter-tact in the meantime, just as I do with tact-lang/tact.vim and prism.js's grammar.

P.S.: @anton-trunov As a bonus I quickly made a TextMate grammar for Ohm and submitted it to Linguist as well, so that in the near future the grammar.ohm may get a nice highlighting on GitHub!

@delovoyhomie @SwiftAdviser

@anton-trunov
Copy link

@novusnota Fantastic work!

I tried it with Helix and I needed to do some extra stuff that was not in the instructions: opened an issue here tact-lang/tree-sitter-tact#1.

But that is some minor thing we can fix later, so I consider this project finished.

Looking forward to seeing the Ohm support on GitHub as well :)

@novusnota
Copy link
Contributor Author

@anton-trunov I've extended Helix instructions, thank you for your fantastic support!

Linguist folks make a release approximately every 3-4 months (to match with Github Enterprise Server updates), and their last one was in December, so it's rather safe to say that the next release in ~March will feature Ohm support. And, if we're lucky with 800 more .tact files across Github, Tact as well :)

@delovoyhomie
Copy link
Collaborator

Rewards sent!

Thank you for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Approved This proposal is approved by the committee Developer Tool Related to tools or utilities used by developers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants