Add static linting (light) for TeX files #323

clason · 2020-11-19T09:53:17Z

I don't know your plans for texlab 3.0 (and I hesitate to put more on your plate), but would it make sense to add basic(!) static linting to diagnostics for TeX files, similar to what you have for Bib files?

I don't mean anything like ChkTeX (which is too opinionated for my taste anyway), just some basic syntax errors like

mismatched braces (not parentheses/brackets, just to catch something like \frac{e^{\pi}{2})
mismatched environments (\begin{equation*}...\end{equation})
...

(I usually find the TeX compiler output less than helpful, either due to opaque messages or to wrong locations especially with beamer, and these things would be nice to catch before compiling a 400 page manuscript and failing...)

The text was updated successfully, but these errors were encountered:

pfoerster · 2020-11-19T10:04:13Z

This seems perfectly reasonable. In my opinion, ChkTeX is a bit annoying, too. In particular #304 and #186 still linger around. Things like detecting mismatched braces and mismatched environments are pretty easy to implement, since we already do that when parsing. At the moment, I am working on making the parsers for LaTeX and BibTeX incremental so TexLab does not need to reparse on every keystroke and some general parsing improvements. I am almost done with LaTeX parsing and BibTeX should be pretty easy. At the moment, the incremental LaTeX parser is able to reparse large with more documents in less than 1 ms on my machine. This should solve most performance problems.

clason · 2020-11-19T10:13:37Z

That's great; I was hoping that these would be rather easy as byproducts of the parsing. Would it make sense to just push any parsing errors straight to diagnostics? Otherwise it's probably best to start simple and add errors as they are suggested.

Incremental parsing sounds like an excellent improvement; I'm quite excited about that. On that note, have you looked at tree-sitter for parsing? There's no (good) grammar for it yet, but it's built for fast, incremental, error-recovering parsing.

pfoerster · 2020-11-19T10:29:00Z

Would it make sense to just push any parsing errors straight to diagnostics?

Yes, the BibTeX parser already does something like that.

On that note, have you looked at tree-sitter for parsing? There's no (good) grammar for it yet, but it's built for fast, incremental, error-recovering parsing.

I have looked into tree-sitter. Actually, I have two grammars for LaTeX and BibTeX lying around. For BibTeX, tree-sitter is working fine and is producing good results. However, for LaTeX, it has performance problems. I think, the problem is, that LaTeX parse trees are very shallow and not deeply nested like most languages. In particular, parsing plain text seems to be the problem. With no modification to the grammar, tree-sitter is slower at parsing incrementally than our old parser, which is not incremental. The performance can be improved by storing the text as a linked list, which is convenient for the parser because it can reuse more nodes but it makes it harder to work with. In addition to that, I encountered lots of problems with the borrow checker when trying to make the semantic analysis incremental too.

At the moment, I am using a hand-written incremental packrat parser based on the following paper:

Patrick Dubroy and Alessandro Warth. 2017. Incremental packrat parsing. In Proceedings of the 10th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2017). Association for Computing Machinery, New York, NY, USA, 14–25. DOI:https://doi.org/10.1145/3136014.3136022

The hand-written (and not that much optimized. There is stil a paper left which optimizes the invalidation process of the memoization table) parser is about 4 times as fast as the optimized tree-sitter parser and way easier to use. For the syntax tree design, I mostly followed the way Roslyn handles parse trees (green and red trees): Boths trees are immutable. The green tree is constructed bottom-up and can be reused in the next parse, while the red tree is constructed top-down on-demand and is thrown away every parse.

clason · 2020-11-19T10:41:44Z

Thank you, that's very interesting to hear. The packrat parser sounds fascinating, really cool!

Would you be willing to share the tree-sitter parsers? They might not be competitive for texlab, but neovim has started leveraging tree-sitter for an increasing variety of things (syntax highlighting, folding, indentation, structural text editing), and it can't be slower than the current regex parsing :) If it was made public, I'm sure there are quite a few people interested in contributing.

oblitum · 2020-11-19T21:48:57Z

Would love this, chktex can be useful and very annoying at the same time.

pfoerster · 2020-11-21T07:58:04Z

Would you be willing to share the tree-sitter parsers?

I think https://github.com/Aerijo/tree-sitter-bibtex/blob/master/grammar.js already does a good job for BibTeX. It just needs to be modified to handle Windows line endings as well but this change should be straightforward.
Regarding the LaTeX grammar, I can say that our grammar is very much suited to the use in TexLab. In particular, the grammar now needs to be updated to implement parsing of key value pairs as well in order to implement #324. Once that one is done, I can think I can update the tree-sitter grammar and share it. Maybe some one has a use for it.

clason · 2020-11-21T12:07:08Z

I think https://github.com/Aerijo/tree-sitter-bibtex/blob/master/grammar.js already does a good job for BibTeX. It just needs to be modified to handle Windows line endings as well but this change should be straightforward.

Thank you, I'll take a look at it! (EDIT Seems it's not compatible with the latest version of tree-sitter. Bummer.)

Regarding the LaTeX grammar, I can say that our grammar is very much suited to the use in TexLab. In particular, the grammar now needs to be updated to implement parsing of key value pairs as well in order to implement #324. Once that one is done, I can think I can update the tree-sitter grammar and share it. Maybe some one has a use for it.

I think there's definite interest in the https://github.com/nvim-treesitter/nvim-treesitter community! As you wrote, the grammar will probably have to be adapted to that use, but that's no problem -- it'd very helpful to have a working parser to start from. (I looked into writing one from scratch, and quickly got lost...) As such, I wouldn't worry about putting in any more work into the grammar if you're not using it yourself. It's also not urgent by any means, so whenever you get around to it would be fine.

pfoerster · 2021-06-19T10:50:53Z

@clason Do you think, there are more types of (safe) errors that we can report after parsing/analysis that are not annoying for the user? Unfortunately, LaTeX is not as restrictive as BibTeX for example, where we can tell for sure that something is an error.

clason · 2021-06-19T10:53:31Z

If you can catch undefined commands or environments, that might be useful?

pfoerster · 2021-06-19T10:57:42Z

I thought about this one too, but it can be very hard in some cases. Especially if you delve into TeX territory. I am currently updating the latex-completion-data script with a new (more accurate) method of determining the defined commands and environments. Maybe this can be helpful. I think that this should definitely be a warning in the beginning due to the number of false positives that can occur but I'll keep it in mind.

clason · 2021-06-19T11:08:05Z

Yeah, I assumed that this was hard to get (right). Honestly, mismatched or missing braces and environments pretty much already cover 99% of syntax errors I make, so coming up with more is hard ;)

Maybe math commands outside math mode? But you're right, people can redefine them at will, and there's no macro definition so crazy that someone isn't using it...

The only thing that might be safe is (multiple) subscript/superscript?

pfoerster · 2021-06-19T11:13:06Z

The only thing that might be safe is (multiple) subscript/superscript?

This should definitely be doable and pretty easy to implement with the current parser.

clason · 2021-09-27T12:56:56Z

Another thing I would like is to control whether compiler errors and (especially!) warnings are forwarded to diagnostics. (I do not need to see all the overfull boxes... unless I want to verify that the server is running ;))

pfoerster · 2021-09-28T17:00:28Z

Another thing I would like is to control whether compiler errors and (especially!) warnings are forwarded to diagnostics. (I do not need to see all the overfull boxes... unless I want to verify that the server is running ;))

I agree with you that the overfull box warnings are mostly useless as some of them simply cannot/should not be fixed.
Do you think we need a more sophisticated filtering mechanism or is a simple flag for errors and warnings enough at the moment?

clason · 2021-09-28T17:07:36Z

Being able to disable warnings (and optionally errors) is already useful. But I'm sure there will be requests for increased granularity eventually...

VimTeX allows providing a list of regular expressions which are matched against diagnostics to filter out, so you could just list { 'LaTeX Warning', 'Marginpar on page', 'Overfull', 'Underfull' } and what else bugs you.

I think both would be of independent use (some just want to turn off warnings or even all output; others will want more fine-grained control). Start with the easiest, I'd say, and wait until someone requests more ;)

pfoerster · 2022-06-16T16:44:25Z

Another thing I would like is to control whether compiler errors and (especially!) warnings are forwarded to diagnostics. (I do not need to see all the overfull boxes... unless I want to verify that the server is running ;))

@clason There is a PR (#655), which provides simple support for filtering diagnostics based on regular expressions. What do you think?

### Added - Add support for escaping placeholders in forward search ([#649](latex-lsp/texlab#649)) - Add support for diagnostic filtering ([#323](latex-lsp/texlab#323)) - Add pre-built binaries for the following targets: - `aarch64-unknown-linux-gnu` - `armv7-unknown-linux-gnueabihf` - `x86_64-unknown-linux-musl` - `aarch64-pc-windows-msvc` - `i686-pc-windows-msvc` ### Fixed - Parse incomplete server options correctly ([#651](latex-lsp/texlab#651)) ## [4.1.0] - 12.06.2022 ### Added - Add server commands to clean build directory ([#607](latex-lsp/texlab#607)) ### Changed - Improve output when hovering over BibTeX strings - Improve the heuristic for finding build artifacts ([#635](latex-lsp/texlab#635)) ### Fixed - Allow brackets in included file paths ([#639](latex-lsp/texlab#639)) - Allow commands in included file paths ([#641](latex-lsp/texlab#641))

pfoerster self-assigned this Nov 19, 2020

pfoerster added the enhancement New feature or request label Nov 19, 2020

pfoerster mentioned this issue Nov 21, 2020

Complete cite= in acros \DeclareAcronym #324

Closed

clason mentioned this issue Jan 7, 2021

Texlab: No (error) diagnostics after Build Error neovim/nvim-lspconfig#551

Closed

pfoerster mentioned this issue May 3, 2021

Texlab exits with non-zero exit code #405

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add static linting (light) for TeX files #323

Add static linting (light) for TeX files #323

clason commented Nov 19, 2020 •

edited

Loading

pfoerster commented Nov 19, 2020

clason commented Nov 19, 2020

pfoerster commented Nov 19, 2020

clason commented Nov 19, 2020

oblitum commented Nov 19, 2020

pfoerster commented Nov 21, 2020

clason commented Nov 21, 2020 •

edited

Loading

pfoerster commented Jun 19, 2021

clason commented Jun 19, 2021

pfoerster commented Jun 19, 2021

clason commented Jun 19, 2021

pfoerster commented Jun 19, 2021

clason commented Sep 27, 2021

pfoerster commented Sep 28, 2021

clason commented Sep 28, 2021

pfoerster commented Jun 16, 2022

Add static linting (light) for TeX files #323

Add static linting (light) for TeX files #323

Comments

clason commented Nov 19, 2020 • edited Loading

pfoerster commented Nov 19, 2020

clason commented Nov 19, 2020

pfoerster commented Nov 19, 2020

clason commented Nov 19, 2020

oblitum commented Nov 19, 2020

pfoerster commented Nov 21, 2020

clason commented Nov 21, 2020 • edited Loading

pfoerster commented Jun 19, 2021

clason commented Jun 19, 2021

pfoerster commented Jun 19, 2021

clason commented Jun 19, 2021

pfoerster commented Jun 19, 2021

clason commented Sep 27, 2021

pfoerster commented Sep 28, 2021

clason commented Sep 28, 2021

pfoerster commented Jun 16, 2022

clason commented Nov 19, 2020 •

edited

Loading

clason commented Nov 21, 2020 •

edited

Loading