-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for markup language #40
Comments
Yeah, it must be nice to support markdown or latex as you said. However, there is some problem to do that. At first, it's comparably easy to strip markdown into plain text. There are already some tools to strip them and I can also use markdown parser library via scripting language interface like The main problem is that the position (line, column, offset) of grammatical error is not correct after stripping markdown or latex. LanguageTool will return the result for stripped plain text, not for an original markdown or latex code. So we need to use sourcemap to maintain the relations of positions between markdown or latex code and coverted plaintext. As long as I googled, there is no markdown/latex conversion tool which supports sourcemap. So I need to make it but there is no resource to do that currently. Thank you for your suggestion but current my opinion is that it's hard to support. |
I understand that it's hard to support and that's why I initially made a request in A solution might be to simply replace the "markup" by space/newlines and disable the formatting (spaces, newlines, ...) error of Then again, I understand it's not straightforward. |
I would love to use your plugin with LaTeX documents. TexStudio supports LanguageTool (not in a perfect way but still it is useful). Maybe reading their code would give you some clues how to implement something similar. |
You already offer the possibility to ignore everything but comments and I think I have read that you identify comments by the highlight group. If that is the case would it be possible to do the reverse way and specify highlight groups we want to ignore? |
For LaTeX, simply ignoring all commands (from highlight groups) could become a real issue. For instance, I hacked a special-purpose solution for my own needs with LaTeX by adding a preprocessor that replaces most of the commands I frequently use with the plain-text representations + enough spaces to avoid shifting the result locations. I have integrated this into the plugin by locally setting
where #!/usr/bin/env python3
import os
import subprocess
import sys
dir_path = os.path.dirname(os.path.realpath(__file__))
subprocess.call('cat ' + sys.argv[-1] + ' | '
+ os.path.join(dir_path, 'detex.py') + ' | '
+ 'languagetool ' + ' '.join(sys.argv[1:-1]),
shell=True) and |
Hello, @languitar's commet looks like a viable workaround. Could this not be integrated into the plugin? |
Sorry for double-posting, but here is a possible workaround.
I created this small ruby script: def get_commands line
fullreplace = ["cite", "label"]
partreplace = ["section", "subsection"]
# puts line
fullreplace.each do |fr|
len = (line.gsub(/\\#{fr}\{([^}]*)}.*/,"#{" " *fr.length} \\1 ")).length
line = line.gsub(/\\#{fr}\{[^}]*}/," " * len)
end
partreplace.each do |pr|
line = line.gsub(/\\#{pr}\{([^}]*)}/,"#{" " *pr.length} \\1 ")
end
puts line
end
ARGF.each_line do |line|
get_commands(line)
end You can pipe tex into it and it outputs the latex commands replaced with the correct amount of whitespace. I now it's not pretty, just a proof-of-concept, but couldn't be this used if we specify all latex commands in the two arrays? |
Just for the reference, this is my custom script: https://gist.github.com/languitar/2037fccd8520586639aa9f1227bbf8e6 It handles a few more cases. |
This is related: This is interesting too: |
@real-or-random unfortunately, opendetex/detex does not work well with real life LaTeX documents. I would say that all the tools to convert (la)tex to text that I have checked offer very basic functionalities. Maybe pandoc at some point will be good enough but so far it has some problems too. |
Since we don't (yet) have tool to strip markdown with sourcemaps, would it be possible to open the plain text version in a split and leave it up to the user to find the error in the source? This wouldn't be a huge deal for markdown and would really help. |
@rhysd there is finally a (platform-independent) solution for the problem that blocks you from implementing a markup parser - textidote! Your Problem
Solution
Can you try to port their logic to |
@krishnakumarg1984 thanks for the info about texttidote, it looks pretty interesting. |
YaLafi does filter LaTeX text, too. |
Would be great if TeXtidote support was added, it even supports json results for easy parsing. Great wrapper for LanguageTool, would make this plugin work on LaTeX and Markdown. |
I wonder if this can be fixed in the plugin instead of LanguageTool. With more and more highlighters supporting fenced languages Vim now knows what part of my markdown doc are markdown and what are code blocks. Even if the markdown inline syntax causes a few errors it would be great if the code blocks could be ignored since those cause a huge amount of problems and are annoying to skip over every time. |
If I'm not mistaken, this should be possible with tree-sitter. |
Recently I switched to ltex which is based on LanguageTool and does a reasonable job supporting LaTeX and other markup languages. It can simply be used as a language server in vim/neovim, see https://valentjn.github.io/ltex/installation-usage.html. It tends to use excessive amounts of memory and so far I wasn't able to add words to the dictionary, but else it seems fine. Maybe it's an option for With tree-sitter, I am rather doubtful. I am still waiting for a simple dictionary-type spell checking that works as good as vim's default. There is spellsitter, but it didn't convince me yet to switch. |
It would be nice to have support for mark-up language like latex or markdown.
Using grammarous with those languages raise a lot of error due to the tag and automatic styling (for latex mostly because the latex compiler handle a lot of style issues).
Furthermore, having things like piece of code in the text can make the usage of grammarous less pleasant due to number of error raised. Supporting those mark-up languages would mean being able to disable spell checking for things like the
verbatim
environment in latex or code quote (``) in markdown.I've asked LanguageTool if they could add the support of markup language, their answer is that it's up to the editor to handle the markup language parsing.
This look like an enhanced version of #10 (if I've understood the issue).
An idea could be to have a dictionary of command (by
filetype
). Those command would parse the file and return the raw text, without the markup and ignored text (like the content of theverbatim
environment)The text was updated successfully, but these errors were encountered: