This repository has been archived by the owner on Feb 5, 2021. It is now read-only.
Sentence splitting with maths replacements for English texts #22
Labels
enhancement
New feature or request
With
tex2txt.py --lang en ...
, the LaTeX inputcurrently results in this plain text version:
It seems that the dot at 'V.' is not recognised as sentence splitter by LanguageTool (LT), since it might be the acronym of a first name. Consequently, LT will not complain about the lower-case 'this' starting a new sentence.
According to some experiments, the following settings for maths replacements are more appropriate in English texts.
Now, LT's sensitivity seems to be almost as good as for German texts with the current replacement collections ('D1D', 'I1I', ...). Word repetitions due to missing interpunction in equations and missing white space in connection with \text{...} parts are detected as before.
Still, there is at least one difference to the German version. In the following snippet, the missing dot is not detected in the English variant. LT does not complain about the capital 'This'.
But LT also won't generate a message for
The text was updated successfully, but these errors were encountered: