Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TextLevelRule conflicts with Java rules usage in LibreOffice plugin #669

Closed
TiagoSantos81 opened this issue Feb 17, 2017 · 11 comments
Closed
Labels
office-integration the LibreOffice/OpenOffice add-on

Comments

@TiagoSantos81
Copy link
Contributor

TiagoSantos81 commented Feb 17, 2017

After the implementation of the new caching mechanism, several issues appeared in the LibreOffice plugin.
The first and most obvious and pressing is the issue with UppercasecaseSentenceStart, that does not trigger properly inside LibreOffice, as seen in the sentence below.
"An error example, of this regression. it is possible to find the error in the beginning of this sentence."

Other related issues are the conflicts with prioritization in Java rules, seen in the Portuguese rules based in AbstractReplace class, and that affect other languages as well.

Last but not least, is the conflict with bitext rules, that demanded it to be disabled in the HTTP server, which may make it unavailable also in off-line usage of some plug-ins.

I would like to work on it, but I do not know where to find the relevant performance data that shows the benefits of the new caching mechanism, so that I can avoid further regressions.
In LanguageTool Regression test pagethe performance graphs do not show any statistically significant difference on any of the performance metrics, so I believe that the metrics that guide this change are not available on that portal.

Where can I find the relevant data? Am I allowed to work on it, or more work on that function is yet to be pushed?

PS - Not that not of the reported issues affect the website, that is working perfectly with all the previous functions.

@danielnaber
Copy link
Member

After the implementation of the new caching mechanism, several issues appeared in the LibreOffice plugin.

The cache itself isn't used in LibreOffice. It's only used when the JLanguageTool object gets created with a cache, which is only the case for the HTTP server so far. So this problem is only related to the cache because I moved some rules to be TextLevelRules. Help with fixing the issues is very welcome.

In LanguageTool Regression test page the performance graphs do not show any statistically significant difference

That's because that check only tests a tiny one-word (or so) "sentence" that's so short it doesn't benefit from caching. You can use org.languagetool.rules.patterns.PerformanceTest to see that the cache actually work, if you modify it to run more often on the same text.

@TiagoSantos81
Copy link
Contributor Author

TiagoSantos81 commented Feb 17, 2017

Excellent. Many thanks.
I will take this as an incentive to learn Java properly.

Many Java rules seem to already work properly, so maybe it is only the ones that are yet to be fully converted that have misbehave, although the UppercaseSentenceRule was actually converted.

Can the issue in it be related with the column adjustement workaround?
237c11a

The rule triggers in the beginning of a paragraph but not inside it, so:

 -              if (match.getLine() == 0) {
 -                newMatch.setColumn(match.getColumn() + 1);
 -              } else {
 -                newMatch.setColumn(match.getColumn());
 -              }
 +                newMatch.setColumn(match.getColumn() + 1);

Can the removal of the conditional in this way work without affecting the HTTP behaviour?

@danielnaber
Copy link
Member

I'm not sure, I guess LO/LT integration works on character positions instead of lines and columns.

@TiagoSantos81
Copy link
Contributor Author

You are right. It doesn't. I tried it now and it does not change, but I am having odd results. If I change the language, e.g. to italian, I get the orthographical error triggers and the sentence trigger. For the untrained eye (like mine) it seemed a much easier solution. I will report any advances, but all tips are welcome.

@danielnaber danielnaber added the office-integration the LibreOffice/OpenOffice add-on label Feb 24, 2017
milekpl added a commit that referenced this issue Apr 24, 2017
@milekpl
Copy link
Member

milekpl commented Apr 24, 2017

Ignore my commit in this thread, this was for issue #699. My bad.

FredKruse added a commit that referenced this issue Aug 24, 2017
@FredKruse
Copy link
Contributor

I set the tokenizeText in the check for paragraph to 'true'.
This solved the Problem with the upperCaseRule and other textLevelRules like SentencesWhitespaceRule. I hope the value 'false' was just a bug and there are no other side effects. I tested it for German texts. It works well. Please test it for other languages.
But there is a general problem with the office extension. The textLevelRules works only for paragraphs. This is OK for rules like Uppercase or whitespace but isn't much satisfactory for rules like wordCoherencyRule.
This is because the office interface which is used works only on the level of paragraphs.
Is there anyone in the community who is so familiar with the office interface to advance the extension for the whole text?

@TiagoSantos81
Copy link
Contributor Author

Brilliant work, Fred.
I made several attempts and spent a lot of time on that file without results. I wouldn't have remembered to fiddle with that variable.
After testing it summarily, all seems to be working as expected. Since this should be tested in more languages, I would suggest closing in one week, if there are no reasonable complains.
The usefulness of LibreOffice and LanguageTool would be greatly improved, if there were more cross-development.

@FredKruse
Copy link
Contributor

Thank you Tiago,
There are two points left so far I see:

  1. A way should be found to analyze the whole text.
  2. Rules that replaces a dot at the end of a sentences for example by a question mark don't work in the dialog box (F7) (out of the text using the right mouse button they work fine). The question mark is inserted before the dot (this produces the next error). The behavior is strange. I don't know if LT handles the dialog box and the suggestions in the text different or if there is a bug inside of libre office.
    I would like to work on both points but it will take time because I am not familiar with the interface and also not with the LT-code. So I have to do a lot of research.

@TiagoSantos81
Copy link
Contributor Author

Rules that replaces a dot at the end of a sentences for example by a question mark don't work in the dialog box (F7)

I can confirm this. I tested it with the 'greeting and farewells' rules that I localized from German to Portuguese, since they have the exact same behaviour in both languages. For example, it replaces the dot in best regards. with best regards,. in an infinite cycle.
The rule works fine outside of the 'spelling and grammar' dialog (F7).

I do not know if this behaviour is new. In the coming days, I will test with an older release and see if it also happens, to see if it is somehow related with TextLevelRules.

I would like to work on both points but it will take time because I am not familiar with the interface and also not with the LT-code. So I have to do a lot of research.

If someone is working on it it is great. This part of the code links to too many parts, that have to be understood deeply. It would be great if you picked it up, but I know how disheartening spending time with this can be.

Regarding these problems, should a new issue be opened?

@FredKruse
Copy link
Contributor

The problem with the dialog is not related with the TextLevelRules. I testet it with rules on sentences level.
Yes, I think, we should open two issues (they are not related, I think):

  1. TextLevelRules are only working on the level of paragraphs
  2. Sentences ending dot can't be over written in dialog box

@TiagoSantos81
Copy link
Contributor Author

This issue seems solved.
@FredKruse
I opened two new issue with the problems we were debating (#781 and #782). TextLevelRules are only working on the level of paragraphs this issue is self evident, but may benefit from more information.

Mbodin pushed a commit to Mbodin/languagetool that referenced this issue Oct 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
office-integration the LibreOffice/OpenOffice add-on
Projects
None yet
Development

No branches or pull requests

4 participants