Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New LanguageTool release 6.4 #29

Closed
lpla opened this issue Apr 2, 2024 · 11 comments
Closed

New LanguageTool release 6.4 #29

lpla opened this issue Apr 2, 2024 · 11 comments

Comments

@lpla
Copy link

lpla commented Apr 2, 2024

Hi.

Just to notify that a new LanguageTool version has been released: https://github.com/languagetool-org/languagetool/blob/master/languagetool-standalone/CHANGES.md#64-2024-03-28

@meyayl
Copy link
Owner

meyayl commented Apr 2, 2024

Thank you for the reminder. I just pushed the updated image.

@ovizii
Copy link

ovizii commented Apr 3, 2024

It seems the built-in healthcheck is broken. My instance was on "auto-update" and in a restarting frenzy since yesterday when it updated to the new version.

@ovizii
Copy link

ovizii commented Apr 3, 2024

Even after disabling the healthcheck it was stuck on 100% CPU and wasn't working. I couldn't see any errors in the logs so I went back to meyay/languagetool:6.3a-5 and all is perfectly workign again.

@meyayl
Copy link
Owner

meyayl commented Apr 4, 2024

Interesting observation. I am not sure why you experience what you experience.

On my container host it makes no difference whether I use 6.4 or 6.3a-5.

Have you tried setting the environment LOG_LEVEL to DEBUG to gather more information?

Can you share your compose file, so I can try to reproduce with your exact settings?

@ovizii
Copy link

ovizii commented Apr 4, 2024

Well, let me add a few words before sharing the compose file. I use autoheal, to restart unhealthy containers. When the healthcheck of LanguageTool broke, autoheal continued restarting it.

I tried disabling the healthcheck with:

    healthcheck:
      test: ["NONE"]

Which did the trick, except it was stuck on 100% CPU and wasn't working o I went back to meyay/LanguageTool:6.3a-5 and all is perfectly working again.

I will gladly try with the log-level set to debug this evening and report back. Here is my docker-compose.yml in case you can spot something wrong there.

services:

  languagetool:
    image: meyay/languagetool:6.3a-5
    container_name: languagetool
    hostname: languagetool
    restart: "no"
    cap_drop:
      - ALL
    cap_add:
      - CAP_SETUID
      - CAP_SETGID
    security_opt:
      - no-new-privileges
    environment:
      - TZ=Europe/Berlin
      - MAP_UID=1000
      - MAP_GID=1000
      - JAVA_XMS=256m  # OPTIONAL: Setting a minimal Java heap size of 256 mib
      - JAVA_XMX=3G  # OPTIONAL: Setting a maximum Java heap size of 3 Gib
      - download_ngrams_for_langs=en,de
      - langtool_languageModel=/ngrams
      - langtool_fasttextModel=/fasttext/lid.176.bin
      - langtool_pipelinePrewarming=true
      - langtool_pipelineCaching=true
      - langtool_maxPipelinePoolSize=500 # no clue about optimal value
      - langtool_pipelineExpireTimeInSeconds=3600 # no clue about optimal value
#      - langtool_maxWorkQueueSize=50 # no clue about optimal value
      - langtool_cacheSize=500 # size of internal cache in number of sentences (optional, default: 0)
#      - langtool_maxTextLength=50000 # no clue about optimal value
      - langtool_maxCheckThreads=4 # 10 are default, maybe try 2x CPU cores
    networks:
      languagetool:
        ipv4_address: 192.168.192.22
      traefik_languagetool:
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=traefik_languagetool"
      - "traefik.http.routers.languagetool.tls=true"
      - "traefik.http.routers.languagetool.entrypoints=websecure"
      - "traefik.http.routers.languagetool.rule=Host(`lang.mydomain.tld`)"
      - "traefik.http.routers.languagetool.middlewares=secHeaders@file"
      - "traefik.http.routers.languagetool.service=languagetool"
      - "traefik.http.services.languagetool.loadbalancer.server.port=8010"
    volumes:
      - /opt/languagetool/ngrams:/ngrams
      - /opt/languagetool/fasttext:/fasttext
      - /dev/shm:/tmp
    cpus: 2
    mem_limit: 4G
#    healthcheck:
#      test: ["NONE"]

@ovizii
Copy link

ovizii commented Apr 4, 2024

OK, I switched back to the latest tag with LOG_LEVEL=DEBUG. The healthcheck fails, CPU usage is pretty high even after this. LT is not reachable.

I don't see anything useful in the logs are some excerpts, the rest is just more of the same. Not sure why Russian is mentioned here at all, I don't speak or read it, and it's not selected in my Chrome extension to be checked!? I am the only user of this instance. I checked the logs, there really is nobody else accessing it.

Disabling the healthcheck does nothing, same behaviour as described above. Switching back to image: meyay/languagetool:6.3a-5 all works perfectly fine again.

languagetool  | 2024-04-04 11:15:39 +0000 Setting up thread pool with 4 threads
languagetool  | 2024-04-04 13:15:39.462 GMT+02:00 DEBUG org.languagetool.tools.LtThreadPoolFactory Create new threadPool with corePool: 4 maxThreads: 4 maxTa
skInQueue: 0 identifier: lt-server-thread daemon: false exceptionHandler: org.languagetool.server.Server$$Lambda/0x0000000800181e48@1033576a
languagetool  | 2024-04-04 13:15:40.520 GMT+02:00 INFO  org.languagetool.language.identifier.DefaultLanguageIdentifier Started fastText process for language
identification: Binary /usr/bin/fasttext with model @ /fasttext/lid.176.bin
languagetool  | 2024-04-04 13:15:40.521 GMT+02:00 DEBUG org.languagetool.tools.LtThreadPoolFactory Create new threadPool with corePool: 4 maxThreads: 4 maxTa
skInQueue: 8 identifier: lt-text-checker-thread daemon: false exceptionHandler: org.languagetool.server.TextChecker$$Lambda/0x000000080020acb8@65a15628
languagetool  | 2024-04-04 13:15:40.698 GMT+02:00 INFO  org.languagetool.server.TextChecker Prewarming pipelines...
languagetool  | 2024-04-04 13:15:40.705 GMT+02:00 DEBUG org.languagetool.server.PipelinePool Requesting pipeline; pool has 0 active objects, 0 idle; pipeline
 settings: org.languagetool.server.PipelineSettings@625abb97[lang=Russian,motherTongue=<null>,query=org.languagetool.server.TextChecker$QueryParams@5b1f29fa[
altLanguages=[],enabledRules=[],disabledRules=[WHITESPACE_RULE],enabledCategories=[],disabledCategories=[],useEnabledOnly=false,useQuerySettings=true,allowIn
completeResults=true,enableHiddenRules=true,premium=false,enableTempOffRules=false,regressionTestMode=false,mode=TEXTLEVEL_ONLY,level=PICKY,callback=<null>,i
nputLogging=true],globalConfig=org.languagetool.GlobalConfig@745f,user=UserConfig{dictionarySize=0, maxSpellingSuggestions=0, userDictName='default', configu
rableRuleValues={}, linguServices=null, filterDictionaryMatches=false, textSessionId=null, hidePremiumMatches=false, abTest='null'}]
languagetool  | WARN: ngram index dir /ngrams/ru not found for Russian
languagetool  | 2024-04-04 13:15:41.770 GMT+02:00 DEBUG org.languagetool.server.PipelinePool Fetching pipeline took 1066ms; pool has 1 active objects, 0 idle
; pipeline settings: org.languagetool.server.PipelineSettings@625abb97[lang=Russian,motherTongue=<null>,query=org.languagetool.server.TextChecker$QueryParams
@5b1f29fa[altLanguages=[],enabledRules=[],disabledRules=[WHITESPACE_RULE],enabledCategories=[],disabledCategories=[],useEnabledOnly=false,useQuerySettings=tr
ue,allowIncompleteResults=true,enableHiddenRules=true,premium=false,enableTempOffRules=false,regressionTestMode=false,mode=TEXTLEVEL_ONLY,level=PICKY,callbac
k=<null>,inputLogging=true],globalConfig=org.languagetool.GlobalConfig@745f,user=UserConfig{dictionarySize=0, maxSpellingSuggestions=0, userDictName='default
', configurableRuleValues={}, linguServices=null, filterDictionaryMatches=false, textSessionId=null, hidePremiumMatches=false, abTest='null'}]
languagetool  | 2024-04-04 13:15:43.430 GMT+02:00 DEBUG org.languagetool.AnalyzedTokenReadings '' is immunized by antipattern in line 13507
languagetool  | 2024-04-04 13:15:43.430 GMT+02:00 DEBUG org.languagetool.AnalyzedTokenReadings '' is immunized by antipattern in line 13507
languagetool  | 2024-04-04 13:15:43.430 GMT+02:00 DEBUG org.languagetool.AnalyzedTokenReadings 'LanguageTool' is immunized by antipattern in line 13507
...
...
...
languagetool  | WARN: ngram index dir /ngrams/pt not found for Portuguese (Brazil)
languagetool  | 2024-04-04 13:15:50.093 GMT+02:00 DEBUG org.languagetool.server.PipelinePool Fetching pipeline took 2108ms; pool has 1 active objects, 6 idle
; pipeline settings: org.languagetool.server.PipelineSettings@f4cfd90[lang=Portuguese (Brazil),motherTongue=English,query=org.languagetool.server.TextChecker
$QueryParams@7ae9a33a[altLanguages=[],enabledRules=[],disabledRules=[WHITESPACE_RULE],enabledCategories=[],disabledCategories=[],useEnabledOnly=false,useQuer
ySettings=true,allowIncompleteResults=true,enableHiddenRules=true,premium=false,enableTempOffRules=false,regressionTestMode=false,mode=TEXTLEVEL_ONLY,level=P
ICKY,callback=<null>,inputLogging=true],globalConfig=org.languagetool.GlobalConfig@745f,user=UserConfig{dictionarySize=0, maxSpellingSuggestions=0, userDictN
ame='default', configurableRuleValues={}, linguServices=null, filterDictionaryMatches=false, textSessionId=null, hidePremiumMatches=false, abTest='null'}]
languagetool  | 2024-04-04 13:15:50.775 GMT+02:00 DEBUG org.languagetool.server.PipelinePool Requesting pipeline; pool has 1 active objects, 6 idle; pipeline
 settings: org.languagetool.server.PipelineSettings@f4cfd90[lang=Portuguese (Brazil),motherTongue=English,query=org.languagetool.server.TextChecker$QueryPara
ms@7ae9a33a[altLanguages=[],enabledRules=[],disabledRules=[WHITESPACE_RULE],enabledCategories=[],disabledCategories=[],useEnabledOnly=false,useQuerySettings=
true,allowIncompleteResults=true,enableHiddenRules=true,premium=false,enableTempOffRules=false,regressionTestMode=false,mode=TEXTLEVEL_ONLY,level=PICKY,callb
ack=<null>,inputLogging=true],globalConfig=org.languagetool.GlobalConfig@745f,user=UserConfig{dictionarySize=0, maxSpellingSuggestions=0, userDictName='defau
lt', configurableRuleValues={}, linguServices=null, filterDictionaryMatches=false, textSessionId=null, hidePremiumMatches=false, abTest='null'}]

2024-04-04 13:18:04.964 GMT+02:00 DEBUG org.languagetool.server.PipelinePool Requesting pipeline; pool has 0 active objects, 141 idle; pipeline
 settings: org.languagetool.server.PipelineSettings@7109b603[lang=Russian,motherTongue=Russian,query=org.languagetool.server.TextChecker$QueryParams@5b1f29fa[a
ltLanguages=[],enabledRules=[],disabledRules=[WHITESPACE_RULE],enabledCategories=[],disabledCategories=[],useEnabledOnly=false,useQuerySettings=true,allowIncom
pleteResults=true,enableHiddenRules=true,premium=false,enableTempOffRules=false,regressionTestMode=false,mode=TEXTLEVEL_ONLY,level=PICKY,callback=<null>,inputL
ogging=true],globalConfig=org.languagetool.GlobalConfig@745f,user=UserConfig{dictionarySize=0, maxSpellingSuggestions=0, userDictName='default', configurableRu
leValues={}, linguServices=null, filterDictionaryMatches=false, textSessionId=null, hidePremiumMatches=false, abTest='null'}]
languagetool  | 2024-04-04 13:18:21.405 GMT+02:00 DEBUG org.languagetool.server.PipelinePool Fetching pipeline took 16441ms; pool has 1 active objects, 141 idl
e; pipeline settings: org.languagetool.server.PipelineSettings@7109b603[lang=Russian,motherTongue=Russian,query=org.languagetool.server.TextChecker$QueryParams
@5b1f29fa[altLanguages=[],enabledRules=[],disabledRules=[WHITESPACE_RULE],enabledCategories=[],disabledCategories=[],useEnabledOnly=false,useQuerySettings=true
,allowIncompleteResults=true,enableHiddenRules=true,premium=false,enableTempOffRules=false,regressionTestMode=false,mode=TEXTLEVEL_ONLY,level=PICKY,callback=<n
ull>,inputLogging=true],globalConfig=org.languagetool.GlobalConfig@745f,user=UserConfig{dictionarySize=0, maxSpellingSuggestions=0, userDictName='default', con
figurableRuleValues={}, linguServices=null, filterDictionaryMatches=false, textSessionId=null, hidePremiumMatches=false, abTest='null'}]
languagetool  | 2024-04-04 13:18:21.407 GMT+02:00 DEBUG org.languagetool.server.PipelinePool Requesting pipeline; pool has 1 active objects, 141 idle; pipeline
 settings: org.languagetool.server.PipelineSettings@7109b603[lang=Russian,motherTongue=Russian,query=org.languagetool.server.TextChecker$QueryParams@5b1f29fa[a
ltLanguages=[],enabledRules=[],disabledRules=[WHITESPACE_RULE],enabledCategories=[],disabledCategories=[],useEnabledOnly=false,useQuerySettings=true,allowIncom
pleteResults=true,enableHiddenRules=true,premium=false,enableTempOffRules=false,regressionTestMode=false,mode=TEXTLEVEL_ONLY,level=PICKY,callback=<null>,inputL
ogging=true],globalConfig=org.languagetool.GlobalConfig@745f,user=UserConfig{dictionarySize=0, maxSpellingSuggestions=0, userDictName='default', configurableRu
leValues={}, linguServices=null, filterDictionaryMatches=false, textSessionId=null, hidePremiumMatches=false, abTest='null'}]
languagetool  | 2024-04-04 13:18:50.574 GMT+02:00 DEBUG org.languagetool.server.PipelinePool Fetching pipeline took 29167ms; pool has 2 active objects, 141 idl
e; pipeline settings: org.languagetool.server.PipelineSettings@7109b603[lang=Russian,motherTongue=Russian,query=org.languagetool.server.TextChecker$QueryParams
@5b1f29fa[altLanguages=[],enabledRules=[],disabledRules=[WHITESPACE_RULE],enabledCategories=[],disabledCategories=[],useEnabledOnly=false,useQuerySettings=true
,allowIncompleteResults=true,enableHiddenRules=true,premium=false,enableTempOffRules=false,regressionTestMode=false,mode=TEXTLEVEL_ONLY,level=PICKY,callback=<n
ull>,inputLogging=true],globalConfig=org.languagetool.GlobalConfig@745f,user=UserConfig{dictionarySize=0, maxSpellingSuggestions=0, userDictName='default', con
figurableRuleValues={}, linguServices=null, filterDictionaryMatches=false, textSessionId=null, hidePremiumMatches=false, abTest='null'}]
languagetool  | 2024-04-04 13:18:50.576 GMT+02:00 DEBUG org.languagetool.server.PipelinePool Requesting pipeline; pool has 2 active objects, 141 idle; pipeline
 settings: org.languagetool.server.PipelineSettings@7109b603[lang=Russian,motherTongue=Russian,query=org.languagetool.server.TextChecker$QueryParams@5b1f29fa[a
ltLanguages=[],enabledRules=[],disabledRules=[WHITESPACE_RULE],enabledCategories=[],disabledCategories=[],useEnabledOnly=false,useQuerySettings=true,allowIncom
pleteResults=true,enableHiddenRules=true,premium=false,enableTempOffRules=false,regressionTestMode=false,mode=TEXTLEVEL_ONLY,level=PICKY,callback=<null>,inputL
ogging=true],globalConfig=org.languagetool.GlobalConfig@745f,user=UserConfig{dictionarySize=0, maxSpellingSuggestions=0, userDictName='default', configurableRu
leValues={}, linguServices=null, filterDictionaryMatches=false, textSessionId=null, hidePremiumMatches=false, abTest='null'}]

@meyayl
Copy link
Owner

meyayl commented Apr 8, 2024

I tried your compose file and it indeed results in cpu utilization, which might be related to the prewarming, which seems wanting to load the ngrams of all known languages.

languagetool-test  | 2024-04-08 23:55:51.964 GMT+02:00 INFO  org.languagetool.server.TextChecker Prewarming pipelines...
languagetool-test  | WARN: ngram index dir /ngrams/ru not found for Russian
languagetool-test  | WARN: ngram index dir /ngrams/es not found for Spanish
languagetool-test  | WARN: ngram index dir /ngrams/fr not found for French
languagetool-test  | WARN: ngram index dir /ngrams/pt not found for Portuguese (Brazil)
languagetool-test  | WARN: ngram index dir /ngrams/it not found for Italian

The CPU utilization gets even worse if you set the environment variable JAVA_GC=G1GC. By default, JAVA_GC is set to SerialGC, which is the reason only a single cpu is used/clogged.

If you comment out the custom configuration for the prewarming, the cpu clogging disappears.

Please feel free to ask anything regarding the mechanics of the image itself, but when it comes to the implementation of LanguageTool itself, I can only suggest that you raise an issue in the LanguageTool Github repository and file a bug report for version 6.4.

@ovizii
Copy link

ovizii commented Apr 9, 2024

Thanks for having a look, I totally understand it's related to LT, so I will check out their forums and reply here if there are any results.

@ovizii
Copy link

ovizii commented Apr 10, 2024

Btw. I set - langtool_pipelinePrewarming=false and switched back to image: meyay/languagetool:latest and everything is perfectly fine. Very weird behaviour, until image: meyay/languagetool:6.3a-5 the prewarming worked just fine. CPU only spiked for maybe 30 seconds or so after starting.

I opened my own thread on the LT page here: languagetool-org/languagetool#10488

@meyayl
Copy link
Owner

meyayl commented Apr 17, 2024

Can I close this issue, as it's not really related to the image mechanics itself?

@ovizii
Copy link

ovizii commented Apr 17, 2024

Sure, thanks for asking and taking the time to look into it.

@meyayl meyayl closed this as completed Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants