Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid command line xml output (COMPOUND) #3181

Closed
kijerk opened this issue Jun 30, 2020 · 6 comments
Closed

Invalid command line xml output (COMPOUND) #3181

kijerk opened this issue Jun 30, 2020 · 6 comments

Comments

@kijerk
Copy link

kijerk commented Jun 30, 2020

After upgrading from LT 4.8 to 5.0 there are (depending on the input) sometimes lines like COMPOUND: ... at the beginning of the xml output, which render the file invalid for further processing.

For example:

COMPOUND: Kühlschmier-mittel
COMPOUND: Wartungs-zugang
COMPOUND: Gliedmassen
COMPOUND: Gliedmassen
COMPOUND: Klettbänder
COMPOUND: Wechslerarms
COMPOUND: Destillatgleiches
COMPOUND: Herunterfahrens
COMPOUND: Wechslergabel
COMPOUND: Spindelkopf
COMPOUND: Spannzangenschlüssel
<?xml version="1.0" encoding="UTF-8"?>
...

Can you please remove the starting COMPOUND: ... lines from the output? I do not know if this was introduced with version 4.9 or 5.0. Switching to 4.8 does not produce these lines. Thank you.

Just another output from a different file. Hope it helps:

COMPOUND: Lagerfreundliches
COMPOUND: Gießharzsystem
COMPOUND: Umgebungs-temperatur
COMPOUND: Umgebungs-temperatur
COMPOUND: Strangverlauf
COMPOUND: Strangverlauf
10:33:34.375 [ForkJoinPool-1-worker-2] INFO org.languagetool.rules.de.GermanSpellerRule - UNKNOWN: hochbeständigen
COMPOUND: Hochwärmebeständige
COMPOUND: Strangführung
COMPOUND: Strangführung
COMPOUND: Strangführung
COMPOUND: Strangverlauf
...
@danielnaber
Copy link
Member

We use these to find words not known to the dictionary. Quick workaround: filter out lines that match ^COMPOUND: before interpreting as XML.

@kijerk
Copy link
Author

kijerk commented Jun 30, 2020

This is the kind of workaround I had in mind, too. But it is just a workaround. I don't know exactly what else can be in front of the real xml content. And as you can see, the line 10:33:34... doesn't start with COMPOUND:.

The lines are no xml and file is not valid. Why is this kind of output in the xml file? I would really like if you could try to fix the source of the problem.

@kijerk
Copy link
Author

kijerk commented Jul 1, 2020

@danielnaber Is there a chance for a fix from your side? It would be really unfortunate, if the xml-output would not work by default...

@danielnaber
Copy link
Member

The fix is easy, but so far there are no plans to have a 5.0.1 because of this bug, so the fix would be in the snapshots only.

@kijerk
Copy link
Author

kijerk commented Jul 1, 2020

Great, good news and good to know!

@valentjn
Copy link

valentjn commented Jul 2, 2020

This breaks all tools which use the Java interface, but depend on a clean stdout (e.g., language servers where stdout is directly connected to a client via RPC). An ugly workaround is to silence LanguageTool with https://stackoverflow.com/q/4799006.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants