Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soft hyphens in text break grammar API #23

Closed
snomos opened this issue Sep 18, 2020 · 11 comments
Closed

Soft hyphens in text break grammar API #23

snomos opened this issue Sep 18, 2020 · 11 comments
Labels
bug Something isn't working

Comments

@snomos
Copy link
Member

snomos commented Sep 18, 2020

I am not able to get GramDivvun to work in MS Word (the local app) when using the UiT account. It can be installed, it loads, and looks the way it should in the initial screen. But when clicking "Check", it almost immediately dies with the following errors in the console:

[Error] Failed to load resource: the server responded with a status of 500 () (se, line 0)
[Error] Failed to get grammar check API response
TypeError: undefined is not an object (evaluating 'e.filter')
p  index.ts:165
(anonym funksjon)  api.ts:58
(anonym funksjon)  app.0990553d7b12bc912a5b.js:21:37456
s  app.0990553d7b12bc912a5b.js:21:36328
u  bluebird.js:5370
(anonym funksjon)  bluebird.js:3366
(anonym funksjon)  bluebird.js:3423
(anonym funksjon)  bluebird.js:3468
(anonym funksjon)  bluebird.js:3548
l  bluebird.js:145
c  bluebird.js:138
(anonym funksjon)  bluebird.js:154
(anonym funksjon)  bluebird.js:67
(anonym funksjon)  bluebird.js:4620
	(anonym funksjon) (app.0990553d7b12bc912a5b.js:21:38425)
	(anonym funksjon) (app.0990553d7b12bc912a5b.js:21:37456)
	s (app.0990553d7b12bc912a5b.js:21:36328)
	u (app.0990553d7b12bc912a5b.js:1:200125)
	(anonym funksjon) (app.0990553d7b12bc912a5b.js:1:173146)
	(anonym funksjon) (app.0990553d7b12bc912a5b.js:1:173967)
	(anonym funksjon) (app.0990553d7b12bc912a5b.js:1:174655)
	(anonym funksjon) (app.0990553d7b12bc912a5b.js:1:176008)
	l (app.0990553d7b12bc912a5b.js:1:127337)
	c (app.0990553d7b12bc912a5b.js:1:127262)
	(anonym funksjon) (app.0990553d7b12bc912a5b.js:1:128376)
	(anonym funksjon) (app.0990553d7b12bc912a5b.js:1:127207)
	(anonym funksjon) (app.0990553d7b12bc912a5b.js:1:189711)
[Error] Unhandled Promise Rejection: TypeError: undefined is not an object (evaluating 'm.message')
	(anonym funksjon) (word-mac-16.00.js:26:310515)
	promiseReactionJob

Screenshot of the same:

Bilde 18 09 2020 klokken 10 27

I have tested various setups, and most work, but not this one. The ones I have tested are:

  • windows, app (365), UiT account - works
  • windows, browser (Edge), UiT account - works
  • windows, browser (Edge), private account - works
  • mac, browser (Chrome), private account - works
  • mac, browser (Chrome), UiT account - works
  • mac, browser (Safari), private account - DOES work
  • mac, browser (Safari), UiT account - does NOT work
  • mac, app (2016), private account - DOES work
  • mac, app (2016), UiT account - does NOT work

The Safari+UiT problem manifests differently, and thus seems to be a different issue, and can be easily worked around by using another browser. So this bug report targets the Office 2016 locally installed app issue only.

@snomos
Copy link
Member Author

snomos commented Sep 18, 2020

Turns out the problem is soft hyphens in the text sent from Word. So the categorisation above is probably invalid, and just a happy coincidence of the test data used.

To trigger the error, use the following text - it should contain two soft hyphens:

 Áigot nannet sámiid konsulta­šuvdna­rievtti

@bbqsrc bbqsrc changed the title GramDivvun in Word/Mac does not work - JS errors Soft hyphens in text break grammar API Sep 27, 2020
@snomos
Copy link
Member Author

snomos commented Dec 18, 2020

Seems the error is in libdivvun - @unhammer could you have a look?

@bbqsrc
Copy link
Member

bbqsrc commented Dec 18, 2020

The issue this time was the character \x1f was found.

@unhammer
Copy link
Member

The issue this time was the character \x1f was found.

INFORMATION SEPARATOR ONE?

@bbqsrc
Copy link
Member

bbqsrc commented Dec 18, 2020

Everyone's favourite codepoint! The input was dutkama ja luonddu\x1fdiehtaga,

@unhammer
Copy link
Member

What is libdivvun doing wrong? I get

$ echo ' Áigot nannet sámiid konsulta­šuvdna­rievtti' | src/divvun-checker -l se 
{"errs":[["konsulta",21,29,"typo","Ii leat sátnelisttus",["konsula"],"Čállinmeattáhus"],["šuvdna",30,36,"typo","Ii leat sátnelisttus",["šuvona","govdna"],"Čállinmeattáhus"]],"text":" Áigot nannet sámiid konsulta­šuvdna­rievtti"}

$ echo ' Áigot nannet sámiid konsulta­šuvdna­rievtti' | src/divvun-checker -l se |hl-nonprinting 
{"errs":[["konsulta",21,29,"typo","Ii leat sátnelisttus",["konsula"],"Čállinmeattáhus"],["šuvdna",30,36,"typo","Ii leat sátnelisttus",["šuvona","govdna"],"Čállinmeattáhus"]],"text":" Áigot nannet sámiid konsulta-šuvdna-rievtti"}⁋

from the command line with newest giella-sme-speller (that hl-nonprinting is just a script to sed \xad into a dash and EOL into ⁋).

@bbqsrc
Copy link
Member

bbqsrc commented Dec 18, 2020

@snomos has conflated two issues. the \x1f issue seems to be coming from libdivvun, whereas the soft hyphen issue is our problem.

@unhammer
Copy link
Member

unhammer commented Dec 18, 2020

$ printf ' dutkama ja luonddu\x1fdiehtaga' | src/divvun-checker -l se
{"errs":[],"text":" dutkama ja luonddudiehtaga"}

$ printf ' dutkama ja luonddu\x1fdiehtaga' | src/divvun-checker -l se |hl-nonprinting
{"errs":[],"text":" dutkama ja luonddu^_diehtaga"}⁋

– should we be removing it from input, or are we somehow introducing \x1f's, or am I not reproducing the issue correctly here? (Some "expected this, but got that" examples would be nice ;))

@bbqsrc
Copy link
Member

bbqsrc commented Dec 18, 2020

Sorry I am trying to go on vacation, haha.

Error was control character (\\u0000-\\u001F) found while parsing a string, looks like it was coming from libdivvun, perhaps it isn't. January's problem now.

unhammer added a commit to divvun/libdivvun that referenced this issue Dec 18, 2020
Might help with
divvun/divvun-gramcheck-web#23

https://tools.ietf.org/html/rfc7159 says "escape U+0000 through
U+001F", we were escaping up to but not including 1F, simple fix :)
@unhammer
Copy link
Member

There was a bug in libdivvun – that character should've been escaped according to the json spec. Fixed now, hopefully might help with this issue.

@snomos snomos added the bug Something isn't working label Jan 26, 2022
@snomos
Copy link
Member Author

snomos commented Jan 26, 2022

Everyone's favourite codepoint! The input was dutkama ja luonddu\x1fdiehtaga,

This seems to be fixed, at least not causing any trouble in neither Word nor GDocs. Even

konsulta­šuvdna­rievtti

(containing two soft hyphens) seems to be fixed. Closing.

@snomos snomos closed this as completed Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants