Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle non-ASCII chars when returning inference results. #241

Closed
safsoun opened this issue Dec 7, 2021 · 6 comments
Closed

handle non-ASCII chars when returning inference results. #241

safsoun opened this issue Dec 7, 2021 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@safsoun
Copy link

safsoun commented Dec 7, 2021

Hello,

After testing French language, the detection is working fine but the returned string from slots is "tricky" when the word contains symbols such as 'é', 'è' etc..
For example, for the word "éteindre", the returned slot string is "éteindre" which is not user-friendly!
I don't know how do you manage this, I propose to replace any letter of a specific symbol by its basic Latin letter, for example:
'é', 'è' -> e
à -> a
etc ..

PS: I tried to put the text "eteindre" instead of "éteindre" in the rhino console but the dictionary rejected the first one :-(

@safsoun safsoun added the bug Something isn't working label Dec 7, 2021
@ErisMik
Copy link
Contributor

ErisMik commented Dec 7, 2021

Thank you for the report. This is an issue with the text rendering specific only to our web-based SDKs, and is currently in our queue to fix. Desktop and other SDKs do not have this issue and render the correct UTF8 accented characters.

@ErisMik ErisMik self-assigned this Dec 7, 2021
@kenarsa kenarsa changed the title Rhino enhancement : returned texts with specific language symbol are not user-friendly handle non-ASCII chars when returning inference results. Dec 7, 2021
@safsoun
Copy link
Author

safsoun commented Dec 8, 2021

Thank you for your fast reactivity on all the threads I opened until now 💯

@ErisMik
Copy link
Contributor

ErisMik commented Dec 15, 2021

@safsoun The returned slot strings in both console and in the web SDKs should now render the correct UTF8 characters. In the case of the web SDKs, please update to the latest versions of the *-worker and *-factory packages to see this fix. If you run into any more problems, feel free to open another issue.

@ErisMik ErisMik closed this as completed Dec 15, 2021
@safsoun
Copy link
Author

safsoun commented Feb 7, 2022

Hello,

I come back to this issue. It appears it was not fixed.. Still having non-ASCII characters.
For example: for the word "éteindre", I get the text "éteindre" in the console (even if I re-import .yml file) and "C)teindre" with STM32F769-DISCO board output.
Tested with lib v2.1

@mrrostam
Copy link
Member

mrrostam commented Feb 7, 2022

Hello,

I am a little bit confused. You are saying the Picovoice Console shows the word "éteindre" correctly, whereas the board gives "C)teindre"? If that's true, I guess the software you are using to get data from the serial port probably has some issues with non-Ascii chars.

what software did you use to monitor the serial output?

@safsoun
Copy link
Author

safsoun commented Feb 8, 2022

Hello,

Please consider this issue as resolved.
I did a test with my firmware (using STM32F769-DISCO) by doing a compare with the string "éteindre" and got equal strings.
-> strcmp("éteindre", inference->values[i]) returns 0.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants