nvda can read emojies #6523

mrdin8877 · 2016-10-30T10:09:48Z

hi and sorry for asking this, i know that if you install a specific add on, nvda can read emoticons, but that one is the letter type, for example, this one, :) :( :@ etc. but this one is the a symbol that represents emojies. sorry if my information is not enough. thanks

feerrenrut · 2016-11-01T01:44:21Z

P3, This is becoming more important as more and more sites / apps are using emojois rather than emoticons. There is some work to implement this, and some thought will be required on how to deal with translations.

This may also be a good candidate for an addon, could the existing one be extended?

kaveinthran · 2016-11-01T09:49:31Z

will be nicer also if we have emogi, symbols and emoticons selecters

On 11/1/16, Reef Turner notifications@github.com wrote:

P3, This is becoming more important as more and more sites / apps are using
emojois rather than emoticons. There is some work to implement this, and
some thought will be required on how to deal with translations.

This may also be a good candidate for an addon, could the existing one be
extended?

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#6523 (comment)

nvdaes · 2016-11-01T11:59:11Z

Hi, about the question on Emoticons add-on, whose main author is Chris Leo, it's being extended to speak Emojis too. Work is been done to insert also emojis as well as emoticons.
Anyway, the add-on works with speech dictionaries (braille is supported in symbols insertion only).
For the add-on see this branch:
https://github.com/nvdaaddons/emoticons/tree/emojis

LeonarddeR · 2018-02-02T06:23:33Z

The translation part will probably be a bit difficult. I wonder whether Apple, for example, does all the emoji translations internally. I think we can't expect from our translation teams to translate all the emojis

PratikP1 · 2018-02-02T15:58:15Z

To what extent should the speaking of emoji characters be left to synthesizers? It might be worth looking at testing OneCore, it's language packs in different languages, and the Windows 10 emoji panel when the emoji panel functionality becomes available in other languages. People using synthesizers in languages other than English can also test by using this emoji catalog. For development purposes this Github iamcal/emoji-data repository looks useful.

jcsteh · 2018-03-28T02:05:43Z

I looked into this a bit. The Unicode CLDR (Common Locale Data Repository) includes TTS descriptions for emoji and a whole bunch of other languages. You can find them in the common/annotations and common/annotationsDerived directories.

common directory in CLDR svn trunk, under which you can find annotations and annotationsDerived
Downloads for official CLDR releases

Some notes/thoughts:

The data is in XML. I imagine we'd process that data into our own symbol dictionary format for each language and allow NVDA to load this as an additional symbols dictionary.
The TTS descriptions are an annotation with type "tts". For example:

<annotation cp="😂" type="tts">face with tears of joy</annotation>
We'd need to include the data from both annotations and annotationsDerived. The derived annotations include emoji combined with skin tone modifiers, country flags (with info derived from the Unicode territory name data), etc.
Derived annotations include punctuation which we'd probably want to strip; e.g.

<annotation cp="👶🏻" type="tts">baby: light skin tone</annotation>
We could also derive annotations ourselves at runtime using code. While it would decrease dictionary size (since we wouldn't have to include multiple instances of each modified emoji), it'd be a fair bit of work and we'd have to include separate country data, etc.
For English, we need to include en_001 as well as en. en is for US English. Some other English locales derive directly from en_001 instead of en, but it seems a lot of stuff is still only in en, and NVDA doesn't have anything other than "en" anyway.
These dictionaries are going to be several hundred kb each. It's possibly worth it - emoji are used a lot these days - but we'll have to keep an eye on performance and memory usage at runtime.

LeonarddeR · 2018-07-31T14:26:30Z

@jcsteh commented on 28 mrt. 2018 04:05 CEST:

The data is in XML. I imagine we'd process that data into our own symbol dictionary format for each language and allow NVDA to load this as an additional symbols dictionary.

This makes sense.

We could also derive annotations ourselves at runtime using code. While it would decrease dictionary size (since we wouldn't have to include multiple instances of each modified emoji), it'd be a fair bit of work and we'd have to include separate country data, etc.

This sounds suitable for a version 2 of the implementation.

For English, we need to include en_001 as well as en. en is for US English. Some other English locales derive directly from en_001 instead of en, but it seems a lot of stuff is still only in en, and NVDA doesn't have anything other than "en" anyway.

It's not entirely clear to me what the differences are. Could you give an example of what you observed?

jcsteh · 2018-07-31T22:58:29Z

@leonardder commented on Aug 1, 2018, 12:26 AM GMT+10:

We could also derive annotations ourselves at runtime using code. While it would decrease dictionary size (since we wouldn't have to include multiple instances of each modified emoji), it'd be a fair bit of work and we'd have to include separate country data, etc.

This sounds suitable for a version 2 of the implementation.

Note that if the database doesn't cover some of the languages we need, our translators may want to translate them. At that point, we won't want to go for a "version 2" because we'll break all of their work.

For English, we need to include en_001 as well as en. en is for US English. Some other English locales derive directly from en_001 instead of en, but it seems a lot of stuff is still only in en, and NVDA doesn't have anything other than "en" anyway.

It's not entirely clear to me what the differences are. Could you give an example of what you observed?

As an example, the 😂 only appears in en (which is US). en_001 (which is the base for English) doesn't include it, nor does en_GB (which inherits directly from en_001). That means that en_GB doesn't include 😂 at all. See English Inheritance for details about inheritance for specific English locales.

LeonarddeR · 2018-08-01T05:18:41Z

@jcsteh commented on 1 aug. 2018 00:58 CEST:

Note that if the database doesn't cover some of the languages we need, our translators may want to translate them. At that point, we won't want to go for a "version 2" because we'll break all of their work.

Fair point. Furthermore, it seems that the derived annotations mainly just stick the main annotations together, separated with a column which we'd like to ignore anyway. So if I'm correct, sticking to the base annotations gives us results that are close to what we want, unless:

The derived annotations include emoji combined with skin tone modifiers, country flags (with info derived from the Unicode territory name data), etc.

I've seen these skin tone modifiers as part of the main annotations, but may be this is not the case for country flags?

LeonarddeR · 2018-08-01T18:09:03Z

Looks like this repository might be a good way to go, it is perfectly kept up to date and contains exactly what we want: https://github.com/fujiwarat/cldr-emoji-annotation

LeonarddeR · 2018-08-02T15:14:20Z

@jcsteh commented on 1 aug. 2018 00:58 CEST:

As an example, the 😂 only appears in en (which is US). en_001 (which is the base for English) doesn't include it, nor does en_GB (which inherits directly from en_001). That means that en_GB doesn't include 😂 at all. See English Inheritance for details about inheritance for specific English locales.

Based on this, it looks like en is currently the dictionary that is the most complete one, and as we don't specify an English dialect in NVDA, using en here might make sense.

LeonarddeR · 2018-08-02T15:32:36Z

I have a prototype implementation in de @LeonarddeR i6523 branch (note that I'm using my private fork).

It loads an additional emojis.dic file per language. I chose for emojis as the plural of use for ease of code clarity.
It includes https://github.com/fujiwarat/cldr-emoji-annotation as a submodule and generates dictionaries from both annotations and annotationsDerived when running scons source. see emojiDict_sconscript in the root of the repo.

jcsteh · 2018-08-02T21:21:50Z

I think you might need to include en_001 and en, as en_001 is supposed to be the base for all English locales according to that inheritance document. In practice, I'm not sure whether there's anything in en_001 that isn't in en.

LeonarddeR · 2018-08-03T11:50:57Z

@jcsteh commented on 2 Aug 2018, 23:21 CEST:

I think you might need to include en_001 and en, as en_001 is supposed to
be the base for all English locales according to that inheritance document.

You're right. I've updated the code to support multiple sources per locale, which is also required for pt_br, which must include both pt and pt_BR sources.

A different problem though, is whether we want the user to enable or disable processing of emojis. When we either disable or enable it in the gui, we can simply invalidate all data in the data map, so emojis either will or won't be loaded. However, config profile switches are more complex, as we don't want to invalidate all data for every config profile switch.

Also, how should we treat emojis that are added as part of the user dictionary? It is easy to exclude the build in emoji dictionaries as they are in different files. The only way I can think of in case of user added emojis is adding an emojis category to the symbols files, becides symbols and complexSymbols. But then, data invalidation is still required, unless we use a separate regex for emojis within the symbol processor.

LeonarddeR · 2018-08-10T17:31:16Z

I noticed that the annotations also contain symbols like ®, © and ™ which are strictly spoken not emojis I believe.

LeonarddeR · 2018-08-25T12:29:21Z

cc @feerrenrut @michaelDCurran @josephsl, Do you have any thoughts about this issue? The emoji dictionary creation code is there and works quite well, the question is how to integrate these annotations into NVDA. Is it really necessary to have the ability to enable/disable them?

nvdaes · 2018-08-25T13:41:29Z

I mentioned Chris, @Christianlm, the main author of the emoticons addon, but not correctly.
In case he have something to say, hope this helps.
Thanks

kvark128 · 2018-10-02T19:23:17Z

Sorry for emotions, but it is very very bad innovation.
Opened today Punctuation / symbol dialog, i found more than 2000 little-used emoji, many of which have a very bad translation into my language.
Why was it necessary to add all this to the base NVDA distribution? For users who need it there are appropriate add-ons.
And even there is no way to disable this shit. Only have to remove the file cldr.dic.

kvark128 · 2018-10-02T19:42:33Z

Found the checkbox "Include Unicode Consortium data (including emoji) when processing characters and symbols". Thanks so much for the ability to disable this garbage.

LeonarddeR · 2018-10-03T06:05:09Z

@kvark128 : what language are you using? Could you elaborate on what's so bad about the emoji translations for your language?

marlon-sousa · 2018-12-18T02:05:31Z

Hello,

I have found what I believe to be a bug.

How to reproduce?

1- Open up notepad and past the below text:

doesn't include 😂 at all. See

2- press home to go to the beginning of the line.

3- Press nvda + down arrow to read all the text.

4- You should hear "doesn't include face with tears of joy at all. See

5- Right until now everything is OK.

6- Now press home again and press twice ctrl + right arrow. You will hear doesn't include.

7- Now, press the right arrow. You will hear space and if you press right arrow again ... you will hear symbol d 8 3 d

This demonstrates that when reading character by character the symbol is being somehow not processed or processed incorrectly.

If you keep pressing control + right arrow though the symbol is read correctly.

I had the same results using espeak ng in English or Portuguese with NVDA itself also either in English or Portuguese. One core voices (the two brazilian ones) with NVDA either in Portuguese or in English reported kind of the same behavior but in that case the character was not even read, a silence occurred when reading with arrows. The symbol was read correctly by read all or using ctrl arrows.

LeonarddeR · 2018-12-18T06:00:52Z

This should have been fixed in Alpha builds of NVDA.

marlon-sousa · 2018-12-18T07:20:15Z

I am using nvda 218.4. This behavior occurred in Firefox and notepad as well. Obrigado, Marlon Em 18 de dez de 2018, à(s) 04:00, Leonard de Ruijter <notifications@github.com> escreveu:

…

This should have been fixed in Alpha builds of NVDA. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

feerrenrut added the p4 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority label Nov 1, 2016

jcsteh mentioned this issue Jul 16, 2017

Emoticons, Don't Speak, but instead punctuations/symbols #1608

Closed

dkager mentioned this issue Aug 10, 2017

Where can i find this complete emoji list #7481

Closed

jcsteh mentioned this issue Jul 31, 2018

When reading emojis, count replicated emojis rather than reading each individually #8499

Closed

LeonarddeR mentioned this issue Sep 1, 2018

NVDA should support unicode Emoticons in both Braille and speech #5135

Closed

LeonarddeR mentioned this issue Sep 18, 2018

Use Unicode CLDR to create speech symbol dictionaries with emojis #8758

Merged

michaelDCurran closed this as completed in #8758 Sep 25, 2018

nvaccessAuto added this to the 2018.4 milestone Sep 25, 2018

Nardol mentioned this issue Oct 5, 2018

Emojis support brailcom/speechd#49

Closed

DrSooom mentioned this issue Feb 18, 2019

Cannot read Braille unicode Characters. #6341

Closed

Pallas1303 mentioned this issue Dec 6, 2023

About the use of some file of Festcat project for creation not Catalan Voices. FestCat/festival-ca#7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvda can read emojies #6523

nvda can read emojies #6523

mrdin8877 commented Oct 30, 2016

feerrenrut commented Nov 1, 2016

kaveinthran commented Nov 1, 2016

nvdaes commented Nov 1, 2016

LeonarddeR commented Feb 2, 2018

PratikP1 commented Feb 2, 2018 •

edited

jcsteh commented Mar 28, 2018

LeonarddeR commented Jul 31, 2018

jcsteh commented Jul 31, 2018

LeonarddeR commented Aug 1, 2018

LeonarddeR commented Aug 1, 2018

LeonarddeR commented Aug 2, 2018

LeonarddeR commented Aug 2, 2018

jcsteh commented Aug 2, 2018 via email

LeonarddeR commented Aug 3, 2018 •

edited

LeonarddeR commented Aug 10, 2018

LeonarddeR commented Aug 25, 2018

nvdaes commented Aug 25, 2018

kvark128 commented Oct 2, 2018

kvark128 commented Oct 2, 2018

LeonarddeR commented Oct 3, 2018

marlon-sousa commented Dec 18, 2018

LeonarddeR commented Dec 18, 2018 via email

marlon-sousa commented Dec 18, 2018 via email

nvda can read emojies #6523

nvda can read emojies #6523

Comments

mrdin8877 commented Oct 30, 2016

feerrenrut commented Nov 1, 2016

kaveinthran commented Nov 1, 2016

nvdaes commented Nov 1, 2016

LeonarddeR commented Feb 2, 2018

PratikP1 commented Feb 2, 2018 • edited

jcsteh commented Mar 28, 2018

LeonarddeR commented Jul 31, 2018

jcsteh commented Jul 31, 2018

LeonarddeR commented Aug 1, 2018

LeonarddeR commented Aug 1, 2018

LeonarddeR commented Aug 2, 2018

LeonarddeR commented Aug 2, 2018

jcsteh commented Aug 2, 2018 via email

LeonarddeR commented Aug 3, 2018 • edited

LeonarddeR commented Aug 10, 2018

LeonarddeR commented Aug 25, 2018

nvdaes commented Aug 25, 2018

kvark128 commented Oct 2, 2018

kvark128 commented Oct 2, 2018

LeonarddeR commented Oct 3, 2018

marlon-sousa commented Dec 18, 2018

LeonarddeR commented Dec 18, 2018 via email

marlon-sousa commented Dec 18, 2018 via email

PratikP1 commented Feb 2, 2018 •

edited

LeonarddeR commented Aug 3, 2018 •

edited