Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profanity/offensive words filter attribute #72

Open
Ninajoy opened this issue Nov 16, 2019 · 3 comments
Open

Add profanity/offensive words filter attribute #72

Ninajoy opened this issue Nov 16, 2019 · 3 comments

Comments

@Ninajoy
Copy link

Ninajoy commented Nov 16, 2019

No idea if i am on the right track as to why curse words appear differently in the transcript of SpeechRecognitionResult in different browsers. Therefore thought it best to open an issue here.

Question
If browsers implement the transcript SpeechRecognitionResult in such a way where the output differs maybe a profanity filter attribute could be useful so that the developer using the API has a choice in that matter? For example offensiveWordFilter attribute, of type boolean?

Background Story
While experimenting with the SpeechRecognition Interface
in the phrase-matcher from https://github.com/mdn/web-speech-api/ the following occurred:

  1. When using Chrome and saying a curse word like shit, the transcript in SpeechRecognitionResult is censored as s****
  2. When using Firefox Nightly and saying a curse word like shit the transcript in SpeechRecognitionResult is not censored

In neither Chrome nor Nightly this type of censoring is applied for the speechSynthesis interface as used in the speak-easy-synthesis.

In my search into why this happens i found the following:
On https://github.com/chromium/chromium/blob/master/content/browser/speech/speech_recognition_engine.cc on line 277 filter_profanities is set to false on line 579 it should result in pFilter=0. According to https://stackoverflow.com/questions/15030339/remove-profanity-censor-from-google-speech-recognition/15071054 the setting pfilter=0 results in removing the profanity filter. Which could lead to the conclusion in chrome this is changed. I do not feel confident in this conclusion however.

In Nightly I have found no reference in the code to a profanity filter https://dxr.mozilla.org/mozilla-central/source/dom/media/webspeech/recognition

@marcoscaceres
Copy link
Collaborator

That seems like a bug in Chrome (or Google's speech service). The recognition engine should be profanity agnostic. The consuming application should then do its own filtering.

@Ninajoy
Copy link
Author

Ninajoy commented Dec 13, 2019

Thank you for your answer.

I can see a bug was registered for this in chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=804812&q=speech%20censored&colspec=ID%20Pri%20M%20Stars%20ReleaseBlock%20Component%20Status%20Owner%20Summary%20OS%20Modified

In the HTML Speech Incubator document in chapter 7.1.2.3 Builtin Default Grammars on https://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/ the following was included: It is recommended that speech services support a filter parameter that can be set to the value noOffensiveWords to represent a desire to not recognize offensive words.

Would it therefore be handy, to prevent further misunderstandings about this subject, to change my request to include in the speech-api documentation that the engine should be profanity agnostic?

@evanbliu
Copy link

FYI, Chrome just updated its implementation of the Web Speech API to remove profanity masking. This change will take effect in release M127.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants