webkitSpeechRecognition for desktop apps? #1115

isimmons · 2013-09-14T18:18:00Z

I was attempting to add speech recognition to a desktop application but it seems speechRecognition is not working probably due to the application using local file urls and no way to allow permission like when running it in chrome.

TalAter/annyang#44

As you can see in the linked related issue trying to run speecheRecognition by it's self in the console does nothing in node-webkit.

var sr = new webkitSpeechRecognition;
sr.continuous = true; 
sr.interimResults = true;
sr.lang='en';
sr.onresult = function(e) {
    console.log(e.results[e.results.length-1][0].transcript);
};
sr.start();//nothin happens and no errors

So would it be correct to say this is not currently possible in node-webkit or is there a known work around? I'm aware of the getUserMedia API but it seems that only captures audio but doesn't do any speech recognition.

If it's not currently possible will it be possible in future releases?

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

The text was updated successfully, but these errors were encountered:

caolan · 2013-10-12T11:01:37Z

+1

tommoor · 2013-10-12T11:11:26Z

Seems that permission needs to be pre-authorised in the same way it was for getUserMedia calls and screensharing.

Would love to see support for this @rogerwang and feels like a possible quick win?

deanshub · 2013-10-17T20:12:12Z

+1

X4 · 2013-12-01T19:06:07Z

You could write a native-client app for chrome that is a wrapper for this. I guess this could work with phantomjs too.

TalAter · 2014-01-15T23:07:43Z

+1

manuelpaulo · 2014-01-16T05:41:40Z

+1

jshemas · 2014-02-07T21:46:07Z

+1

Akkuma · 2014-03-31T16:22:54Z

Seriously no comment on this after 7 months? +1

ThomasAy · 2014-04-01T15:16:23Z

@isimmons @caolan @tommoor @deanshub @X4 @TalAter @manuelpaulo @jshemas

Does any of you find any workaround ?

X4 · 2014-04-01T16:51:27Z

@ThomasAy I'd have to debug it, as I've only skimmed the nodewebkit sourcecode, but to me it looks like you could patch https://github.com/rogerwang/node-webkit/blob/master/src/media/media_capture_devices_dispatcher.cc to show debug messages, when you call sr.start();
However, I've not even seen a reference of webkitSpeechRecognition, so it seems that this is currently not implemented yet. Correct me if I'm wrong, but here is what I think is missing in node-webkit to support this: http://git.chromium.org/gitweb/?p=chromium.git;a=tree;f=chrome/browser/speech

ghost · 2014-04-05T02:08:13Z

+1

alanjames1987 · 2014-05-07T07:07:08Z

+1

taskinegemen · 2014-07-22T10:30:06Z

what about the status?

PlasmaPower · 2014-09-07T22:16:56Z

+1

byourselves · 2014-09-11T16:57:06Z

+1

PlasmaPower · 2014-09-11T16:59:40Z

This can be done some through Google's web api for speech recognition, but it would be nice to have it integrated into nodeWebkit (or even nicer as a nodeJS module).

bram-dingelstad · 2014-10-03T17:53:29Z

+1

willemmulder · 2014-10-07T17:56:15Z

+1

dzautner · 2014-10-15T10:38:08Z

+1

netanelgilad · 2014-10-16T09:15:27Z

+1

ghost · 2014-10-16T23:09:37Z

+1

roccolucatallarita · 2014-10-29T05:25:27Z

+1

ghost · 2014-11-05T20:15:12Z

+1

miller9904 · 2014-11-07T17:43:48Z

+1 I would LOVE this feature!

Lezeper · 2014-11-25T18:29:54Z

+1

anishtr4 · 2014-12-04T17:43:29Z

is this feature still not working ? i want it badly

miller9904 · 2014-12-04T23:05:55Z

Me too!

askbeka · 2014-12-05T09:39:58Z

+1

RobinMalfait · 2014-12-21T21:25:27Z

Still no fix?!

Edit from 2020: I'm sorry that I responded like this!

tommoor · 2014-12-22T01:41:35Z

@RobinMalfait this isn't a bug, it's a feature request 😉 Chrome uses Google's voice recognition service, there is no equivalent for node webkit.

TemaSM · 2015-10-10T02:09:04Z

+1

zackm0571 · 2015-11-04T18:31:01Z

+1

tom-s · 2015-12-01T16:32:35Z

+1

Aaronik · 2015-12-24T16:58:11Z

+1

ghostoy · 2015-12-28T00:41:23Z

It should be supported by NW13.

Aaron Sullivan notifications@github.com于2015年12月25日周五 00:58写道：

+1

—
Reply to this email directly or view it on GitHub
#1115 (comment).

jhm-ciberman · 2016-01-11T03:15:25Z

Is this supported already? I'm using NW13 beta 2.
And this code throw a SpeechRecognitionError (error="network", message="")

var recognition = new webkitSpeechRecognition();
recognition.onresult = function(e) {console.log(e);}
recognition.onerror =  function(e) {console.log(e);}
// Fails with and without this line: 
//recognition.serviceURI = 'wami.csail.mit.edu';
recognition.start();

willemmulder · 2016-01-11T07:57:44Z

I assume wami.csail.mit.edu returns an error or requires a key of sorts? What do you see in the network requests?

ghost · 2016-01-11T14:37:04Z

Does this feature work without embedding a black box binary blob from Google that accesses the microphone without user consent (and without any web app making explicit use of it)?

Please note while I might sound paranoid, this is apparently what chromium used to do to support this: https://www.privateinternetaccess.com/blog/2015/06/google-chrome-listening-in-to-your-room-shows-the-importance-of-privacy-defense-in-depth/

If this is what you plan to add to nwjs as well, maybe make the binary blob download only triggered at runtime by the web app so no app using nwjs is FORCED to run this potential spyware code, or if this needs to be embedded into nwjs from now on then please provide a separate download that doesn't contain this feature if it relies on a google blob handling the microphones.

TL;DR: make sure apps that don't want to use speech recognition don't need to get shipped with nw.js versions that potentially spy on users with the microphone without asking. Again, see the URL above.

jhm-ciberman · 2016-01-11T15:34:12Z

If I use the code above in NW, it fails with a SpeechRecognitionError. If I try (with or without the wami.csail.mit.edu) in regular Google Chrome, it works fine and detects the speech correctly.
In neither case network requests are generated or showed in the devtools panel.

Nelderson · 2016-02-29T15:34:13Z

I want this soooooo bad.

frenchbread · 2016-04-20T22:18:37Z

+1

jhermsmeier · 2016-04-22T09:24:26Z

@jhm-ciberman @willemmulder I just recently ran into the same network error with the 0.14.0-sdk, might get around to looking into what's going on there in the next days.

willemmulder · 2016-04-22T09:50:36Z

Thanks @jhermsmeier ! Really looking forward to your findings.

jhermsmeier · 2016-04-23T13:31:33Z

I've just done some testing with Chrome, Chromium and nw.js - attempting to use a local speech recognition backend running on localhost via serviceURI. First finding: The SpeechRecognition API stopped giving a flying fuck about the serviceURI property. It's completely ignored from what I can tell... i.e. with Chromium no connection to localhost could be observed, yet the speech recognition worked perfectly fine. Might find the time to dig into the source and file a bug, if there isn't one already.

willemmulder · 2016-04-23T17:25:30Z

Hmm I'm pretty sure that using a different URL (I used wami.csail.mit.edu) worked before in Chrome. However, if I read this thread https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/82LcTDrhshw then there's some talks about not supporting a full URL, whatever that means. Or in other words: I'm just as surprised as you are.

jhermsmeier · 2016-04-23T17:42:25Z

Looks like it's been removed / reverted 4 months ago: https://bugs.chromium.org/p/chromium/issues/detail?id=480516#c6

Can't see a valid reason for why they did though, as from what I can see, it's still in the draft/spec (although on the other hand; that's in flux, and I have difficulties finding a recent document).

There are some mentions of going through chrome://speech-recognition to use custom recognizers – which I haven't quite fully understood, yet. Will really need to dig into the issue tracker and source to figure out what's happening with that I guess.

bromagosa · 2017-02-20T11:41:09Z

Any updates on this?

chrome.tts doesn't seem to work at all in nwjs.io, neither does window.speechSynthesis. Has anyone succeeded?

I thought it may be an issue of adding the right permissions to ttsEngine in the manifest, but that doesn't seem to do the trick either.

rogerwang · 2017-02-22T02:45:23Z

@bromagosa I just tested chrome.tts works for me. This issue is about speech recognition, not text-to-speech. Please file another issue if it has bug on your side.

rogerwang · 2017-02-22T02:47:54Z

To all: I just had a test, speech recognition is working with a valid google API key. So I'm closing this issue.

I'm testing with https://www.google.com/intl/en/chrome/demos/speech.html.

Check here to see how to get a key: https://www.chromium.org/developers/how-tos/api-keys

bromagosa · 2017-02-22T08:53:05Z

@rogerwang sorry, you're right. I was hunting for solutions for both because speech recognition wasn't working either, but I'll take a look at your links. Thanks!

p.s. I found out chrome.tts is supported in Windows, Mac and Chromebooks, but (seemingly arbitrarily) not in GNU/Linux, so that's why I can't get it to work. I may need to work around that by adding a Node binding to festival :(

mscreenie · 2017-03-19T09:39:17Z

The Speech API looks great but the current limits imposed by Google make it unusable for developers.

There needs to be a really good offline ASR so not to ever worry about Google and being online but that probably wont happen anytime soon.

Sugarcaen · 2017-03-22T21:23:03Z

It looks like you could use the Google Cloud Platform for speech rec, or spawn a chrome process and try piping everything through there.

mscreenie · 2017-04-13T23:10:02Z

I think @jhermsmeier comment is interesting. serviceURI was removed in Chromium 49 and I can't find any information as to why. I know the API and spec is in draft, shit happens and things change. But a serviceURI param would have been useful.

This really sucks because vendor lock in and to implement your own self hosted speech to text is now really difficult. Don't know if this is Google wanting all the info to flow into their veins but frustrating to say the least and the removal of the param is a step in the wrong direction in the spirit of openness.

I am probably the minority here so there is little chance anything will change now. Would be nice to be able to voice this to the Chromium team at least.

jhermsmeier · 2017-07-03T19:06:53Z

According to the latest draft of the Web Speech API, the serviceURI hasn't been removed from the specification.

So I went digging a bit, which did turn up some things I haven't discovered before, but most discussion around this seems to have happened out of band, or somewhere inaccessible to the public.

There was Chromium Issue N° 480516 filed ~2 years ago, in which support was added, and subsequently reverted again a few months later. Discussion related to this happened on blink-dev/82LcTDrhshw/pGKPgrXOUaAJ, yet it is still unclear to me what exactly happened there.

It might be worth filing a new Chromium Issue to get an update on the situation.

mscreenie · 2017-08-01T22:07:07Z

Thanks for digging. I've done some myself and have come to the same conclusion. Everything is still a little unclear and seems odd serviceURI to be removed after being added and I can't establish why - It's also a little frustrating.

I am hopeful it may return, for now I'm exploring other things.

Firefox has VoiceFill with similar functionality, I haven't dug into how it works completely but this may be interesting to you:

https://github.com/mozilla/speaktome
https://github.com/mozilla/speech-proxy

VoiceFill is written as a WebExtension unless I am mistaken. If Chromium had WebExtension support there could be hope to port it as a viable replacement easily and without much effort given that is the purpose of WebExtension.

https://developer.chrome.com/extensions

A node plugin but may be short lived.
Maybe posting in the Chromium issue tracker and linking here would be a good start into investigating.

vsharmaMitel · 2020-06-05T13:48:01Z

Hello, may I know why this issue was closed. I really want to implement webkitSpeechRecognition for my nodewebkit app. Can someone please tell me if there is any solution or workaround?

thestonechat · 2021-02-27T15:52:07Z

+1

thestonechat · 2021-02-27T15:52:50Z

I'm searching for Electron alternative for my app just for speech recognition and I hope this works!

mscreenie · 2021-03-19T07:27:49Z

Chromium has recently release SODA (Works offline). This may be a better solution than this.

rogerwang closed this as completed Feb 22, 2017

webkitSpeechRecognition for desktop apps? #1115

webkitSpeechRecognition for desktop apps? #1115

Comments

isimmons commented Sep 14, 2013

caolan commented Oct 12, 2013

tommoor commented Oct 12, 2013

deanshub commented Oct 17, 2013

X4 commented Dec 1, 2013

TalAter commented Jan 15, 2014

manuelpaulo commented Jan 16, 2014

jshemas commented Feb 7, 2014

Akkuma commented Mar 31, 2014

ThomasAy commented Apr 1, 2014

X4 commented Apr 1, 2014

ghost commented Apr 5, 2014

alanjames1987 commented May 7, 2014

taskinegemen commented Jul 22, 2014

PlasmaPower commented Sep 7, 2014

byourselves commented Sep 11, 2014

PlasmaPower commented Sep 11, 2014

bram-dingelstad commented Oct 3, 2014

willemmulder commented Oct 7, 2014

dzautner commented Oct 15, 2014

netanelgilad commented Oct 16, 2014

ghost commented Oct 16, 2014

roccolucatallarita commented Oct 29, 2014

ghost commented Nov 5, 2014

miller9904 commented Nov 7, 2014

Lezeper commented Nov 25, 2014

anishtr4 commented Dec 4, 2014

miller9904 commented Dec 4, 2014

askbeka commented Dec 5, 2014

RobinMalfait commented Dec 21, 2014 • edited

tommoor commented Dec 22, 2014

TemaSM commented Oct 10, 2015

zackm0571 commented Nov 4, 2015

tom-s commented Dec 1, 2015

Aaronik commented Dec 24, 2015

ghostoy commented Dec 28, 2015

jhm-ciberman commented Jan 11, 2016

willemmulder commented Jan 11, 2016

ghost commented Jan 11, 2016

jhm-ciberman commented Jan 11, 2016

Nelderson commented Feb 29, 2016

frenchbread commented Apr 20, 2016

jhermsmeier commented Apr 22, 2016

willemmulder commented Apr 22, 2016

jhermsmeier commented Apr 23, 2016

willemmulder commented Apr 23, 2016

jhermsmeier commented Apr 23, 2016 • edited

bromagosa commented Feb 20, 2017

rogerwang commented Feb 22, 2017

rogerwang commented Feb 22, 2017 • edited

bromagosa commented Feb 22, 2017

mscreenie commented Mar 19, 2017

Sugarcaen commented Mar 22, 2017

mscreenie commented Apr 13, 2017 • edited

jhermsmeier commented Jul 3, 2017

mscreenie commented Aug 1, 2017 • edited

vsharmaMitel commented Jun 5, 2020

thestonechat commented Feb 27, 2021

thestonechat commented Feb 27, 2021

mscreenie commented Mar 19, 2021

RobinMalfait commented Dec 21, 2014 •

edited

jhermsmeier commented Apr 23, 2016 •

edited

rogerwang commented Feb 22, 2017 •

edited

mscreenie commented Apr 13, 2017 •

edited

mscreenie commented Aug 1, 2017 •

edited