Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webkitSpeechRecognition for desktop apps? #1115

Closed
isimmons opened this issue Sep 14, 2013 · 86 comments
Closed

webkitSpeechRecognition for desktop apps? #1115

isimmons opened this issue Sep 14, 2013 · 86 comments

Comments

@isimmons
Copy link

I was attempting to add speech recognition to a desktop application but it seems speechRecognition is not working probably due to the application using local file urls and no way to allow permission like when running it in chrome.

TalAter/annyang#44

As you can see in the linked related issue trying to run speecheRecognition by it's self in the console does nothing in node-webkit.

var sr = new webkitSpeechRecognition;
sr.continuous = true; 
sr.interimResults = true;
sr.lang='en';
sr.onresult = function(e) {
    console.log(e.results[e.results.length-1][0].transcript);
};
sr.start();//nothin happens and no errors

So would it be correct to say this is not currently possible in node-webkit or is there a known work around? I'm aware of the getUserMedia API but it seems that only captures audio but doesn't do any speech recognition.

If it's not currently possible will it be possible in future releases?

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@caolan
Copy link

caolan commented Oct 12, 2013

+1

@tommoor
Copy link

tommoor commented Oct 12, 2013

Seems that permission needs to be pre-authorised in the same way it was for getUserMedia calls and screensharing.

Would love to see support for this @rogerwang and feels like a possible quick win?

@deanshub
Copy link

+1

@X4
Copy link

X4 commented Dec 1, 2013

You could write a native-client app for chrome that is a wrapper for this. I guess this could work with phantomjs too.

@TalAter
Copy link

TalAter commented Jan 15, 2014

+1

2 similar comments
@manuelpaulo
Copy link

+1

@jshemas
Copy link

jshemas commented Feb 7, 2014

+1

@Akkuma
Copy link

Akkuma commented Mar 31, 2014

Seriously no comment on this after 7 months? +1

@ThomasAy
Copy link

ThomasAy commented Apr 1, 2014

@isimmons @caolan @tommoor @deanshub @X4 @TalAter @manuelpaulo @jshemas

Does any of you find any workaround ?

@X4
Copy link

X4 commented Apr 1, 2014

@ThomasAy I'd have to debug it, as I've only skimmed the nodewebkit sourcecode, but to me it looks like you could patch https://github.com/rogerwang/node-webkit/blob/master/src/media/media_capture_devices_dispatcher.cc to show debug messages, when you call sr.start();
However, I've not even seen a reference of webkitSpeechRecognition, so it seems that this is currently not implemented yet. Correct me if I'm wrong, but here is what I think is missing in node-webkit to support this: http://git.chromium.org/gitweb/?p=chromium.git;a=tree;f=chrome/browser/speech

@ghost
Copy link

ghost commented Apr 5, 2014

+1

1 similar comment
@alanjames1987
Copy link

+1

@taskinegemen
Copy link

what about the status?

@PlasmaPower
Copy link

+1

1 similar comment
@byourselves
Copy link

+1

@PlasmaPower
Copy link

This can be done some through Google's web api for speech recognition, but it would be nice to have it integrated into nodeWebkit (or even nicer as a nodeJS module).

@bram-dingelstad
Copy link

+1

6 similar comments
@willemmulder
Copy link

+1

@dzautner
Copy link

+1

@netanelgilad
Copy link

+1

@ghost
Copy link

ghost commented Oct 16, 2014

+1

@roccolucatallarita
Copy link

+1

@ghost
Copy link

ghost commented Nov 5, 2014

+1

@miller9904
Copy link

+1 I would LOVE this feature!

@Lezeper
Copy link

Lezeper commented Nov 25, 2014

+1

@anishtr4
Copy link

anishtr4 commented Dec 4, 2014

is this feature still not working ? i want it badly

@miller9904
Copy link

Me too!

@askbeka
Copy link

askbeka commented Dec 5, 2014

+1

@RobinMalfait
Copy link

RobinMalfait commented Dec 21, 2014

Still no fix?!

Edit from 2020: I'm sorry that I responded like this!

@tommoor
Copy link

tommoor commented Dec 22, 2014

@RobinMalfait this isn't a bug, it's a feature request 😉 Chrome uses Google's voice recognition service, there is no equivalent for node webkit.

@TemaSM
Copy link

TemaSM commented Oct 10, 2015

+1

3 similar comments
@zackm0571
Copy link

+1

@tom-s
Copy link

tom-s commented Dec 1, 2015

+1

@Aaronik
Copy link

Aaronik commented Dec 24, 2015

+1

@ghostoy
Copy link
Member

ghostoy commented Dec 28, 2015

It should be supported by NW13.

Aaron Sullivan notifications@github.com于2015年12月25日周五 00:58写道:

+1


Reply to this email directly or view it on GitHub
#1115 (comment).

@jhm-ciberman
Copy link

Is this supported already? I'm using NW13 beta 2.
And this code throw a SpeechRecognitionError (error="network", message="")

var recognition = new webkitSpeechRecognition();
recognition.onresult = function(e) {console.log(e);}
recognition.onerror =  function(e) {console.log(e);}
// Fails with and without this line: 
//recognition.serviceURI = 'wami.csail.mit.edu';
recognition.start();

@willemmulder
Copy link

I assume wami.csail.mit.edu returns an error or requires a key of sorts? What do you see in the network requests?

@ghost
Copy link

ghost commented Jan 11, 2016

Does this feature work without embedding a black box binary blob from Google that accesses the microphone without user consent (and without any web app making explicit use of it)?

Please note while I might sound paranoid, this is apparently what chromium used to do to support this: https://www.privateinternetaccess.com/blog/2015/06/google-chrome-listening-in-to-your-room-shows-the-importance-of-privacy-defense-in-depth/

If this is what you plan to add to nwjs as well, maybe make the binary blob download only triggered at runtime by the web app so no app using nwjs is FORCED to run this potential spyware code, or if this needs to be embedded into nwjs from now on then please provide a separate download that doesn't contain this feature if it relies on a google blob handling the microphones.

TL;DR: make sure apps that don't want to use speech recognition don't need to get shipped with nw.js versions that potentially spy on users with the microphone without asking. Again, see the URL above.

@jhm-ciberman
Copy link

If I use the code above in NW, it fails with a SpeechRecognitionError. If I try (with or without the wami.csail.mit.edu) in regular Google Chrome, it works fine and detects the speech correctly.
In neither case network requests are generated or showed in the devtools panel.

@Nelderson
Copy link

I want this soooooo bad.

@frenchbread
Copy link

+1

@jhermsmeier
Copy link

@jhm-ciberman @willemmulder I just recently ran into the same network error with the 0.14.0-sdk, might get around to looking into what's going on there in the next days.

@willemmulder
Copy link

Thanks @jhermsmeier ! Really looking forward to your findings.

@jhermsmeier
Copy link

I've just done some testing with Chrome, Chromium and nw.js - attempting to use a local speech recognition backend running on localhost via serviceURI. First finding: The SpeechRecognition API stopped giving a flying fuck about the serviceURI property. It's completely ignored from what I can tell... i.e. with Chromium no connection to localhost could be observed, yet the speech recognition worked perfectly fine. Might find the time to dig into the source and file a bug, if there isn't one already.

@willemmulder
Copy link

Hmm I'm pretty sure that using a different URL (I used wami.csail.mit.edu) worked before in Chrome. However, if I read this thread https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/82LcTDrhshw then there's some talks about not supporting a full URL, whatever that means. Or in other words: I'm just as surprised as you are.

@jhermsmeier
Copy link

jhermsmeier commented Apr 23, 2016

Looks like it's been removed / reverted 4 months ago: https://bugs.chromium.org/p/chromium/issues/detail?id=480516#c6

Can't see a valid reason for why they did though, as from what I can see, it's still in the draft/spec (although on the other hand; that's in flux, and I have difficulties finding a recent document).

There are some mentions of going through chrome://speech-recognition to use custom recognizers – which I haven't quite fully understood, yet. Will really need to dig into the issue tracker and source to figure out what's happening with that I guess.

@bromagosa
Copy link

Any updates on this?

chrome.tts doesn't seem to work at all in nwjs.io, neither does window.speechSynthesis. Has anyone succeeded?

I thought it may be an issue of adding the right permissions to ttsEngine in the manifest, but that doesn't seem to do the trick either.

@rogerwang
Copy link
Member

@bromagosa I just tested chrome.tts works for me. This issue is about speech recognition, not text-to-speech. Please file another issue if it has bug on your side.

@rogerwang
Copy link
Member

rogerwang commented Feb 22, 2017

To all: I just had a test, speech recognition is working with a valid google API key. So I'm closing this issue.

I'm testing with https://www.google.com/intl/en/chrome/demos/speech.html.

Check here to see how to get a key: https://www.chromium.org/developers/how-tos/api-keys

@bromagosa
Copy link

@rogerwang sorry, you're right. I was hunting for solutions for both because speech recognition wasn't working either, but I'll take a look at your links. Thanks!

p.s. I found out chrome.tts is supported in Windows, Mac and Chromebooks, but (seemingly arbitrarily) not in GNU/Linux, so that's why I can't get it to work. I may need to work around that by adding a Node binding to festival :(

@mscreenie
Copy link
Contributor

The Speech API looks great but the current limits imposed by Google make it unusable for developers.

There needs to be a really good offline ASR so not to ever worry about Google and being online but that probably wont happen anytime soon.

@Sugarcaen
Copy link

It looks like you could use the Google Cloud Platform for speech rec, or spawn a chrome process and try piping everything through there.

@mscreenie
Copy link
Contributor

mscreenie commented Apr 13, 2017

I think @jhermsmeier comment is interesting. serviceURI was removed in Chromium 49 and I can't find any information as to why. I know the API and spec is in draft, shit happens and things change. But a serviceURI param would have been useful.

This really sucks because vendor lock in and to implement your own self hosted speech to text is now really difficult. Don't know if this is Google wanting all the info to flow into their veins but frustrating to say the least and the removal of the param is a step in the wrong direction in the spirit of openness.

I am probably the minority here so there is little chance anything will change now. Would be nice to be able to voice this to the Chromium team at least.

@jhermsmeier
Copy link

According to the latest draft of the Web Speech API, the serviceURI hasn't been removed from the specification.

So I went digging a bit, which did turn up some things I haven't discovered before, but most discussion around this seems to have happened out of band, or somewhere inaccessible to the public.

There was Chromium Issue N° 480516 filed ~2 years ago, in which support was added, and subsequently reverted again a few months later. Discussion related to this happened on blink-dev/82LcTDrhshw/pGKPgrXOUaAJ, yet it is still unclear to me what exactly happened there.

It might be worth filing a new Chromium Issue to get an update on the situation.

@mscreenie
Copy link
Contributor

mscreenie commented Aug 1, 2017

Thanks for digging. I've done some myself and have come to the same conclusion. Everything is still a little unclear and seems odd serviceURI to be removed after being added and I can't establish why - It's also a little frustrating.

I am hopeful it may return, for now I'm exploring other things.

  1. Firefox has VoiceFill with similar functionality, I haven't dug into how it works completely but this may be interesting to you:

https://github.com/mozilla/speaktome
https://github.com/mozilla/speech-proxy

VoiceFill is written as a WebExtension unless I am mistaken. If Chromium had WebExtension support there could be hope to port it as a viable replacement easily and without much effort given that is the purpose of WebExtension.

https://developer.chrome.com/extensions

  1. A node plugin but may be short lived.

  2. Maybe posting in the Chromium issue tracker and linking here would be a good start into investigating.

@vsharmaMitel
Copy link

Hello, may I know why this issue was closed. I really want to implement webkitSpeechRecognition for my nodewebkit app. Can someone please tell me if there is any solution or workaround?

@thestonechat
Copy link

+1

@thestonechat
Copy link

I'm searching for Electron alternative for my app just for speech recognition and I hope this works!

@mscreenie
Copy link
Contributor

Chromium has recently release SODA (Works offline). This may be a better solution than this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests