Skip to content

lsmithso/nrtvr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

* Near Real Time Voice Recognition

** Notes
GapTimer:
Used to flush flac to disk + send message to vr client.
Detects silence (gaps) using level 
If doesnt record to disk.
If non-silent, start recording, start timer
If recording. record for min  10 secs, mac 20 secs (window)
If silence in window, stop recording, flush.
If end of window, stop recording.


Uses gst to capture voice from mic, http or skype and pipes to Googles
voice recognition API in real time. Results can be displayed in near
real time for eg radio program transcriptions, skype calls etc. for
the deafblind for braille users etc.

2x process: 

. Capture captures audio from chosen input stream and transcodes to
flac. Uses level plugin to detect silence and count word
boundaries. Streams flac to file and when sufficient words closes file
and sends filename on pipe to vr process.

. vr process waits for filenames on its input pipe and then makes
Google vr api call, displaying results.

Tried on r4 - pretty crappy results. Maybe meed filter? audiofx for voice.
Carry on anyway.
DCode up python launcher for silence detectiom.
+ pipe to process for tuning filters etc/.



----
gst-launch playbin2 uri="mms://wmlive-acl.bbc.co.uk/wms/bbc_ami/radio4/radio4_bb_live_eq1_sl1?BBC-UID=743f7d2b70f86c6814c7231b812545a243f9ae4c10900184a4dfd476c840ce2a&SSO2-UID="



gst-launch pulsesrc device=3 ! flacenc rate=16000 ! filesink location=z.flac

gst-launch -e pulsesrc device=3 ! audio/x-raw-int,rate=16000,channels=1 ! flacenc  ! filesink location=z.flac

mms://wmlive-acl.bbc.co.uk/wms/bbc_ami/radio4/radio4_bb_live_eq1_sl1?BBC-UID=743f7d2b70f86c6814c7231b812545a243f9ae4c10900184a4dfd476c840ce2a&SSO2-UID=
gst-launch -e souphttpsrc location=$U  ! audio/x-raw-int,rate=16000,channels=1 ! flacenc  ! filesink location=z.flac


gst-launch -e mmssrc location=$U! asfdemux name=demux  demux.audio_00  ! audio/x-raw-int,rate=16000,channels=1 ! flacenc  ! filesink location=z.flac

19   gst-launch mmssrc location=mms://195.37.219.74:8080 ! asfdemux name=demux demux.audio_00 ! demux.video_00 ! { queue ! ffdec_msmpeg4 ! ffcolorspace ! xvimagesink }                                              
20   gst-launch mmssrc location=mms://195.37.219.74:8080 ! asfdemux name=demux demux.video_00 ! { queue ! ffdec_wmv2 ! xvimagesink } demux.audio_00 ! { queue ! ffdec_wmav2 ! osssink }             

23   gst-launch filesrc location=TheChubbChubbs.avi ! avidemux name=demux demux.audio_00 ! { queue ! filesink location=chubb.mp3 }                                                                  
24   gst-launch filesrc location=TheChubbChubbs.avi ! avidemux name=demux demux.audio_00 ! { queue ! mad ! wavenc ! filesink location=chubb.wav }                                                   


gst-launch -e mmssrc location=$U ! asfdemux name=demux  demux.audio_00  ! multiqueue !   filesink location=z

gst-launch -e mmssrc location=$U ! asfdemux name=demux  demux.audio_00  ! multiqueue ! ffdec_wmav2   ! filesink location=z



gst-launch -e mmssrc location=$U ! asfdemux name=demux  demux.audio_00  ! multiqueue ! ffdec_wmav2   ! audioconvert ! flacenc ! filesink location=z




gst-launch -e mmssrc location=$U ! asfdemux name=demux  demux.audio_00  ! multiqueue ! ffdec_wmav2  ! audioresample    ! audio/x-raw-int,rate=6000,channels=2  !  audioconvert ! audio/x-raw-int rate=6000,channels=1  ! flacenc ! filesink location=z

gst-launch -e mmssrc location=$U ! asfdemux name=demux  demux.audio_00  ! multiqueue ! ffdec_wmav2  ! audioresample    ! audio/x-raw-int,rate=16000,channels=2  !  audioconvert ! audio/x-raw-int,rate=16000,channels=1  ! flacenc ! filesink location=/tmp/z
export U="mms://wmlive-acl.bbc.co.uk/wms/bbc_ami/radio4/radio4_bb_live_eq1_sl1?BBC-UID=743f7d2b70f86c6814c7231b812545a243f9ae4c10900184a4dfd476c840ce2a&SSO2-UID="


00000000000000000000
def bus_event(bus, message, *args):
    peak = message.structure['peak'][0]
    if peak < -50:
        print 'silence on the cable! help!'
    return True

mainloop = gobject.MainLoop()

s = 'gnomevfssrc location="http://local-stream:8000/local.ogg" ! '\
      'oggdemux ! vorbisdec ! audioconvert ! '\
      'level message=true interval=5000000000 ! fakesink'

pipeline = gst.parse_launch(s)
pipeline.get_bus().add_signal_watch()
i = pipeline.get_bus().connect('message::element', bus_event)
pipeline.set_state(gst.STATE_PLAYING)
mainloop.run()


cutter plugin?
fdsink - open fd to write to?

You can set the sink to READY then change the location.

gst-inspect multifilesink


* Plan B

Since filesink can't change location on the fly: Open a pipe to a
sub-process and write raw data to it via fdsink. When it time to
change file location, send SIGUSE1. The child process reads stdin to
get raw data till EOF, saving to a file as it does so. SIGUSE1 handler
flushes data to file and closes it, then opens the next file etc. The
signal also sends a message to the Google vr client to read the file,
encode it and call the API.

May be the child process can flac encode on the fly, via a gstreamer
pipe. At SIGUSE1, it puts an eso eos on the stream and stops feeding
data to it. Thus flushing the filesink. Or it flac encodes after the
file switch. Or the Google api does it.

Stage 1: Modify nrtv.py tp send data to subprocess via pipe. Make sire
data written to file. Modify child process to pipe to audioplay tp
make sure its gapless.



src//tmp/x.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 32 bit, mono 16000 Hz
1ffmpeg -i /tmp/x.wav 
ffmpeg version 0.7.6-4:0.7.6-0ubuntu0.11.10.3, Copyright (c) 2000-2011 the Libav developers
  built on Jan 24 2013 19:25:26 with gcc 4.6.1
  configuration: --extra-version='4:0.7.6-0ubuntu0.11.10.3' --arch=i386 --prefix=/usr --enable-vdpau --enable-bzlib --enable-libgsm --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-pthreads --enable-zlib --enable-libvpx --enable-runtime-cpudetect --enable-vaapi --enable-gpl --enable-postproc --enable-swscale --enable-x11grab --enable-libdc1394 --enable-shared --disable-static
  WARNING: library configuration mismatch
  avutil      configuration: --extra-version='4:0.7.6ubuntu0.11.10.3' --arch=i386 --prefix=/usr --enable-vdpau --enable-bzlib --enable-libgsm --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-pthreads --enable-zlib --enable-libvpx --enable-runtime-cpudetect --enable-vaapi --enable-libopenjpeg --enable-gpl --enable-postproc --enable-swscale --enable-x11grab --enable-libdirac --enable-libmp3lame --enable-librtmp --enable-libx264 --enable-libxvid --enable-libvo-aacenc --enable-version3 --enable-libvo-amrwbenc --enable-version3 --enable-libdc1394 --shlibdir=/usr/lib/i686/cmov --cpu=i686 --enable-shared --disable-static --disable-ffmpeg --disable-ffplay
  avcodec     configuration: --extra-version='4:0.7.6ubuntu0.11.10.3' --arch=i386 --prefix=/usr --enable-vdpau --enable-bzlib --enable-libgsm --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-pthreads --enable-zlib --enable-libvpx --enable-runtime-cpudetect --enable-vaapi --enable-libopenjpeg --enable-gpl --enable-postproc --enable-swscale --enable-x11grab --enable-libdirac --enable-libmp3lame --enable-librtmp --enable-libx264 --enable-libxvid --enable-libvo-aacenc --enable-version3 --enable-libvo-amrwbenc --enable-version3 --enable-libdc1394 --shlibdir=/usr/lib/i686/cmov --cpu=i686 --enable-shared --disable-static --disable-ffmpeg --disable-ffplay
  avformat    configuration: --extra-version='4:0.7.6-0ubuntu0.11.10.3' --arch=i386 --prefix=/usr --enable-vdpau --enable-bzlib --enable-libgsm --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-pthreads --enable-zlib --enable-libvpx --enable-runtime-cpudetect --enable-vaapi --enable-gpl --enable-postproc --enable-swscale --enable-x11grab --enable-libdc1394 --shlibdir=/usr/lib/i686/cmov --cpu=i686 --enable-shared --disable-static --disable-ffmpeg --disable-ffplay
  avdevice    configuration: --extra-version='4:0.7.6-0ubuntu0.11.10.3' --arch=i386 --prefix=/usr --enable-vdpau --enable-bzlib --enable-libgsm --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-pthreads --enable-zlib --enable-libvpx --enable-runtime-cpudetect --enable-vaapi --enable-gpl --enable-postproc --enable-swscale --enable-x11grab --enable-libdc1394 --shlibdir=/usr/lib/i686/cmov --cpu=i686 --enable-shared --disable-static --disable-ffmpeg --disable-ffplay
  avfilter    configuration: --extra-version='4:0.7.6-0ubuntu0.11.10.3' --arch=i386 --prefix=/usr --enable-vdpau --enable-bzlib --enable-libgsm --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-pthreads --enable-zlib --enable-libvpx --enable-runtime-cpudetect --enable-vaapi --enable-gpl --enable-postproc --enable-swscale --enable-x11grab --enable-libdc1394 --shlibdir=/usr/lib/i686/cmov --cpu=i686 --enable-shared --disable-static --disable-ffmpeg --disable-ffplay
  swscale     configuration: --extra-version='4:0.7.6-0ubuntu0.11.10.3' --arch=i386 --prefix=/usr --enable-vdpau --enable-bzlib --enable-libgsm --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-pthreads --enable-zlib --enable-libvpx --enable-runtime-cpudetect --enable-vaapi --enable-gpl --enable-postproc --enable-swscale --enable-x11grab --enable-libdc1394 --shlibdir=/usr/lib/i686/cmov --cpu=i686 --enable-shared --disable-static --disable-ffmpeg --disable-ffplay
  postproc    configuration: --extra-version='4:0.7.6-0ubuntu0.11.10.3' --arch=i386 --prefix=/usr --enable-vdpau --enable-bzlib --enable-libgsm --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-pthreads --enable-zlib --enable-libvpx --enable-runtime-cpudetect --enable-vaapi --enable-gpl --enable-postproc --enable-swscale --enable-x11grab --enable-libdc1394 --shlibdir=/usr/lib/i686/cmov --cpu=i686 --enable-shared --disable-static --disable-ffmpeg --disable-ffplay
  libavutil    51.  7. 0 / 51.  7. 0
  libavcodec   53.  6. 0 / 53.  6. 0
  libavformat  53.  3. 0 / 53.  3. 0
  libavdevice  53.  0. 0 / 53.  0. 0
  libavfilter   2.  4. 0 /  2.  4. 0
  libswscale    2.  0. 0 /  2.  0. 0
  libpostproc  52.  0. 0 / 52.  0. 0
[wav @ 0x84a4a40] max_analyze_duration reached
Input #0, wav, from '/tmp/x.wav':
  Duration: 09:19:13.40, bitrate: 0 kb/s
    Stream #0.0: Audio: pcm_s32le, 16000 Hz, 1 channels, s32, 512 kb/s
At least one output file must be specified
1

tmp/x.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 32 bit, mono 16000 Hz
1



Bins
----
If you want to add bins to your pipeline, you can do so by specifying 
  <bintype> . ( <properties> <elements> )
in your pipeline. This adds a bin to your pipeline and puts all elements that
are specified between the brackets inside this bin. You can specify properties
of the bin directly after the opening bracket.
There are to special bins: if you don't specify a bintype and no dot either, the
type of the bin defaults to "bin". And you can use curly brackets { } to get a 
bin of type "thread".
  example:#> gst-launch \( { fakesrc pipeline . \( fakesink \) } \)
This will put a fakesrc element inside a thread inside a bin and a fakesink into
a pipeline element inside the thread inside the bin.
Please note that this pipeline would not work, even if the elements were 
connected properly, because the pipeline only specifies one top level element, 
the element is not put inside a pipeline but returned directly. So if you don't
want your elements be put into a pipeline, just add a bin of whatever type you 
wish around the pipeline.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages