JavaScript library for detecting synthesized sounds
Clone or download
qurihara
qurihara ok
Latest commit e21f71e Dec 16, 2017
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
css qurihara merged. Mar 10, 2017
evaluation ファイル整理 Aug 10, 2017
lib lib May 26, 2017
scripts ok Sep 16, 2017
spec_js updated Mar 7, 2017
README.md ok Dec 15, 2017
bundle_pico.bat now the name is picognizer, and recognized-->oncost Aug 10, 2017
bundle_pico.sh now the name is picognizer, and recognized-->oncost Aug 10, 2017
code.js updated Feb 24, 2017
constants.js updated Dec 12, 2016
node_beefy_example.js ok Sep 14, 2017
picognizer.js refactoring temp Aug 31, 2017
picognizer_browser.js before moving to github Aug 31, 2017
reco.html ok Aug 31, 2017
reco.js ok Aug 10, 2017
script.html ok Aug 31, 2017
script.js before moving to github Aug 31, 2017
trigger.html ok Aug 31, 2017
trigger.js ok Aug 10, 2017

README.md

Picognizer

Picognizer is the 100% JavaScript library for detecting synthesized sounds (e.g. sound effects of video games, replayed speech or music, and sound alerts of home appliances). You can run it on your web browser without any server-side systems.

Demo movie

https://www.youtube.com/watch?v=-2MtArZAtfg (Japanese)

https://www.youtube.com/watch?v=CoYJmNdxPNY (English)

Browser example (works best with Firefox on PC/Mac/Android)

Click this: https://qurihara.github.io/picognizer/script.html?cri=48&surl=https://rawgit.com/qurihara/picognizer/gh-pages/scripts/bg_red.js&src=https://rawgit.com/Fulox/FullScreenMario-JSON/master/Sounds/Sounds/mp3/Coin.mp3&frame=0.04&dur=0.01

This example can detect a famous sound effect of getting a coin in Super Mario Brothers. Press "Picognize" button and add permission of using your mic. With "play" button you can play the target sound as an emulation. The browser's background turnes red when the sound is detected. With "fire" button you can check what will happen after detection. Change the slider bar to adjust the threshold.

You can execute any JavaScript code on detection. Refer to https://github.com/qurihara/picognizer/blob/gh-pages/scripts/bg_red.js as an example. The setup() funciton is executed only once when the page is loaded. The onfire() function is executed for each detection.

Node.js example with beefy

The current implementation of Picognizer is web-oriented. However you can also introduce Picognizer with Node.js coding style using beefy[https://www.npmjs.com/package/beefy].

npm i beefy browserify debug -g
beefy node_beefy_example.js --live --open

Then a browser opens to run the example. If you don't use Firefox as your default browser, reopen the page with Firefox.

See node_beefy_example.js for checking up the actual coding mannar.

How to code

// for web browser
var Pico = require('picognizer');
// for Node.js/beefy
//var Meyda = require("./lib/meyda.min");
//var Pico = require('./picognizer');

var P = new Pico;

//parameter
option = {
  bufferSize:Math.pow(2, 10), // Feature buffer size (defalt: automatic)
  windowFunc:"hamming", // Window function (default: hamming)
  feature:["mfcc"], // Feature name (default: ["powerSpectrum"])
  mode:"direct",  // Cost calculation algorithms (default: direct)
  inputType:"audio", // Input type (default: mic)
  file:"input_audio_name", // Filename of input audio when you use inputType:"audio"
  framesec:0.1,  // Interval seconds for each feature extraction (default: 0.02)
  duration:1.0, // Interval seconds for each cost calculation (default: 1.0)
  slice:[0,1.0] // Slice time[s] for the target feature, slice[0] is start and slice[1] is end
};
P.init(option); //set parameter

P.oncost('url_for_audiofilename.mp3', function(cost){
  // do something with cost      
}

option

bufferSize

"buffeSize" is the size of the feature to extract. When you use spectral features, it is necessary to a power of two greater than samples in framesec. If bufferSize is undefined, it is automatically calculated according to the framesec.

mode

It is an option to set cost calculation algorithms. The target feature vector and the input feature vector are calculated using dynamic time warping as "dtw" or direct comparison "direct."

inputType

You can select either input data from the microphone as "mic" or the audio file as "audio". If "audio" is defined, it is necessary to specify the audio file with "file."

slice

If the sound source of the target is long, you can cut out the specified seconds and extract feature vectors. The slice [0] represents the start time and slice [1] accounts for the end time. Please describe it all in seconds.

Please see meyda wiki for parameters (bufferSize, windowFunc, feature) on features since meyda is used for feature extraction.