Skip to content

greysun/audio-sound-and-speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 

Repository files navigation

audio-sound-and-speech

repository of audio,sound and speech related paper ,tools and docs

Papers

https://github.com/google/uis-rnn This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization. https://arxiv.org/abs/1810.04719

https://github.com/philipperemy/deep-speaker

https://github.com/qqueing/DeepSpeaker-pytorch

https://arxiv.org/abs/1604.07160 Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection

SURREY-CVSSP SYSTEM FOR DCASE2017 CHALLENGE TASK4

https://arxiv.org/find/all/1/all:+dcase/0/1/0/all/0/1 https://wenku.baidu.com/view/b223255b3186bceb18e8bb71.html https://etymo.io/search/Dcase https://arxiv.org/abs/1612.01611v1 https://arxiv.org/abs/1607.03681v2 https://arxiv.org/abs/1703.06902v1 https://arxiv.org/abs/1609.06026v3 http://karol.piczak.com/papers/Piczak2015-ESC-ConvNet.pdf https://arxiv.org/pdf/1609.05234.pdf(https://github.com/spragunr/deep_q_rl)

https://vijaychan.github.io/Publications/2011%20-%20Survey%20and%20evaluation%20of%20audio%20fingerprinting%20schemes%20for%20mobile%20audio%20search.pdf SURVEY AND EVALUATION OF AUDIO FINGERPRINTING SCHEMES FOR MOBILE QUERY-BY-EXAMPLE APPLICATIONS

Tools and code

https://github.com/google/uis-rnn This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization. https://arxiv.org/abs/1810.04719

https://github.com/dake/openVP 声纹识别

https://github.com/tensorflow/models/tree/master/research/audioset CNN Architectures for Large-Scale Audio Classification

http://projects.csail.mit.edu/soundnet/ SoundNet: Learning Sound�Representations from Unlabeled Video

https://github.com/tyiannak/pyAudioAnalysis http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0144610 https://github.com/librosa/librosa https://github.com/readbeyond/aeneas https://github.com/CPJKU/madmom https://github.com/aalireza/SimpleAudioIndexer https://github.com/craffel/mir_eval

Audio/Sound event detection:

https://github.com/gorinars/dcase16-cnn https://github.com/liuhuang31/dcase17_cnn https://github.com/kahst/AcousticEventDetection https://github.com/nationalparkservice/acoustic_discovery

可视化:

https://github.com/TUT-ARG/sed_vis

https://github.com/TUT-ARG/TUT_Rare_sound_events_mixture_synthesizer

http://tut-arg.github.io/sed_eval/:评估工具

https://github.com/TUT-ARG/sed_vis :可视工具

https://github.com/znichols/racKet https://github.com/justinsalamon/UrbanSound8K-JAMS http://bmcfee.github.io/papers/scipy2015_librosa.pdf

https://github.com/andabi/voice-vector A deep neural network for finding text-independent speaker embedding written in tensorflow and tensorpack

musical fingerprinting systems:

https://github.com/echonest/echoprint-server Server components for Echoprint https://github.com/beetbox/pyacoustid Python bindings for Chromaprint acoustic fingerprinting and the Acoustid Web service https://acoustid.org AcoustID is a project providing complete audio identification service, based entirely on open source software. https://labrosa.ee.columbia.edu/matlab/audfprint/ audfprint is a (compiled) Matlab script that can take a list of soundfiles and create a database of landmarks, and then subsequently take one or more query audio files and match them against the previously-created database.

https://github.com/dpwe/audfprint Landmark-based audio fingerprinting

https://github.com/spotify/echoprint-server Server for the Echoprint audio fingerprint system https://github.com/worldveil/dejavu Audio fingerprinting and recognition in Python https://github.com/jameslyons/python_speech_features This library provides common speech features for ASR including MFCCs and filterbank energies.

Documents

https://github.com/bootphon/phonemizer Simple text to phonemes converter for multiple languages

https://mp.weixin.qq.com/s?__biz=MzU2OTA0NzE2NA==&mid=2247501030&idx=1&sn=31fe4c7f596e377afc3a473bbffc84e2&chksm=fc8625f5cbf1ace38f098b1e3d641a66a9618926ffe4a3c11abf549ddf95aeb25b5f99e261bb&mpshare=1&scene=24&srcid=110566sXYXbPEX9X6HQqQvYV#rd 语音识别领域最全入门资料、论文、代码、产品大合集!包括语音识别,语音合成,声纹识别等内容,一文在手,带你走进语音识别的世界。

https://www.zhihu.com/question/53707809/answer/181292755 https://zhuanlan.zhihu.com/p/24362279 https://www.zhihu.com/question/21505605 https://www.zhihu.com/question/265075184/answer/291146573 https://zhuanlan.zhihu.com/p/26482011

MFCC情感识别:

http://blog.csdn.net/u011108244/article/details/51661186 https://my.oschina.net/jamesju/blog/193343 http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/ http://blog.csdn.net/audio_algorithm/article/details/78709422 https://wenku.baidu.com/view/39b761f20242a8956bece4a3.html

Detection and Classification of Acoustic Scenes and Events Outcome of the DCASE

https://www.cs.tut.fi/sgn/arg/dcase2017/ https://github.com/yongxuUSTC/dcase2017_task4_cvssp https://github.com/DeepLJH0001/DCASE2016 http://www.sohu.com/a/193907127_642762 https://www.cs.tut.fi/sgn/arg/dcase2017/challenge/download

https://www.zhihu.com/question/56816282/answer/150639596 https://github.com/qiuqiangkong/DCASE2016_Task3 https://www.zhihu.com/question/57658184/answer/245420536

http://www.sohu.com/a/117638110_465975 https://www.zhihu.com/question/23497307/answer/24772167

https://www.zhihu.com/question/20398418/answer/18080841 https://www.zhihu.com/question/24342192/answer/225984574 https://zhuanlan.zhihu.com/p/33464788 https://zhuanlan.zhihu.com/p/33144046 https://ccrma.stanford.edu/~jos/filters/ https://zhuanlan.zhihu.com/p/28848339

https://blog.csdn.net/yutianzuijin/article/details/21446401 音乐检索简介 https://max.book118.com/html/2017/0221/92851570.shtm 基于内容的音频信息检索 https://github.com/musescore/MuseScore MuseScore is an open source and free music notation software.

http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/ Audio Fingerprinting with Python and Numpy

https://www.zhihu.com/question/265066896/answer/291395259 https://www.zhihu.com/question/265209086/answer/301313983 语音识别方面的比赛有哪些?

https://blog.naaln.com/2013/08/music-algorithm-for-fingerprint-framework/ 音乐指纹 - 算法的框架

https://github.com/ybayle/awesome-deep-learning-music List of articles related to deep learning applied to music

About

Repository of audio,sound and speech related paper ,tools and docs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published