Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 

Curriculum Vitae - Quan Wang

Current Occupation Contact
Staff Software Engineer & Tech Lead Manager Email: quanrpi@gmail.com
Google LLC Web: wangquan.me
New York, NY LinkedIn: www.linkedin.com/in/wangquan

News

My award-winning book "Voice Identity Techniques: From core algorithms to engineering practice" (Chinese) can be purchased here.

Research Interests

Recent

  • Speaker identification and diarization
  • Source separation and speech enhancement
  • Text-to-speech synthesis
  • Deep learning

Previous

  • Learning-based font loading (MLFont)
  • Optical character recognition
  • Biomedical image analysis (mainly 2D and 3D segmentation)
  • Geometric models, shape models, and face models
  • Occupancy sensing and reconstruction for smart lighting

Media Coverage

Education

  • 2010/08 – 2014/10, Ph.D., Rensselaer Polytechnic Institute, NY, USA

    • Signal Analysis and Machine Perception Laboratory (SAMPL)
    • Department of Electrical, Computer, and Systems Engineering (ECSE)
    • Advisor: Prof. Kim L. Boyer
    • Thesis: Exploiting Geometric and Spatial Constraints for Vision and Lighting Applications
    • GPA: 4.0/4.0
  • 2006/08 – 2010/08, B.Eng. in Automation, Tsinghua University, Beijing, China

    • Department of Automation, Class of Fundamental Sciences
    • Advisor: Prof. Qionghai Dai
    • Thesis: Implementation and Study of Light-Field-Based 3D Object Retrieval System
    • Major GPA: 91.3/100

Work Experience

  • 2015/11 – Current, Staff Software Engineer and Tech Lead Manager, Google, New York City, NY, USA

    • Manager: Dr. Ignacio Lopez Moreno
    • "OK Google" voice search & actions
    • Speaker identification and speaker diarization
    • VoiceFilter source separation
    • Learning-based font loading (MLFont)
  • 2014/11 – 2015/10, Machine Learning Scientist, Amazon, Cambridge, MA, USA

    • Manager: Dr. Shiv Vitaladevuni
    • Amazon Firefly: Optical character recognition
    • Amazon Echo: Speech recognition
  • 2013/05 – 2013/08, Research Intern, IBM Almaden Research Center, San Jose, CA, USA

    • Manager: Dr. Tanveer Syeda-Mahmood
    • Automated segmentation and heart disease detection from echocardiogram images
    • The Medical Sieve project (in Java)
  • 2012/05 – 2012/08, Research Intern, Siemens Corporate Research, Princeton, NJ, USA

    • Manager: Dr. Dijia Wu and Dr. Shaohua Kevin Zhou
    • Learning-based automatic knee cartilage segmentation in 3D MR images (in C++)
  • 2009/06 – 2009/07, Intern Programmer, Northking Technology Corporation, Beijing, China

    • The development of the Business Operation System of Northking Technology Corporation (with JSF framework)

Awards

  • Distinguished Author of Year 2020

    • Awarded by Publishing House of Electronics Industry (PHEI).
  • The Allen B. Dumont Prize, 2015

    • This prize is awarded to a graduate student who has demonstrated high scholastic ability and has made a substantial contribution to that field.

Publications

Books

  • Quan Wang, "Voice Identity Techniques: From core algorithms to engineering practice" (Chinese), Publishing House of Electronics Industry (PHEI), September 2020. [GitHub] [JD] [TMall] [DangDang]

Journal Publications

  • Quan Wang, Kim L. Boyer, "The active geometric shape model: A new robust deformable shape model and its applications", Computer Vision and Image Understanding, Volume 116, Issue 12, December 2012, Pages 1178-1194, ISSN 1077-3142, doi:10.1016/j.cviu.2012.08.004. [link] [PDF] [slides] [software]

  • Quan Wang, Xinchi Zhang, Kim L. Boyer, "Occupancy distribution estimation for smart light delivery with perturbation-modulated light sensing", Journal of Solid State Lighting 2014 1:17, ISSN 2196-1107, doi:10.1186/s40539-014-0017-2. [link] [PDF] [software]

  • Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Hector Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sebastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-Francois Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling, "ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech", Computer Speech & Language, Volume 64, doi:10.1016/j.csl.2020.101114. [link] [PDF]

Conference Publications

  • Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ding Zhao, Yiteng (Arden)Huang, Arun Narayanan, Ian McGraw, "Personalized Keyphrase Detection using Speaker and Environment Information", arXiv:2104.13970 [eess.AS], 2021. [PDF]

  • Roza Chojnacka, Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno, "SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System", arXiv:2104.02125 [eess.AS], 2021. [PDF]

  • Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno, "Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition", arXiv:2104.01989 [cs.CL], 2021. [PDF]

  • Yiling Huang, Yutian Chen, Jason Pelecanos, Quan Wang, "Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech", IEEE Spoken Language Technology Workshop (SLT), 2021. [PDF]

  • Shaojin Ding, Ye Jia, Ke Hu, Quan Wang, "Textual Echo Cancellation", arXiv:2008.06006 [eess.AS], 2020. [PDF]

  • Quan Wang, Ignacio Lopez Moreno, Mert Saglam, Kevin Wilson, Alan Chiao, Renjie Liu, Yanzhang He, Wei Li, Jason Pelecanos, Marily Nika, Alexander Gruenstein, "VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition", Interspeech 2020. [PDF] [website] [Google AI Blog]

  • Quan Wang, Ignacio Lopez Moreno, "Version Control of Speaker Recognition Systems", arXiv:2007.12069 [eess.AS], 2020. [PDF]

  • Shaojin Ding, Quan Wang, Shuo-yiin Chang, Li Wan, Ignacio Lopez Moreno, "Personal VAD: Speaker-Conditioned Voice Activity Detection", Proc. Odyssey 2020 The Speaker and Language Recognition Workshop. [PDF]

  • Li Wan, Prashant Sridhar, Yang Yu, Quan Wang, Ignacio Lopez Moreno, "Tuplemax Loss for Language Identification", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019). [PDF] [poster]

  • Quan Wang, Hannah Muckenhirn, Kevin Wilson, Prashant Sridhar, Zelin Wu, John Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio Lopez Moreno, "VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking", Interspeech 2019. (ORAL) [PDF] [samples]

  • Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley, Chong Wang, "Fully Supervised Speaker Diarization", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019). [PDF] [code] [poster] [Google AI Blog]

  • Yutian Chen, Yannis Assael, Brendan Shillingford, David Budden, Scott Reed, Heiga Zen, Quan Wang, Luis C. Cobo, Andrew Trask, Ben Laurie, Caglar Gulcehre, Aäron van den Oord, Oriol Vinyals, Nando de Freitas, "Sample Efficient Adaptive Text-to-Speech", International Conference on Learning Representations (ICLR 2019). [PDF] [samples] [poster] [Google AI Blog]

  • Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu, "Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis", Advances in neural information processing systems (NeurIPS 2018). [PDF] [samples] [poster] [Google AI Blog]

  • Quan Wang, Carlton Downey, Li Wan, Philip Andrew Mansfield, Ignacio Lopez Moreno, "Speaker Diarization with LSTM", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018). [PDF] [poster] [code] [wiki]

  • Li Wan, Quan Wang, Alan Papir, Ignacio Lopez Moreno, "Generalized End-to-End Loss for Speaker Verification", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018). (ORAL) [PDF] [slides] [wiki]

  • F A Rezaur Rahman Chowdhury, Quan Wang, Ignacio Lopez Moreno, Li Wan, "Attention-Based Models for Text-Dependent Speaker Verification", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018). [PDF] [poster]

  • Alejandro Luebs, Bastiaan Kleijn, Felicia Lim, Florian Stimberg, Jan Skoglund, Quan Wang, Thomas Walters, "Wavenet Based Low-Rate Speech Coding", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018). [PDF] [poster]

  • Quan Wang, Xinchi Zhang, Kim L. Boyer, "3D Scene Estimation with Perturbation-Modulated Light and Distributed Sensors", 10th IEEE Workshop on Perception Beyond the Visible Spectrum (PBVS). (ORAL) [PDF]

  • Quan Wang, Yan Ou, A. Agung Julius, Kim L. Boyer and Min Jun Kim, "Tracking Tetrahymena Pyriformis Cells using Decision Trees", 21st International Conference on Pattern Recognition (ICPR), Pages 1843-1847, 11-15 Nov. 2012. [PDF] [shotgun] [poster] [software]

  • Quan Wang, Dijia Wu, Le Lu, Meizhu Liu, Kim L. Boyer, and Shaohua Kevin Zhou, "Semantic Context Forests for Learning-Based Knee Cartilage Segmentation in 3D MR Images", MICCAI 2013: Workshop on Medical Computer Vision. (ORAL) [PDF] [poster] [slides] [software]

  • Quan Wang, Xin Shen, Meng Wang, Kim L. Boyer, "Label Consistent Fisher Vectors for Supervised Feature Aggregation", 22nd International Conference on Pattern Recognition (ICPR), 2014. [PDF] [poster] [software] [demo]

  • Quan Wang, Xinchi Zhang, Meng Wang, Kim L. Boyer, "Learning Room Occupancy Patterns from Sparsely Recovered Light Transport Models", 22nd International Conference on Pattern Recognition (ICPR), 2014. (ORAL) [PDF]

  • Quan Wang, Kim L. Boyer, "Feature Learning by Multidimensional Scaling and its Applications in Object Recognition", 26th SIBGRAPI Conference on Graphics, Patterns and Images (Sibgrapi). IEEE, 2013. (ORAL) [[PDF]https://github.com/wq2012/SimpleMatrix/blob/master/documentation/MDS_SIBGRAPI_2013.pdf)] [slides] [software]

  • Tanveer Syeda-Mahmood, Quan Wang, Patrick McNeillie, David Beymer, Colin Compas, "Discriminating Normal and Abnormal Left Ventricular Shapes in Four-Chamber View 2D Echocardiography", International Symposium on Biomedical Imaging (ISBI), 2014.

  • Quan Wang, Yu Wang, Zuoguan Wang, "Online Smart Face Morphing Engine with Prior Constraints and Local Geometry Preservation", International Workshop on Multimodal pattern recognition of social signals in human computer interaction (MPRSS 2014). (ORAL) [PDF]

  • Xinchi Zhang, Quan Wang, Kim L. Boyer, "Illumination Adaptation with Rapid-Response Color Sensors", SPIE Optical Engineering + Applications, 2014. (ORAL) [PDF]

Technical Reports and Theses

  • Jin Shi, Quan Wang, Yeming Fang, Gang Feng, Zhengying Chen, Jason Pelecanos, Ignacio Lopez Moreno, Andrea Chu, Pedro Moreno Mengibar, "Utterance Augmentation for Speaker Recognition", Technical Disclosure Commons, Defensive Publications Series, 2020. [link] [PDF]

  • Quan Wang, Yiran Mao , "Learning Better Font Slicing Strategies from Data", Technical Disclosure Commons, Defensive Publications Series, 2017. [link] [PDF] [wiki]

  • Philip Andrew Mansfield, Quan Wang, Carlton Downey, Li Wan, Ignacio Lopez Moreno, "Links: A High-Dimensional Online Clustering Method", arXiv:1801.10123 [stat.ML], 2018. [PDF]

  • Quan Wang, "GMM-Based Hidden Markov Random Field for Color Image and 3D Volume Segmentation", arXiv:1212.4527 [cs.CV], 2012. [PDF]

  • Quan Wang, "HMRF-EM-image: Implementation of the Hidden Markov Random Field Model and its Expectation-Maximization Algorithm", arXiv:1207.3510 [cs.CV], 2012. [PDF]

  • Quan Wang, "Kernel Principal Component Analysis and its Applications in Face Recognition and Active Shape Models", arXiv:1207.3538 [cs.CV], 2012. [PDF]

  • Quan Wang, "Exploiting Geometric and Spatial Constraints for Vision and Lighting Applications", Rensselaer Polytechnic Institute Ph.D. dissertation, 2014.

  • Quan Wang, "Implementation and Study of Light-Field-Based 3D Object Retrieval System", Tsinghua University Undergraduate Thesis, 2010. [PDF] [poster] [demo]

Acknowledged by

  • Soumi Maiti, Hakan Erdogan, Kevin Wilson, Scott Wisdom, Shinji Watanabe, John R. Hershey, "End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings", arXiv:2105.02096 [cs.SD]. [PDF]

  • ASVspoof 2019: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan, 2019. [PDF]

  • Ye Jia, Ron J. Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, Yonghui Wu, "Direct speech-to-speech translation with a sequence-to-sequence model", arXiv:1904.06037 [cs.CL]. [PDF]

  • Aonan Zhang, "Composing Deep Learning and Bayesian Nonparametric Methods", Ph.D. Dissertation, 2019. [PDF]

  • Yu Wang, "A broadly applicable three-dimensional neuron reconstruction framework based on deformable models and software system with parallel GPU implementation". Ph.D. Dissertation, 2011.

Patents

Reviewing

Teaching and Mentoring

Students

  • Wei Xia, 2021 Google summer intern, Ph.D.
  • Shaojin Ding, 2019 & 2020 Google summer intern, Ph.D.
  • Aonan Zhang, 2018 Google summer intern & 2018 Google Student Researcher, Ph.D.
  • Hannah Muckenhirn, 2018 Google summer intern, Ph.D.
  • F A Rezaur Rahman Chowdhury, 2017 Google summer intern, Ph.D. (cohost)
  • Carlton Downey, 2017 Google summer intern, Ph.D.
  • Xinchi Zhang, 2013-2014 undergraduate student at Smart Lighting Engineering Research Center

Teaching Assistant

  • 2011/01 – 2012/12, Teaching Assistant, Rensselaer Polytechnic Institute, Troy NY, USA
    • Spring 2011, Embedded Control [ENGR 2350], by Prof. Russell P. Kraft
    • Spring 2011, Real-Time Applications in Control & Communications [ECSE 4760], by Prof. Russell P. Kraft
    • Fall 2011, Introduction to Engineering Analysis [ENGR 1100], by Prof. Mark W. Olles
    • Spring 2012, Electric Circuits [ECSE 2010], by Prof. Jeffrey Braunstein
    • Spring 2012, Biological Image Analysis [ECSE 4960], by Dr. Jens Rittscher and Dr. Dirk Padfield
    • Fall 2012, Modeling and Analysis of Uncertainty [ENGR 2600], by Prof. Charles J. Malmborg

About

Curriculum Vitae of Quan Wang

Topics

Resources

Releases

No releases published

Packages

No packages published