Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Topic: Speech Recognition
These are working notes for a topic area
Grade Progression: What should students know?
Grades K-2: Some types of devices can recognize human speech. This includes most cellphones, and home entertainment systems like Amazon's Echo or Google Home.
Grades 3-5: Speech recognition systems use grammatical knowledge to disambiguate homophones such as bear/bare or there/their/they're. Example: "There is no hot water" vs. "Their hot water is off" vs. "They're waiting for the hot water to come back on".
Readings for Working Group
Machine Learning is Fun Part 6: How to do Speech Recognition with Deep Learning. Adam Geitgey, Medium, December 2016. medium.com
How Siri Works -- Interview with Tom Gruber, CTO of SIRI. Nova Spivack, NovaSpivack.com, January 26, 2010. novaspivack.com
Old Readings (replaced)
a. Brief Explanation of AI for Layman medium.com
b. Making the Leap from Speech to Dialogue: The Challenge for Human to Machine Communication medium.com
c. CACM January 2014 - A historical perspective of speech recognition. Xuedong Huang, James Baker, and Raj Reddy. Commun. ACM 57, 1 (January 2014), 94-103. DOI: acm.org
d. CACM April 2018 - Speech emotion recognition: two decades in a nutshell, benchmarks, and ongoing trends. Björn W. Schuller. Commun. ACM 61, 5 (April 2018), 90-99. DOI: doi.org/10.1145/3129340 acm.org
e. Video: Speech Emotion Recognition. youtube.com
- Demo: Speech Recognition in Chrome
- Alexa, Siri, Cortana
- Use Audacity (free download) to record a speech segment and display the spectrogram.
Miscellaneous concepts to incorporate
Audio -> Formants -> Phones -> Syllables -> Words -> Phrases
How neural nets improved speech recognition: use of massive training data.
Grammar: recognition does best with conversational English
"How to recognize speech" == "How to wreck a nice beach"
Languages other than English
Accents; child voices
Applications: Alexa, Siri, Cortana. What do they do? How are they useful?