QBASR

This is a paper that investigates question answering under conditions of noisy input.

We experiment with generating a large synthetic corpus and compare this with accuracy on human recorded data.
We introduce a confidence model that directly incorporates the confidence from an automatic-speech-recognition (ASR) system into a question answering neural network.
We compare model accuracy on data with "unknowns" decoded by the ASR system and data where a known vocabulary prediction is always forced.

Paper

The paper can be found at: https://www.isca-speech.org/archive/Interspeech_2019/pdfs/3154.pdf

Code

Code for the Deep Averaging Network, the Infromation Retrieval system, and relevant visualizations are provided here.
Additional data generation code (intended for a Slurm cluster) can be found at https://github.com/DenisPeskov/QBASR_GenerateData

Data

Data is stored at qbasr.umiacs.io There are three folders:

Human Original Data
QANTA
SearchQA

Human original data contains the .mp3s for (and .wav, .lat, and .sau) files needed for generating the final text output. The processed text files from this data are stored in the respective place below.

QANTA contains the processed text data for Quizbowl. asr_qanta.{split}.2018.04.18.json are the text to speech generated questions. where split can be train, dev, or split. Additionally, there are two extension types for dev/test data: 1) first and 2) joined. First contains just the first sentence, which is the most difficult one. Joined contains the entire Quizbowl question.

qb.human.json are the human-recorded questions This is the file containing decoded human-recorded questions, joined to contain one Quizbowl question. http://qbasr.umiacs.io/QANTA/qb.human.joined.json

SearchQA contains the processed text data for Jeopardy. Unmodified ASR-decoded data is located at: searchqa.{split}.json Force Decoding version is located: searchqa.exp.{split}.json where split can be train, dev, or split. This is path to the expanded test version.
http://qbasr.umiacs.io/SearchQA/searchqa.exp.test.json

Auto generated data is not stored in the interest of space (~300 GB). A seperate repository containing the code needed to generate audio data with Google Text to Speech and decode the data with Kaldi is provided at: https://github.com/DenisPeskov/QBASR_GenerateData

Example of Quizbowl data (the same sentence across different speakers):

Speaker 1: http://qbasr.umiacs.io/HumanOriginalData/HumanData_MultiSpeaker/0/2_0.mp3
Speaker 2: http://qbasr.umiacs.io/HumanOriginalData/HumanData_MultiSpeaker/1/2_0.mp3
Auto Generated: http://qbasr.umiacs.io/HumanOriginalData/HumanData_MultiSpeaker/999/-997_0.mp3

Example of Jeopardy data:

Alex Trebek's voice http://qbasr.umiacs.io/HumanOriginalData/HumanData_9.26.18_FullJeopardyGame/2_1.mp3

Jeopardy Show #7828 (http://www.j-archive.com/showgame.php?game_id=6112) is hand-parsed.

Data is stored as: columnNumber_QuestionNumber (with 1 not 0 index). So 2_1 corresponds to the first question second column (Name the Novel): "If the picture was to alter, it was to alter. That was all...not one blossom of his loveliness would ever fade".

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
DAN		DAN
Extra		Extra
Visualization		Visualization
ir-code		ir-code
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DAN

DAN

Extra

Extra

Visualization

Visualization

ir-code

ir-code

README.md

README.md

Repository files navigation

QBASR

Paper

Code

Data

Example of Quizbowl data (the same sentence across different speakers):

Example of Jeopardy data:

About

Releases

Packages

Contributors 2

Languages

DenisPeskoff/QBASR

Folders and files

Latest commit

History

Repository files navigation

QBASR

Paper

Code

Data

Example of Quizbowl data (the same sentence across different speakers):

Example of Jeopardy data:

About

Resources

Stars

Watchers

Forks

Languages