ASR LLM Rescoring

Instructions

Run preprocess_data.py to generate dictionaries containing n-best asr scores for each utterance.
Run lllm_scoring.py to update dictionaries with llm scores for each utterance. (for gpt2 and bert)
Run combined_scores.py with arg --lambda_param to combine the asr and llm scores.
Run compute_error_rate.py to compute the error rate for a given hypothesis dictionary.
gridsearch.sh Tests error rates on a range of lambda values.
hyp_comb_10_dict_test_other.json contains the hypotheses and all the scores for the automasking experiment
hyp_comb_masks_10_dict_test_other.json contains the hypotheses and all the scores for the selective mask-based experiment

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
ground_truth		ground_truth
inference		inference
plots		plots
LICENSE		LICENSE
README.md		README.md
combined_scores.py		combined_scores.py
compute_error_rate.py		compute_error_rate.py
create_ref_dict.py		create_ref_dict.py
display_scores.ipynb		display_scores.ipynb
gridsearch.sh		gridsearch.sh
llm_scoring.py		llm_scoring.py
preprocess_data.py		preprocess_data.py
run.sh		run.sh