Improve evaluate output format #55

thisismatu · 2022-11-08T20:54:51Z

Update progress bar characters and width and clear it on finish
Group both ASR and NLU output better
Use our fork of Needleman-Wunsch alignment for ASR and NLU output. This is a rather simple tool that shows the where the ground truth and prediction differs.

Progress bar

Transcribing  50%  █████████▌░░░░░░░░░░  (2/4, 36 utt/min) [5s:3s]

NLU output

speechly evaluate nlu <app_id> test.txt

Line: 2
└─ Ground truth: *add_to_cart i want [two|2](amount) of those
└─ Prediction:   *add_to_cart i want *two*********** of those

Line: 4
└─ Ground truth: *set_delivery_date delivery [tomorrow|2022-01-*01](delivery_date)
└─ Prediction:   *set_delivery_date delivery [tomorrow|2022-11-10*](delivery_date)

Accuracy: 0.50 (2/4)

ASR output

speechly evaluate asr <app_id> test.jsonl 
                                                                  
Audio: podcast1.wav
└─ Ground truth: WELCOME TO ANOTHER EPISODE OF THE SPEECHLY PODCAST
└─ Prediction:   WELCOME TO ANOTHER EPISODE OF THE SPEECH** PODCAST

Audio: podcast3.wav
└─ Ground truth: THIS CONCEPT OF VOICE BEING AN EXPERT UI COULD YOU MAY***BE UNPACK THAT CONCEPT
└─ Prediction:   THIS CONCEPT OF VOICE BEING AN EXPERT UI COULD YOU MIGHT BE UNPACK THAT CONCEPT

Word Error Rate (WER): 0.04 (3/68)

Known issues

Long lines will wrap, but at least with these changes they are easier to distinguish. This PR does not try to make changes to that since it's a rabbit hole, believe me. I was in that hole for a brief moment of time before i managed to pull myself out of it...

Group items better and use Needleman-Wunsch alignment for asr evaluation output

bigdatabaracus · 2022-11-09T07:13:44Z

Thanks @thisismatu Nice! One comment right of the bat. I think using the Needleman-Wunsch alignment also for NLU to highlight differences would make a lot of sense. As an example.

Line: 2
└─ Ground truth: *add_to_cart i want -two----------- of those
└─ Prediction:   *add_to_cart i want [two|2](amount) of those

bigdatabaracus · 2022-11-09T08:06:53Z

Thanks @thisismatu for the update. Out of curiosity does the second NLU example in the PR description look now like this now?

Line: 4
└─ Ground truth: *set_delivery_date delivery [tomorrow|2022-10--09](delivery_date)
└─ Prediction:   *set_delivery_date delivery [tomorrow|2022-1-1-09](delivery_date)

thisismatu · 2022-11-09T08:22:38Z

@bigdatabaracus it varies, hence i didn't add it in the first place. As there's no customization, our options are to remove it for nlu or live with this.

Few example outputs with ground truth on top and prediction on bottom:

[tomorrow|2022-10-11]
[tomorrow|2022-11-10]

[tomorrow|2021-01--01]
[tomorrow|2022-11-10-]

[tomorrow|1998-12-21-]
[tomorrow|2022-11--10]

bigdatabaracus · 2022-11-09T08:59:52Z

The hyphen symbol is an unfortunate choice of alignment character for the go library. We are just about to add that character to our ASR output as well. 🤔

Mathias Lindholm added 2 commits November 8, 2022 22:37

Update progress style and clear on finish

30dfed9

Improve evaluate accuracy output

0ba8cea

Group items better and use Needleman-Wunsch alignment for asr evaluation output

thisismatu requested review from bigdatabaracus and teelisyys November 8, 2022 20:54

Format nlu output using nw algo

2c8d3ba

teelisyys previously approved these changes Nov 9, 2022

View reviewed changes

Use our nwalgo fork and specify char

a0d76a9

thisismatu dismissed teelisyys’s stale review via a0d76a9 November 9, 2022 11:01

thisismatu requested a review from teelisyys November 9, 2022 11:05

bigdatabaracus approved these changes Nov 9, 2022

View reviewed changes

thisismatu merged commit bb8c46b into master Nov 9, 2022

thisismatu deleted the format-evaluate-output branch November 9, 2022 11:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve evaluate output format #55

Improve evaluate output format #55

Uh oh!

thisismatu commented Nov 8, 2022 •

edited

Loading

Uh oh!

bigdatabaracus commented Nov 9, 2022 •

edited

Loading

Uh oh!

bigdatabaracus commented Nov 9, 2022

Uh oh!

thisismatu commented Nov 9, 2022 •

edited

Loading

Uh oh!

bigdatabaracus commented Nov 9, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Improve evaluate output format #55

Improve evaluate output format #55

Uh oh!

Conversation

thisismatu commented Nov 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bigdatabaracus commented Nov 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bigdatabaracus commented Nov 9, 2022

Uh oh!

thisismatu commented Nov 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bigdatabaracus commented Nov 9, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

thisismatu commented Nov 8, 2022 •

edited

Loading

bigdatabaracus commented Nov 9, 2022 •

edited

Loading

thisismatu commented Nov 9, 2022 •

edited

Loading