Diarization Result Types #17

wodka · 2021-12-08T15:51:26Z

What is the current behavior?

The Words Base Type https://github.com/deepgram/node-sdk/blob/main/src/types/wordBase.ts does not include the speaker right now. Further on the Utterance level it has a speaker https://github.com/deepgram/node-sdk/blob/main/src/types/utterance.ts but that does not mean that all words belong to that speaker.

Steps to reproduce

Audio File where multiple speakers talk with less space than 0.8 seconds of silence between them

Expected behavior

One of 2 outcomes - both kind of requiring changes to the api result as both have inconsistencies that are tricky to realise!

Each Utterance to be of exactly one speaker

To have a new utterance whenever the speaker changes. Then there does not need to be a speaker type on the word and the speaker type on the utterance can stay the same.

Each Word has a speaker property

this would require that the word type gets the speaker property as well (it does exist in the api response already), and to indicate that there are multiple potential speakers to have the speaker on the utterance level to be of type number[] (as there can be multiple within one utterance)

Please tell us about your environment

Operating System/Version: OSX
Language: TypeScript
Browser: Chrome

Other information

https://developers.deepgram.com/documentation/features/diarize/

The text was updated successfully, but these errors were encountered:

michaeljolley · 2021-12-16T01:02:54Z

Thanks for reporting this @wodka. This has been a discussion internally for the past week. #18 should address the optional speaker property of the wordBase type and I updated the description of the speaker property of an utterance to denote that it is the predicted speaker based on all the words in the utterance rather than a definitive speaker. It is derived without the diarizer's input.

Once that PR is merged and released, I'll close this issue as the product & engineering teams will handle any changes to the API moving forward and those changes are out of scope for this repository.

michaeljolley closed this as completed Feb 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diarization Result Types #17

Diarization Result Types #17

wodka commented Dec 8, 2021

michaeljolley commented Dec 16, 2021

Diarization Result Types #17

Diarization Result Types #17

Comments

wodka commented Dec 8, 2021

What is the current behavior?

Steps to reproduce

Expected behavior

Each Utterance to be of exactly one speaker

Each Word has a speaker property

Please tell us about your environment

Other information

michaeljolley commented Dec 16, 2021