Remove LONG_SPEAKER_ID and instead only use SPEAKER_ID #1643

jrobble · 2023-02-10T02:42:30Z

Related to #1674.

Having both LONG_SPEAKER_ID and SPEAKER_ID is a little confusing to end users. The format of LONG_SPEAKER_ID is <start>-<stop>-# where "start" and "stop" refer to the frames or times (in ms) of a media segment. (Currently, only video files are segmented by the Workflow Manager.) The format of SPEAKER_ID is just the # part of that.

The LONG_SPEAKER_ID is always set by speech-to-text components. On the other hand, In some cases the SPEAKER_ID is set to 0 to indicate that it should not be used and that the LONG_SPEAKER_ID should be used instead. This happens in cases where a video is segmented. Since each segment is processed independently the speakers need to be identified relative to their segments. For example, a speaker with id 0 in segment A may not be the same person as the speaker with id 0 in segment B. The <start>-<stop>- prefix ensures that each of these speakers has a unique id.

Because sometimes the SPEAKER_ID is valid and sometimes it's not, creating some confusion, moving forward we've decided to instead only use the full <start>-<stop>-# format to represent speaker ids. Specifically, we're dropping LONG_SPEAKER_ID from the JSON output object and instead re-purposing the existing SPEAKER_ID to use the long <start>-<stop>-# format.

The text was updated successfully, but these errors were encountered:

cdglasz · 2023-02-20T18:24:55Z

This will be pushed to the next major release. All speech components have temporary logic to rename SPEAKER_ID to LONG_SPEAKER_ID, and overwrite SPEAKER_ID with 0. This temporary logic will merely need to be removed, unit tests updated to reference SPEAKER_ID rather than LONG_SPEAKER_ID, and openmpf-python-component-sdk/detection/component_util/mpf_component_util/job_config.py altered to set speaker_id according to SPEAKER_ID rather than LONG_SPEAKER_ID.

jrobble added the enhancement label Feb 10, 2023

jrobble added this to the Milestone 3 milestone Feb 10, 2023

jrobble assigned cdglasz Feb 10, 2023

jrobble added this to To do in OpenMPF: Development via automation Feb 10, 2023

jrobble moved this from To do to Planned in OpenMPF: Development Feb 10, 2023

This was referenced Feb 14, 2023

Update SPEAKER_ID logic, set LONG_SPEAKER_ID=0 openmpf/openmpf-components#321

Merged

Remove logic for overwriting speaker IDs openmpf/openmpf-python-component-sdk#72

Merged

cdglasz moved this from Planned to In Progress in OpenMPF: Development Feb 17, 2023

jrobble added the breaks compatibility label Feb 21, 2023

This was referenced Feb 21, 2023

Remove logic for overwriting speaker IDs (#72) openmpf/openmpf-python-component-sdk#73

Merged

Update SPEAKER_ID logic, remove LONG_SPEAKER_ID (#321) openmpf/openmpf-components#322

Merged

cdglasz mentioned this issue Mar 6, 2023

Replace LONG_SPEAKER_ID with SPEAKER_ID openmpf/openmpf-components#325

Closed

jrobble mentioned this issue Apr 24, 2023

Update SPEAKER_ID logic, set LONG_SPEAKER_ID=0 #1674

Closed

brosenberg42 mentioned this issue Jul 12, 2023

Feat/wfm trigger openmpf/openmpf-components#336

Merged

jrobble closed this as completed Nov 29, 2023

OpenMPF: Development automation moved this from In Progress to Closed Nov 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove LONG_SPEAKER_ID and instead only use SPEAKER_ID #1643

Remove LONG_SPEAKER_ID and instead only use SPEAKER_ID #1643

jrobble commented Feb 10, 2023 •

edited

cdglasz commented Feb 20, 2023 •

edited

Remove LONG_SPEAKER_ID and instead only use SPEAKER_ID #1643

Remove LONG_SPEAKER_ID and instead only use SPEAKER_ID #1643

Comments

jrobble commented Feb 10, 2023 • edited

cdglasz commented Feb 20, 2023 • edited

jrobble commented Feb 10, 2023 •

edited

cdglasz commented Feb 20, 2023 •

edited