You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Having both LONG_SPEAKER_ID and SPEAKER_ID is a little confusing to end users. The format of LONG_SPEAKER_ID is <start>-<stop>-# where "start" and "stop" refer to the frames or times (in ms) of a media segment. (Currently, only video files are segmented by the Workflow Manager.) The format of SPEAKER_ID is just the # part of that.
The LONG_SPEAKER_ID is always set by speech-to-text components. On the other hand, In some cases the SPEAKER_ID is set to 0 to indicate that it should not be used and that the LONG_SPEAKER_ID should be used instead. This happens in cases where a video is segmented. Since each segment is processed independently the speakers need to be identified relative to their segments. For example, a speaker with id 0 in segment A may not be the same person as the speaker with id 0 in segment B. The <start>-<stop>- prefix ensures that each of these speakers has a unique id.
Because sometimes the SPEAKER_ID is valid and sometimes it's not, creating some confusion, moving forward we've decided to instead only use the full <start>-<stop>-# format to represent speaker ids. Specifically, we're dropping LONG_SPEAKER_ID from the JSON output object and instead re-purposing the existing SPEAKER_ID to use the long <start>-<stop>-# format.
The text was updated successfully, but these errors were encountered:
This will be pushed to the next major release. All speech components have temporary logic to rename SPEAKER_ID to LONG_SPEAKER_ID, and overwrite SPEAKER_ID with 0. This temporary logic will merely need to be removed, unit tests updated to reference SPEAKER_ID rather than LONG_SPEAKER_ID, and openmpf-python-component-sdk/detection/component_util/mpf_component_util/job_config.py altered to set speaker_id according to SPEAKER_ID rather than LONG_SPEAKER_ID.
Related to #1674.
Having both
LONG_SPEAKER_ID
andSPEAKER_ID
is a little confusing to end users. The format ofLONG_SPEAKER_ID
is<start>-<stop>-#
where "start" and "stop" refer to the frames or times (in ms) of a media segment. (Currently, only video files are segmented by the Workflow Manager.) The format ofSPEAKER_ID
is just the#
part of that.The
LONG_SPEAKER_ID
is always set by speech-to-text components. On the other hand, In some cases theSPEAKER_ID
is set to 0 to indicate that it should not be used and that theLONG_SPEAKER_ID
should be used instead. This happens in cases where a video is segmented. Since each segment is processed independently the speakers need to be identified relative to their segments. For example, a speaker with id 0 in segment A may not be the same person as the speaker with id 0 in segment B. The<start>-<stop>-
prefix ensures that each of these speakers has a unique id.Because sometimes the
SPEAKER_ID
is valid and sometimes it's not, creating some confusion, moving forward we've decided to instead only use the full<start>-<stop>-#
format to represent speaker ids. Specifically, we're droppingLONG_SPEAKER_ID
from the JSON output object and instead re-purposing the existingSPEAKER_ID
to use the long<start>-<stop>-#
format.The text was updated successfully, but these errors were encountered: