Currently, ExoPlayer's TextRenderer/TextOutput infrastructure is mostly unhelpful for music lyric formats. Lyrics are presented to the user as displaying all of them at once on the screen, and highlighting the active ones. But with the current design, a TextOutput won't know about future cues, and hence it cannot be implemented.
|
void onCues(CueGroup cueGroup); |
Also, the Cue class has various properties that do not make sense for lyrics, such as position.
|
public final float position; |
This is mostly throwing the idea out there, to have a text infrastructure in ExoPlayer supporting music lyrics as well. Up to the renderer level, it is mostly similar - lyrics formats are often a map from presentation timestamp to text line, and sometimes formats such as SRT or TTML that already have parsers in ExoPlayer are used for sidecar lyrics files, or even embedded into music tags wholesale.
Summarizing the unique needs of lyrics:
- Both future and past cues are displayed as well.
- Because there is no video, position metadata doesn't make sense.
- Some lyrics formats such as TTML have the ability to transfer information about who's saying a specific line, which can be used by views for example by aligning text to another side.
- Some lyrics formats support syllable-level synchronization, which can be used by views to display a gradient for karaoke singing of lyrics.
- Similar to subtitles, there are some text-based formats such as LRC, TTML or ID3's SYLT tag and Bitmap-based lyric formats such as https://en.wikipedia.org/wiki/MP3%2BG - however the Bitmap-based formats are closer to a video codec than a text file and would thus be out of scope of TextRenderer (example of bitmap based format: https://www.youtube.com/watch?v=gmYPQq7JcEg or https://www.youtube.com/watch?v=k35mUJkZNRM)
Would this be something the ExoPlayer team is interested in supporting?
Currently, ExoPlayer's TextRenderer/TextOutput infrastructure is mostly unhelpful for music lyric formats. Lyrics are presented to the user as displaying all of them at once on the screen, and highlighting the active ones. But with the current design, a TextOutput won't know about future cues, and hence it cannot be implemented.
media/libraries/exoplayer/src/main/java/androidx/media3/exoplayer/text/TextOutput.java
Line 44 in 630c1af
Also, the Cue class has various properties that do not make sense for lyrics, such as position.
media/libraries/common/src/main/java/androidx/media3/common/text/Cue.java
Line 258 in 630c1af
This is mostly throwing the idea out there, to have a text infrastructure in ExoPlayer supporting music lyrics as well. Up to the renderer level, it is mostly similar - lyrics formats are often a map from presentation timestamp to text line, and sometimes formats such as SRT or TTML that already have parsers in ExoPlayer are used for sidecar lyrics files, or even embedded into music tags wholesale.
Summarizing the unique needs of lyrics:
Would this be something the ExoPlayer team is interested in supporting?