Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for the Language tag #1377

Open
18 tasks
certuna opened this issue Sep 28, 2021 · 12 comments
Open
18 tasks

Support for the Language tag #1377

certuna opened this issue Sep 28, 2021 · 12 comments
Labels
enhancement go Go code javascript Javascript code

Comments

@certuna
Copy link
Contributor

certuna commented Sep 28, 2021

This is a placeholder Issue for implementation of support for the Language tag (TLAN in id3, LANGUAGE in Vorbis Comments, etc). It is defined in the id3v2 tag specification as an ISO 639-2 code ("eng", "spa").

Edited to reflect the latest state of discussions on how to implement this.

Thing to consider:

  • what to do with multi-language songs & albums. If we support multiple values properly, we'll have to set up a new table and m2m relationships in the database, set up language as an array in the API, all just like with Genres. This may be more complex than it's worth. One idea is to initially implement Language as a single value tag - MusicBrainz Picard for example only does one language per song, if there's more then it tags mul ("multiple languages")
  • when rolling up Language to the album level, the single-value approach becomes more problematic, since there are way more multi-language albums around than multi-language songs. The MusicBrainz database rolls up an album with 10 songs in eng and 1 song in spa to 1 album language mul. This is...probably not what we want.
  • another case is an album with 1 song in eng and 10 songs without lyrics zxx (="no language"), we probably want the album to be eng.
  • Language cannot be exposed to Subsonic clients since it's not in the API specs
  • We could break 1NF and implement it like Navidrome does with the all_artist_ids field: store multiple languages in one field (language = "eng fra spa"), use LIKE in the sql query, and split the string client-side in the WebUI.

Milestone 1: single-value support only for songs

Serverside:

  • read the tag & sanitize the string to drop all non-ISO 639-2 values
  • add a language column in the media_file table

Clientside (Web UI):

  • use the Intl.DisplayNames Javascript method to convert the ISO 639-2 codes to localized language names
  • add a SongList column for Language, and allow the column to be toggled in the Album tracklist, Playlist and Songs views
  • add a filter box in the Songs and Playlist views

Milestone 2: single-value roll-up to album and artist

Serverside:

  • add a language column in the album and artist table
  • roll up the language from songs to album and from songs to artist (using MostFrequent?)
  • add language to the album and artist API endpoint (as arrays?)

Clientside (Web UI):

  • use the Intl.DisplayNames Javascript method to convert the ISO 639-2 codes to localized language names
  • add AlbumList and ArtistList columns for Language
  • add a filter box in the Albums and Artists views

Milestone 3: multi-value support for songs, albums and artists

Serverside:

  • m2m tables: language, album-language and song-language
  • remove language column from the media_file table
  • read the tag & populate the table/m2m relations
  • add language to the album API endpoint (multi-valued)

Clientside (Web UI):

  • implement Language like Genres in the Songlists
  • add Language in the Album view
  • add a filter box in the Albums views
@deluan
Copy link
Member

deluan commented Sep 28, 2021

Hey @certuna, what would be the usage of this tag(s)? Just information, or any kind of filtering on the UI? If it is just informational, we can have one text field to store multiple values, and parse it into an array when sending it to the UI

@certuna
Copy link
Contributor Author

certuna commented Sep 28, 2021

I was envisioning we'd have a similar dropdown filter as we have now for Genre, so you could filter in the Song or Album view for language="Spanish" & genre = "Hip-Hop" etc.

I agree that if we do this without the filter option and just show it in the Song list as an informative column, you could store multiple languages as a comma separated string in the media_file table, like you say.

@metalheim
Copy link
Contributor

metalheim commented Oct 4, 2021

I would find this very useful for smart playlists in the long run f.e.

Songs with 'genre=Hip-Hop' & 'lang=fra'
To find French Hip-Hop

That said I don't have my song languages tagged multi-valued

@deluan
Copy link
Member

deluan commented Oct 4, 2021

I actually not sure if we need to support multiple languages. The use case I see is what @metalheim just said above: to filter songs by language in a smart-playlist (or even in a UI filter). For this, IMHO, what really matter is the main language, right?

@certuna
Copy link
Contributor Author

certuna commented Oct 11, 2021

Many songs exist with two or more languages (is Don Omar & Lucenzo's "Danza Kuduro" a Portugese or Spanish language song?), but I think it's fine to implement this initially as single-valued instead of multi-valued. We can always make it multivalued later, and check if the performance/complexity hit is worth it.

Edit: we should probably make language an array in the song API from the start, even if it initially always has 1 element.

@deluan deluan added go Go code javascript Javascript code labels Oct 13, 2021
@upsuper
Copy link
Contributor

upsuper commented Feb 13, 2023

In addition to showing the language embedded in the metadata, this information may also be critical for assigning the right lang attribute when rendering the song information (e.g. title) in the UI. Most of the problems I mentioned in #2174 are relevant here as well.

@isle9
Copy link

isle9 commented Jun 23, 2023

Are there any plans on implementing the milestone 1 on the server side in the near future?

@certuna
Copy link
Contributor Author

certuna commented Jun 28, 2023

I'm currently working on multiple artist support, but once that's done support for Key and Language are next.

@metalheim
Copy link
Contributor

there is an excellent go library /databse that can be used to lookup and sanitize lang uage strings (drop all non-ISO 639-2 values)
https://github.com/barbashov/iso639-3
It get's its data from the SIL.org official source for language standardization.
Contrary to its name it also does support iso 639-2

https://github.com/barbashov/iso639-3/blob/1f4ffb2d8d1cb8137d29ac6deb9e28b7e216b767/iso6393.go#L47

@certuna
Copy link
Contributor Author

certuna commented Jul 6, 2023

That's cool, definitely useful.

ISO 639-3 is the current standard, but id3v2.4 was published over twenty years ago when 639-2 was the latest and greatest so I guess that's what I'll use. As usual, Vorbis has nothing standardized.

I'm now looking into the Intl.DisplayNames specs for the Web UI, it seems that it can use both ISO 639-1 two-letter codes ("en") and 639-2 three-letter codes ("eng") so it looks like it's OK to sanitize 639-2 tags & store them in the database. Then 639-3 is not needed.

@metalheim metalheim mentioned this issue Sep 16, 2023
25 tasks
@certuna
Copy link
Contributor Author

certuna commented Sep 17, 2023

By the way, I just found out that in the Musicbrainz DB, language is not a multi-valued property: a song can have only one language, or the tag "multiple languages" ("mul"), or "no language" aka instrumental ("zxx"). We can do the same in Navidrome, it's probably best to aggregate to "most common" language of the album instead of making an album "multi-language" as soon as there's even one song with another language.

@deluan
Copy link
Member

deluan commented Dec 28, 2023

By the way, I just found out that in the Musicbrainz DB, language is not a multi-valued property: a song can have only one language, or the tag "multiple languages" ("mul"), or "no language" aka instrumental ("zxx"). We can do the same in Navidrome, it's probably best to aggregate to "most common" language of the album instead of making an album "multi-language" as soon as there's even one song with another language.

Agree. And what to do if some songs have the tag with the same language, but a few don't have the tag at all? Go with the majority? Also, if one is instrumental (zxx) and the rest is ex: eng, we should use eng right?

Ok, after reading your comment again, I realized my questions were already answered:

it's probably best to aggregate to "most common" language of the album instead of making an album "multi-language"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement go Go code javascript Javascript code
Projects
None yet
Development

No branches or pull requests

5 participants