Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kodi will not add video with Unicode > 0x10000 in filename to mysql/mariadb library #23312

Open
1 of 7 tasks
scott967 opened this issue May 20, 2023 · 0 comments
Open
1 of 7 tasks
Labels
Triage: Needed (managed by bot!) issue that was just created and needs someone looking at it

Comments

@scott967
Copy link
Contributor

Bug report

Describe the bug

Here is a clear and concise description of what the problem is:

Video files with emoji characters in the filename are not added to the video library when scanned if using mysql or mariadb databases. The files are added when default sqlite db is used.

Note: musicdb not tested but probably has similar issue.

Expected Behavior

Here is a clear and concise description of what was expected to happen:

Files should be scanned and local info / scraped into mysql / mariadb databases

Actual Behavior

When scanning, Kodi logs an error

error <general>: SQL: [MyVideos121] Undefined MySQL error: Code (1366)
Query: INSERT INTO files (idFile, idPath, strFileName, playCount, lastPlayed, dateAdded) VALUES(NULL, 5817, '🏝댄스 커버🏝 WJSN - Boogie Up │우주소녀 - 부기업 COVER DANCE 안무 영상(1).mp4', NULL, NULL, '2020-03-30 08:08:13')
                                                   
error <general>: CVideoDatabase::AddFile unable to addfile ()

Possible Fix

Discovered while moving from sqlite to mariadb. Investigating this issue I found a likely candidate issue #16328 that seems to be inactive but is likely the root cause. Namely, mysql/mariadb need to created with utf8mb4_general_ci collation, not utf8_general_ci (aka utf8mb3). The inability to handle 4 byte utf-8 character data is the reason that the data can't be inserted into the tables. Note that 4-byte utf-8 characters cover more area than just emojis -- everything outside the unicode BMP.

I tried a naive fix by just changing

        snprintf(sqlcmd, sizeof(sqlcmd),
                 "CREATE DATABASE `%s` CHARACTER SET utf8 COLLATE utf8mb4_general_ci", db.c_str());`\

in dbwrapers/mysqldataset.cpp connect function, but that failed with error 1250 returned.

See also MySQL Ref Man 8.0 concerning deprecating utf8/utf8mb3

To Reproduce

Steps to reproduce the behavior:

  1. Scan a video file into library (exact here musicvideo with local nfo) having filename with characters outside of unicode BMP (in this case: U+1F3DD) 🏝댄스 커버🏝 WJSN - Boogie Up │우주소녀 - 부기업 COVER DANCE 안무 영상(1).mp4

Debuglog

The debuglog can be found here:

log

Screenshots

Here are some links or screenshots to help explain the problem:

Additional context or screenshots (if appropriate)

Here is some additional context or explanation that might help:

Your Environment

Used Operating system:

  • Android

  • iOS

  • tvOS

  • Linux

  • macOS

  • Windows

  • Windows UWP

  • Operating system version/name: Win 10 x64 22H2

  • Kodi version: 20.1.0 x64

note: Once the issue is made we require you to update it with new information or Kodi versions should that be required.
Team Kodi will consider your problem report however, we will not make any promises the problem will be solved.

@xbmc-gh-bot xbmc-gh-bot bot added the Triage: Needed (managed by bot!) issue that was just created and needs someone looking at it label May 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Triage: Needed (managed by bot!) issue that was just created and needs someone looking at it
Projects
None yet
Development

No branches or pull requests

1 participant