Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] External Chapters #1257

Merged
merged 5 commits into from
Feb 15, 2022
Merged

Conversation

j2oc
Copy link

@j2oc j2oc commented Aug 20, 2021

Presentation

This feature allows the user to add chapters to any (except internet based) video file or override the embedded chapters in any video file that possess them.

Motivation

This is very useful to me and can be for others (I surmise). There can be many advantages to this but instead of being too exhaustive (and exhausting) some of them can be gleaned from here.

Definition

To achieve the described functionality a text (UTF-8 or ANSI) file with the same file stem of the video file but with the .XCHP file extension should be placed in the same directory of the corresponding video file.

The .XCHP file should have multiple lines (one for each chapter) and each line should have 12 characters defining the Chapter Reference Time, in the form: HH:MM:SS.DDD (which is the same used in MPC and referred in the GUI as High Precision). If a line does not adhere to that format it'll be ignored.

Any characters after 12th position in the line are considered to be the Chapter Name. It can be omitted (empty) though. Any leading and/or trailing whitespace of the Chapter Name will be trimmed.

An example of a possible .XCHP file follows:

00:00:00.000 Exposition
00:41:34.744 Climax
01:18:12:002 Denouement

Implementation notes

I've tried to (re)use existing code where feasible but with the smallest footprint possible. However there is some redundancy in the implementation. The cost of avoiding it was modifying well established code.

One clear case of this was the String to Reference Time conversion function (StringToReftime) that I had to duplicate almost verbatim inside the new SetupExternalChapters member because the internal time format was different. So here is a clear opportunity for a future refactoring.

@ale5000-git
Copy link

ale5000-git commented Aug 31, 2021

To be more future proof I suggest to add an obligatory field in the first line that contain the codepage.
For UTF-8 maybe just UTF-8 text but for ANSI the exact codepage.
Example CP=850 or CP=UTF-8

I already imagine when million of chapters files start circulating on the internet in different encodings and player struggling to decode them without any info.

PS: It would also be nice to have "Chapter" afer "Open" and "Dub" in "Open File/URL" dialog (that works also for streamings).

@ale5000-git
Copy link

I also suggest to add 3 specifics chapter names that can be later used for this: #1221
These:

  • Opening
  • Ending
  • ADS

Later maybe they can be displayed in red (or different color) and maybe can also be added an option for auto-skipping.

@clsid2
Copy link
Owner

clsid2 commented Aug 31, 2021

To be more future proof I suggest to add an obligatory field in the first line that contain the codepage.

No. This seems to be a new format specification. So just require it to be UTF8 for non-ascii stuff. If users do something else it is their fault.

@ArcticGems
Copy link

To be more future proof I suggest to add an obligatory field in the first line that contain the codepage.

No. This seems to be a new format specification. So just require it to be UTF8 for non-ascii stuff. If users do something else it is their fault.

Maybe give a error message saying that user should save the file in UTF-8.

@ale5000-git
Copy link

ale5000-git commented Aug 31, 2021

The error message is good also otherwise there will be many reports of users that get text not displayed correctly.
Also I think that it will be good to strictly require that the file start with the UTF-8 BOM (not just UTF-8 text inside) because if edited with some text editors without UTF-8 BOM the file may be corrupted.

@j2oc
Copy link
Author

j2oc commented Sep 1, 2021

To be more future proof I suggest to add an obligatory field in the first line that contain the codepage.

No. This seems to be a new format specification. So just require it to be UTF8 for non-ascii stuff. If users do something else it is their fault.

I second this. Since the feature allows for a user-originated override it should fall on that same user the responsibility to ensure the proper format (be it the encoding and/or the Chapter Reference Time) when saving the .XCHP file.

The main rationale(s) behind this feature was/were to be as simple as possible (including the format definition itself) and to leverage the support for Chapters that was already there. What this implementation does is to add an extra (this time, external to the video file itself) source for the Chapters' info to be loaded into the player (and the first to be attempted, hence the -- inherent -- override capability). There is no error, per se, because if no valid line(s) from a corresponding .XCHP file are found (for whatever reason) it just skips to the next possible (internal) Chapters' info source. There is no error because any invalid (or non-valid) line(s) will be silently discarded/ignored.

As for the encoding, UTF-8 should be the de facto standard for any text based storage. The UTF-8 Everywhere Manifesto makes a great case for why this should be, so I won't. That is why I think that adding anything like as a codepage definition is redundant, unnecessary and detracting from the intended simplicity. In much the same way adding a BOM is also redundant and unnecessary and a bad idea (the Unicode Standard states that it's neither required nor recommended; see here.)

Finally, a quick reference to Specially Named Chapters. Since this implementation only adds an extra (possible) source for Chapters' info, anything related to Specially Named Chapters would (should) have to go (in my opinion) in the layer above (when the Chapters' info is already loaded and ready to be used) and would be valid to all Chapters' info sources, because, once the info is loaded the player doesn't care anymore about its origin/source. In short, Specially Named Chapters has nothing do to with loading info (which the External Chapters implementation addresses) but everything to do with adding new/extra functionalities to the current Chapters capabilities across the player (and so, IMO, out of the scope of this implementation).

@ale5000-git
Copy link

ale5000-git commented Sep 2, 2021

I had some experience with subrip subtitles and trust me every thing you leave "free" to choose will be taken wrong.
De facto standard is very relative because there are infinite configuration also inside a single OS but in addition the file can be edited in Windows / Linux / macOS, the default of notepad of Windows 7 is ANSI, and I know there are other text editors that do it wrong.

I had an idea to create a site like opensubtitles but for chapters and don't check the encoding is like asking for problems.
Maybe for purist BOM is bad but from programmer point of view have a certainty of the encoding require less guessing.

For files with BOM the text editor just read it and already know the encoding.
For files without BOM the text editor should look all the text and guess so it is slow.
You just considered the player but you don't consider the people that will actually create the chapter files.

@clsid2
Copy link
Owner

clsid2 commented Sep 2, 2021

ANSI works totally fine unless your codepage is different than that of the creator. When distributing to others, you must use UTF8. Period. Discussing it is pointless.

@ale5000-git
Copy link

ale5000-git commented Sep 2, 2021

The discussion about ANSI was already ended, I was just talking about UTF-8 with BOM vs UTF-8 without BOM.

@j2oc: If you save a file with UTF-8 without BOM and the file doesn't contain any special char outside ANSI then the next time you open it the text editor (it obviously depend from the text editor) may open it as ANSI and if you later insert for example an accented character it is inserted as ANSI.
UTF-8 BOM prevent this, but if the player doesn't support BOM then people will end before or after with an ANSI file.

@clsid2 clsid2 merged commit 7193a35 into clsid2:develop Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants