Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for latin encondig according to Brazilian Digital Television System (SBTVD / ISDB-Tb). #16

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

andrelcm
Copy link

@andrelcm andrelcm commented Oct 18, 2020

Dear maintainers,

This pull request supports latin enconding for Brazilian Digital Television System (SBTVD / ISDB-Tb) accorind to official standard ABNT 15606-1 (2013), related to issue #8.
This document is portuguese, so I will try to translate the relevant parts.

11.4 Character encoding
11.4.1 8 bits character codes
The character encoding using 8 bits must comply with ARIB STD-B5 and the technique described in ARIB STD-B24:2007, volume 1, subsection 7.1, with the the adaptations to include latin characters, as follows.
The coding structure used by SBTVD must comply with the technique described in ARIB STD-B24:2007, volume 1, part 2, subsection 7.1.1.1, and the following changes:
a) inclusion of character codes “latin extension” to the GP character codes. Table 13 presents the character code “latin extension” and table 9 presents the special codes for GP character codes;
b) changing the initial state of the GL page to “alphanumeric” and changing the initial state of the GR page to
“latin extension” (see Figure 6). Invocation and designation methods should not be used in the system diffusion Brazilian;
c) classification of the set of codes and final bytes according to Table 15;
d) inclusion of the graphic set of Latin characters (latin extension) and special characters according to to Table 15.
d) inclusão do conjunto gráfico de caracteres latinos (latin extension) e caracteres especiais de acordo com
a Tabela 15.
NOTE 1 Table 13 was adapted from ISO / IEC 8859-15: 1999.
NOTE 2 Table 15 presents the modified excerpt from Table 7-3 of ARIB STD-B24: 2007 for SBTVD.

11.6 Captions and overlapping characters
The encoding of subtitles and overlapping characters must comply with the method described in ARIB STD-B24: 2007, volume 1, part 3, with the following change:
-- change of the initial state of the system (presented in ARIB STD-B24: 2007, Volume 1, Part 3, Table 8-2) according to the values presented in Table 16;
-- use of G0 and G2 as an initial state;
-- G3 is used by the SS3 code (0x1D). SS3 means invoking a G3 code by placing it in the GL area temporarily.

Table 16 specififies the following desginations:

  • G0: alphanumeric set
  • G1: alphanumeric set
  • G2: latin extension set
  • G3: special characters

Thus, I created a initialization for latin decoder with following code:
decoder->handle_g0 = decoder_handle_alnum_latin;
decoder->handle_g1 = decoder_handle_alnum_latin;
decoder->handle_g2 = decoder_handle_latin_extension;
decoder->handle_g3 = decoder_handle_latin_special;

Since there is no technique specified to change to these designations, I included a code in 'parse_caption_management_data' to detect the language. If it is portuguese or spanish (this standard is also used in Argentina), an attribute in the instance is set to define the use of latin initialization.
I tested the code with VLC and the files suplied in #15. It seems that the japanese portion is not broken, but I am not sure since I don't know japanese. If you want, you can also test with a dump I made from a brazilian broadcast: test.ts.

Regards,
André Moreira

…f Brazilian Digital Television System (SBTVD / ISDB-Tb).
@andrelcm andrelcm marked this pull request as draft October 18, 2020 06:48
@andrelcm andrelcm marked this pull request as ready for review October 18, 2020 07:35
@andrelcm andrelcm changed the title Add support for latin encondig according of Brazilian Digital Television System (SBTVD / ISDB-Tb). Add support for latin encondig according to Brazilian Digital Television System (SBTVD / ISDB-Tb). Oct 18, 2020
@fcartegnie
Copy link
Contributor

fcartegnie commented Oct 19, 2020

Latin already means something else compared to ASCII.
Do we need to name encoding "latin" ? There's no more specific naming ?
Maybe we should name by standard ?

@andrelcm
Copy link
Author

"Latin" is the term employed by the standard. Not sure if it is the best one, but I'd be ok to use another word.

@andrelcm
Copy link
Author

Hi, Cartegnie. Do you want me to change something?

Copy link
Contributor

@fcartegnie fcartegnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Cartegnie. Do you want me to change something?

there's already remarks around here

@@ -66,6 +66,8 @@ typedef struct arib_instance_t
bool b_generate_drcs;
bool b_use_private_conv;
bool b_replace_ellipsis;
bool is_latin;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use enum

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I use enum for japanese and brazilian standards?

}
}

void arib_initialize_decoder_latin( arib_decoder_t* decoder )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add generic api with encoding parameter

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for asking, but I did not catch what you mean by "generic api".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't add a new init api for each decoder. init( &dec, ARIB_B25 ) init( &dec, SBTVD )

xtne6f added a commit to xtne6f/TVCaptionMod2 that referenced this pull request Jan 9, 2022
以下の議論を参考にした:
[Add support for latin encondig according to Brazilian ~ by andrelcm]( nkoriyama/aribb24#16 )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants