Skip to content

Commit

Permalink
Added the CEA-708 support to the open-source project.
Browse files Browse the repository at this point in the history
Issue: #1807

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=144726542
  • Loading branch information
cdrolle authored and ojw28 committed Jan 17, 2017
1 parent 1ffe775 commit 18a24a1
Show file tree
Hide file tree
Showing 3 changed files with 1,299 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

import com.google.android.exoplayer2.Format;
import com.google.android.exoplayer2.text.cea.Cea608Decoder;
import com.google.android.exoplayer2.text.cea.Cea708Decoder;
import com.google.android.exoplayer2.text.subrip.SubripDecoder;
import com.google.android.exoplayer2.text.ttml.TtmlDecoder;
import com.google.android.exoplayer2.text.tx3g.Tx3gDecoder;
Expand Down Expand Up @@ -58,6 +59,7 @@ public interface SubtitleDecoderFactory {
* <li>SubRip ({@link SubripDecoder})</li>
* <li>TX3G ({@link Tx3gDecoder})</li>
* <li>Cea608 ({@link Cea608Decoder})</li>
* <li>Cea708 ({@link Cea708Decoder})</li>
* </ul>
*/
SubtitleDecoderFactory DEFAULT = new SubtitleDecoderFactory() {
Expand All @@ -78,6 +80,9 @@ public SubtitleDecoder createDecoder(Format format) {
|| format.sampleMimeType.equals(MimeTypes.APPLICATION_MP4CEA608)) {
return clazz.asSubclass(SubtitleDecoder.class).getConstructor(String.class, Integer.TYPE)
.newInstance(format.sampleMimeType, format.accessibilityChannel);
} else if (format.sampleMimeType.equals(MimeTypes.APPLICATION_CEA708)) {
return clazz.asSubclass(SubtitleDecoder.class).getConstructor(Integer.TYPE)
.newInstance(format.accessibilityChannel);
} else {
return clazz.asSubclass(SubtitleDecoder.class).getConstructor().newInstance();
}
Expand Down Expand Up @@ -105,6 +110,8 @@ private Class<?> getDecoderClass(String mimeType) {
case MimeTypes.APPLICATION_CEA608:
case MimeTypes.APPLICATION_MP4CEA608:
return Class.forName("com.google.android.exoplayer2.text.cea.Cea608Decoder");
case MimeTypes.APPLICATION_CEA708:
return Class.forName("com.google.android.exoplayer2.text.cea.Cea708Decoder");
default:
return null;
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
/*
* Copyright (C) 2016 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.google.android.exoplayer2.text.cea;

import android.text.Layout.Alignment;
import com.google.android.exoplayer2.text.Cue;

/**
* A {@link Cue} for CEA-708.
*/
/* package */ final class Cea708Cue extends Cue implements Comparable<Cea708Cue> {

/**
* An unset priority.
*/
public static final int PRIORITY_UNSET = -1;

/**
* The priority of the cue box.
*/
public final int priority;

/**
* @param text See {@link #text}.
* @param textAlignment See {@link #textAlignment}.
* @param line See {@link #line}.
* @param lineType See {@link #lineType}.
* @param lineAnchor See {@link #lineAnchor}.
* @param position See {@link #position}.
* @param positionAnchor See {@link #positionAnchor}.
* @param size See {@link #size}.
* @param windowColorSet See {@link #windowColorSet}.
* @param windowColor See {@link #windowColor}.
* @param priority See (@link #priority}.
*/
public Cea708Cue(CharSequence text, Alignment textAlignment, float line, @LineType int lineType,
@AnchorType int lineAnchor, float position, @AnchorType int positionAnchor, float size,
boolean windowColorSet, int windowColor, int priority) {
super(text, textAlignment, line, lineType, lineAnchor, position, positionAnchor, size,
windowColorSet, windowColor);
this.priority = priority;
}

@Override
public int compareTo(Cea708Cue other) {
if (other.priority < priority) {
return -1;
} else if (other.priority > priority) {
return 1;
}
return 0;
}

}
Loading

8 comments on commit 18a24a1

@peddisri
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some suggestions for this patch.

As per the CEA specs, both 608 and 708 may be present in ATSC DTV transmissions. So for example, if Exoplayer is to be used to render the digital form of the cable TV, there cannot be one mime type.

In my opinion, like most players (example VLC), Exo should identify 4 tracks for captions (CC1, CC2, CC3 & CC4, ) and if either CC1 or CC2 is selected, render the channel 1 and channel 2 of CEA 608, and if CC3 or above is selected, CEA 708 captions are rendered. We can map the service number of CEA 708 to channels more than CC3. i.e primary service number = CC3, secondary service num,ber = CC4, and other service numbers are CCX..

This means, supporting CEA 708 and 608 cannot be exclusive, and the parsing code should handle both simultaneously and be able to switch between them run-time based on selected track.

Please let me know if I am missing something in my understanding.

Here is my proposal:
The main class that is entry point to decoding captions could be CEADecoder (which is currently an abstract class), and internally , it creates two "Parsers", CEA608Parser and CEA708Parser. The current CEA608Decoder can be renamed to CEA608Parser and also does not derive from CEADecoder.
In this way, CEADecoder:decode function first parses the CC Type and based on the type invokes either of the parsers.
Each parser knows if a channel (or track) is enabled depending on currently selected track by the app. If a track is enable, it generates Cues, otherwise skips through the data.

@ojw28
Copy link
Contributor

@ojw28 ojw28 commented on 18a24a1 Jan 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

708 support isn't wired up end-to-end yet, this is just a step toward this. I think our intention is to do the track selection at a higher level though. There's no particular reason to require that a single decoder be capable of decoding all of the tracks simultaneously just because they're muxed into a single stream. As an analogy, we don't require a single decoder to be capable of decoding both audio and video because they're muxed together. They get split out before they reach the decoders, and I think our intention is to do something similar for 608/708 as well (if it's hard to do this without decoding then all of the data can be delivered, with a flag set on the format of each stream to indicate which track should actually be kept during decode). But yeah, this is all work in progress.

@zsmatyas
Copy link
Contributor

@zsmatyas zsmatyas commented on 18a24a1 Jan 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a little bit more complicated than handling 4 channels. 608 differentiates 2 main channels defined by the "line" the stream is coming from, like "line 21" of the analog transmission is one incoming stream (2 bytes in every frame). In this single stream for every 2 bytes there is a bit serving as a flag which "sub-channel" this command belongs. This way "line 21" allows transmitting 2 channels itself, but they share the bandwidth. Analog frame-rate was 30, so we have at most 60 useful bytes shared between Channel 1 and 2. Equally shared, without any overhead and commands, we can still easily meet scenarios when 30 characters are not enough in a second to correctly show everything, with the positioning, styling and other overhead the amount of useful bytes is much less. So in practice, we cannot fit 2 languages into Channel 1 and 2 as they must share the 60 bytes / second bandwidth. That is way it became quite standard in the US that Channel 1 (of line 21) is English called primary channel, while Spanish subtitles are transmitted in an independent line (Channel 3) also primary channel. Channel 3 and 4 similarly shares a single "line" of transmission as channel 1 and 2 does. As channel 2 and 4 only uses the leftover bandwidth, they usually convey much less data and are rarely used.

These were all defined in the 608 standard. There are other modes also allowed, outside of the scope of our discussion.

But this means that you need to read every single byte to check the flag that differentiates between Channel 1 and 2, you cannot parse them independently. Similarly channel 3 and 4 are not independent from each other. As you cannot tell what channels are used (what are those bit flags) unless you read through the entire stream, most players show immediately 4 possible subtitle streams to the user to select from. There is no header or metadata which streams will be used (as it was originally a continuous stream of shows in analog television, it can change mid-stream any time). Usually you need to pick Channel 1 for English and 3 for Spanish, but any combination is allowed.

Then came 708 with digital transmission increasing the bandwidth, adding some headers to the new streams as part of MPEG-2, but the original 608 bytes are still transmitted without header and metadata information. Limitations are still present. 608 and 708 bytes are transmitted in a "Closed Caption Data Packet" (cc_data_pkt), that has a type field (2 bits) that differentiates the 2 lines of the 608 stream (value 0 and 1) and defines values 2 and 3 as DTVCC_PACKET_DATA and DTVCC_PACKET_START. So you need to collect the incoming stream into a buffer to have a full 708 Packet to interpret it correctly and you still need to read through the entire stream of all 4 possible 608 channels to figure out if they are really present or not (any one of them can be a stream of continuous 0 values). See LINK

While interpreting the 708 packets, it has 3 bits for service numbers, but when value 7 is used, additional 6 bits are used for identifying the extended service, so there are 7 main services and 63 extended services. Services are like independent subtitle channels, but in 708 it is a more generic term, it can be weather information, traffic updates, age rating, leftover time of the current show or lots of other similar services. Again, there are no headers or metadata suggesting what services will be used in this stream (for live tv it could change any time), you cannot tell how many of the possible 63 services (subtitle channels/streams) are used.

So there are 4 possible 608 channels (2 useful and 2 using leftover bandwidth), and 63 possible 708 channels, usually repeating the same streams that are parallel transmitted as 608 streams as well. But there are much more options.

I expect the main goal would be to correctly interpret 2 channels of 608 subtitles (but figuring out which 2 is stream dependent), and probably the 7 main services of the 708 should cover all main use cases for almost any streams.

@ojw28
Copy link
Contributor

@ojw28 ojw28 commented on 18a24a1 Jan 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peddisri
Copy link
Contributor

@peddisri peddisri commented on 18a24a1 Jan 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ojw28
So this means the SeiReader in ExtractorSampleSource will indicate it supports both CEA608 and CEA708 Tracks and depending on what the user selects, the corresponding Decoder will be invoked. Is this understanding correct? If this is the case, the current design is good.

@arifdi
Copy link

@arifdi arifdi commented on 18a24a1 Feb 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zsmatyas nice explanation, very accurate. As a side clarification, the service information for 608 and 708 captions are carried as part of PSIP data in broadcasts via Caption Service Descriptor in PMT or in EIT where available. This descriptor identifies each available service by service number (CC1-CC4 for 608, 1...64 for 708), it's type (608/708) and its language.

@zsmatyas
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arifdi
I didn't know about that. Do you mean that the Caption Service Descriptor must always correctly show what captions are present? It seems to be not optional, so we should be able to show UI selectors for the subtitles based only on the Caption Service Descriptor.

@AquilesCanta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ultimately depends on the TV standard, I guess. The descriptor @arifdi mentions belongs to ATSC(though I am not sure it is mandatory[1]).

The TS Extractor's API will soon support both mechanisms(The code is already available, I am waiting for a blocking commit to go in):

  • Either passing a list of services for CC that might be available, so we expose all the tracks ask for . But they will be rendered iif there is actual content for those services.
  • If the PMT has Caption Service Descriptors, we will use that info to declare the CC tracks.

[1]: CEA 708 spec says that, when carried in Nal units, bandwidth for CC should be preallocated so that captions can be added at a later stage of transmission without requiring remultiplexing the stream. In that case, I guess someone could add captions for a service that is not already declared by the PMT's descriptors. This is just a guess, though. I am just considering adding a descriptor to the PMT could require you to split it across multiple packets.

Please sign in to comment.