Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dubbed videos should not include subtitles of other languages #5359

Closed
radinamatic opened this issue Dec 8, 2016 · 16 comments
Closed

Dubbed videos should not include subtitles of other languages #5359

radinamatic opened this issue Dec 8, 2016 · 16 comments
Milestone

Comments

@radinamatic
Copy link
Member

Summary

This is an old issue that was fixed previously, but I see it again in 0.17b3. For context see #3913 and this comment.

In short, dubbed videos should not include or be linked to subtitles, as the entirety of KA subtitles in non-English languages are the translation of the original English transcript, and thus useless when user is viewing a dubbed video.

System information

Ubuntu 15.10 with Firefox, but it's a contentpack issue, and most probably not related to a specific system.

Screenshots

Video in Spanish with the "subtitle" (or rather, the transcript) in English.

ubuntu15 10 pip install static 0 17b3 running - oracle vm virtualbox_080

@benjaoming
Copy link
Contributor

Hi @radinamatic :)

In short, dubbed videos should not include or be linked to subtitles

Why not? If you don't have sound, it seems useful?

They're off by default, anyways?

Not sure how they should have been removed and re-appeared, AFAIK there's always been English subtitles in the English content pack.

@radinamatic
Copy link
Member Author

@benjaoming

Why not? If you don't have sound, it seems useful?

Captions are indeed very useful, IF they are made/transcribed from the video they are used with, and that is not the case here. English captions/subtitles here are transcribed from the original English video and do not describe exactly the audio track and video content of the "dubbed" video in Spanish (mainly because the video is not really "dubbed" but recreated from scratch, following the similar script as original - see my comment here)

Accessibility bonus feature here would be if we had captions of the Spanish "dubbed" videos themselves, but unfortunately we don't, and offering the caption of the original English video that is out of sync with what happens in the Spanish one is not really useful, and can be confusing.

@benjaoming
Copy link
Contributor

@radinamatic

Would this be a correct description?

Subtitles for other languages than the active language should only be available when the video isn't already dubbed for the active language

@radinamatic
Copy link
Member Author

@benjaoming Yes! That would be essentially correct for our use case. 😃

@benjaoming benjaoming changed the title Dubbed videos should not include English subtitles Dubbed videos should not include subtitles of other languages Dec 13, 2016
@benjaoming
Copy link
Contributor

@radinamatic - I'm not sure if this is trivial to fix.

It wasn't fixed in #3913 as you refer to - only the issue was closed and discussed in Kolibri. I don't know if I want to do anything at this point.

As for the data which we can base a fix on, here's an example of extra_fields - I think we could potentially use translated_youtube_lang, and if it's set to non-english then only display subtitles matching it.

{"image_url":"https:\/\/cdn.kastatic.org\/googleusercontent\/OIAdB4ITJNXl0oo_M0GIklCCgOoEH90OJuXh-k__OatTBXFQMA2tieQe83Kdwe9de4Z4hFnq7h61QpDz0jAxoe4","duration":219,"readable_id":"subtraction-introduction","relative_url":"\/video\/subtraction-introduction","license_name":"CC BY-NC-SA (KA default)","related_exercise_url":"\/exercise\/subtraction_1","format":"mp4","sha":"6dad65537fc2eb227e3da12bf484bc1a30507ae5","translated_youtube_lang":"en","keywords":""}

@benjaoming
Copy link
Contributor

Unmarking as release blocker since it's not a new issue.

@radinamatic
Copy link
Member Author

@benjaoming Not sure which PR fixed it, but the contentpacks we built for the first release of 0.16 back in April did not have this issue.

We can discuss the possible solutions and weather it deserves the release blocker status on the syncup tomorrow.

@benjaoming
Copy link
Contributor

@radinamatic the annotation of existing contents works like this:

  1. A content pack is download
  2. The content pack is unpacked
  3. The annotation system looks for subtitle files
  4. For each file found, the video with the appropriate id is marked with subtitles for the language in question

The annotation happens for all installed languages, and subtitles should be displayed for these. I'm not seeing anything in this behaviour that's altered in 0.16, maybe you have experienced something with a content pack that was installed but not correctly annotated.

@benjaoming benjaoming removed this from the 0.17.0 milestone Dec 17, 2016
@radinamatic
Copy link
Member Author

radinamatic commented Dec 19, 2016

@benjaoming Then the problem is how the annotation system assigns "subtitles" to dubbed videos. It happens consistently with Spanish videos with all the 0.17 installers I tested (even this morning with the latest one by @mrpau-richard).

Now, we may conclude that it is not feasible to fix this for the final 0.17 release, but the underlying problem remains.

If you have time, read my lengthy explanation(s) in the discussion we had back in May with @66eli77 & @jamalex, in order to avoid the similar issue in Kolibri. Two most important points:

  • do not use "in the same language as video" as the ONLY criteria to display captions.

  • we need a way to unequivocally link video A with its own captions B0, B1, B2, B3..., so none of those captions get (conf)used as correct for videos C, D, E, F which are subsequently re-created versions of A in other languages for the exactly same topic, made with the same script.

To use the above naming example in KA Lite video "Intro to subtraction":
http://127.0.0.1:8008/learn/khan/math/early-math/cc-early-math-add-sub-basics/cc-early-math-add-sub-intro/subtraction-introduction/

THIS IS CORRECT
Original video in English with Sal talking (let's call it video A), with subtitle B0 (English transcript) displayed.

virtualbox_windows 7_19_12_2016_15_38_01

THIS IS ALSO CORRECT
The same original video in English A, with subtitle B1 displayed, which is a Spanish translation of the English transcript B0.

virtualbox_windows 7_19_12_2016_15_20_22

THIS IS WRONG
"Dubbed" video in Spanish with a female narrator (let's call it video C), with the same subtitle B1 displayed as above. Content of the subtitle corresponds to audio track of video A, and not video C as it should be.

virtualbox_windows 7_19_12_2016_15_10_19

@benjaoming
Copy link
Contributor

So you're saying: No subtitles at all for dubbed videos? I don't think we have data as to which transcript the subtitles originate from.

@benjaoming
Copy link
Contributor

Also, thanks for explaining - I'm still pretty sure the issue wasn't addressed looking at the old closed issue - it got closed out of a decision (your own?) that it wasn't to be fixed in KA Lite, but in Kolibri :) We can still fix it by making a simple decision such as "don't display subtitles if the video language isn't English".. but I have to consolidate that with what kind of data is actually available in the content database, because I would draw the red line in case we have to change the way we build the content packs.

@radinamatic
Copy link
Member Author

radinamatic commented Dec 19, 2016

I cannot vouch that the issue is 100% valid for all the languages, but knowing how KA videos are produced & translated through crowd-sourced efforts, chances are good it is. I could only wish that accessibility was held in such a high regard to actually invest the effort to create captions of dubbed versions of the video, but knowing the real world, all the non-English subtitles are almost certainly translations of original English transcripts created prior to dubbed videos, and are thus (almost) useless.

Ref: almost - @jamalex recently mentioned that in some cases "dubbed" videos are indeed/actually dubbed, meaning the original (English) video was overlaid with audio track which is a (spoken) translation of original English audio, and not completely re-made (both the video & audio tracks) by following very closely Sal's "script" from the original video. Not ideal, but this increases the chance of translated subtitles being actually useful as captions for dubbed videos.

Ref: when was the issue addressed - Now that you mention putting the foot on changing contentpacks this late, I believe that's exactly how we fixed it at the time. @aronasorman made some magic of his own on the contentpack maker, and cooked the contentpacks so that they included:

  • all dubbed videos (if any for a given language), with no subtitles
  • all the remaining English videos IF there was an available subtitle in the given language for those videos

Any of this doable for the 0.17 release?

@benjaoming
Copy link
Contributor

I don't know, as I haven't looked properly into it. But I agree on your description and think the proposed solution sounds like the most realistic fix.

@radinamatic radinamatic added this to the 0.18 milestone Jan 15, 2017
@radinamatic
Copy link
Member Author

I'm punting this to 0.18, but we should keep working on it, if nothing else as an exercise for how to approach the same problem with KA videos in Kolibri!

@mrpau-eugene
Copy link
Contributor

Hi! @radinamatic I managed to remove subtitles/captions on videos that are dubbed.
screen shot 2017-05-23 at 3 47 17 pm

All English videos with available english subtitles with be turned on by default (as suggested by @benjaoming on issue #5464.

screen shot 2017-05-23 at 3 44 53 pm

@radinamatic
Copy link
Member Author

This has been fixed in the latest 0.17.2 release candidates, so closing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants