Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
[ADN] impossible to extract subtitles #12724
Comments
|
the problem in the issue can be fixed simply by decoding the decrypted subtitle, however the real problem is that they change the decryption key frequently, so this won't be fixed untill there is a js interpreter that can handle the key construction js code. |
|
Can you (@remitamine) explain how to get the encr. key, i see that you regulary change the key in the extractor ? |
|
the key can be found in http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js. function(){var a=function(){var a,b=[a=114874,a+=-1521,a+=-19814,a+=-75638,a+=45570,a+=46993,a+=-66124,a+=28122];b[3]=[b[4],b[4]=b[3]][0],b[5]=[b[2],b[2]=b[5]][0],b[2]=44262*b[2]%(2<<16),b[2]=6159*b[2]%(2<<16),b[5]=b[6]^b[2],b[3]=[b[0],b[0]=b[3]][0],b[0]=b[7]^b[0],b[5]=4906*b[5]%(2<<16),b[5]=b[7]^b[0],acopifakofuwil(b.map(function(a){return("0000"+a.toString(16)).substr(-4)}).join(""))};a(),a=null}()what i did is copying what is inside the var a, b = [a = 114874, a += -1521, a += -19814, a += -75638, a += 45570, a += 46993, a += -66124, a += 28122];
b[3] = [b[4], b[4] = b[3]][0], b[5] = [b[2], b[2] = b[5]][0], b[2] = 44262 * b[2] % (2 << 16), b[2] = 6159 * b[2] % (2 << 16), b[5] = b[6] ^ b[2], b[3] = [b[0], b[0] = b[3]][0], b[0] = b[7] ^ b[0], b[5] = 4906 * b[5] % (2 << 16), b[5] = b[7] ^ b[0], console.log(b.map(function(a) {
return ("0000" + a.toString(16)).substr(-4)
}).join(""))the result is: diff --git a/youtube_dl/extractor/adn.py b/youtube_dl/extractor/adn.py
index 66caf6a81..50cfdcdee 100644
--- a/youtube_dl/extractor/adn.py
+++ b/youtube_dl/extractor/adn.py
@@ -45,7 +45,7 @@ class ADNIE(InfoExtractor):
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(base64.b64decode(enc_subtitles[24:])),
- bytes_to_intlist(b'\nd\xaf\xd2J\xd0\xfc\xe1\xfc\xdf\xb61\xe8\xe1\xf0\xcc'),
+ bytes_to_intlist(b'\xec\xe1\xba\xc9\x23\x00\xc0\xba\x45\xed\xf7\xef\xad\x34\x1b\x0e'),
bytes_to_intlist(base64.b64decode(enc_subtitles[:24]))
))
subtitles_json = self._parse_json( |
|
This is not possible to automatically execute this (via external server for example) ? |
as i said before:
it's possible with a js interpreter. |
|
I do not really have any knowledge in js interpreter but maybe there is something that would be appropriate? : |
javascript project we are using python in this project. the decryption code has been changed again(they apply more obfuscation, but it still simple to deobfuscate), the change that also need to apply in the code is changing the user agent(they banned the user agent used by youtube-dl). diff --git a/youtube_dl/extractor/adn.py b/youtube_dl/extractor/adn.py
index 66caf6a81..09e46cc34 100644
--- a/youtube_dl/extractor/adn.py
+++ b/youtube_dl/extractor/adn.py
@@ -38,14 +38,16 @@ class ADNIE(InfoExtractor):
enc_subtitles = self._download_webpage(
'http://animedigitalnetwork.fr/' + sub_path,
- video_id, fatal=False)
+ video_id, headers={
+ 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0'
+ }, fatal=False)
if not enc_subtitles:
return None
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(base64.b64decode(enc_subtitles[24:])),
- bytes_to_intlist(b'\nd\xaf\xd2J\xd0\xfc\xe1\xfc\xdf\xb61\xe8\xe1\xf0\xcc'),
+ bytes_to_intlist(b'\xba\x11\x86\x24\x55\xa6\x40\xf8\x50\xb8\xb0\xe7\x46\x4d\x90\x13'),
bytes_to_intlist(base64.b64decode(enc_subtitles[:24]))
))
subtitles_json = self._parse_json( |
|
Otherwise, could we create an option where we could enter the key manually? |
|
Ah yeah, which can be easily got via an external serivce in JS (in a GH Page for example) |
|
@remitamine, the new obsfucation system is strarting at which line exactly ? (after unminify) We can also create a notice to explain how to get easily this decryption key to afterward specify in -subdk. |
|
Hey, update :
|
i will point directly to obfuscated part related to the decryption key.
the second part can be found in(the real value is
options should be generic, normally we don't add options specific for one of the extractors. |
|
oh, we can use --video-password, it will be a little weird but if we cannot create a new argument. |
|
Hi, i still get this issue any idea ? |
it will be fixed in the next version. |
|
Ow...
|
|
re possible ? |
|
i feel like @testlog0 has not completly understand what the problem is? The issue here is that the site creates a new key for the subtitle similar to an issue .The site uses date,time and key to obsfuricate the code so that the user cannot understand it.Think it this was if you have a valuable file, you would do whatever possible to avoid anyone copying it.So your subtitle is protected that way.In order to break it you need to deobsfuricate the code,then it will be parsed by the necessary function and you will get your subtitle.This sometimes can be done directly and if the result are positive the code can be pushed here. @remitamine please give one more example so that i can prepare the regex function meanwhile |
i will put the unminified code as it will be the one that is needed to be matched.
the second part can be found in:
|
|
@remitamine thanks for the 2nd example,found out they use date as a get value followed by something similar to version id, |
|
they use a minfied version then append it with the obsfuricated code,just not able to understand which part to use, @remitamine please guide as i am simply using parts of the code,part of the code is at https://gist.github.com/siddht1/ed9837e6d2af205b4ccdb25459ba20e4 |
|
@remitamine '\x70\x72\x65\x70\x61\x72\x65\x53\x75\x62\x74\x69\x74\x6c\x65\x73' actually means that |
|
prepareSubtitles |
|
@remitamine i know '\x70\x72\x65\x70\x61\x72\x65\x53\x75\x62\x74\x69\x74\x6c\x65\x73' means preparesubtitlle,'\x73\x75\x62\x73\x74\x72\x69\x6e\x67' means substring,'\x70\x61\x72\x73\x65' means parse,just searching where it gets invoked |
|
loadSubtitles |
|
https://gist.github.com/siddht1/ed9837e6d2af205b4ccdb25459ba20e4#file-adn-part-code-L1060 @siddht1 if you're looking for a better understanding of the code, i think the first thing that you have to do is to deobfuscate the code:
this will give you a clear source code that will let you understand the flow. |
|
@siddht4 if you're looking for a better understanding of the code, i think the first thing that you have to do is to deobfuscate the code:
this will give you a clear source code that will let you understand the flow. That prettly much what was i doing but you already know developer obfuscrate the code so that its next to impossible to get the code back in readable format(next to impossible,not impossible).My mozilla deobsfuricate doesnt seems to work,so had been document.write() what each part meant.Lengthy work so chit chatting with @remitamine.If @remitamine has partial deobfuscate of the code share it.I will be keeping the complete copy of the js for each day so that i can verify indeed there is a function working with the get values of date and version |
it's possible to convert the code into a readable format.
i wrote before a script to automatically get an fresh deobfuscated version of the js code, however i can't access it now because i can't access the HDD. |
this problem is tracked at #17084. |
My bad, thanks for answer. |
As you want, gimme your name somewhere I can contact you. |
|
After fixing the urllib.request with an user agent and doing some tweaks because player_config was moved it was still working before, but now since today the m=re.search to get key doesnt work anymore and return None. Someone has any idea how to fix this? |
|
@remitamine Doesn't work anymore, they've changed it just after the release of One Punch Man Season 2. |
|
(btw, here's my crappy way to get new player config info with re.findall if someone need)
|
|
hope @remitamine or @persi-persu gonna be able to break that new js ! Here's the latest "deobfuscated_js version" from today |
|
the manual method that I've explained before still work without a problem(deobfuscation, extract key function, change the last call to _0x3108d8[0x3] = 0x7732 * _0x3108d8[0x3] % (0x2 << 0x10), _0x3108d8[0x3] = 0x7cae * _0x3108d8[0x3] % (0x2 << 0x10), _0x3108d8[0x3] = _0x3108d8[0x0] * _0x3108d8[0x1];
for (var _0x57989e = 0x0; _0x57989e < 0x12; _0x57989e++) _0x3108d8[0x1] += _0x57989e;
_0x3108d8[0x1] = _0x3108d8[0x2] * _0x3108d8[0x3], _0x3108d8[0x0] = [_0x3108d8[0x2], _0x3108d8[0x2] = _0x3108d8[0x0]][0x0];
for (_0x57989e = 0x4; _0x57989e < 0x11; _0x57989e++) _0x3108d8[0x1] += _0x57989e;
_0x3108d8[0x0] = 0x95cd * _0x3108d8[0x0] % (0x2 << 0x10), _0xb03201(_0x3108d8['map'](function(_0x365834) {
return ('3111' + _0x365834['toString'](0x10))['substr'](-0x4);
})['join']('')); |
|
Thanks @remitamine! |
Could you explain how it works, haven't understand it. EDIT: Thanks, it works, but it's pretty boring to do it manually every day, anyway, thanks for answer. |
|
updating the instruction again!!! import base64
import os
import re
import urllib.request
from collections import deque
import jsbeautifier
from jsbeautifier.unpackers import UNPACKERS
for unpacker in UNPACKERS:
if 'javascriptobfuscator' in unpacker.__name__:
def unpack(code):
matches = re.search(r"(?s)((?:var\s+)?(_0x[0-9a-f]+)\s*=\s*\[\s*(.+?)\s*\].+?}\(\2\s*,\s*(0x[0-9a-f]+)\)\);\s*)(?:var\s+)?(_0x[0-9a-f]+)", code)
if matches:
repl_array, shift, repl_func = matches.group(3, 4, 5)
repl_array = deque(base64.b64decode(x[1:-1].encode().decode('unicode_escape')).decode().replace(r"'", r"\'") for x in repl_array.split(','))
repl_array.rotate(-int(shift, 16))
code = code[len(matches.group(1)):]
code = re.sub(r"%s\('(0x[0-9a-f]+)'\)" % repl_func, lambda mobj: "'%s'" % repl_array[int(mobj.group(1), 16)], code)
return code
unpacker.unpack = unpack
break
with urllib.request.urlopen(urllib.request.Request('http://www.animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js', headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0'})) as adn_min:
opts = jsbeautifier.default_options()
opts.eol = os.linesep
opts.unescape_strings = True
code = jsbeautifier.beautify(adn_min.read().decode(), opts)
with open('adn.js', 'wb') as adn_unmin:
adn_unmin.write(code.encode())now we have to location the key construction function, the subtitle decryption code located at
the key in this case is constructed with two parts, we only need the second part(changes frequently), the name of the second part variable is _0xfc90c3 = _0xd52c11 + _0x278d86that gets set by _0xb03201 = function() {
_0x278d86 = arguments[0x0];
};that gets called in: function() {
var _0x876a2f = function() {
var _0x99882, _0x3108d8 = [_0x99882 = 0x7651, _0x99882 += 0x9ad8, _0x99882 += -0xa16c, _0x99882 += 0x14631];
_0x3108d8[0x3] = 0x7732 * _0x3108d8[0x3] % (0x2 << 0x10), _0x3108d8[0x3] = 0x7cae * _0x3108d8[0x3] % (0x2 << 0x10), _0x3108d8[0x3] = _0x3108d8[0x0] * _0x3108d8[0x1];
for (var _0x57989e = 0x0; _0x57989e < 0x12; _0x57989e++) _0x3108d8[0x1] += _0x57989e;
_0x3108d8[0x1] = _0x3108d8[0x2] * _0x3108d8[0x3], _0x3108d8[0x0] = [_0x3108d8[0x2], _0x3108d8[0x2] = _0x3108d8[0x0]][0x0];
for (_0x57989e = 0x4; _0x57989e < 0x11; _0x57989e++) _0x3108d8[0x1] += _0x57989e;
_0x3108d8[0x0] = 0x95cd * _0x3108d8[0x0] % (0x2 << 0x10), _0xb03201(_0x3108d8['map'](function(_0x365834) {
return ('3111' + _0x365834['toString'](0x10))['substr'](-0x4);
})['join'](''));
};
_0x876a2f(), _0x876a2f = {};
}()now, we get the inner code of the function: var _0x99882, _0x3108d8 = [_0x99882 = 0x7651, _0x99882 += 0x9ad8, _0x99882 += -0xa16c, _0x99882 += 0x14631];
_0x3108d8[0x3] = 0x7732 * _0x3108d8[0x3] % (0x2 << 0x10), _0x3108d8[0x3] = 0x7cae * _0x3108d8[0x3] % (0x2 << 0x10), _0x3108d8[0x3] = _0x3108d8[0x0] * _0x3108d8[0x1];
for (var _0x57989e = 0x0; _0x57989e < 0x12; _0x57989e++) _0x3108d8[0x1] += _0x57989e;
_0x3108d8[0x1] = _0x3108d8[0x2] * _0x3108d8[0x3], _0x3108d8[0x0] = [_0x3108d8[0x2], _0x3108d8[0x2] = _0x3108d8[0x0]][0x0];
for (_0x57989e = 0x4; _0x57989e < 0x11; _0x57989e++) _0x3108d8[0x1] += _0x57989e;
_0x3108d8[0x0] = 0x95cd * _0x3108d8[0x0] % (0x2 << 0x10), _0xb03201(_0x3108d8['map'](function(_0x365834) {
return ('3111' + _0x365834['toString'](0x10))['substr'](-0x4);
})['join'](''));change the last call in the code( var _0x99882, _0x3108d8 = [_0x99882 = 0x7651, _0x99882 += 0x9ad8, _0x99882 += -0xa16c, _0x99882 += 0x14631];
_0x3108d8[0x3] = 0x7732 * _0x3108d8[0x3] % (0x2 << 0x10), _0x3108d8[0x3] = 0x7cae * _0x3108d8[0x3] % (0x2 << 0x10), _0x3108d8[0x3] = _0x3108d8[0x0] * _0x3108d8[0x1];
for (var _0x57989e = 0x0; _0x57989e < 0x12; _0x57989e++) _0x3108d8[0x1] += _0x57989e;
_0x3108d8[0x1] = _0x3108d8[0x2] * _0x3108d8[0x3], _0x3108d8[0x0] = [_0x3108d8[0x2], _0x3108d8[0x2] = _0x3108d8[0x0]][0x0];
for (_0x57989e = 0x4; _0x57989e < 0x11; _0x57989e++) _0x3108d8[0x1] += _0x57989e;
_0x3108d8[0x0] = 0x95cd * _0x3108d8[0x0] % (0x2 << 0x10), console.log(_0x3108d8['map'](function(_0x365834) {
return ('3111' + _0x365834['toString'](0x10))['substr'](-0x4);
})['join'](''));and execute the code in the web browser console, get the printed key and replace the one in the adn.py file.
@Asusagawa as i stated before, i don't use the website, and the same would apply for me if they decide to make the automatic detection of the key construction function more difficult.
|
|
@remitamine just when i figured out how u did u provide tuto haha :) nice ! Thanks again |
|
Thanks for this explanation, sorry if I seemed offensive, that wasn't the idea. |
|
@remitamine Sorry to bother you once again, it doesn't work anymore.
|
|
will be fixed in the next version. |
|
Thanks. |
|
@remitamine I've got this issue, do you know that is the problem please ?
|
|
will be fixed in the next version. |
|
It works, thanks ! |
|
Hello I have a problem : WARNING: Unable to download webpage: HTTP Error 403: Forbidden |
|
i have this error for the downloads subs in adn |
|
@hexodgo read the comment on how to extract and update the decryption key. |
|
im lock at this step that gets set by _0xb03201 function: ?
the line is
I don't understand the next step ... sorry |
|
need help :) |
|
@hexodgo everything has been explained before, read previous comments. |
Please follow the guide below
xinto all the boxes [ ] relevant to your issue (like that [x])Make sure you are using the latest version: run
youtube-dl --versionand ensure your version is 2017.04.11. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.Before submitting an issue make sure you have:
What is the purpose of your issue?
The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue
If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:
Add
-vflag to your command line you run youtube-dl with, copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):If the purpose of this issue is a site support request please provide all kinds of example URLs support for which should be included (replace following example URLs by yours):
Description of your issue, suggested solution and other information
impossible to extract subtitles.