Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Youku.com download fails #7627

Closed
nxtreaming opened this issue Nov 24, 2015 · 10 comments
Closed

Youku.com download fails #7627

nxtreaming opened this issue Nov 24, 2015 · 10 comments
Labels

Comments

@nxtreaming
Copy link

@nxtreaming nxtreaming commented Nov 24, 2015

root@ubuntu:/home/ubuntu# youtube-dl --verbose --user-agent "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36" "http://v.youku.com/v_show/id_XMTM5MjM5NTAyMA==.html"
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'--verbose', u'--user-agent', u'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36', u'http://v.youku.com/v_show/id_XMTM5MjM5NTAyMA==.html']
[debug] Encodings: locale ANSI_X3.4-1968, fs ANSI_X3.4-1968, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2015.11.24
[debug] Python version 2.7.6 - Linux-3.13.0-32-generic-x86_64-with-Ubuntu-14.04-trusty
[debug] exe versions: none
[debug] Proxy map: {}
[youku] XMTM5MjM5NTAyMA: Downloading JSON metadata 1
ERROR: Unable to download JSON metadata: [Errno 104] Connection reset by peer (caused by error(104, 'Connection reset by peer')); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 329, in _request_webpage
return self._downloader.urlopen(url_or_request)
File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 1878, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "/usr/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/local/bin/youtube-dl/youtube_dl/utils.py", line 691, in http_open
req)
File "/usr/lib/python2.7/urllib2.py", line 1187, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib/python2.7/httplib.py", line 1045, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 409, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
line = self.fp.readline(_MAXLINE + 1)
File "/usr/lib/python2.7/socket.py", line 476, in readline
data = self._sock.recv(self._rbufsize)

root@ubuntu:/home/ubuntu# youtube-dl --version
2015.11.24

@yan12125 yan12125 added the broken-IE label Nov 24, 2015
@freeyoung
Copy link

@freeyoung freeyoung commented Nov 24, 2015

Same here. This blog post could be useful: http://www.cnblogs.com/zhxilin/p/4993074.html

@nxtreaming
Copy link
Author

@nxtreaming nxtreaming commented Nov 25, 2015

I confirm the blog's algorithm is OK

my local test seems success ( PHP code)

    public function getM3U8Url() {
            if (false) {
                $content = $this->mycurl('http://v.youku.com/player/getPlayList/VideoIDS/'.$this->ykid.'/Pf/4/ctype/12/ev/1');
                $content = json_decode($content,true);
                $metadata0 = $content['data'][0];
                $source_ep = $metadata0['ep'];
                $ip = $metadata0['ip'];
            } else {
                $content = $this->mycurl('http://play.youku.com/play/get.json?vid='.$this->ykid.'&ct=12');
                $content = json_decode($content,true);
                $metadata0 = $content['data']['security'];
                $source_ep = $metadata0['encrypt_string'];
                $ip = $metadata0['ip'];
            }
            $res = $this->generate_ep($this->ykid,$source_ep);
            $m3u8_url = 'http://pl.youku.com/playlist/m3u8?ctype=12&ep='.urlencode($res["ep"]).'&ev=1&keyframe=1&oip='.$ip.'&sid='.$res["sid"].'&token='.$res["token"].'&ts='.time().'&type='.$this->type.'&vid='.$this->ykid;
            return $m3u8_url;
    }

please check if (false) .... else { code }

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Nov 25, 2015

Thanks @freeyoung. The author of that blog post has confirmed that we can use the algorithm in youtube-dl.

@catlovemouse
Copy link

@catlovemouse catlovemouse commented Dec 2, 2015

@yan12125

***@Debian:/usr/local/lib/python2.7/dist-packages/youtube_dl$ youtube-dl --version
2015.11.27.1
The latest youtube-dl still failed for youku url.

The latest source code doesn't use algorithm posted by http://www.cnblogs.com/zhxilin/p/4993074.html.

When can it support youku?

Thank you~

@roses007
Copy link

@roses007 roses007 commented Dec 8, 2015

Apologies ahead of time for this comment and my cluelessness, but I have a few questions regarding this issue with youku.com.

I know the code on youtube-dl has yet to be updated, but is the solution via the algorithm linked above still valid? I checked the blog page again and there seemed to be an addendum posted on it today, but since I had to rely on Google translate to understand it, I'm confused as to whether it's just an additional explanation or if the algorithm had to be changed again.

Additionally, youtube-dl was basically a life saver for me when I found out that I could use it for Youku and Tudou as both of those sites are horribly incompatible with my internet/location. I'd originally relied on this one website that could access and stream videos from both places, but neither site is compatible with it now (first Tudou videos stopped working, and since a few weeks ago, Youku). I'm thankful that I can still access/download Tudou videos via youtube-dl, but is there any alternative solution for clueless people like me to implement the Youku code manually while I wait for youtube-dl to update the code?

I tried downloading you-get instead for the time being, but I only ended up coming across problems trying to understand it and Python 3. All of that and the errors I kept encountering were too much for my inexperience (so much so that I literally feel nauseous every time I think about it fixing the errors/attempting to execute that program again).

Sorry once again for this comment (along with any misuse of technical terms) and for not being more patient. I know it takes time to fix this kind of stuff, and I'm extremely grateful that youtube-dl exists and that it is as easily executable as it is for someone with little experience like me. But I just wanted to check if there is something that I can do on my end (and if there's any chance that someone wouldn't mind explaining it in layman's terms) in order to extract Youku videos during the wait.

I noticed that a test had been performed by @nxtreaming above, but as I mentioned, I have little knowledge of coding and related tasks/programs, so I don't know if that can actually be implemented as an alternative solution or if the test was just a simulation of sorts. A similar explanation applies to my earlier question regarding the algorithm on cnblogs, in case that can still somehow be used in a temporary solution.

Thank you so much ahead of time if anyone may have the time and patience to help!

@nxtreaming
Copy link
Author

@nxtreaming nxtreaming commented Dec 9, 2015

I just updated the PHP code to take HTTP cookies and HTTP referrer into account, It is ok to playback youku's video in the format of m3u8.

The following is the full PHP code which I use to play youku's video:

getM3U8Url(); setCookieAndLoaction($furl , $_GET["vid"]); class YKParser { private $ykid; private $type; /\* flv,mp4,hd2 */ ``` public function __construct($vid, $type='mp4') { $this->ykid = $vid; $this->type = $type; } public function getM3U8Url() { if (false) { $content = $this->mycurl('http://v.youku.com/player/getPlayList/VideoIDS/'.$this->ykid.'/Pf/4/ctype/12/ev/1'); $content = json_decode($content,true); $metadata0 = $content['data'][0]; $source_ep = $metadata0['ep']; $ip = $metadata0['ip']; } else { $content = $this->mycurl_2('http://play.youku.com/play/get.json?vid='.$this->ykid.'&ct=12', $this->ykid); $content = json_decode($content,true); $metadata0 = $content['data']['security']; $source_ep = $metadata0['encrypt_string']; $ip = $metadata0['ip']; } $res = $this->generate_ep($this->ykid,$source_ep); $m3u8_url = 'http://pl.youku.com/playlist/m3u8?ctype=12&ep='.urlencode($res["ep"]).'&ev=1&keyframe=1&oip='.$ip.'&sid='.$res["sid"].'&token='.$res["token"].'&ts='.time().'&type='.$this->type.'&vid='.$this->ykid; return $m3u8_url; } private function mycurl($url) { $curl = curl_init(); curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); curl_setopt($curl, CURLOPT_URL, $url); curl_setopt($curl, CURLOPT_TIMEOUT, 10); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($curl, CURLOPT_HEADER, 0); $res = curl_exec($curl); curl_close($curl); return $res; } private function mycurl_2($url, $vid) { $curl = curl_init(); curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); curl_setopt($curl, CURLOPT_URL, $url); curl_setopt($curl, CURLOPT_TIMEOUT, 10); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($curl, CURLOPT_REFERER, "http://v.youku.com/v_show/id_${vid}"); curl_setopt($curl, CURLOPT_HEADER, 0); $res = curl_exec($curl); curl_close($curl); return $res; } private function generate_ep($vid, $ep) { $f_code_1 = 'becaf9be'; $f_code_2 = 'bf7e5f01'; $e_code = $this->trans_e($f_code_1, base64_decode($ep)); $tmp = split("_",$e_code); $new_ep = $this->trans_e($f_code_2, $tmp[0]."_".$vid."_".$tmp[1]); $res["ep"] = base64_encode($new_ep); $res["sid"] = $tmp[0]; $res["token"] = $tmp[1]; return $res; } private function trans_e($a, $c) { $f = $h = 0; for($i=0; $i<256; $i++) $b[] = $i; $result = ""; while ($h<256) { $f = ($f + $b[$h] + ord($a[$h % strlen($a)])) % 256; list($b[$f],$b[$h]) = array($b[$h],$b[$f]); $h++; } $q = $f = $h = 0; while ($q < strlen($c)) { $h = ($h + 1) % 256; $f = ($f + $b[$h]) % 256; list($b[$f],$b[$h]) = array($b[$h],$b[$f]); if(is_int($c[$q])) { $result .= chr($c[$q] ^ $b[($b[$h] + $b[$f]) % 256]); } else { $result .= chr(ord($c[$q]) ^ $b[($b[$h] + $b[$f]) % 256]); } $q++; } return $result; } ``` } function setCookieAndLoaction($furl , $vid){ $ch = curl_init('http://play.youku.com/play/get.json?vid='.$vid.'&ct=12'); curl_setopt($ch, CURLOPT_REFERER, "http://v.youku.com/v_show/id_${vid}"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // get headers too with this line curl_setopt($ch, CURLOPT_HEADER, 1); $result1 = curl_exec($ch); // get cookie // multi-cookie variant contributed by @Combuster in comments preg_match_all('/^Set-Cookie:\s_([^;]_)/mi', $result1, $matches); $cookies = array(); foreach($matches[1] as $item) { parse_str($item, $cookie); $cookies = array_merge($cookies, $cookie); } ``` $cookies_str = ""; foreach($cookies as $key => $value) { $cookies_str .= "$key="; $cookies_str .= "$value; "; } $cookies_str .= "domain=.youku.com; "; $cookies_str .= "path=/; "; $cookies_str .= "secure"; header("Set-Cookie: $cookies_str; expires=".gmstrftime("%A, %d-%b-%Y %H:%M:%S GMT",time()+60*5)); header("Location: ".$furl); ``` } ?>
@Celthi
Copy link
Contributor

@Celthi Celthi commented Dec 11, 2015

After setting cookie and adding Referer, I have download Json data(as follow). I find the Json data is different from the codes extracted. Some keys are missing, such as data1['seed'], dt['no'] and so on.
I'm not sure whether I get the wrong Json data or Json data has changed. I try to fix the code assumed that the Json datal has changed, but I don't know what exactly some of the keys&values mean. So I don't know what the correct key&values to replace the wrong ones. Appreciate ANY HELP ! T am try my best to make it work.
The json data is got from url:http://v.youku.com/v_show/id_XNzEyMDc4Mzc2.html?from=y1.2-1-98.3.3-1.1-1-1-2-0
Json_download.txt

At this moment 2015.12.11 20:15, I have pulled a request. Referenced you-get, I fixed the videourl construct a bit, so it can get the right video urls. I have test some cases, and it works well using my branch https://github.com/Celthi/youtube-dl/tree/youku_bugfix

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Dec 11, 2015

I've implemented the M3U8 approach based on the algorithm proposed in http://www.cnblogs.com/zhxilin/p/4993074.html and some ideas from @nxtreaming's PHP script at https://github.com/yan12125/youtube-dl/tree/wip-youku. However, the download speed is terribly slow (974.91B/s) and sometimes errors occur:

$ youtube-dl -v "http://v.youku.com/v_show/id_XMTM5MjM5NTAyMA==.html" --hls-prefer-native
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-v', 'http://v.youku.com/v_show/id_XMTM5MjM5NTAyMA==.html', '--hls-prefer-native']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2015.12.06
[debug] Git HEAD: 47f48f5
[debug] Python version 3.5.1 - Linux-4.3.0-1-ARCH-x86_64-with-arch-Arch-Linux
[debug] exe versions: ffmpeg 2.8.3, ffprobe 2.8.3, rtmpdump 2.4
[debug] Proxy map: {}
[youku] XMTM5MjM5NTAyMA: Downloading JSON metadata 2
[debug] Invoking downloader on 'http://pl.youku.com/playlist/m3u8?oip=2356209368&ev=1&ep=cSaQE0GMVc4D5CHfgT8bYnjjdiQJXJZ3kmKA%2FIgfBcVAOaHC6FHTxJS5&vid=XMTM5MjM5NTAyMA&ctype=12&sid=3449813031209125d12c6&keyframe=1&token=2399&type=3gphd'
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 296
[download] Destination: 中国之星 151121-XMTM5MjM5NTAyMA.mp4
[download]   0.0% of ~34.92MiB at  974.91B/s ETA 10:27:48ERROR: content too short (expected 123704 bytes and served 72783)
Traceback (most recent call last):
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1607, in process_info
    success = dl(filename, info_dict)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1549, in dl
    return fd.download(name, info)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/downloader/common.py", line 342, in download
    return self.real_download(filename, info_dict)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/downloader/hls.py", line 101, in real_download
    success = ctx['dl'].download(frag_filename, {'url': frag_url})
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/downloader/common.py", line 342, in download
    return self.real_download(filename, info_dict)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/downloader/http.py", line 236, in real_download
    raise ContentTooShortError(byte_counter, int(data_len))
youtube_dl.utils.ContentTooShortError: (72783, 123704)

For those who would like to try my branch wip-youku:

  1. Install Python 2.6+ or 3.2+
  2. Download https://github.com/yan12125/youtube-dl/archive/wip-youku.zip and extracts it
  3. python -m youtube_dl

I'm in Taiwan. Anyone in Mainland China or outside Mainland China has encountered similar problems?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Dec 12, 2015

Thanks @Celthi Youku videos will work again in the next version. Thanks for the report and everyone works on it.

@yan12125 yan12125 closed this Dec 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants
You can’t perform that action at this time.