Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mixcloud] can't download periodically (very often) #14088

Closed
PSlava opened this issue Aug 31, 2017 · 12 comments
Closed

[mixcloud] can't download periodically (very often) #14088

PSlava opened this issue Aug 31, 2017 · 12 comments

Comments

@PSlava
Copy link

@PSlava PSlava commented Aug 31, 2017

  • I've verified and I assure that I'm running youtube-dl 2017.08.27.1

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

Program refuses to download from mixcloud.com. Sometimes it downloads without problems, but more often it does not work. I've tried the same link many times.


./youtube-dl -v https://www.mixcloud.com/aemidesign/mdma/
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'https://www.mixcloud.com/aemidesign/mdma/']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.08.27.1
[debug] Python version 2.7.10 - Darwin-15.6.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 3.2.2, ffprobe 3.2.2
[debug] Proxy map: {}
[mixcloud] aemidesign-mdma: Downloading webpage
ERROR: Unable to extract play info; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
File "./youtube-dl/youtube_dl/YoutubeDL.py", line 776, in extract_info
ie_result = ie.extract(url)
File "./youtube-dl/youtube_dl/extractor/common.py", line 434, in extract
ie_result = self._real_extract(url)
File "./youtube-dl/youtube_dl/extractor/mixcloud.py", line 108, in _real_extract
r'm-play-info="([^"]+)"', webpage, 'play info')
File "./youtube-dl/youtube_dl/extractor/common.py", line 797, in _search_regex
raise RegexNotFoundError('Unable to extract %s' % _name)
RegexNotFoundError: Unable to extract play info; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

screenshot 2017-08-31 17 34 26

@kayb94
Copy link
Contributor

@kayb94 kayb94 commented Aug 31, 2017

I can reproduce this. Do you encounter this with other media from mixcloud too?
Obviously, the webpage downloaded at line 85 in extractor/mixcloud differs (I received two different ones):
webpage = self._download_webpage(url, track_id)

With one, the download is successfull, with the another it is not. I'm really not into that extractor enough to be able to fix this, I think. Anyway, I can offer you a quick and dirty solution (command line):
$ false; while [[ $? != 0 ]]; do youtube-dl https://www.mixcloud.com/aemidesign/mdma/ --limit-rate 500K; done
This just restarts the youtube-dl every time it has failed (which worked for me).
To speed things up, hit Ctrl+C as soon as the download rate drops. If you want to stop this, before it's done, hit Ctrl+Z (actually only pauses) or just close the terminal.

@ishitatsuyuki
Copy link
Contributor

@ishitatsuyuki ishitatsuyuki commented Sep 2, 2017

It seems they're doing AB test on their React frontend. I have analyzed the format and successfully cracked the XOR cipher. The last thing is to invent a regex.

@kayb94
Copy link
Contributor

@kayb94 kayb94 commented Sep 3, 2017

A regex for what?

@ishitatsuyuki
Copy link
Contributor

@ishitatsuyuki ishitatsuyuki commented Sep 4, 2017

@kayb94 key extracting. Obfuscation is hard, so I will probably use a partial known-plaintext attack to determine the first bytes of the key.

@ishitatsuyuki
Copy link
Contributor

@ishitatsuyuki ishitatsuyuki commented Sep 5, 2017

I'm working on this and it's mostly success. Stay tuned!

(Hiding details as the mixcloud guys are also watching)

@mixcloud-downloader
Copy link

@mixcloud-downloader mixcloud-downloader commented Sep 5, 2017

Let me hijack this issue. They're preparing to role out a new version. That's what you'll get for https://www.mixcloud.com/peatnoise/peat-noise-hypotech-podcast-008/:

<!DOCTYPE html><html><head><title>Mixcloud</title><link rel="dns-prefetch" href="//thumbnailer.mixcloud.com" /><link rel="dns-prefetch" href="//waveform.mixcloud.com" /><link rel="icon" href="https://www.mixcloud.com/media/images/www/global/favicon.ico" sizes="16x16 32x32"><link rel="icon" href="https://www.mixcloud.com/media/images/www/global/favicon-32.png" type="image/png" sizes="32x32"/><link rel="icon" href="https://www.mixcloud.com/media/images/www/global/favicon-48.png" type="image/png" sizes="48x48"/><link rel="icon" href="https://www.mixcloud.com/media/images/www/global/favicon-64.png" type="image/png" sizes="64x64"/><meta name="application-name" content="Mixcloud" /><meta name="msapplication-tooltip" content="Launch Mixcloud"><meta name="msapplication-TileColor" content="#FFFFFF"><meta name="msapplication-TileImage" content="https://www.mixcloud.com/media/images/www/global/MS-TileImage.png"><link rel="apple-touch-icon" href="https://www.mixcloud.com/media/images/www/global/touch-icon-60.png"><link rel="apple-touch-icon" sizes="76x76" href="https://www.mixcloud.com/media/images/www/global/touch-icon-76.png"><link rel="apple-touch-icon" sizes="120x120" href="https://www.mixcloud.com/media/images/www/global/touch-icon-120.png"><link rel="apple-touch-icon" sizes="152x152" href="https://www.mixcloud.com/media/images/www/global/touch-icon-152.png"><meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"><script type="text/javascript">window.NREUM||(NREUM={}),__nr_require=function(e,n,t){function r(t){if(!n[t]){var o=n[t]={exports:{}};e[t][0].call(o.exports,function(n){var o=e[t][1][n];return r(o||n)},o,o.exports)}return n[t].exports}if("function"==typeof __nr_require)return __nr_require;for(var o=0;o<t.length;o++)r(t[o]);return r}({1:[function(e,n,t){function r(){}function o(e,n,t){return function(){return i(e,[c.now()].concat(u(arguments)),n?null:this,t),n?void 0:this}}var i=e("handle"),a=e(2),u=e(3),f=e("ee").get("tracer"),c=e("loader"),s=NREUM;"undefined"==typeof window.newrelic&&(newrelic=s);var p=["setPageViewName","setCustomAttribute","setErrorHandler","finished","addToTrace","inlineHit","addRelease"],d="api-",l=d+"ixn-";a(p,function(e,n){s[n]=o(d+n,!0,"api")}),s.addPageAction=o(d+"addPageAction",!0),s.setCurrentRouteName=o(d+"routeName",!0),n.exports=newrelic,s.interaction=function(){return(new r).get()};var m=r.prototype={createTracer:function(e,n){var t={},r=this,o="function"==typeof n;return i(l+"tracer",[c.now(),e,t],r),function(){if(f.emit((o?"":"no-")+"fn-start",[c.now(),r,o],t),o)try{return n.apply(this,arguments)}finally{f.emit("fn-end",[c.now()],t)}}}};a("setName,setAttribute,save,ignore,onEnd,getContext,end,get".split(","),function(e,n){m[n]=o(l+n)}),newrelic.noticeError=function(e){"string"==typeof e&&(e=new Error(e)),i("err",[e,c.now()])}},{}],2:[function(e,n,t){function r(e,n){var t=[],r="",i=0;for(r in e)o.call(e,r)&&(t[i]=n(r,e[r]),i+=1);return t}var o=Object.prototype.hasOwnProperty;n.exports=r},{}],3:[function(e,n,t){function r(e,n,t){n||(n=0),"undefined"==typeof t&&(t=e?e.length:0);for(var r=-1,o=t-n||0,i=Array(o<0?0:o);++r<o;)i[r]=e[n+r];return i}n.exports=r},{}],4:[function(e,n,t){n.exports={exists:"undefined"!=typeof window.performance&&window.performance.timing&&"undefined"!=typeof window.performance.timing.navigationStart}},{}],ee:[function(e,n,t){function r(){}function o(e){function n(e){return e&&e instanceof r?e:e?f(e,u,i):i()}function t(t,r,o,i){if(!d.aborted||i){e&&e(t,r,o);for(var a=n(o),u=m(t),f=u.length,c=0;c<f;c++)u[c].apply(a,r);var p=s[y[t]];return p&&p.push([b,t,r,a]),a}}function l(e,n){v[e]=m(e).concat(n)}function m(e){return v[e]||[]}function w(e){return p[e]=p[e]||o(t)}function g(e,n){c(e,function(e,t){n=n||"feature",y[t]=n,n in s||(s[n]=[])})}var v={},y={},b={on:l,emit:t,get:w,listeners:m,context:n,buffer:g,abort:a,aborted:!1};return b}function i(){return new r}function a(){(s.api||s.feature)&&(d.aborted=!0,s=d.backlog={})}var u="nr@context",f=e("gos"),c=e(2),s={},p={},d=n.exports=o();d.backlog=s},{}],gos:[function(e,n,t){function r(e,n,t){if(o.call(e,n))return e[n];var r=t();if(Object.defineProperty&&Object.keys)try{return Object.defineProperty(e,n,{value:r,writable:!0,enumerable:!1}),r}catch(i){}return e[n]=r,r}var o=Object.prototype.hasOwnProperty;n.exports=r},{}],handle:[function(e,n,t){function r(e,n,t,r){o.buffer([e],r),o.emit(e,n,t)}var o=e("ee").get("handle");n.exports=r,r.ee=o},{}],id:[function(e,n,t){function r(e){var n=typeof e;return!e||"object"!==n&&"function"!==n?-1:e===window?0:a(e,i,function(){return o++})}var o=1,i="nr@id",a=e("gos");n.exports=r},{}],loader:[function(e,n,t){function r(){if(!x++){var e=h.info=NREUM.info,n=d.getElementsByTagName("script")[0];if(setTimeout(s.abort,3e4),!(e&&e.licenseKey&&e.applicationID&&n))return s.abort();c(y,function(n,t){e[n]||(e[n]=t)}),f("mark",["onload",a()+h.offset],null,"api");var t=d.createElement("script");t.src="https://"+e.agent,n.parentNode.insertBefore(t,n)}}function o(){"complete"===d.readyState&&i()}function i(){f("mark",["domContent",a()+h.offset],null,"api")}function a(){return E.exists&&performance.now?Math.round(performance.now()):(u=Math.max((new Date).getTime(),u))-h.offset}var u=(new Date).getTime(),f=e("handle"),c=e(2),s=e("ee"),p=window,d=p.document,l="addEventListener",m="attachEvent",w=p.XMLHttpRequest,g=w&&w.prototype;NREUM.o={ST:setTimeout,SI:p.setImmediate,CT:clearTimeout,XHR:w,REQ:p.Request,EV:p.Event,PR:p.Promise,MO:p.MutationObserver};var v=""+location,y={beacon:"bam.nr-data.net",errorBeacon:"bam.nr-data.net",agent:"js-agent.newrelic.com/nr-1044.min.js"},b=w&&g&&g[l]&&!/CriOS/.test(navigator.userAgent),h=n.exports={offset:u,now:a,origin:v,features:{},xhrWrappable:b};e(1),d[l]?(d[l]("DOMContentLoaded",i,!1),p[l]("load",r,!1)):(d[m]("onreadystatechange",o),p[m]("onload",r)),f("mark",["firstbyte",u],null,"api");var x=0,E=e(4)},{}]},{},["loader"]);</script><script type="text/javascript">window.NREUM||(NREUM={});NREUM.info={"beacon":"bam.nr-data.net","queueTime":0,"licenseKey":"68b8ba3b8c","agent":"","transactionName":"NgBTN0ZTXRUDVRdeXA9KdxZaUUcPDVhMWloZBl0MQVYdERVBTUVWAAZFTUJbVhERDDFSUgIRZwpRRR0BB0I=","applicationID":"752811","errorBeacon":"bam.nr-data.net","applicationTime":10}</script><meta property="og:site_name" content="Mixcloud" /><meta property="og:locale" content="en_US" /><meta property="fb:app_id" content="49631911630" /><meta name="twitter:site" content="@mixcloud" /><meta name="twitter:app:name:iphone" content="Mixcloud" /><meta name="twitter:app:id:iphone" content="401206431" /><meta name="twitter:app:name:ipad" content="Mixcloud" /><meta name="twitter:app:id:ipad" content="401206431" /><meta name="twitter:app:name:googleplay" content="Mixcloud"/><meta name="twitter:app:id:googleplay" content="com.mixcloud.player"/><link href='//fonts.googleapis.com/css?family=Open+Sans:300,400,600,700' rel='stylesheet' type='text/css'><link href="https://www.mixcloud.com/media/css/www.8863f762b75122e1fec2f286ed53cbd9.css" type="text/css" rel="stylesheet" /></head><body><img src="https://stream9.mixcloud.com/1x1.gif" height="1" width="1" style="position: fixed; bottom: 0; right: 0; z-index: 0;"><script>
    window.trackJsStatus = function(method, goal) {
        try{
            var xhr = new XMLHttpRequest();
            xhr.open(method, "/react/track_experiment/?goal=" + goal);

            if (method === 'POST') {
                var CSRF_COOKIE_RE = /(?:^|;)\s*csrftoken=([^;\s]*)/;
                var result = CSRF_COOKIE_RE.exec(document.cookie);
                var csrftoken = result ? result[1] : null;
                xhr.setRequestHeader("X-CSRFToken", csrftoken);
            }
            
            xhr.send(null);
        } catch (e) {}
    };
</script><script src="https://www.mixcloud.com/media/js/www_manifest.78d1a5e03a752065a839m.js"></script><div id="react-root"></div><script src="https://www.mixcloud.com/media/js/www_vendor.878ff2d8befc101710b0m.js"></script><script src="https://www.mixcloud.com/media/js/www.8fac97a8707546c867d8m.js"></script><script src="https://connect.facebook.net/en_US/sdk.js" async></script><div style="position: absolute; left: -999px"><svg width="20" height="10"><defs><clipPath id="hovercardMask"><polygon points="0 10,20 10,10 0"></polygon></clipPath></defs></svg></div><input type='hidden' name='csrfmiddlewaretoken' value='aj85HuRYjnotvvR3T0LAh3r0MxhyB4dxV5g0kPKgu82Q3sFgALMHkAfDliBUHCC3' /><script>
        try {
            window.trackJsStatus('GET', 'load');
            window.trackJsStatus('POST', 'load');
        } catch (e) {}
    </script></body></html>

Looking at the requests, you'll have to POST to their https://www.mixcloud.com/graphql endpoint to get the stream url.

@ishitatsuyuki
Copy link
Contributor

@ishitatsuyuki ishitatsuyuki commented Sep 5, 2017

I cannot understand what you're talking about, and please stop trying to troll us. I cannot reproduce your output. All I can get are the old AngularJS version or the new React version. Note that if you throw requests too much it seems to challenge your browser with some obfuscated code.

I'm aware that GraphQL API is available, but it's also XORed and we need a reliable way to retrieve the key.

@kayb94
Copy link
Contributor

@kayb94 kayb94 commented Sep 5, 2017

@ishitatsuyuki To be honest, I don't exactly get what you are doing, but it sounds cool. xD Anyway, just say if I can help out with something easy to save you some effort or so.

@mixcloud-downloader
Copy link

@mixcloud-downloader mixcloud-downloader commented Sep 6, 2017

There's no reason to get personal @ishitatsuyuki. I just wanted to give you a head start. They did an A/B test, which stopped yesterday. If you feel I trolled you, feel free to ignore that piece of information.

@PSlava
Copy link
Author

@PSlava PSlava commented Sep 12, 2017

Now youtube-dl fails to download mixcloud every time.

@kayb94
Copy link
Contributor

@kayb94 kayb94 commented Sep 12, 2017

@ishitatsuyuki has already created a pull request, see #14132
I have verified it still works. You just have to wait for it being integrated into youtube-dl.

@kayb94 kayb94 mentioned this issue Sep 12, 2017
5 of 5 tasks complete
@AndyRubio
Copy link

@AndyRubio AndyRubio commented Sep 15, 2017

I wonder why 'the mixcloud guys' are being such dicks these days. Haven't they got better things to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.