Skip to content
This repository has been archived by the owner on Sep 28, 2020. It is now read-only.

403s #19

Closed
joedean3 opened this issue Jun 21, 2013 · 32 comments
Closed

403s #19

joedean3 opened this issue Jun 21, 2013 · 32 comments

Comments

@joedean3
Copy link

http://www.youtube.com/watch?v=kDWgsQhbaqU

many others

@flagbug
Copy link
Owner

flagbug commented Jun 21, 2013

Works fine here

Can you please post the stack trace and the version you are using?

@joedean3
Copy link
Author

I just got the latest you put up (it was 15 hrs old when I updated). It returned a 403 with an older version I was running. Now it throws an exception. I am going to check to make sure I properly integrated your latest but it still downloads other videos.

I’m now calling DownloadUrlResolver with http://www.youtube.com/watch?v=k6EQAOmJrbw

The exception thrown is ‘the given key was not present in the dictionary’ and it is throwing on the first iteration of

foreach (Uri url in downloadUrls)

I think because downloadUrls is null

I’m attaching the pagesource in pagesource.html

Here is the stack trace:

           YoutubeExtractor.dll!YoutubeExtractor.DownloadUrlResolver.ThrowYoutubeParseException(System.Exception innerException) Line 196 C#

          YoutubeExtractor.dll!YoutubeExtractor.DownloadUrlResolver.GetDownloadUrls(string videoUrl) Line 49 + 0x8 bytes        C#

@flagbug
Copy link
Owner

flagbug commented Jun 21, 2013

Are you sure you are running version 0.6.2? Because 0.6.1 and 0.6.0 had exactly the issue you encountered.

@michael79
Copy link

Hello,

I've been experiencing 403 issues as well for the last couple of days, first on 0.5.0, and later after integrating the changes I could from 0.6.2 (still running VS2010).

Examples of videos that trigger 403s:
http://www.youtube.com/watch?v=Km-0g7LzTYk
http://www.youtube.com/watch?v=V8e3NXxWSDI
http://www.youtube.com/watch?v=JLCBExg1ppM

Also, I believe that I've seen the exception that joedean3 is talking about. It is thrown by DownloadUrlResolver's ExtractDownloadUrls() method. Its caused by YouTube backend changes: In the videos that trigger 403s, the "sig" parameter has changed its name to "s", causing the following line to fail:

string url = string.Format("{0}&fallback_host={1}&signature={2}", queries["url"], queries["fallback_host"], queries["sig"]);

I've replaced it with the following:

string url = string.Empty;

if (queries.Keys.Contains("sig"))
{
  url = string.Format("{0}&fallback_host={1}&signature={2}", queries["url"], queries["fallback_host"], queries["sig"]);
}
else if (queries.Keys.Contains("s"))
{
   url = string.Format("{0}&fallback_host={1}&signature={2}", queries["url"], queries["fallback_host"], queries["s"]);
}
else
{
   // exception thrown here ...
}

However, while that gets rid of the exceptions, it doesn't solve the 403 issue.

@flagbug
Copy link
Owner

flagbug commented Jun 26, 2013

Hm, I still can't reproduce the 403 exception.

Can you please post the whole exception including the exception name, message and stacktrace?

@joedean3
Copy link
Author

I just ran one test and it’s not happening anymore… At least not on the video I was testing before.

I will surely follow up with you should I get it or any other issues again.

Thanks

JOE

From: Dennis Daume [mailto:notifications@github.com]
Sent: Wednesday, June 26, 2013 5:13 PM
To: flagbug/YoutubeExtractor
Cc: joedean3
Subject: Re: [YoutubeExtractor] 403s (#19)

Hm, I still can't reproduce the 403 exception.

Can you please post the whole exception including the exception name, message and stacktrace?


Reply to this email directly or view it on GitHub #19 (comment) . https://github.com/notifications/beacon/QSZip7bRuAPnMp2VFiIbRwq1Rpbyh1sRZfOay3I4QVk516pHTn6yseqlDtP4_Gkz.gif

@michael79
Copy link

On my machine, the 403s are continuing as usual. To make sure they weren't coming from my code, I downloaded the last version (0.6.2), and used the included ExampleApplication. Initially, I got parsing exceptions, but after I replaced the "sig" parameter with "s" (as specified in my previous message), the 403s came back.

URL of the video I used: http://www.youtube.com/watch?v=rgqHT-iy8kA

Exception details:

System.Net.WebException was unhandled
  Message=The remote server returned an error: (403) Forbidden.
  Source=YoutubeExtractor
  StackTrace:
       at YoutubeExtractor.VideoDownloader.<>c__DisplayClass2.<Execute>b__0(Object sender, AsyncCompletedEventArgs args) in C:\...\YoutubeExtractor\VideoDownloader.cs:line 51
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading._ThreadPoolWaitCallback.PerformWaitCallbackInternal(_ThreadPoolWaitCallback tpWaitCallBack)
       at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(Object state)
  InnerException: 

ResponseURI:

{http://r2---sn-cx1x9-ua8z.c.youtube.com/videoplayback?cp=U0hWR1JRU19IU0NONl9KS1dKOkZJNTVtakdLbVZS&expire=1372265939&ms=au&source=youtube&sver=3&upn=KpWgYbVyc6E&id=ae0a874fe8b2f240&itag=18&gcr=il&ipbits=8&ratebypass=yes&fexp=933900,904448,932402,929223,916625,921047,928201,901208,929123,929915,929906,929907,929125,925714,929917,929919,931202,912512,912515,912521,906838,904488,906840,931910,931913,932227,904830,919373,933701,904122,900816,926403,909421,912711&newshard=yes&sparams=cp,gcr,id,ip,ipbits,itag,ratebypass,source,upn,expire&mt=1372242341&mv=m&ip=79.177.3.250&key=yt1&fallback_host=tc.v4.cache5.c.youtube.com&signature=77425A3D8C402F6A2E24969448EB53726827D1BED.0AD9269310EB6543A0B4098959FC35704F79112A0}

@joedean3
Copy link
Author

OK, here it is…

I was getting 403s but with your latest, it throws an exception:

Name:System.Collections.Generic.KeyNotFoundException

Message: "The given key was not present in the dictionary."

StackTrace:

at System.Collections.Generic.Dictionary`2.get_Item(TKey key)

at YoutubeExtractor.DownloadUrlResolver.d__0.MoveNext() in DownloadUrlResolver.cs:line 81 at YoutubeExtractor.DownloadUrlResolver.GetVideoInfos(IEnumerable`1 downloadUrls, String videoTitle) in DownloadUrlResolver.cs:line 94 at YoutubeExtractor.DownloadUrlResolver.GetDownloadUrls(String videoUrl) in DownloadUrlResolver.cs:line 44

From: Dennis Daume [mailto:notifications@github.com]
Sent: Wednesday, June 26, 2013 5:13 PM
To: flagbug/YoutubeExtractor
Cc: joedean3
Subject: Re: [YoutubeExtractor] 403s (#19)

Hm, I still can't reproduce the 403 exception.

Can you please post the whole exception including the exception name, message and stacktrace?


Reply to this email directly or view it on GitHub #19 (comment) . https://github.com/notifications/beacon/QSZip7bRuAPnMp2VFiIbRwq1Rpbyh1sRZfOay3I4QVk516pHTn6yseqlDtP4_Gkz.gif

@flagbug
Copy link
Owner

flagbug commented Jun 26, 2013

What is the format code of the video you are downloading?

@michael79
Copy link

18

@flagbug
Copy link
Owner

flagbug commented Jun 26, 2013

Ha! Now I can reproduce the KeyNotFoundException, thanks!

@flagbug
Copy link
Owner

flagbug commented Jun 26, 2013

Ok, as it turns out this is because the uploader has not made the video available in my country...lets see if I can throw a more descriptive exception.

Could you please give me another video URL with the format code that throws a 403 exception?

@michael79
Copy link

@flagbug
Copy link
Owner

flagbug commented Jun 26, 2013

And the format code? This is important as I believe that it throws only on certain video types

@michael79
Copy link

The same one, 18. Actually, I tried a bunch of other codes from GetDownloadUrls(), and they all throw 403s: 44, 35, 43, 34, 5.

@flagbug
Copy link
Owner

flagbug commented Jun 26, 2013

I hate that I can't reproduce this. In which country do you live? Maybe I can pipe my connection through a proxy in your country and see if this "solves" it.

@michael79
Copy link

I'm from Israel.

@michael79
Copy link

Thank you for your efforts, I must go for now, but I'll be back in the evening.

@smurz
Copy link

smurz commented Jun 26, 2013

Well it seems I got it working most of the time. This problem mostly occurs with the VEVO videos.
The problem is the 's' signature is a kind of encrypted and direct video urls aren't taken from a 'get_video_info' but from the PageSource of the video itself from the 'ytplayer.config'.

Take a look at this project
https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/youtube.py

Here is some code of mine, that seems to solve the problem with the most of those vids. Love your project. I hope it helps

.....
var pageSource = HttpHelper.DownloadString(videoUrl);
.....
var downloadUrls = ExtractDownloadUrls(source, pageSource);
......

    private static IEnumerable<Uri> ExtractDownloadUrls(string videoInfo, string pageSource)
    {
        var playerStreamMap = string.Empty;

        var playerConfig = GetInsideBraces(pageSource, ";ytplayer.config = ");
        if (playerConfig.Length != 0)
        {
            playerConfig = Regex.Unescape(playerConfig);
            var sMatch = Regex.Matches(playerConfig, "[&,]s=");

            //check for 's' signature
            if (sMatch.Count != 0)
            {
                var fmt = Regex.Match(playerConfig, "\"url_encoded_fmt_stream_map\": \".*?\"");
                if (fmt.Groups.Count != 0)
                {
                    playerStreamMap = fmt.Groups[0].Value.Split(':')[1].Replace("\"", "");
                }
            }
        }

        var urlMap = string.IsNullOrEmpty(playerStreamMap) ? HttpHelper.ParseQueryString(videoInfo)["url_encoded_fmt_stream_map"] : playerStreamMap;

        string[] splitByUrls = urlMap.Split(',');

        var url = string.Empty;
        foreach (string str in splitByUrls)
        {
            var queries = HttpHelper.ParseQueryString(str);
            if (queries.ContainsKey("url"))
            {
                if (queries.Keys.Contains("sig"))
                {
                    url = string.Format("{0}&fallback_host={1}&signature={2}", queries["url"], queries["fallback_host"], queries["sig"]);
                }
                else if (queries.Keys.Contains("s"))
                {
                    var s = DecodeSignature(queries["s"]);

                    url = string.Format("{0}&fallback_host={1}&signature={2}", queries["url"], queries["fallback_host"], s);
                }

                if (!url.Contains("ratebypass"))
                    url += "&ratebypass=yes";
            }

            //url = HttpHelper.UrlDecode(url);
            //url = HttpHelper.UrlDecode(url);

            yield return new Uri(url);
        }
    }

    private static string GetInsideBraces(string source, string searchString)
    {
        var index = source.IndexOf(searchString, StringComparison.Ordinal);
        if (index == -1) return null;

        var start = index + searchString.Length - 1;

        var result = "{";
        var braces = 1;
        start = source.IndexOf('{', start);

        for (var i = start + 1; i < source.Length && braces != 0; i++)
        {
            result += source[i];
            if (source[i] == '{') braces++;
            if (source[i] == '}') braces--;
        }

        return result;
    }

    private static string DecodeSignature(string s)
    {
        var split = s.Split('.');
        if (split.Count() != 2 || split[0].Length != 43 || split[1].Length != 43)
            throw new ArgumentException("Can't decrypt signture: " + s);

        var a = split[0];
        var b = split[1];

        b = b.Substring(0, 8) + a[0] + b.Substring(9, 9) + b[b.Length - 4] + b.Substring(19, 20) + b[18];
        b = b.Substring(0, 40);
        a = a.Substring(a.Length - 40, 40);

        var sb = new StringBuilder();
        sb.Append((a + "." + b).Reverse().ToArray());


        var result = sb.ToString();
        return result;
    }

@michael79
Copy link

smurz, where does the first parameter ( videoInfo ) in ExtractDownloadUrls() come from?

@smurz
Copy link

smurz commented Jun 27, 2013

michael79, it's a source of the get_video_info page

string requestUrl = String.Format("http://www.youtube.com/get_video_info?&video_id={0}&el=detailpage&ps=default&eurl=&gl=US&hl=en", id);

string videoInfo = HttpHelper.DownloadString(requestUrl);

thou this method seems to not work if the signature is other than "43 chars point 43 chars"

@michael79
Copy link

[Part I/II]

With some changes, I managed to get smurz's solution to work:

First of all, your ExtractDownloadUrls() has a bug, in about 50% of the cases playerStreamMap will have a leading space, which then causes the following line to fail:

if (queries.ContainsKey("url"))

as the key will then be " url" instead of "url" (note the space).

I've fixed this by trimming the leading & trailing spaces:

playerStreamMap = playerStreamMap.Trim();

The complete code for ExtractDownloadUrls() looks like this:

        private static IEnumerable<Uri> ExtractDownloadUrls(string videoInfo, string pageSource)
        {
            var playerStreamMap = string.Empty;

            var playerConfig = GetInsideBraces(pageSource, ";ytplayer.config = ");
            if (playerConfig.Length != 0)
            {
                playerConfig = Regex.Unescape(playerConfig);
                var sMatch = Regex.Matches(playerConfig, "[&,]s=");

                //check for 's' signature
                if (sMatch.Count != 0)
                {
                    var fmt = Regex.Match(playerConfig, "\"url_encoded_fmt_stream_map\": \".*?\"");
                    if (fmt.Groups.Count != 0)
                    {
                        playerStreamMap = fmt.Groups[0].Value.Split(':')[1].Replace("\"", "");
                    }
                }
            }

            // Trim possible leading space
            playerStreamMap = playerStreamMap.Trim();
            var urlMap = string.IsNullOrEmpty(playerStreamMap) ? HttpHelper.ParseQueryString(videoInfo)["url_encoded_fmt_stream_map"] : playerStreamMap;

            string[] splitByUrls = urlMap.Split(',');

            var url = string.Empty;
            foreach (string str in splitByUrls)
            {
                var queries = HttpHelper.ParseQueryString(str);
                if (queries.ContainsKey("url"))
                {
                    if (queries.Keys.Contains("sig"))
                    {
                        url = string.Format("{0}&fallback_host={1}&signature={2}", queries["url"], queries["fallback_host"], queries["sig"]);
                    }
                    else if (queries.Keys.Contains("s"))
                    {
                        string sParam = queries["s"];
                        string sDecoded = DecodeSignature(sParam);

                        url = string.Format("{0}&fallback_host={1}&signature={2}", queries["url"], queries["fallback_host"], sDecoded);
                    }

                    if (!url.Contains("ratebypass"))
                        url += "&ratebypass=yes";
                }

                yield return new Uri(url);
            }
        }

@michael79
Copy link

[Part II/II]

Second, I changed the DecodeSignature() method, so that it works with the signatures I'm getting - which are in the form of (42 chars).(42 chars). I've taken the algorithm from a link that smurz gave here: https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/youtube.py

For everyone's convenience, here is the original code (in Python)

    def _decrypt_signature(self, s):
        """Decrypt the key"""

        if len(s) == 88:
            return s[48] + s[81:67:-1] + s[82] + s[66:62:-1] + s[85] + s[61:48:-1] + s[67] + s[47:12:-1] + s[3] + s[11:3:-1] + s[2] + s[12]
        elif len(s) == 87:
            return s[62] + s[82:62:-1] + s[83] + s[61:52:-1] + s[0] + s[51:2:-1]
        elif len(s) == 86:
            return s[2:63] + s[82] + s[64:82] + s[63]
        elif len(s) == 85:
            return s[76] + s[82:76:-1] + s[83] + s[75:60:-1] + s[0] + s[59:50:-1] + s[1] + s[49:2:-1]
        elif len(s) == 84:
            return s[83:36:-1] + s[2] + s[35:26:-1] + s[3] + s[25:3:-1] + s[26]
        elif len(s) == 83:
            return s[52] + s[81:55:-1] + s[2] + s[54:52:-1] + s[82] + s[51:36:-1] + s[55] + s[35:2:-1] + s[36]
        elif len(s) == 82:
            return s[36] + s[79:67:-1] + s[81] + s[66:40:-1] + s[33] + s[39:36:-1] + s[40] + s[35] + s[0] + s[67] + s[32:0:-1] + s[34]

        else:
            raise ExtractorError(u'Unable to decrypt signature, key length %d not supported; retrying might work' % (len(s)))

Unfortunately, I've been unable to find videos with other signature forms (I suspect these are location-dependent), so my method won't work with signatures other than (42 chars).(42 chars).

        private static string DecodeSignature(string s)
        {
            string result = string.Empty;

            switch (s.Length)
            {
                case 86:    // (42 chars).(42 chars)

                    result = s.Substring(2, 61) + s[82] + s.Substring(64, 18) + s[63];
                    break;

                /* Other signature lengths will go here */

                default:

                    throw new ArgumentException("Can't decrypt signature: " + s);
            }

            return result;
        }

If any of you are getting different signatures, and want to convert the Python code to C#, you can use the online Python compiler at http://www.compileonline.com/execute_python_online.php, to make sure that your code is identical.

@RonAsor
Copy link

RonAsor commented Jul 3, 2013

Its mostly applied to vevo links, you dont have to worry about country specific content,as its widely applied to them in any region.

@flagbug
Copy link
Owner

flagbug commented Jul 7, 2013

I just published a NuGet pre-release package that hopefully fixes this issue.

If everything works, I'll release an non-alpha package.

@smurz
Copy link

smurz commented Jul 8, 2013

Well, the signature decripting procedure is contained in the YouTube player itself. If you decompile it (I used ShowMyCode.com) you'll see the class called SignatureDecipher written in an Actionscript. The "decipher" function produces the correct signature.

public class SignatureDecipher {

    public static var TIMESTAMP:Number = 1588;

    private static function swap_1588(_arg1:Array, _arg2:Number):Array{
        var _local3:String = _arg1[0];
        var _local4:String = _arg1[(_arg2 % _arg1.length)];
        _arg1[0] = _local4;
        _arg1[_arg2] = _local3;
        return (_arg1);
    }
    private static function reverse_1588(_arg1:Array):Array{
        _arg1.reverse();
        return (_arg1);
    }
    private static function clone_1588(_arg1:Array, _arg2:Number):Array{
        return (_arg1.slice(_arg2));
    }
    public static function decipher(_arg1:String):String{
        var _local2:Array = _arg1.split("");
        _local2 = reverse_1588(_local2);
        _local2 = clone_1588(_local2, 3);
        _local2 = swap_1588(_local2, 19);
        _local2 = reverse_1588(_local2);
        _local2 = clone_1588(_local2, 2);
        return (_local2.join(""));
    }
}

@flagbug
Copy link
Owner

flagbug commented Jul 8, 2013

@smurz The latest commit does exactly this

@lakeba
Copy link

lakeba commented Jul 9, 2013

the last -pre didn't solve the problem of 403:

https://www.youtube.com/watch?v=9bZkp7q19f0

@flagbug
Copy link
Owner

flagbug commented Jul 9, 2013

Ok guys, I'm finally able to reproduce the 403 error without using a proxy, for this video:
http://www.youtube.com/watch?v=kn6-c223DUU

This makes everything a lot easier

@dentex
Copy link

dentex commented Jul 11, 2013

Hi.
I've been able to download the video streams from http://www.youtube.com/watch?v=kn6-c223DUU with my Android App porting this function to Java:

{  
    function SignatureDecipher () {  
    }  
    static function decipher(str) {  
        var _local3 = str.split("");  
        _local3 = reverse_15888(_local3);  
        _local3 = clone_15888(_local3, 2);  
        _local3 = reverse_15888(_local3);  
        return(_local3.join(""));  
    }  
    static function clone_15888(arr, len) {  
        return(arr.slice(len));  
    }  
    static function reverse_15888(arr) {  
        arr.reverse();  
        return(arr);  
    }  
}

I found it in a .swf into the YT web page, as suggested above. The (long) function ported before from gantt's script doesn't work anymore.

This solution works also for all the other "vevo" videos.

Hope this helps. bye.

@jackun
Copy link

jackun commented Sep 20, 2013

I'll tack this on here:
Seems to be for signature with length of 93.

public static function decipher(_arg1:String):String{
        var _local2:Array = _arg1.split("");
        _local2 = clone_15966(_local2, 3);
        _local2 = reverse_15966(_local2);
        _local2 = clone_15966(_local2, 1);
        _local2 = reverse_15966(_local2);
        _local2 = clone_15966(_local2, 3);
        _local2 = reverse_15966(_local2);
        _local2 = clone_15966(_local2, 3);
        _local2 = swap_15966(_local2, 59);
        _local2 = clone_15966(_local2, 2);
        return (_local2.join(""));
}

Also with length of 86 might have changed to ( javascript :P ):

function decipher(str){
    var arr = str.split("");
    arr = Reverse(arr);
    arr = Swap(arr, 12);
    arr = Swap(arr, 32);
    arr = Reverse(arr);
    arr = Swap(arr, 34);
    arr = clone(arr, 3);
    arr = Swap(arr, 35);
    arr = Swap(arr, 42);
    arr = clone(arr, 2);
    return (arr.join(""));
}

@flagbug
Copy link
Owner

flagbug commented Dec 27, 2013

Closing as duplicate of #16

@flagbug flagbug closed this as completed Dec 27, 2013
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants