Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ytdl.getInfo - Error: Status code: 404 #923

Closed
Monabr opened this issue May 18, 2021 · 19 comments
Closed

ytdl.getInfo - Error: Status code: 404 #923

Monabr opened this issue May 18, 2021 · 19 comments
Labels

Comments

@Monabr
Copy link

Monabr commented May 18, 2021

The problem is that if i try get info from some urls i get this error. Example below.

const customVideoInfo = await ytdl.getInfo("http://www.youtube.com/watch?v=3Z61pOtkY7w")

Error: Status code: 404
    at ClientRequest.<anonymous> (/workspace/node_modules/miniget/dist/index.js:211:27)
    at Object.onceWrapper (events.js:421:26)
    at ClientRequest.emit (events.js:314:20)
    at ClientRequest.EventEmitter.emit (domain.js:506:15)
    at HTTPParser.parserOnIncomingClient (_http_client.js:601:27)
    at HTTPParser.parserOnHeadersComplete (_http_common.js:122:17)
    at HTTPParser.execute (<anonymous>)
    at TLSSocket.socketOnData (_http_client.js:474:22)
    at TLSSocket.emit (events.js:314:20)
    at TLSSocket.EventEmitter.emit (domain.js:506:15) 
@WaqasIbrahim
Copy link
Contributor

WaqasIbrahim commented May 18, 2021

I have some across this issue (Mainly on age restricted videos). This is an issue with fallback info endpoint "https://www.youtube.com/get_video_info".

https://www.youtube.com/get_video_info?video_id=3Z61pOtkY7w
This results in 404.

Adding a query parameter "html5=1" solves this issue.
https://www.youtube.com/get_video_info?video_id=3Z61pOtkY7w&html5=1

Needs more testing though.

url.searchParams.set('video_id', id);

@Okervill
Copy link

Okervill commented May 20, 2021

I'm getting the same error with a large number of music videos. These videos don't seem to be age restricted since they can be viewed without logging in. Most recent examples i've found:
https://www.youtube.com/watch?v=C3a1sz2Bp_E
https://www.youtube.com/watch?v=SC4xMk98Pdc

Edit: When i run ytdl locally on my desktop it runs fine (i'm using this as part of a discord bot). When i run it from the server where i host the discord bot (in the same country, although a different city) it gives me that 404 error.

@gatecrasher777
Copy link
Contributor

gatecrasher777 commented May 20, 2021

Relying on this get_video_info, and blissfully unaware of the html5: 1 fix I had to do a bit of research today.

Seems YT has moved that get_info functionality into their INNERTUBE API which returns just the JSON data (responseContext) without the UrlEncoded wrapper.

Getting it requires a post request to https://www.youtube,com/youtubei/v1/player?key=YOUR_KEY

with videoId as a parameter along with some other innertube parameters (and the key) which are easy enough to pick out with any web request to youtube. It also requires a 'x-goog-visitor-id' in the header, which again you can pick out from the parameters.

This now conforms closely to their more efficient INNERTUBE API channel browse and search post requests.

I have working code that can do this now, if anyone is interested and can't figure it out.

@rjdg14
Copy link

rjdg14 commented May 21, 2021

I've been having this issue in FreeTube recently on some age restricted videos and a few music videos, which uses this code for retrieving its content. I'm hoping the problem with the algorithm will be fixed in the next few days.

@gatecrasher777
Copy link
Contributor

Currently ytdl is dependent on the watch page (getWatchHTMLPage), which works in most cases.

Fallback to getWatchJSONPage fails because it now requires a POST request, rather than a GET request.

Fallback to getVideoInfoPage fails because it now requires the html5=1 fix.

@vaaski
Copy link
Contributor

vaaski commented May 23, 2021

not sure how to use miniget to make a POST request, but #924 adds the mentioned html5 param to getVideoInfoPage and also changes the ID of the age restricted video test, which passes now.

@WaqasIbrahim
Copy link
Contributor

Fallback to getWatchJSONPage fails because it now requires a POST request, rather than a GET request.

Can you give more details about this?

@gatecrasher777
Copy link
Contributor

I wrote a bandwidth analyzer implementing the new requestCallback and I noticed that the getWatchJSONPage always fails, notably it is invoked in the test case (VideoId B3eAMGXFw1o). Each request returns about 25 bytes only (3 attempts) and then the pipeline moves on to the getVideoInfoPage.

Checking out this video in the browser, I noticed that there is an age verification command in the INNERTUBE_API, following which, the getWatchJSONPage is invoked with a POST request. In the above case:

https://www.youtube.com/watch?v=B3eAMGXFw1o&has_verified=1&pbj=1

requires postData as follows:

const params = {
    'command': {
      clickTrackingParams: INNERTUBE_CONTEXT.clickTracking.clickTrackingParams,
      commandMetadata: {
        webCommandMetadata: {
          url: "/watch?v="+id+"&has_verified=1",
          webPageType: "WEB_PAGE_TYPE_UNKNOWN",
          rootVe: 83769
        }
      },
      urlEndpoint: {
        url: "/watch?v="+id+"&has_verified=1"
      }
    },
    'session_token': XSRF_TOKEN
  }
  const postData = querystring.stringify(params);

and

'content-type':'application/x-www-form-urlencoded',
'content-length': Buffer.byteLength(postData,'utf8'),

in the headers. I have got this to work, such that the media formats are returned in the response.

I'm not sure what rootVe is, or if it ever changes or needs to change. INNERTUBE_CONTEXT.clickTracking.clickTrackingParams and XSRF_TOKEN can be scraped from the ytcfg.set() in the initial getWatchHTMLPage.

I can perhaps add the bandwidth analyzer to examples if there is interest. It runs the four test cases when run without an argument, or it will analyze the getInfo pipeline for any specific video. It shows each request url with uncompressed/compressed bandwidth. And whether the getInfo process was successful.

@gatecrasher777
Copy link
Contributor

gatecrasher777 commented May 23, 2021

github.com/gatecrasher777/node-ytdl-core

I put getInfoBandwidth.js in example - you may need to change the require depending on where you run it. You should either add your youtube cookie in the headers on line 63, or run it without a cookie.

getWatchJSONPage, and getVideoInfoPage are working in the lib/info.js

Also added a fourth info source into the pipeline from the innertube api, called getVideoInfoInner. Probably superfluous, but it works well and uses very little bandwidth, once you have the ytcfg params/tokens.

@rjdg14
Copy link

rjdg14 commented May 28, 2021

Does anyone know when the module will be officially updated to fix this issue? It looks like there are a number of working fixes (from the sound of it) but to the best of my knowledge none of them have been properly implemented into an official release yet. Currently the ability to load most age restricted videos, as well as a handful of unrestricted music videos and other videos, on software using the ytdl-core module such as FreeTube is still broken, as it has been for nearly 2 weeks.

@vaaski
Copy link
Contributor

vaaski commented May 28, 2021

@rjdg14 #924 was merged and released in 4.8.1 and it fixes age-restricted videos.

@gatecrasher777
Copy link
Contributor

gatecrasher777 commented May 29, 2021

The getWatchJSONPage still doesn't work. Ordinarily this would resolve the age restricted video - before getInfo() falls back onto getVideoInfoPage. Perhaps getVideoInfoPage with html5=1 doesn't resolve the age restriction problem in every case.

Also with getWatchJSONPage being broken, there are several fruitless attempts to get it which significantly slows down the pipeline. To address this delay one could change the pipeline order temporarily on line 61 to

let info = await pipeline([id, options], validate, retryOptions, [
    getWatchHTMLPage,
    getVideoInfoPage, 
    getWatchJSONPage
]);

I have written a fix (mentioned above) but it is quite a major change to the current code, so I'm not willing to make a PR of it before a code maintainer advises me to, or has made suggestions/changes.

@xpiREC
Copy link

xpiREC commented Jun 16, 2021

It seems that Google has updated the verification procedure, now the scent cookie file for logged-in users is also taken into account.
Which prevents playback and leads to a 404 error.
Using cookies gives a longer time to block proxy addresses.

@rjdg14
Copy link

rjdg14 commented Jun 18, 2021

The issue with playback on age restricted videos through the ytdl core seems to have started occuring again, only a couple of weeks after the fix. Should it be possible to fix the issue again?

I honestly don't quite understand why Google have been going to the lengths they've been to try and make age restricted videos near-impossible to watched unless logged into YouTube's website on an account that has been verified as over 18. Until late last year their age restriction system, which was first introduced in (I think) 2008 was relatively lax.

@gatecrasher777
Copy link
Contributor

gatecrasher777 commented Jun 18, 2021

The getVideoInfoPage (with html=1) would usually return streams even to unverified users. Maybe this has changed again.

If you do supply the cookie of a logged in user to getInfo then getWatchHTMLPage and getWatchJSONPage should get you past the barrier and receiving the streams without the need to call getVideoInfoPage.

Except that currently getWatchJSONPage doesn't work. It now requires a post request rather than a get request.

Perhaps getting this working again will help. Also adding the &has_verified=1 to the post request might even work for unverified users.

@rjdg14
Copy link

rjdg14 commented Jun 18, 2021

I have an age verified Google account, but still dislike receiving the viewer discretion warning screens on such videos when viewed on YouTube's website, so prefer to use FreeTube which doesn't display them and which uses your core to fetch video information. I've been able to get round most in-browser nuisances using custom filters on my adblocker, but due to the way that this screen on YouTube is encoded, it doesn't seem to be possible to effectively remove the viewer discretion warning when logged in using an adblocker - you'll simply end up with a white screen instead. YouTube provides no official option to turn it off for logged in users either. I have been able to remove other nuisances, such as YouTube's "Includes Paid Promotion" overlay with my adblocker (it's called something along the lines of PaidContentOverlay).

In the mean time, I've installed this browser extension that is able to automatically skip such warnings on any logged in account that does not require further age verification. It's a little slow and not perfect since it still hangs on the warning for a second or two but is still better than before:

https://chrome.google.com/webstore/detail/youtube-auto-proceed-to-v/lmjcoecpdenpmdoieiiendpoohgmabmd?hl=en

This, when it detects that a video is age restricted, will add "&has_verified=1&bpctr=9999999999" to the URL bar, which will cause an affected video to load with no issue provided the user is logged in and over 18. Does the ytdl core contain reference to this string?

@WaqasIbrahim
Copy link
Contributor

Related issues:

ytdl-org/youtube-dl#29333
ytdl-org/youtube-dl#29086

There is another solution proposed here but not sure how long this will last as YouTube is moving to new innertube API.
ytdl-org/youtube-dl#29333 (comment)

LoneExile added a commit to LoneExile/Bot_Discord_JS that referenced this issue Jun 22, 2021
@TimeForANinja
Copy link
Collaborator

🤔 might have been some early signs of #939

@TimeForANinja
Copy link
Collaborator

gonna close this as fixed with #924 and #939
feel free to comment again if i'm wrong
actually, might as well just open a new issue in that case😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants