-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embed playback issue: When embed playback is blocked, pytube would detect it as age restriction #1621
base: master
Are you sure you want to change the base?
Conversation
Here is some minimal code to reproduce the issue before the patch: from pytube import YouTube as PyYouTube
# Video that is not allowed as embedded
yt = PyYouTube("https://www.youtube.com/watch?v=e8RcQoGY4OE")
yt.check_availability()
stream = yt.streams.first()
filename = stream.download("./")
# normal video
yt = PyYouTube("https://www.youtube.com/watch?v=wMRbSKjEtrc")
yt.check_availability()
stream = yt.streams.first()
filename = stream.download("./")
# age restricted video
yt = PyYouTube("https://www.youtube.com/watch?v=mQvteoFiMlg")
yt.check_availability()
stream = yt.streams.first()
filename = stream.download("./") |
I tried this commit. It worked for some videos but not for all. I still got error of age restricted when they are not restricted. Check this url: "https://www.youtube.com/shorts/DYVUwiB6fLM" both URLs are not age restricted and no embedding blocked but still giving exception of AgeRestrictedError |
Hey, there is also this PR #1619 that seems to change the way data is obtained at the start, I haven't tried it but you can give it a shot. My solution was for this particular case of embed blocking, but it seems there's more cases in which this happens so it should be generalized (I don't like the way I structured this solution particularly). |
I'll give #1619 a try. BTW your commit does worked for embedding blocked videos. I tried on some more videos that have embedding blocked, this commit can bypass that embedding block gate. |
@pbxforce I have updated the code and now both your links work. It is also a lot faster that the other PR mentioned previously. What I added now is a mapping from unplayability reason to player to use to bypass said reason. This way, if you encounter new problems you can update this map (check the reasons.py map). You can extract the reason by adding a print statement in the newly updated code for vid_info |
The mapping for now looks like this: PLAYER_FOR_REASON = {
'Playback on other websites has been disabled by the video owner': 'ANDROID',
'This video is not available': 'ANDROID',
'This video is unavailable': 'ANDROID',
} Which makes it seem redundant because all the vales are ANDROID. Could come in handy if there are other reasons that are not bypassed by ANDROID, but I don't have much experience with youtube to tell. This could also eventually be updated to have a list of players for each key, so that the code can try all of them until one succeeds. |
This is good. I somewhat fixed it by changing 'client=ANDROID_EMBED to 'client=ANDROID' but it was not totally fixed. Your way is much better and getting playabilityStatus reason is also good to know for future errors. Thank you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR fixes the issue for me related to #1620 of AgeRestrictionError for non-restricted and non-embedding videos (which also gives exception of AgeRestrictionError).
@nficano / @glubsy/ @tfdahlin / @RONNCC could I get a review/feedback on this? The age restriction reason should probably be moved to the map too no? If android will fix all reasons then probably we don't even need a map, also it would be nice to include the reason for unplayability in the exception, wouldn't you agree? |
I tested the code and it manages to get the video when it is prevented from embedding. But for some reason the ANDROID client is limiting the validity of the url to about 30 seconds, after that time, a 403 error is generated. We could use the IOS client, which has no throttling. But it returns few stream options. I think the best thing would be if we could make the ANDROID client work normally to use it as the default in innertube.py. |
pytube/__main__.py
Outdated
def update_vid_info_with(self, client='ANDROID'): | ||
"""Attempt to update the vid_info by using client passed as arg.""" | ||
innertube = InnerTube( | ||
client=client, | ||
use_oauth=self.use_oauth, | ||
allow_cache=self.allow_oauth_cache | ||
) | ||
|
||
innertube_response = innertube.player(self.video_id) | ||
playability_status = innertube_response['playabilityStatus'].get('status', None) | ||
# If we still can't access the video, raise an exception | ||
if playability_status == 'UNPLAYABLE': | ||
raise exceptions.VideoUnavailable(self.video_id) | ||
self._vid_info = innertube_response | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first glance I would say this entire method should be part of YouTube.vid_info
, no need for a separate detour like that? vid_info
could take a default client
parameter?
pytube/reasons.py
Outdated
PLAYER_FOR_REASON = { | ||
'Playback on other websites has been disabled by the video owner': 'ANDROID', | ||
'This video is not available': 'ANDROID', | ||
'This video is unavailable': 'ANDROID', | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module's name and this variable name are not very clear. I don't think it needs its own separate module.
This could be better documented with nested key/value pairs, ie
FALLBACK_CLIENTS = [
{
'error_message': 'Playback on other websites has been disabled by the video owner',
'client': 'ANDROID'
},
...
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then again I don't think this should be handled here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what to think of this workaround. It might work, but forcing whatever client just because the first one did not work sounds like a poor approach.
If anything the "client" application should handle that, so perhaps the only thing worth doing here would be to throw the appropriate exception instead of trying to work around whatever data you get in the "library" part of the code.
aligned /w glubsy ... slightly scared if this would break other things 🤔 |
would you have any insight into a generic solution across clients perhaps @piltom ? |
Alright, this approach was just a generalization of the original Would you agree then to remove the
I will try to reproduce that and see if there's something I can do |
As a matter of fact, it seems that |
That sounds much more reasonable to me, although the exception should not really care about workarounds, only the calling code should then suggest what to do in that case.
As far as I know, this has been going on since early april of 2023. This PR does not aim at solving this issue. |
I'll push a cleaner version later today. In the meantime here is some playability status details for each client, see how many of them differ in both status and reason depending on the client. Also it seems that the original age bypass does not work anymore: Format is client: (status, reason)
|
f2c5179
to
7853e28
Compare
@glubsy @RONNCC what do you think of the current solution? I force pushed a squashed commit :) The pytube constructor takes the optional innertube client argument (vid_info is a property getter, it cannot take arguments). Then the vid_info getter will check for availability on the innertube response. There's another check for availability in the code but it seems to only be used on the watch_html... should this maybe be updated or generalized? Doing the check for availability in the vid_info getter might be too soon? Where would you place that instead? I can also add a new feature for troubleshooting that will analyze the problems on each client to suggest the user what the problem is. If you check my previous comment in the PR, in some cases the youtube reasons are vague and overlap with other problems. This would be something like: def troubleshoot_url(self):
# go through all clients to find a good reason why the link does not work
# maybe raise the correct exception? |
Hello. This morning while running some test downloads I ran into an issue with the following video: https://www.youtube.com/watch?v=e8RcQoGY4OE
Pytube would detect it as Age restricted even though it is not. After some digging I found it was due to the video having embed playback blocked.
The current solution in this PR will check the if reason for no playability is present in the newly introduced map and obtain which player to use to work around it.
This solution could be problematic in the future for two reasons:
Here are some links for you to test: