New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Rokfin] Add extractor #1534
[Rokfin] Add extractor #1534
Conversation
let me know when you are done making changes |
You can go ahead. I am done. |
It's been 9 days, so I was wondering where the review process is. Any outstanding issues? I would not be asking, but Rokfin is such an important site. It's a political website. Supporting it supports free speech. Rokfin The other reason Rokfin support is so important is lack of the "Download" feature. The only way to download something from Rokfin is through things like yt-dlp. I know that every extractor may be released as a plugin. However, there are two problems with this. First, not everyone has technical expertise to be able to use plugins. Second, even I (to be fully honest) didn't know about yt-dlp plugins until recently, and I've been using youtube-dl for years. So, can you really trust an average user to figure this out, since plugins are unsupported? In summary, I would appreciate a substantive feedback regarding the Rokfin extractor. If everything is good, then perhaps it can be merged. Simultaneously, I will do what I can to speed the process up: let me know if there are urgent/important yt-dlp bugs I hope this message won't be ignored. Certain videos on Rokfin are so important that they can (literally, not metaphorically) save lives. I can provide specifics privately, if you are interested. Thank you. |
I've been testing the Rokfin extractor with the --wait-for-video option, and, as I was waiting for a stream, a network glitch occurred. The 'release_timestamp' immediately became None, as did diff (line 1370 in YoutubeDL.py). So far, so good. But then line 1375 gave a warning that the video should have become available by now, which is false, and line 1376 gave TypeError: '>' not supported between instances of 'int' and 'NoneType' There are two ways out of this. The extractor could do _download_json(... fatal = True ...) instead of fatal = False, the way this is done right now. That would silence the warning and avoid the TypeError. Problem is: this won't work well if --wait-for-network gets implemented. So, should your code handle both diff and 'live_status' being None? |
It's a simple fix. I'll push later diff --git a/yt_dlp/YoutubeDL.py b/yt_dlp/YoutubeDL.py
index 227098656..528d727ee 100644
--- a/yt_dlp/YoutubeDL.py
+++ b/yt_dlp/YoutubeDL.py
@@ -1375,7 +1375,7 @@ class YoutubeDL(object):
self.report_warning('Release time of video is not known')
elif (diff or 0) <= 0:
self.report_warning('Video should already be available according to extracted info')
- diff = min(max(diff, min_wait or 0), max_wait or float('inf'))
+ diff = min(max(diff or 0, min_wait or 0), max_wait or float('inf'))
self.to_screen(f'[wait] Waiting for {format_dur(diff)} - Press Ctrl+C to try now')
wait_till = time.time() + diff |
Yes, but the lines
will still give a false warning in the situation I described. |
The warning is correct. If the extractor doesn't set |
@P-reducible Can you please rebase and squash these commits so there aren't a lot of unrelated changes in the commit history? |
@orbea Yeah, sure. Right now, I'm still testing, fixing bugs, and adding new things. Should be finished soon. |
9b09659
to
3fa1757
Compare
Suppose there is a pre-filmed non-live video. What's the difference between Also, what should the Is there a place where the difference is explained? |
For some services, there is a clear separation between "uploaded" time and "released" time of the video. The two fields exists to account for that. If such a distincton doesnt make sense for a site, just use
No, it has nothing to do with whatever happens on other platforms |
Ok, I got confused because common.py and README.md say different things about But what about live streams? Live streams need |
Originally there was only
For livestreams, I think it makes sense to have only |
Is this done now? If so, rebase on master and ping me |
05305e2
to
f879e84
Compare
@pukkandan As you requested ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same issues exist in the other classes too
yt_dlp/extractor/rokfin.py
Outdated
downloaded_json = try_get(downloaded_json, lambda x: x.get) | ||
created_by = try_get(downloaded_json, lambda x: x('createdBy')).get | ||
upload_date_time = try_get(downloaded_json, lambda x: x('creationDateTime')) | ||
channel_name = try_get(created_by, lambda x: x('name')) | ||
content_subdict = try_get(content_subdict, lambda x: x.get) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove all this. see below comment
yt_dlp/extractor/rokfin.py
Outdated
is_unlisted=False) | ||
|
||
content_subdict = try_get(downloaded_json, lambda x: x['content']) | ||
video_formats_url = try_get(content_subdict, lambda x: url_or_none(x['contentUrl'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
video_formats_url = try_get(content_subdict, lambda x: url_or_none(x['contentUrl'])) | |
video_formats_url = url_or_none(content.get('contentUrl')) |
The point of try_get is to avoid .get
chains. Not to unnecessarily over-complicate
yt_dlp/extractor/rokfin.py
Outdated
'duration': try_get(content_subdict, lambda x: float_or_none(x('duration'))), | ||
'thumbnail': try_get(content_subdict, lambda x: x('thumbnailUrl1')), | ||
'description': try_get(content_subdict, lambda x: x('contentDescription')), | ||
'like_count': try_get(downloaded_json, lambda x: x('likeCount')), | ||
'dislike_count': try_get(downloaded_json, lambda x: x('dislikeCount')), | ||
'comment_count': try_get(downloaded_json, lambda x: x('numComments')), | ||
'availability': availability, | ||
'creator': channel_name, | ||
'channel_id': try_get(created_by, lambda x: x('id')), | ||
'channel': channel_name, | ||
'channel_url': try_get(created_by, lambda x: self._CHANNEL_BASE_URL + x('username')), | ||
'timestamp': unified_timestamp(upload_date_time), | ||
'tags': [str(tag) for tag in try_get(downloaded_json, lambda x: x('tags')) or []], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'duration': try_get(content_subdict, lambda x: float_or_none(x('duration'))), | |
'thumbnail': try_get(content_subdict, lambda x: x('thumbnailUrl1')), | |
'description': try_get(content_subdict, lambda x: x('contentDescription')), | |
'like_count': try_get(downloaded_json, lambda x: x('likeCount')), | |
'dislike_count': try_get(downloaded_json, lambda x: x('dislikeCount')), | |
'comment_count': try_get(downloaded_json, lambda x: x('numComments')), | |
'availability': availability, | |
'creator': channel_name, | |
'channel_id': try_get(created_by, lambda x: x('id')), | |
'channel': channel_name, | |
'channel_url': try_get(created_by, lambda x: self._CHANNEL_BASE_URL + x('username')), | |
'timestamp': unified_timestamp(upload_date_time), | |
'tags': [str(tag) for tag in try_get(downloaded_json, lambda x: x('tags')) or []], | |
'duration': content.get('duration'), | |
'thumbnail': url_or_none(content.get('thumbnailUrl1')), | |
'description': content.get('contentDescription'), | |
'like_count': downloaded_json.get('likeCount'), | |
'dislike_count': downloaded_json.get('dislikeCount'), | |
'comment_count': downloaded_json.get('numComments'), | |
'availability': availability, | |
'creator': channel_name, | |
'channel_id': try_get(downloaded_json, x['createdBy']['id']), | |
'channel': channel_name, | |
'channel_url': try_get(downloaded_json, self._CHANNEL_BASE_URL + x['createdBy']['username']) | |
'timestamp': unified_timestamp(upload_date_time), | |
'tags': [str(tag) for tag in downloaded_json.get('tags') or []], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, try_get(d, lambda x: x[a][b][c])
can be simplified to traverse_obj(d, (a, b, c)
if you want
Is this supposed to work yet? I applied it to
|
|
sure, either way is fine by me |
|
|
I clearly said the same thing in our previous discussion (which has not "disappeared" btw). When you asked about adding an extractor arg, I asked for your reasoning on why yt-dlp "should return non-video URLs in playlists" and you failed to give any genuine reason beside a comment arguing the semantics that "it's not really a playlist" (which is relevant how?) |
|
@pukkandan What I said makes perfect sense. I gave an explanation, and you didn't reply. Only in retrospect did I learn that my explanation was not "genuine". |
@pukkandan Are you planning a new release? If so, when should I give you the code in time for the release? |
There are a couple more issues (unrelated to this) that I want to fix before a release. Will make one as soon as I get the time to finish those up. It is unlikely for new code to make it in. Feel free to take your time |
@pukkandan If that's the case, you may want to merge this regression fix I just committed: 'premiumPlan' applies to posts only. For streams, the field is just called 'premium'. I also added multi_video=True to playlist_result in RokfinStackIE. |
thanks @P-reducible I look forward to this feature. On my flakey internet connection, rokfin videos frequently buffer and break and are impractical to play through. It will be GREAT if yt-dlp will acquire the video for me so that it can be watched later without buffering and network issues. |
@iambumblehead, you are welcome. The speed lag affects me, as well, to some degree. Must be a Rokfin issue. |
Site search and login moved to PR #2992. |
Please follow the guide below
x
into all the boxes [ ] relevant to your pull request (like that [x])Before submitting a pull request make sure you have:
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
This closes issue 1351.