Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
KUL university website Toledo (using Kaltura) #25236
Comments
|
So I figured out how to download single videos. There is an id and some kind of password in the link for the single videos. The command in bash is |
|
Based on the extractor class ToledoIE(InfoExtractor):
_VALID_URL = r"https?://(?:.+?\.)?kuleuven\.be/(?:[^/]+/)*/(?:[^/]+/)*sp/(?P<id>\d+)/thumbnail.*"
def _real_extract(self, url):
page_id = self._match_id(url)
webpage = self._download_webpage(url, page_id)
entries = []
for player_element in re.findall(
r'(<[^>]+class="kalturaPlayer[^"]*"[^>]*>)', webpage):
player_params = extract_attributes(player_element)
if player_params.get('data-type') not in ('kaltura_singleArticle',):
self.report_warning('Unsupported player type')
continue
entry_id = player_params['data-id']
entries.append(self.url_result(
'kaltura:2375821:' + entry_id, 'Kaltura', entry_id))
return self.playlist_result(entries, page_id)It seems like only the regex of the url not right yet. |
|
@wvhulle Any progress on this by any chance? |
|
My regex expression skills were too bad to properly extract the info. I also have to make the authentication with the university website work. In the end i resorted to the plugin 'VideoDownloadHelper'. |
Example URLs
Single video:
Playlist:
Description
The website uses account credentials that are given to every student. I cannot provide them because they are private, but they are of the form "r(some number)" and the password is chosen by the user. I use a netscape cookie file to authenticate.
I have tried to write an extractor, wrote a regex to match on the website title
r"https?://(?:.+?\.)?kuleuven\.be/.*"but got stuck in writing the actual extraction function. While reading the source code of the pages I saw Kaltura is used. So I tried to make use of the Kaltura functions, but they do not seem to apply to the format of this page?In one of the other issues it is mentioned that you can reference a Kaltura id directly. Where can I find this id?