[vlive:playlist] Add new extractor #13613

coreynicholson · 2017-07-09T12:34:36Z

Please follow the guide below

You will be asked some questions, please read them carefully and answer honestly
Put an x into all the boxes [ ] relevant to your pull request (like that [x])
Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

At least skimmed through adding new extractor tutorial and youtube-dl coding conventions sections
Searched the bugtracker for similar pull requests

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Bug fix
Improvement
New extractor
New feature

Description of your pull request and other information

A couple more examples of vlive playlists:

http://www.vlive.tv/video/24371/playlist/24384
http://www.vlive.tv/video/20928/playlist/20933
http://www.vlive.tv/video/16181/playlist/17079

The reason why there are two tests is because the webpages for the two are slightly different. The webpage for the video in the first test contains playlist information even if you remove the 'playlist' and playlist id segments of the URL. Whereas the second video doesn't contain playlist information if you remove the 'playlist' and playlist id segments of the URL therefore the playlist extractor would not work.

I've put this extractor before the regular video extractor in the extractors file because the playlist URLs match the regular video pattern. This way they both work.

coreynicholson · 2017-07-09T12:45:12Z

Looks like the builds are failing because multiple extractors are not allowed to match the same URL, is there any way around this without making the regular VLive video URL pattern restrictive?

dstftw · 2017-07-09T12:51:52Z

Override suitable.

coreynicholson · 2017-07-09T17:09:29Z

Made the change, the builds don't seem to be failing because of this extractor anymore.

dstftw · 2017-07-09T17:33:14Z

youtube_dl/extractor/vlive.py

+    def _real_extract(self, url):
+        playlist_id = self._match_id(url)
+        video_id = self._search_regex(
+            self._VALID_URL, url, 'video id', group='video_id')


Use bare re.match.

dstftw · 2017-07-09T17:36:55Z

youtube_dl/extractor/vlive.py

+            webpage, 'playlist name', fatal=False)
+
+        item_ids = self._search_regex(
+            r'\bvar\s+playlistVideoSeqs\s*=\s*\[([^\]]+)\]',


No need to escape ] is character set.

dstftw · 2017-07-09T17:37:08Z

youtube_dl/extractor/vlive.py

+
+        item_ids = self._search_regex(
+            r'\bvar\s+playlistVideoSeqs\s*=\s*\[([^\]]+)\]',
+            webpage, 'playlist item ids', default='')


default is senseless here.

dstftw · 2017-07-09T17:38:53Z

youtube_dl/extractor/vlive.py

+            webpage, 'playlist item ids', default='')
+
+        entries = []
+        for item_id in re.split(r'\s*,\s*', item_ids):


_parse_json instead.

dstftw · 2017-07-09T17:39:45Z

youtube_dl/extractor/vlive.py

+        webpage = self._download_webpage(
+            'http://www.vlive.tv/video/%s/playlist/%s' % (video_id, playlist_id), video_id)
+
+        playlist_name = self._html_search_regex(


--no-playlist is not respected.

dstftw · 2017-07-09T17:40:54Z

youtube_dl/extractor/extractors.py

@@ -1203,6 +1203,7 @@
    VKWallPostIE,
 )
 from .vlive import (
+    VLivePlaylistIE,


Not alphabetic.

coreynicholson · 2017-07-09T18:24:32Z

Thanks for your feedback, I've pushed another commit where I've tried to address your comments. Let me know if there's still something you would like me to improve.

dstftw · 2017-07-09T18:25:31Z

youtube_dl/extractor/vlive.py

+            webpage, 'playlist item ids')
+
+        entries = []
+        for item_id in self._parse_json('[%s]' % item_ids, playlist_id):


You already have [] in regex, capture it and parse there.

Good point, updated.

dstftw · 2017-07-09T18:41:34Z

youtube_dl/extractor/vlive.py

+        assert video_id_match
+        video_id = compat_str(video_id_match.group('video_id'))
+
+        video_url_format = 'http://www.vlive.tv/video/%s'


Uppercase is used for constants. Also this is a template not format.

dstftw · 2017-07-09T18:46:44Z

youtube_dl/extractor/vlive.py

+            return self.url_result(
+                video_url_format % video_id,
+                ie=VLiveIE.ie_key(), video_id=video_id)
+        else:


else is superfluous.

coreynicholson · 2017-07-09T18:53:45Z

Thanks, I've changed the variable name for the constant to uppercase and removed the unnecessary else.

dstftw · 2017-07-09T19:04:33Z

youtube_dl/extractor/vlive.py

+        },
+        'playlist_mincount': 20
+    }, {
+        'url': 'http://www.vlive.tv/video/22867/playlist/22912',


Make this test only_matching and add a comment about why it even exists.

If I make it only_matching does that mean it will only test that the URL matches the regex? If so, that defeats the point of that test. The URL in the second test links to a video which appears to be linked to the playlist somewhat differently to the URL in the first test.

From my observations, if someone were to download the webpage using only the video id from the given URL, it would not link to the playlist, whereas it would with the URL in the first test. Therefore I think both tests should perform the full webpage download.

i.e:
http://www.vlive.tv/video/30824 the HTML for this page does contain the playlist
http://www.vlive.tv/video/22867 the HTML for this page does not

I can add a code comment to clarify this if my assumption about only_matching is correct.

Both test the same extraction scenario. The rule is one test per extraction scenario.

Then I think it'll be better to keep the second test and remove the first one entirely, I'll do that.

coreynicholson · 2017-07-09T19:34:11Z

I've pushed the change to remove a test so there's only one.

[vlive:playlist] Add new extractor

9a0458a

dstftw added the pending-fixes label Jul 9, 2017

[vlive:playlist] Prevent VLiveIE matching playlist URLs

7acd0b1

coreynicholson force-pushed the vlive_playlist branch from 3c8fdeb to 7acd0b1 Compare July 9, 2017 14:56

dstftw requested changes Jul 9, 2017

View reviewed changes

[vlive:playlist] Address PR comments

4dcd120

dstftw requested changes Jul 9, 2017

View reviewed changes

[vlive:playlist] Simplify playlist item ids extraction

500b0a9

dstftw requested changes Jul 9, 2017

View reviewed changes

[vlive:playlist] Uppercase for constants and remove unnecessary 'else'

673bfc2

dstftw requested changes Jul 9, 2017

View reviewed changes

[vlive:playlist] Remove a test

ea4317a

dstftw merged commit b71c18b into ytdl-org:master Jul 9, 2017

dstftw added a commit that referenced this pull request Sep 23, 2017

Credit @coreynicholson for vlive:playlist (#13613)

07d1344

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vlive:playlist] Add new extractor #13613

[vlive:playlist] Add new extractor #13613

coreynicholson commented Jul 9, 2017 •

edited

Loading

coreynicholson commented Jul 9, 2017

dstftw commented Jul 9, 2017

coreynicholson commented Jul 9, 2017

dstftw Jul 9, 2017

dstftw Jul 9, 2017

dstftw Jul 9, 2017

dstftw Jul 9, 2017

dstftw Jul 9, 2017

dstftw Jul 9, 2017

coreynicholson commented Jul 9, 2017

dstftw Jul 9, 2017

coreynicholson Jul 9, 2017

dstftw Jul 9, 2017 •

edited

Loading

dstftw Jul 9, 2017

coreynicholson commented Jul 9, 2017

dstftw Jul 9, 2017 •

edited

Loading

coreynicholson Jul 9, 2017

dstftw Jul 9, 2017

coreynicholson Jul 9, 2017

coreynicholson commented Jul 9, 2017

[vlive:playlist] Add new extractor #13613

[vlive:playlist] Add new extractor #13613

Conversation

coreynicholson commented Jul 9, 2017 • edited Loading

Please follow the guide below

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

coreynicholson commented Jul 9, 2017

dstftw commented Jul 9, 2017

coreynicholson commented Jul 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coreynicholson commented Jul 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dstftw Jul 9, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coreynicholson commented Jul 9, 2017

dstftw Jul 9, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coreynicholson commented Jul 9, 2017

coreynicholson commented Jul 9, 2017 •

edited

Loading

dstftw Jul 9, 2017 •

edited

Loading

dstftw Jul 9, 2017 •

edited

Loading