[Vidbit] Add new extractor (Closes #9688) #9759

TRox1972 · 2016-06-12T01:19:39Z

No description provided.

dstftw · 2016-06-12T02:59:16Z

youtube_dl/extractor/vidbit.py

+        return {
+            'id': video_id,
+            'title': self._html_search_regex(r'<h1>(.+)</h1>', webpage, 'title'),
+            'url': self._BASE_URL % self._html_search_regex(r'file:\s*["\'](.+)["\']', webpage, 'video URL'),


url should be used as base URL.

["\'](.+)["\'] will capture everything including " and ' till trailing " or '.

It seems like the site both uses single and double quotes. Changing it to e.g. ([^"\']) wouldn't extract the correct title if it contains one type of quotes. Do you know of a good regex for this?

Capture opening quote and match it with closing.

url should be used as base URL.

@dsftw I'm not sure what you mean. I've seen the same setup in other extractors.

I mean exactly that: base part of url should be used as base URL instead of this hardcode.

yan12125 · 2016-06-12T09:14:29Z

This is a custom JWPlayer. Should use JWPlatformBaseIE._parse_jwplayer_data() instead.

dstftw · 2016-06-12T09:18:20Z

Current _parse_jwplayer_data won't work with this input since the path is relative.

yan12125 · 2016-06-12T09:23:34Z

_parse_jwplayer_data can be extended to support relative URLs.

TRox1972 · 2016-06-12T15:30:43Z

So should _parse_jwplayer_databe modified to handle relative URLs, or should I use current approach?

yan12125 · 2016-06-14T15:24:40Z

It's OK to just use the current approach. Rewriting jwplayer-related codes is not a top priority.

TRox1972 · 2016-06-18T16:23:56Z

I've added a quick fix for the base URL, but it's not very pretty, so any suggestions for a better solution are appreciated :)

dstftw · 2016-06-18T16:27:35Z

Use urljoin?

TRox1972 · 2016-06-21T17:52:41Z

@dfstw Does changes seem OK?

dstftw · 2016-06-22T17:10:09Z

youtube_dl/extractor/vidbit.py

+            'id': video_id,
+            'title': self._html_search_regex(r'<h1>(.+)</h1>', webpage, 'title'),
+            'url': compat_urlparse.urljoin(url, self._html_search_regex(r'file:\s*(["\'])((?:(?!\1).)+)\1', webpage, 'video URL', group=2)),
+            'thumbnail': compat_urlparse.urljoin(url, self._html_search_regex(r'image:\s*(["\'])((?:(?!\1).)+)\1', webpage, 'thumbnail', None, group=2)),


This will fail no thumbnails extracted. You claim thumbnail to be optional. Is there any example URL of such video?
Also og:image seems like easier way to extract thumbnail.

dstftw · 2016-06-22T17:13:02Z

Also carry long lines and squash commits.

TRox1972 · 2016-06-25T22:13:48Z

@dstftw Does this seem good?

dstftw reviewed Jun 12, 2016
View reviewed changes

dstftw added the pending-fixes label Jun 17, 2016

dstftw reviewed Jun 22, 2016
View reviewed changes

[Vidbit] Add new extractor

38bc638

TRox1972 force-pushed the vidbit branch from c685810 to 38bc638 Compare June 25, 2016 22:13

dstftw closed this in f484c5f Jun 26, 2016

dstftw removed the pending-fixes label Jun 26, 2016

TRox1972 deleted the vidbit branch June 26, 2016 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Vidbit] Add new extractor (Closes #9688) #9759

[Vidbit] Add new extractor (Closes #9688) #9759

TRox1972 commented Jun 12, 2016

dstftw Jun 12, 2016

dstftw Jun 12, 2016

TRox1972 Jun 12, 2016

dstftw Jun 12, 2016

TRox1972 Jun 18, 2016 •

edited

Loading

dstftw Jun 18, 2016

yan12125 commented Jun 12, 2016

dstftw commented Jun 12, 2016

yan12125 commented Jun 12, 2016

TRox1972 commented Jun 12, 2016

yan12125 commented Jun 14, 2016

TRox1972 commented Jun 18, 2016

dstftw commented Jun 18, 2016

TRox1972 commented Jun 21, 2016

dstftw Jun 22, 2016 •

edited

Loading

dstftw commented Jun 22, 2016 •

edited

Loading

TRox1972 commented Jun 25, 2016

[Vidbit] Add new extractor (Closes #9688) #9759

[Vidbit] Add new extractor (Closes #9688) #9759

Conversation

TRox1972 commented Jun 12, 2016

dstftw Jun 12, 2016

Choose a reason for hiding this comment

dstftw Jun 12, 2016

Choose a reason for hiding this comment

TRox1972 Jun 12, 2016

Choose a reason for hiding this comment

dstftw Jun 12, 2016

Choose a reason for hiding this comment

TRox1972 Jun 18, 2016 • edited Loading

Choose a reason for hiding this comment

dstftw Jun 18, 2016

Choose a reason for hiding this comment

yan12125 commented Jun 12, 2016

dstftw commented Jun 12, 2016

yan12125 commented Jun 12, 2016

TRox1972 commented Jun 12, 2016

yan12125 commented Jun 14, 2016

TRox1972 commented Jun 18, 2016

dstftw commented Jun 18, 2016

TRox1972 commented Jun 21, 2016

dstftw Jun 22, 2016 • edited Loading

Choose a reason for hiding this comment

dstftw commented Jun 22, 2016 • edited Loading

TRox1972 commented Jun 25, 2016

TRox1972 Jun 18, 2016 •

edited

Loading

dstftw Jun 22, 2016 •

edited

Loading

dstftw commented Jun 22, 2016 •

edited

Loading