[peertube] Added extractor #16329

parth-verma · 2018-04-29T18:18:23Z

Please follow the guide below

You will be asked some questions, please read them carefully and answer honestly
Put an x into all the boxes [ ] relevant to your pull request (like that [x])
Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

At least skimmed through adding new extractor tutorial and youtube-dl coding conventions sections
Searched the bugtracker for similar pull requests
Checked the code with flake8

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Bug fix
Improvement
New extractor
New feature

Description of your pull request and other information

Adds the ability to download videos from peertube.touhoppai.moe thus resolving #16301

dstftw · 2018-04-29T18:35:39Z

youtube_dl/extractor/peertube.py

+    _BASE_THUMBNAIL_URL = 'https://peertube.touhoppai.moe/static/previews/%s.jpg'
+    IE_DESC = 'Peertube Videos'
+    IE_NAME = 'Peertube'
+    _VALID_URL = r'https?:\/\/peertube\.touhoppai\.moe\/videos\/watch\/(?P<id>[0-9|\-|a-z]+)'


Which line in particular are you referring to as invalid?

This exact line, namely the capture group.

dstftw · 2018-04-29T18:35:50Z

youtube_dl/extractor/peertube.py

+
+
+class PeertubeIE(InfoExtractor):
+    _BASE_VIDEO_URL = 'https://peertube.touhoppai.moe/static/webseed/%s-1080.mp4'


No hardcodes. Use API.

Which api are you talking about?

Website's API.

dstftw · 2018-04-29T18:36:42Z

youtube_dl/extractor/peertube.py

+            'ext': 'mp4',
+            'title': 'David Revoy Live Stream: Speedpainting',
+            'description': 'md5:5c09a6e3fdb5f56edce289d69fbe7567',
+            'thumbnail': 'https://peertube.touhoppai.moe/static/previews/7f3421ae-6161-4a4a-ae38-d167aec51683.jpg',


parth-verma · 2018-04-30T19:25:08Z

@dstftw please review

dstftw · 2018-04-30T19:25:39Z

youtube_dl/extractor/peertube.py

+class PeertubeIE(InfoExtractor):
+    IE_DESC = 'Peertube Videos'
+    IE_NAME = 'Peertube'
+    _VALID_URL = r'https?:\/\/peertube\.touhoppai\.moe\/videos\/watch\/(?P<id>[0-9|\-|a-z]+)'


Nothing changed.

What is exactly wrong with the capture group as it seems correct to me and runs without any errors?

Code in bracket is a set of matching characters. oring won't work.

dstftw · 2018-04-30T19:25:44Z

youtube_dl/extractor/extractors.py

@@ -1332,7 +1333,7 @@
    WebOfStoriesPlaylistIE,
 )
 from .weibo import (
-    WeiboIE, 
+    WeiboIE,


Remove all unrelated changes.

dstftw · 2018-04-30T19:25:59Z

youtube_dl/extractor/peertube.py

+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        url_data = compat_urlparse.urlparse(url)
+        api_url = "%s://%s/api/v1/videos/%s" % (url_data.scheme, url_data.hostname, video_id)


dstftw · 2018-04-30T19:26:42Z

youtube_dl/extractor/peertube.py

+        return {
+            'id': video_id,
+            'title': details['name'],
+            'description': details['description'],


Should not break if missing.

dstftw · 2018-04-30T19:27:26Z

youtube_dl/extractor/peertube.py

+            'id': video_id,
+            'title': details['name'],
+            'description': details['description'],
+            'url': details['files'][-1]['fileUrl'],


All formats must be extracted.

dstftw · 2018-04-30T19:27:31Z

youtube_dl/extractor/peertube.py

+            'title': details['name'],
+            'description': details['description'],
+            'url': details['files'][-1]['fileUrl'],
+            'thumbnail': url_data.scheme + '://' + url_data.hostname + details['thumbnailPath']


parth-verma · 2018-05-02T14:15:23Z

@dstftw Made the required changes. Please review.

dstftw · 2018-05-03T16:20:17Z

youtube_dl/extractor/peertube.py

+class PeertubeIE(InfoExtractor):
+    IE_DESC = 'Peertube Videos'
+    IE_NAME = 'Peertube'
+    _VALID_URL = r'https?:\/\/peertube\.touhoppai\.moe\/videos\/watch\/(?P<id>[0-9|a-z]{8}-[0-9|a-z]{4}-[0-9|a-z]{4}-[0-9|a-z]{4}-[0-9|a-z]{12})'


Do you read my messages at all?

dstftw · 2018-05-03T16:23:45Z

youtube_dl/extractor/peertube.py

+        video_id = self._match_id(url)
+        url_data = compat_urlparse.urlparse(url)
+        base_url = "%s://%s" % (url_data.scheme, url_data.hostname)
+        api_url = urljoin(urljoin(base_url, "/api/v1/videos/"), video_id)


What the hell are you doing? urljoin(url, '/api/v1/videos/%s' % video_id). All.

dstftw · 2018-05-03T16:24:20Z

youtube_dl/extractor/peertube.py

+        details = self._download_json(api_url, video_id)
+        return {
+            'id': video_id,
+            'title': details.get('name'),


Read: coding conventions, mandatory fields.

dstftw · 2018-05-03T16:24:31Z

youtube_dl/extractor/peertube.py

+            'title': details.get('name'),
+            'description': details.get('description'),
+            'formats': [{'url': file_data['fileUrl'], 'filesize': file_data.get('size')} for file_data in sorted(details['files'], key=lambda x: x['size'])],
+            'thumbnail': urljoin(base_url, details['thumbnailPath'])


Read: coding conventions, optional fields.

dstftw · 2018-05-03T16:25:52Z

youtube_dl/extractor/peertube.py

+            'id': video_id,
+            'title': details.get('name'),
+            'description': details.get('description'),
+            'formats': [{'url': file_data['fileUrl'], 'filesize': file_data.get('size')} for file_data in sorted(details['files'], key=lambda x: x['size'])],


_sort_formats.

Must not break if any of these keys is missing.

parth-verma · 2018-05-03T17:14:17Z

@dstftw Made the changes. Please verify.

dstftw · 2018-05-03T17:20:56Z

youtube_dl/extractor/peertube.py

+        video_id = self._match_id(url)
+        url_data = compat_urlparse.urlparse(url)
+        base_url = "%s://%s" % (url_data.scheme, url_data.hostname)
+        api_url = urljoin(base_url, "/api/v1/videos/%s" % video_id)


Are you trolling or what?

AGAIN Can you please be more descriptive!!

I've provided clear working piece of code that you must just copy paste. Instead you introduced mess with base URL.

I did just copy that😐. The change with the base url is incase the url is received with no scheme defined

dstftw · 2018-05-03T17:22:03Z

youtube_dl/extractor/peertube.py

+            'duration': details.get('duration'),
+            'view_count': details.get('views'),
+            'like_count': details.get('likes'),
+            'dislike_count': details.get('dislikes'),


int_or_none.

dstftw · 2018-05-03T17:22:10Z

youtube_dl/extractor/peertube.py

+            'thumbnail': urljoin(base_url, details['thumbnailPath']) if 'thumbnailPath' in details else None,
+            'uploader': details.get('account', {}).get('name'),
+            'uploader_id': details.get('account', {}).get('id'),
+            'uploder_url': details.get('account', {}).get('url'),


dstftw · 2018-05-03T17:22:24Z

youtube_dl/extractor/peertube.py

+class PeertubeIE(InfoExtractor):
+    IE_DESC = 'Peertube Videos'
+    IE_NAME = 'Peertube'
+    _VALID_URL = r'(?:https?:)//peertube\.touhoppai\.moe\/videos\/watch\/(?P<id>[0-9|a-z]{8}-[0-9|a-z]{4}-[0-9|a-z]{4}-[0-9|a-z]{4}-[0-9|a-z]{12})'


Can you please be more descriptive

Read my previous comments. I'm not going to repeat myself.

This regex is appropriate as the id is always in uuid format which follows the exact same format and the values are hex code so though a-z should be replaced by a-f i believe the rest of the format will remain the same.

Link

You: [0-9|a-f]. Stackoverflow: [0-9a-f]. Difference? You have to learn to distinguish (...) and [...].

parth-verma · 2018-05-17T07:31:07Z

@h-h-h-h made the changes, check now.

[peertube] Added extractor

0340263

dstftw requested changes Apr 29, 2018

View reviewed changes

dstftw added the pending-fixes label Apr 29, 2018

Integrated peertube api for peertube

2d8fc9c

parth-verma force-pushed the peertube_extractor branch from 689c980 to 2d8fc9c Compare April 30, 2018 17:01

dstftw requested changes Apr 30, 2018

View reviewed changes

parth-verma added 2 commits May 1, 2018 01:15

Used urljoin for url comprehension

b191967

Added multiple formats for videos

45c7e97

parth-verma force-pushed the peertube_extractor branch from 896568a to 45c7e97 Compare April 30, 2018 19:59

parth-verma added 2 commits May 1, 2018 15:45

Fixed regex for valid url capture group

06ad3fe

made filesize field optional

6655bbc

dstftw requested changes May 3, 2018

View reviewed changes

Made requested changes

115bdf0

dstftw reviewed May 3, 2018

View reviewed changes

parth-verma added 5 commits May 3, 2018 23:24

Added none handling of integers via int_or_none.

cc7f194

Added try_get for multiple getters

4cc0397

Fixed uuid char range

6f22764

Merge branch 'master' into peertube_extractor

fdb1961

Fixed regex

2c08d27

dstftw closed this in c561b75 May 25, 2018



		class PeertubeIE(InfoExtractor):
		_BASE_VIDEO_URL = 'https://peertube.touhoppai.moe/static/webseed/%s-1080.mp4'

[peertube] Added extractor #16329

[peertube] Added extractor #16329

Conversation

parth-verma commented Apr 29, 2018 • edited Loading

Please follow the guide below

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dstftw Apr 30, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dstftw Apr 30, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parth-verma commented Apr 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parth-verma commented May 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parth-verma commented May 3, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parth-verma May 4, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parth-verma commented May 17, 2018

parth-verma commented Apr 29, 2018 •

edited

Loading

dstftw Apr 30, 2018 •

edited

Loading

dstftw Apr 30, 2018 •

edited

Loading

parth-verma May 4, 2018 •

edited

Loading