Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spreaker] Add new extractor #13877

Closed
wants to merge 8 commits into from
Closed

[spreaker] Add new extractor #13877

wants to merge 8 commits into from

Conversation

Tatsh
Copy link
Contributor

@Tatsh Tatsh commented Aug 10, 2017

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

Extractor for Spreaker.com podcast site. Fixes issue #13480

_VALID_URL = r"""(?x)^
https?://
(?:www.|api.)?
spreaker.com/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dots must be escaped.

(?:
show/[a-z0-9_-]+|
user/[a-z0-9_-]+/[a-z0-9_-]|
episode/(?P<id>[0-9]+)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each should be a separate extractor.

},
{
'url': ('https://www.spreaker.com/user/9780658/swm-ep15-how-to-'
'market-your-music-part-2'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't carry URLs.

]

def _spreaker_episode_data_to_info(self, data):
upload_date = data['published_at'][0:10].replace('-', '')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not break if published_at is missing.

if not author:
author = {}
stats = data.get('stats')
view_count = like_count = comment_count = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack of information is denoted by None.

view_count = like_count = comment_count = 0
show = data.get('show')
if not show:
show = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show = data.get('show') or {}

show = {}
else:
show_image = show.get('image')
if not show_image:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference before assignment for show_image if not show.

Escape . in regexes
Make separate extractors for episode page, playlist (show), API
Support API's direct links to MP3 files
Make counts set to None in case they are not found
Handle when published_at is not present
Other fixes
@Tatsh
Copy link
Contributor Author

Tatsh commented Sep 1, 2017

@dstftw This is ready to go on my end.

youtube_dl/extractor/spreaker.py Outdated Show resolved Hide resolved
youtube_dl/extractor/spreaker.py Outdated Show resolved Hide resolved
youtube_dl/extractor/spreaker.py Outdated Show resolved Hide resolved
youtube_dl/extractor/spreaker.py Outdated Show resolved Hide resolved
youtube_dl/extractor/spreaker.py Outdated Show resolved Hide resolved
youtube_dl/extractor/spreaker.py Outdated Show resolved Hide resolved
youtube_dl/extractor/spreaker.py Outdated Show resolved Hide resolved
youtube_dl/extractor/spreaker.py Outdated Show resolved Hide resolved
youtube_dl/extractor/spreaker.py Outdated Show resolved Hide resolved
@peterk
Copy link

peterk commented Jul 1, 2018

@Tatsh I am also looking for this. Do you plan on finishing it? Was thinking about starting a new PR.

@Tatsh
Copy link
Contributor Author

Tatsh commented Jul 1, 2018

@peterk I pushed new code up with the requested changes from @dstftw. This should be good to go now.

I suggest maintaining a local branch of your own, merging master and merging the pull requests you like. This way you can use the newest features and stay up-to-date with master. Just don't use this branch for pull requests (always start with master).

@rudolphos
Copy link

Is there an update on this?

@Tatsh
Copy link
Contributor Author

Tatsh commented Jan 26, 2019

@dstftw Please review again since the last commit.

@Tatsh
Copy link
Contributor Author

Tatsh commented Mar 20, 2019

Bump

@rudolphos
Copy link

bump, still doesn't work

#13480 (comment)

@Tatsh
Copy link
Contributor Author

Tatsh commented Sep 8, 2019

@rudolphos This version works fine.

@rudolphos
Copy link

I'm using the latest version and it doesn't work

Input URL: https://www.spreaker.com/user/hjmadigan/51-lost-in-a-good-game --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-ci', '--write-thumbnail', '--no-check-certificate', '--sub-format', 'ass/srt/best', '--write-auto-sub', '--convert-subs', 'srt', '--write-description', '-f', 'bestvideo+bestaudio/best', '--download-archive', 'downloaded.txt', '--output', '/Videos/%(uploader)s - %(title)s (%(id)s).%(ext)s', 'https://www.spreaker.com/user/hjmadigan/51-lost-in-a-good-game', '--verbose']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2019.09.12.1
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.18362
[debug] exe versions: ffmpeg N-93825-gc967128952, ffprobe N-93825-gc967128952, rtmpdump 2.4
[debug] Proxy map: {}
[generic] 51-lost-in-a-good-game: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 51-lost-in-a-good-game: Downloading webpage
[generic] 51-lost-in-a-good-game: Extracting information
[generic] 8913843fbb3c94807f9351c73d45ca5e: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 8913843fbb3c94807f9351c73d45ca5e: Downloading webpage
[generic] 8913843fbb3c94807f9351c73d45ca5e: Extracting information
ERROR: Unsupported URL: https://widget.spreaker.com/player?episode_id=19032297&autoplay=false&playlist=show&cover_image_url=https://d3wo5wojvuv7l.cloudfront.net/images.spreaker.com/original/8913843fbb3c94807f9351c73d45ca5e.jpg
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmphopugs4t\build\youtube_dl\YoutubeDL.py", line 796, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmphopugs4t\build\youtube_dl\extractor\common.py", line 530, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmphopugs4t\build\youtube_dl\extractor\generic.py", line 3355, in _real_extract
youtube_dl.utils.UnsupportedError: Unsupported URL: https://widget.spreaker.com/player?episode_id=19032297&autoplay=false&playlist=show&cover_image_url=https://d3wo5wojvuv7l.cloudfront.net/images.spreaker.com/original/8913843fbb3c94807f9351c73d45ca5e.jpg
Input URL: https://www.spreaker.com/show/psychology-of-video-games-podcast_1 --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-ci', '--write-thumbnail', '--no-check-certificate', '--sub-format', 'ass/srt/best', '--write-auto-sub', '--convert-subs', 'srt', '--write-description', '-f', 'bestvideo+bestaudio/best', '--download-archive', 'downloaded.txt', '--output', '/Videos/%(uploader)s - %(title)s (%(id)s).%(ext)s', 'https://www.spreaker.com/show/psychology-of-video-games-podcast_1', '--verbose']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2019.09.12.1
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.18362
[debug] exe versions: ffmpeg N-93825-gc967128952, ffprobe N-93825-gc967128952, rtmpdump 2.4
[debug] Proxy map: {}
[generic] psychology-of-video-games-podcast_1: Requesting header
WARNING: Falling back on generic information extractor.
[generic] psychology-of-video-games-podcast_1: Downloading webpage
[generic] psychology-of-video-games-podcast_1: Extracting information
[generic] 6773d4af57b93eac576d2c1ab827bf40: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 6773d4af57b93eac576d2c1ab827bf40: Downloading webpage
[generic] 6773d4af57b93eac576d2c1ab827bf40: Extracting information
ERROR: Unsupported URL: https://widget.spreaker.com/player?show_id=3329893&autoplay=false&cover_image_url=https://d1bm3dmew779uf.cloudfront.net/cover/6773d4af57b93eac576d2c1ab827bf40.jpg
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmphopugs4t\build\youtube_dl\YoutubeDL.py", line 796, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmphopugs4t\build\youtube_dl\extractor\common.py", line 530, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmphopugs4t\build\youtube_dl\extractor\generic.py", line 3355, in _real_extract
youtube_dl.utils.UnsupportedError: Unsupported URL: https://widget.spreaker.com/player?show_id=3329893&autoplay=false&cover_image_url=https://d1bm3dmew779uf.cloudfront.net/cover/6773d4af57b93eac576d2c1ab827bf40.jpg

@Tatsh Tatsh requested a review from dstftw November 28, 2019 04:18
@Tatsh
Copy link
Contributor Author

Tatsh commented Nov 17, 2020

@dstftw If you want me to make changes, I'll need to re-make this PR. I asked GitHub and they cannot fix branch references.

@dstftw dstftw closed this in 686e898 Nov 25, 2020
@dstftw
Copy link
Collaborator

dstftw commented Nov 25, 2020

Picked some code in 686e898. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants