Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[erocast] Add new extractor for erocast.me #31631

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

RonniCGN
Copy link

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

Explanation of your pull request in arbitrary form goes here. Please make sure the description explains the purpose and effect of your pull request and is worded well enough to be understood. Provide as much context and examples as possible.

@RonniCGN RonniCGN mentioned this pull request Feb 20, 2023
@dirkf
Copy link
Contributor

dirkf commented Mar 7, 2023

As the linked issue didn't use the Site Support template, please complete the template for this site, posted either here, or in the issue, or in a new issue.

@RonniCGN

This comment was marked as resolved.

@dirkf

This comment was marked as resolved.

Copy link
Contributor

@dirkf dirkf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work!

I've made some suggestions.


# The Song data is in a script tag with the following format:
# (see https://github.com/ytdl-org/youtube-dl/issues/31203#issuecomment-1259867716)
searchPattern = r'<script>var song_data_' + str(video_id) + r' = (.+?)<\/script>'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • relax the RE
  • when might video_id not be a string?
  • if the .+ might contain newlines, use r'(?s)... or use [\s\S]+
Suggested change
searchPattern = r'<script>var song_data_' + str(video_id) + r' = (.+?)<\/script>'
searchPattern = r'<script\b[^>]*>var\s+song_data_%s\s*=\s*(.+?)</script>' % re.escape(video_id)

jsonObject = self._parse_json(jsonString, None, fatal=False)
audio_url = jsonObject['stream_url']
title = jsonObject['title']
user_name = jsonObject['user']['name']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional metadata, shouldn't crash extraction:

Suggested change
user_name = jsonObject['user']['name']
user_name = traverse_obj(jsonObject, ('user', 'name'), expected_type=lambda x: x.strip() or None)

Comment on lines +38 to +41
formats = []
formats.extend(self._extract_m3u8_formats(
audio_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call _sort_formats():

Suggested change
formats = []
formats.extend(self._extract_m3u8_formats(
audio_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
formats = self._extract_m3u8_formats(
audio_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False)
self._sort_formats(formats)

from __future__ import unicode_literals

from .common import InfoExtractor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from ..utils import traverse_obj

# The Song data is in a script tag with the following format:
# (see https://github.com/ytdl-org/youtube-dl/issues/31203#issuecomment-1259867716)
searchPattern = r'<script>var song_data_' + str(video_id) + r' = (.+?)<\/script>'
jsonString = self._html_search_regex(searchPattern, webpage, 'data')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make the text for the search more specific:

Suggested change
jsonString = self._html_search_regex(searchPattern, webpage, 'data')
jsonString = self._html_search_regex(searchPattern, webpage, 'song data')

@@ -353,6 +353,8 @@
from .embedly import EmbedlyIE
from .engadget import EngadgetIE
from .eporner import EpornerIE
from .erocast import ErocastIE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unwanted space

Suggested change

@Simon818
Copy link

I'm sad this seemingly got stuck, I found an audio file on Erocast recently and the site is not particularly screen-reader friendly. I know enough to get the audio file out using developer tools but not enough to write code to automate it based on a URL. By this point I'd guess the original PR author isn't working on it anymore.

@aaron-tan
Copy link
Contributor

Hi all,

I've had a look at the code by the original author and I've noticed there wasn't any functionality to allow downloads of playlists from the site. I decided to have a look over the last few days and I've managed to successfully implement the feature to download playlists from erocast.me. Shall I create a new PR for this feature as the original author doesn't seem to be working on this anymore like Simon818 remarked? While creating a new PR I can also add the suggestions mentioned above by dirkf so it can all be in one neat package and we can close this PR for good.

Thoughts? @Simon818 @dirkf @RonniCGN

@Simon818
Copy link

@aaron-tan That's awesome. I think you should go ahead and submit it, particularly if it's already working on both playlists and individual tracks. We have no idea when the original PR author is going to respond and you have something that works today.

aaron-tan added a commit to aaron-tan/youtube-dl that referenced this pull request Aug 16, 2023
Build on code from PR ytdl-org#31631.
- Add playlist info extractor to download playlists
- Refactor code from suggestions by @dirkf
@aaron-tan aaron-tan mentioned this pull request Aug 16, 2023
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants