[radiocloud.jp] Add new extractor #12725

EliteTK · 2017-04-12T20:35:16Z

I have

At least skimmed through adding new extractor tutorial and youtube-dl coding conventions sections
Searched the bugtracker for similar pull requests

One of the following applies

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

The purpose of this pull request

Bug fix
Improvement
New extractor
New feature

Description

This is a new extractor for https://radiocloud.jp/.

This is a audio-only website, if that is not within the scope of this project, my apologies, please close the pull request.

Additionally I can also confirm this is not a website which is based around copyright infringement, from the little Japanese that I know it appears to be the official website for some Japanese radio station.

The website restricts access to older recordings to users without an account, however this restriction is not enforced by anything more than a semi-transparent div preventing clicks, therefore authentication was not implemented in the extractor as it is unnecessary.

The website doesn't have any obvious way of getting the URL for a specific recording, to get one you need to view the RSS feed for a radio program and then use one of the URLs provided there.

The extractor currently makes no attempt to create a playlist of all recordings found on the archive page of a program as there could be quite a few recordings and I personally don't see it as a useful feature, but it could be done.

The extractor also makes no attempt to guess which recording you wish to download if the URL provided does not refer to a specific content_id.

A note about some regex:

        file_url = self._search_regex(r'var\s*source\s*=\s*"(.+?)"', webpage, 'url')

I think this regex is a bit iffy, but I couldn't think of a better way.

Thank you for your time.

Edit: removed note about .strip() - I was mistaken.

dstftw · 2017-04-23T03:14:27Z

youtube_dl/extractor/radiocloud.py

+                break
+
+        if not element:
+            raise ExtractorError('Could not find details of id {}'.format(video_id))


{} won't on python 2.6.

dstftw · 2017-04-23T03:17:14Z

youtube_dl/extractor/radiocloud.py

+            return None
+
+        element = None
+        for e in get_elements_by_class("contents_box", webpage):


Single quotes.

dstftw · 2017-04-23T03:18:28Z

youtube_dl/extractor/radiocloud.py

+                                         note='Downloading player',
+                                         errnote='Unable to download player')
+
+        file_url = self._search_regex(r'var\s*source\s*=\s*"(.+?)"', webpage, 'url')


Should be var\s+.

dstftw · 2017-04-23T03:22:09Z

youtube_dl/extractor/radiocloud.py

+            'ext': 'm4a',
+            'title': '「オープニング」 ',
+            'description': '「“フィギュアスケートは人生”引退会見で浅田真央さんは何を語ったのか？」スポーツライター・青嶋ひろのさんが解説！',
+        }


Test fails.

Interesting, it seems like for people without an account the page loads content a week or two into the past. For people who are logged in with an account the page loads content a month into the past.

I'll try to work out if I can find a way to load user restricted content without authentication. If that fails, I'll implement authentication (but I'll make it into its own commit instead of amending the current commit like for the other fixes).

I'm not sure if there is any access to content older than a month. I'll fetch the extracted URL for a recording about to expire and see if I can still access it after it expires.

If I can't find a way to download content which the website does not give direct access to even for logged in users, what should I do for tests which have an "expiration date" so to speak?

@dstftw any ideas about this?

EliteTK · 2018-03-10T21:45:43Z

@dstftw Sorry for the long wait. I've been a bit busy and was for a while happy with my own solution.

I've made the requested changes.

Upon further inspection there is an "end_date" or expiry date on each upload. I did a search across all uploads on the website (with a separate script) and found an upload which expires in 2038. When 2038 is approaching, feel free to ask for a new test :P .

So far I've made do with my own scripts but it would be really neat to get this in youtube_dl so I can use this feature directly with mpv.

EliteTK force-pushed the radiocloud branch from 9c3a241 to cf46ead Compare April 12, 2017 20:46

dstftw requested changes Apr 23, 2017

View reviewed changes

dstftw added the pending-fixes label Apr 23, 2017

dstftw force-pushed the master branch from fa77986 to 0c7a631 Compare June 24, 2017 22:04

dstftw force-pushed the master branch from 4991699 to 1141e91 Compare August 5, 2017 00:42

dstftw force-pushed the master branch from 37318e1 to 65220c3 Compare January 27, 2018 22:49

[radiocloud.jp] Add new extractor

5628541

EliteTK force-pushed the radiocloud branch from cf46ead to 5628541 Compare March 10, 2018 21:40

dstftw force-pushed the master branch from c486aa9 to 5ee7ae5 Compare December 9, 2018 15:39

dstftw force-pushed the master branch from d99bab0 to e118a87 Compare January 23, 2019 18:40

dstftw force-pushed the master branch from 5e26784 to da2069f Compare September 13, 2020 13:52

cypheron mentioned this pull request Feb 3, 2021

Evaluation / overview of new proposed extractors / sites #28054

Open

dirkf force-pushed the master branch from 01bf89e to 4c6fba3 Compare August 26, 2022 07:51

dirkf closed this Aug 1, 2023

dirkf added the defunct PR source branch is not accessible label Oct 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[radiocloud.jp] Add new extractor #12725

[radiocloud.jp] Add new extractor #12725

EliteTK commented Apr 12, 2017 •

edited

Loading

dstftw Apr 23, 2017

dstftw Apr 23, 2017

dstftw Apr 23, 2017

dstftw Apr 23, 2017

EliteTK Apr 24, 2017

EliteTK May 8, 2017

EliteTK commented Mar 10, 2018

[radiocloud.jp] Add new extractor #12725

[radiocloud.jp] Add new extractor #12725

Conversation

EliteTK commented Apr 12, 2017 • edited Loading

I have

One of the following applies

The purpose of this pull request

Description

dstftw Apr 23, 2017

Choose a reason for hiding this comment

dstftw Apr 23, 2017

Choose a reason for hiding this comment

dstftw Apr 23, 2017

Choose a reason for hiding this comment

dstftw Apr 23, 2017

Choose a reason for hiding this comment

EliteTK Apr 24, 2017

Choose a reason for hiding this comment

EliteTK May 8, 2017

Choose a reason for hiding this comment

EliteTK commented Mar 10, 2018

EliteTK commented Apr 12, 2017 •

edited

Loading