[iheartradio:podcast] Add new extractor #27037

gardenappl · 2020-11-17T11:22:21Z

Please follow the guide below

You will be asked some questions, please read them carefully and answer honestly
Put an x into all the boxes [ ] relevant to your pull request (like that [x])
Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

At least skimmed through adding new extractor tutorial and youtube-dl coding conventions sections
Searched the bugtracker for similar pull requests
Checked the code with flake8

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Bug fix
Improvement
New extractor
New feature

Description of your pull request and other information

(continued from #26394)

Added support for podcasts from https://iheart.com/podcast. Supports individual episodes, as well as entire series of podcasts, using their internal API.

Note that IHeartRadio offers other services besides podcasts, but those seem to be only available in the US and/or Canada. That includes things like music playlists, and actual radio stations. I did not include support for them because these are not available where I live, and frankly, I really don't care about those features.

remitamine · 2021-01-01T20:03:33Z

you should split the extractor into:

one that handles individual episodes
the other would handle a whole podcast.

gardenappl · 2021-01-02T15:24:37Z

@remitamine Done, also made a few fixes.

gardenappl · 2021-01-02T15:30:57Z

Made commit message a bit shorter.

remitamine · 2021-01-02T16:52:47Z

still the episode and podcast code is mixed up, the code that is specific and only related to podcasts should not be put in the base extractor, the same for the episode extractor.

gardenappl · 2021-01-03T16:31:52Z

This last commit should be good.

remitamine · 2021-01-03T16:56:56Z

youtube_dl/extractor/iheartradio.py

+        episodes = self._get_all_episodes(podcast_id, temp_user)
+        episode_ids = [episode['id'] for episode in episodes]
+
+        streams_info = self._get_streams_info(podcast_id, episode_ids,
+                                              temp_user, podcast_id)


no, you should delegate the extraction to the episode extractor.

remitamine · 2021-01-03T16:59:41Z

youtube_dl/extractor/iheartradio.py

+    # Register anonymous user, same behavior as web app
+    def _register_temp_user(self, current_id):
+        random_device_id = compat_str(uuid.uuid4())
+        random_oauth_id = compat_str(uuid.uuid4())
+
+        register_user_values = urlencode_postdata({
+            'accessToken': 'anon',
+            'accessTokenType': 'anon',
+            'deviceId': random_device_id,
+            'deviceName': "web-desktop",
+            'host': "webapp.WW",
+            'oauthUuid': random_oauth_id,
+            'userName': 'anon' + random_oauth_id
+        })
+        return self._download_json(
+            'https://ww.api.iheart.com/api/v1/account/loginOrCreateOauthUser',
+            current_id, "Registering temporary user", data=register_user_values,
+            headers={'Accept': 'application/json, text/plain, */*',
+                     'X-hostName': 'webapp.WW'})


can be avoided by using API v3 for all requests.

I'm not sure how I would do that, I was just looking at the network requests done by the web frontend and copying that behavior.

Also, what are we trying to avoid? Other API calls require the output from this.

I'm not sure how I would do that, I was just looking at the network requests done by the web frontend and copying that behavior.

Also, what are we trying to avoid? Other API calls require the output from this.

trying to avoid additional requests, I think to move quickly with this PR, I would make a simpler implementation, and then you can base your modification on it.

the extractor that relies on API v3 has been added in 9c484c0.

remitamine · 2021-01-03T17:01:13Z

youtube_dl/extractor/iheartradio.py

+)
+
+
+# To get the audio files, we have to use their internal API


no need to add comments that can be easily deduced from the code.

remitamine · 2021-01-03T17:01:53Z

youtube_dl/extractor/iheartradio.py

+        # Release date timestamp is in milliseconds
+        release_date = content_info.get('startDate')
+        if isinstance(release_date, Number) and release_date > 2000000000:
+            release_date /= 1000


use int_or_none function.

remitamine · 2021-01-03T17:04:21Z

youtube_dl/extractor/iheartradio.py

+        # Remove analytics from stream URL (optional)
+        streamUrl = item_info['streamUrl']
+        streamUrl = re.sub(r'(?:www\.)?podtrac\.com/pts/redirect\.[\w]*/', '', streamUrl)
+        streamUrl = re.sub(r'chtbl\.com/track/[\w]*/', '', streamUrl)
+        streamUrl = re.sub(r'\?source=[\w]*', '', streamUrl)


this should done separatly from this PR(it's used in multiple services that are supported by youtube-dl, so it should be done in generic way).

I couldn't find any other mentions of "podtrac" or "chbtl" in other extractors. But yeah, I guess I shouldn't be doing that in this file.

gardenappl · 2021-01-04T15:36:04Z

@remitamine Thanks for co-operating.
You said I could base my modifications on your commits, but your code seems perfectly fine and I don't think I can add much. The only thing I did was fix the test for the episode extractor. It was expecting the description to contain HTML tags, but that's cleaned now.

the description has no HTML tags now.

gardenappl mentioned this pull request Nov 17, 2020

[iheartradio:podcast] Add new extractor #26394

Closed

9 tasks

remitamine added the pending-fixes label Jan 1, 2021

gardenappl force-pushed the iheartradio branch from 904c312 to 00ae1f6 Compare January 2, 2021 15:30

gardenappl force-pushed the iheartradio branch from 305f463 to a1c4403 Compare January 3, 2021 16:30

remitamine requested changes Jan 3, 2021

View reviewed changes

remitamine added a commit that referenced this pull request Jan 4, 2021

[iheart] Add new extractor for iHeartRadio(#27037)

9c484c0

[iheart] Fix test, description has no HTML tags now

38e12bf

gardenappl force-pushed the iheartradio branch from f1f6ab5 to 38e12bf Compare January 4, 2021 15:33

remitamine merged commit f6ea29e into ytdl-org:master Jan 4, 2021

ThirumalaiK pushed a commit to ThirumalaiK/youtube-dl that referenced this pull request Jan 28, 2021

[iheart] Add new extractor for iHeartRadio(ytdl-org#27037)

2794bbf

ThirumalaiK pushed a commit to ThirumalaiK/youtube-dl that referenced this pull request Jan 28, 2021

[iheart] Update test description value (ytdl-org#27037)

14feea8

the description has no HTML tags now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[iheartradio:podcast] Add new extractor #27037

[iheartradio:podcast] Add new extractor #27037

gardenappl commented Nov 17, 2020

remitamine commented Jan 1, 2021

gardenappl commented Jan 2, 2021

gardenappl commented Jan 2, 2021

remitamine commented Jan 2, 2021 •

edited

Loading

gardenappl commented Jan 3, 2021

remitamine Jan 3, 2021

remitamine Jan 3, 2021

gardenappl Jan 3, 2021

remitamine Jan 3, 2021

remitamine Jan 4, 2021

remitamine Jan 3, 2021

remitamine Jan 3, 2021

remitamine Jan 3, 2021

gardenappl Jan 3, 2021 •

edited

Loading

gardenappl commented Jan 4, 2021

		)


		# To get the audio files, we have to use their internal API

[iheartradio:podcast] Add new extractor #27037

[iheartradio:podcast] Add new extractor #27037

Conversation

gardenappl commented Nov 17, 2020

Please follow the guide below

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

remitamine commented Jan 1, 2021

gardenappl commented Jan 2, 2021

gardenappl commented Jan 2, 2021

remitamine commented Jan 2, 2021 • edited Loading

gardenappl commented Jan 3, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gardenappl Jan 3, 2021 • edited Loading

Choose a reason for hiding this comment

gardenappl commented Jan 4, 2021

remitamine commented Jan 2, 2021 •

edited

Loading

gardenappl Jan 3, 2021 •

edited

Loading