Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[iheartradio:podcast] Add new extractor #27037

Merged
merged 1 commit into from
Jan 4, 2021

Conversation

gardenappl
Copy link
Contributor

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

(continued from #26394)

Added support for podcasts from https://iheart.com/podcast. Supports individual episodes, as well as entire series of podcasts, using their internal API.

Note that IHeartRadio offers other services besides podcasts, but those seem to be only available in the US and/or Canada. That includes things like music playlists, and actual radio stations. I did not include support for them because these are not available where I live, and frankly, I really don't care about those features.

@remitamine
Copy link
Collaborator

you should split the extractor into:

  • one that handles individual episodes
  • the other would handle a whole podcast.

@gardenappl
Copy link
Contributor Author

@remitamine Done, also made a few fixes.

@gardenappl
Copy link
Contributor Author

Made commit message a bit shorter.

@remitamine
Copy link
Collaborator

remitamine commented Jan 2, 2021

still the episode and podcast code is mixed up, the code that is specific and only related to podcasts should not be put in the base extractor, the same for the episode extractor.

@gardenappl
Copy link
Contributor Author

This last commit should be good.

Comment on lines 142 to 146
episodes = self._get_all_episodes(podcast_id, temp_user)
episode_ids = [episode['id'] for episode in episodes]

streams_info = self._get_streams_info(podcast_id, episode_ids,
temp_user, podcast_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, you should delegate the extraction to the episode extractor.

Comment on lines 22 to 40
# Register anonymous user, same behavior as web app
def _register_temp_user(self, current_id):
random_device_id = compat_str(uuid.uuid4())
random_oauth_id = compat_str(uuid.uuid4())

register_user_values = urlencode_postdata({
'accessToken': 'anon',
'accessTokenType': 'anon',
'deviceId': random_device_id,
'deviceName': "web-desktop",
'host': "webapp.WW",
'oauthUuid': random_oauth_id,
'userName': 'anon' + random_oauth_id
})
return self._download_json(
'https://ww.api.iheart.com/api/v1/account/loginOrCreateOauthUser',
current_id, "Registering temporary user", data=register_user_values,
headers={'Accept': 'application/json, text/plain, */*',
'X-hostName': 'webapp.WW'})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be avoided by using API v3 for all requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how I would do that, I was just looking at the network requests done by the web frontend and copying that behavior.

Also, what are we trying to avoid? Other API calls require the output from this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how I would do that, I was just looking at the network requests done by the web frontend and copying that behavior.

Also, what are we trying to avoid? Other API calls require the output from this.

trying to avoid additional requests, I think to move quickly with this PR, I would make a simpler implementation, and then you can base your modification on it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the extractor that relies on API v3 has been added in 9c484c0.

)


# To get the audio files, we have to use their internal API
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to add comments that can be easily deduced from the code.

Comment on lines 82 to 85
# Release date timestamp is in milliseconds
release_date = content_info.get('startDate')
if isinstance(release_date, Number) and release_date > 2000000000:
release_date /= 1000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use int_or_none function.

Comment on lines 87 to 91
# Remove analytics from stream URL (optional)
streamUrl = item_info['streamUrl']
streamUrl = re.sub(r'(?:www\.)?podtrac\.com/pts/redirect\.[\w]*/', '', streamUrl)
streamUrl = re.sub(r'chtbl\.com/track/[\w]*/', '', streamUrl)
streamUrl = re.sub(r'\?source=[\w]*', '', streamUrl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should done separatly from this PR(it's used in multiple services that are supported by youtube-dl, so it should be done in generic way).

Copy link
Contributor Author

@gardenappl gardenappl Jan 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any other mentions of "podtrac" or "chbtl" in other extractors. But yeah, I guess I shouldn't be doing that in this file.

@gardenappl
Copy link
Contributor Author

@remitamine Thanks for co-operating.
You said I could base my modifications on your commits, but your code seems perfectly fine and I don't think I can add much. The only thing I did was fix the test for the episode extractor. It was expecting the description to contain HTML tags, but that's cleaned now.

@remitamine remitamine merged commit f6ea29e into ytdl-org:master Jan 4, 2021
ThirumalaiK pushed a commit to ThirumalaiK/youtube-dl that referenced this pull request Jan 28, 2021
ThirumalaiK pushed a commit to ThirumalaiK/youtube-dl that referenced this pull request Jan 28, 2021
the description has no HTML tags now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants