Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dnb] Add new extractor #18725

Closed
wants to merge 3 commits into from
Closed

[dnb] Add new extractor #18725

wants to merge 3 commits into from

Conversation

user706
Copy link

@user706 user706 commented Jan 2, 2019

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

New extractor for e.g. https://portal.dnb.de/audioplayer/do/show/1077188552

@user706
Copy link
Author

user706 commented Jan 2, 2019

strange...
Download only works correctly the 2nd time.

Example:

$ python3 -m youtube_dl https://portal.dnb.de/audioplayer/do/show/107679887X
[DNB] 107679887X: Downloading webpage
[download] Downloading playlist: None
[DNB] playlist None: Collected 1 video ids (downloading 1 of them)
[download] Downloading video 1 of 1
ERROR: unable to download video data: HTTP Error 404: Not Found

$ python3 -m youtube_dl https://portal.dnb.de/audioplayer/do/show/107679887X
[DNB] 107679887X: Downloading webpage
[download] Downloading playlist: None
[DNB] playlist None: Collected 1 video ids (downloading 1 of them)
[download] Downloading video 1 of 1
[download] Destination: Gavotte und Bourée aus der 'Französischen Suite Nr. 5' [Elektronische Ressource] _ Johann Sebastian Bach-107679887X.mp3
[download] 100% of 4.48MiB in 00:01
[download] Finished downloading playlist: None

Here are some urls to test with:
https://portal.dnb.de/audioplayer/do/show/1077188552
https://portal.dnb.de/audioplayer/do/show/1077187920
https://portal.dnb.de/audioplayer/do/show/107745936X
https://portal.dnb.de/audioplayer/do/show/1077188145
https://portal.dnb.de/audioplayer/do/show/1076798888
https://portal.dnb.de/audioplayer/do/show/107679887X
https://portal.dnb.de/audioplayer/do/show/1076798861



class DNBIE(InfoExtractor):
_VALID_URL = r'https?://(?:portal\.dnb\.de/audioplayer/do/show/|d-nb\.info/)(?P<id>\w+)[/&]?'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/&]? does not make any sense at the end.

}]

@staticmethod
def update_and_return_dic(info_dict, update_info):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline.

'id': obj.get('idn'),
'title': obj.get('title'),
'author': obj.get('author'),
'url': 'https://portal.dnb.de/' + obj.get('media_url'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urljoin.

for obj in objs:
thumbnail = obj.get('cover_url')
if thumbnail:
thumbnail = 'https://portal.dnb.de/' + thumbnail
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urljoin.


info_dict = {
'id': obj.get('idn'),
'title': obj.get('title'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mandatory. Read coding conventions.

webpage = self._download_webpage(url, video_id)

m = re.search(r'fdnbpl.media\s*=\s*(\[.*\]);', webpage)
objs = json.loads(m.group(1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_search_regex, _parse_json. Again: read coding conventions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defunct PR source branch is not accessible pending-fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants