Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nest Add new extractor #31274

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Nest Add new extractor #31274

wants to merge 2 commits into from

Conversation

evanzh15
Copy link

@evanzh15 evanzh15 commented Oct 2, 2022

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractorExplanation of your pull request in arbitrary form goes here. Please make sure the description explains the purpose and effect of your pull request and is worded well enough to be understood. Provide as much context and examples as possible.
  • New feature

Description of your pull request and other information

Add extractor for NestCam video.

@dirkf dirkf linked an issue Oct 10, 2022 that may be closed by this pull request
5 tasks
Copy link
Contributor

@dirkf dirkf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work!

It's nearly there. Have a look at the suggestions and get the test working.

r'https:\/\/video.nest.com\/clip\/(.+?)(\.|")', webpage, 'video_id', fatal=False)
title = self._html_search_meta(['og:title', 'title'], webpage, 'title')
if title == "":
title = "\"\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this extractor the page may have no explicit title, but yt-dl wants one, so use a specialised standard method to invent one (as above):

Suggested change
title = "\"\""
title = self._generic_title(url)

'description': '#caughtonNestCam',
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _generic_title(self, url)
return 'NestCam video ' + super(NestIE, self)._generic_title(url)

video_id = self._search_regex(
r'https:\/\/video.nest.com\/clip\/(.+?)(\.|")', webpage, 'video_id', fatal=False)
title = self._html_search_meta(['og:title', 'title'], webpage, 'title')
if title == "":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if title == "":
if not title:

webpage = self._download_webpage(url, video_id)
video_id = self._search_regex(
r'https:\/\/video.nest.com\/clip\/(.+?)(\.|")', webpage, 'video_id', fatal=False)
title = self._html_search_meta(['og:title', 'title'], webpage, 'title')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer tuple for const sequence:

Suggested change
title = self._html_search_meta(['og:title', 'title'], webpage, 'title')
title = self._html_search_meta(('og:title', 'title'), webpage, 'title')

Comment on lines +30 to +31
if "/" in ext:
ext = ext[ext.index("/") + 1:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use utils.mimetype2ext():

Suggested change
if "/" in ext:
ext = ext[ext.index("/") + 1:]
ext = mimetype2ext(ext) or ext

# coding: utf-8
from __future__ import unicode_literals

from .common import InfoExtractor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used later:

Suggested change
from .common import InfoExtractor
from .common import InfoExtractor
from ..utils import (
ExtractorError,
mimetype2ext,
url_or_none,
)



class NestIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?video.nest\.com/clip/(?P<id>)(.mp4)?'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will never match a useful ID!

Suggested change
_VALID_URL = r'https?://(?:www\.)?video.nest\.com/clip/(?P<id>)(.mp4)?'
_VALID_URL = r'https?://(?:www\.)?video\.nest\.com/clip/(?P<id>\w+)'

Comment on lines +23 to +24
video_id = self._search_regex(
r'https:\/\/video.nest.com\/clip\/(.+?)(\.|")', webpage, 'video_id', fatal=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to escape / here (in a JS /regexp/, yes), but do escape ., and don't overwrite video_id:

Suggested change
video_id = self._search_regex(
r'https:\/\/video.nest.com\/clip\/(.+?)(\.|")', webpage, 'video_id', fatal=False)
video_id = self._search_regex(
r'https://video\.nest\.com/clip/(.+?)(?:\.|")', webpage, 'video_id', fatal=False) or video_id

Actually, is this ever different from the value extracted from the page URL? With the correct _VALID_URL, you should have a good value for it. If you do need to do this search, use the _VALID_URL again:

Suggested change
video_id = self._search_regex(
r'https:\/\/video.nest.com\/clip\/(.+?)(\.|")', webpage, 'video_id', fatal=False)
video_id = self._search_regex(
self._VALID_URL, webpage, 'video_id', group='id', fatal=False) or video_id

Or just

Suggested change
video_id = self._search_regex(
r'https:\/\/video.nest.com\/clip\/(.+?)(\.|")', webpage, 'video_id', fatal=False)

Comment on lines +9 to +18
_TEST = {
'url': 'https://video.nest.com/clip/73ddb6bd57c4485597a76e154a4429ea.mp4',
'md5': '7ab4eb6d4c2480be1740cc014a76ee96',
'info_dict': {
'id': '73ddb6bd57c4485597a76e154a4429ea',
'ext': 'mp4',
'title': "\"\"",
'description': '#caughtonNestCam',
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer _TESTS_ in new extractors:

Suggested change
_TEST = {
'url': 'https://video.nest.com/clip/73ddb6bd57c4485597a76e154a4429ea.mp4',
'md5': '7ab4eb6d4c2480be1740cc014a76ee96',
'info_dict': {
'id': '73ddb6bd57c4485597a76e154a4429ea',
'ext': 'mp4',
'title': "\"\"",
'description': '#caughtonNestCam',
}
}
_TESTS = [{
'url': 'https://video.nest.com/clip/73ddb6bd57c4485597a76e154a4429ea.mp4',
'md5': '7ab4eb6d4c2480be1740cc014a76ee96',
'info_dict': {
'id': '73ddb6bd57c4485597a76e154a4429ea',
'ext': 'mp4',
'title': "\"\"",
'description': '#caughtonNestCam',
}
}]

'info_dict': {
'id': '73ddb6bd57c4485597a76e154a4429ea',
'ext': 'mp4',
'title': "\"\"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match other changes:

Suggested change
'title': "\"\"",
'title': r're:^NestCam video \w+',

@dirkf dirkf mentioned this pull request Dec 2, 2023
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for clips from Nest Cam site video.nest.com
2 participants