Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Noovo] Add new extractor #12792

Closed
wants to merge 4 commits into from
Closed

[Noovo] Add new extractor #12792

wants to merge 4 commits into from

Conversation

@fredbourni
Copy link
Contributor

fredbourni commented Apr 19, 2017

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

Adding extractor for Noovo (Noovo.ca) french video platform since requested via issue #12565


api_url = self.TEMPLATE_API_URL % video_id

api_content = self._download_webpage(api_url, video_id)

This comment has been minimized.

Copy link
@dstftw

dstftw Apr 19, 2017

Collaborator

_download_json.

This comment has been minimized.

Copy link
@fredbourni

fredbourni Apr 19, 2017

Author Contributor

I used _download_webpage function instead since the JSON structure is unfortunately different per show which means the bc id is often in different places... to basically make it more resilient.

Do you mind if I change the function as proposed while still using regexp with its content?

self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id
)
else:
raise ExtractorError('Unable to extract brightcove id from api')

This comment has been minimized.

Copy link
@dstftw

dstftw Apr 19, 2017

Collaborator

Will never happen.

This comment has been minimized.

Copy link
@fredbourni

fredbourni Apr 19, 2017

Author Contributor

OK!

fredbourni added 2 commits Apr 20, 2017
@fredbourni

This comment has been minimized.

Copy link
Contributor Author

fredbourni commented Apr 20, 2017

Applied requested changes and added another test having a different API content structure.
Thanks for reviewing again.


TEMPLATE_API_URL = 'http://api.noovo.ca/api/v1/pages/single-episode/%s'

BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/618566855001/default_default/index.html?videoId=%s'

This comment has been minimized.

Copy link
@dstftw

dstftw Apr 23, 2017

Collaborator

Use consistent naming.

'timestamp': 1492019320,
'title': 'md5:2895fdc124639be0ef64ea0d06f5e493',
'upload_date': '20170412',
'uploader_id': '618566855001'

This comment has been minimized.

Copy link
@dstftw

dstftw Apr 23, 2017

Collaborator

All tests that test similar extraction scenario should be only_matching.


class NoovoIE(InfoExtractor):

_VALID_URL = r'https?://(?:[a-z0-9\-]+\.)?noovo\.ca/videos/(?P<id>[a-z0-9\-]+/[a-z0-9\-]+)'

This comment has been minimized.

Copy link
@dstftw

dstftw Apr 23, 2017

Collaborator

No need to escape -.

@fredbourni

This comment has been minimized.

Copy link
Contributor Author

fredbourni commented Apr 26, 2017

Modifications pushed, thanks again.

@dstftw dstftw closed this in b5c3953 Apr 29, 2017
dstftw added a commit that referenced this pull request May 22, 2017
khavishbhundoo added a commit to khavishbhundoo/youtube-dl that referenced this pull request Jun 14, 2017
* [cbsinteractive] fix extractor

* [cbsinteractive] update test cases

* [cbsinteractive] extract formats with `CBSIE`

* [extractor/common] Fix rtmp and rtsp formats' URLs in _extract_wowza_formats

* [vier] Extract more info

Extract the `episode_number` and `upload_date`. Also extract the real
`description`.

* [vier] Relax regexes and extract more metadata (closes ytdl-org#12539)

* [jsinterp] Add support for quoted names and indexers (closes ytdl-org#13123, closes ytdl-org#13130)

* [ChangeLog] Actualize

* release 2017.05.18

* [ChangeLog] Fix typo

* [jsinterp] Fix typo and cleanup regexes (closes ytdl-org#13134)

* [ChangeLog] Actualize

* release 2017.05.18.1

* [mitele] Update app key regex

* [hitbox] Add support for smashcast.tv (closes ytdl-org#13154)

* [njpwworld] Fix extraction (closes ytdl-org#13162)

* [toypics] Fix extraction

* [toypics] Improve and modernize

* [adobepass] Add support for Brighthouse MSO

* [toggle] Relax _VALID_URL (closes ytdl-org#13172)

* [youtube] Fix DASH manifest signature decryption (closes ytdl-org#8944)

* [youtube] Modernize

* [streamcz] Add support for subtitles

* [downloader/external] Pass -loglevel to ffmpeg downloader (closes ytdl-org#13183)

* Credit @zurfyx for atresplayer improvements (ytdl-org#12548)

* Credit @mphe for streamango (ytdl-org#12643)

* Credit @fredbourni for noovo (ytdl-org#12792)

* [ChangeLog] Actualize

* release 2017.05.23

* Credit @timendum for rai (ytdl-org#11790) and mediaset (ytdl-org#12964)

* Credit @gritstub for vevo fix (ytdl-org#12879)

* [cbsnews] fix extraction for 60 Minutes videos

* [vimeo] Fix formats' sorting (closes ytdl-org#13189)

* [postprocessor/ffmpeg] Fix metadata filename handling on Python 2

Fixes ytdl-org#13182

* [udemy] Fix extraction for outputs' format entries without URL (closes ytdl-org#13192)

* [youku] Fix extraction (closes ytdl-org#13191)

* [utils] Recognize more patterns in strip_jsonp()

Used in Youku Show pages

* [youku:show] Fix extraction

* [tudou] Merge into youku extractor (fixes ytdl-org#12214)

Also, there are no tudou playlists anymore. All playlist URLs points to youku
playlists.

* [bbc] Add support for authentication

* Revert "[youtube] Don't use the DASH manifest from 'get_video_info' if 'use_cipher_signature' is True (ytdl-org#5118)"

This reverts commit 87dc451.

* [ChangeLog] Update after the fix for ytdl-org#11381

* [ChangeLog] Actualize

* release 2017.05.26

* [cbsnews] Fix extraction (closes ytdl-org#13205)

* [youku] Extract more metadata (closes ytdl-org#10433)

* [adn] fix formats extraction

* [utils] Drop an compatibility wrapper for Python < 2.6

addinfourl.getcode is added since Python 2.6a1. As youtube-dl now
requires 2.6+, this is no longer necessary.

See python/cpython@9b0d46d

* [cbsinteractive] Relax _VALID_URL (closes ytdl-org#13213)

* [beam:vod] Add extractor

* [beam] Improve and add support for mixer.com (closes ytdl-org#13032)

* [dvtv] Parse adaptive formats as well

The old code hit an error when it attempted to parse the string
"adaptive" for video height. Actually parsing the returned playlists is
a good idea because it adds more output formats, including some
audio-only-ones.

* [dvtv] Improve and fix playlists support (closes ytdl-org#13063)

* [medialaan] Fix videos with missing videoUrl

A rough trick to get around the two different json styles medialaan seems to be using.
Fix for these example videos:
https://vtmkzoom.be/video?aid=45724
https://vtmkzoom.be/video?aid=45425

* [medialaan] PEP 8 (closes ytdl-org#12774)

* [gaskrank] Fix extraction

* [gaskrank] Improve (closes ytdl-org#12493)

* [abcnews] Add support for embed URLs

* [abcnews] Improve and remove duplicate test (closes ytdl-org#12851)

* [xhamster] Extract categories (closes ytdl-org#11728)

* [xhamster] Fix author and like/dislike count extraction

* [xhamster] Simplify (closes ytdl-org#13216)

* [youtube] Parse player_url if format URLs are encrypted or DASH MPDs are requested

Fixes ytdl-org#13211

* [ChangeLog] Actualize

* release 2017.05.29

* [README.md] Add an example for how to use .netrc on Windows

That's a Python bug: http://bugs.python.org/issue28334
Most likely it will be fixed in Python 3.7: python/cpython#123

* [README.md] Mention http_dash_segments protocol

* [packtpub] Fix authentication(closes ytdl-org#13240)

* [drbonanza] Fix extraction (closes ytdl-org#13231)

* [francetv] Relax _VALID_URL

* [1tv] Lower preference for http formats (closes ytdl-org#13246)

* [youtube] Improve chapters extraction (closes ytdl-org#13247)

* [safari] Fix typo (closes ytdl-org#13252)

* [YoutubeDL] Don't emit ANSI escape codes on Windows

* [godtv] Remove extractor (closes ytdl-org#13175)

* [pornhub:playlist] Fix extraction (closes ytdl-org#13281)

* [pornhub:uservideos] Add missing raise

* [bandcamp:weekly] Add extractor

* [bandcamp:weekly] Improve and extract more metadata (closes ytdl-org#12758)

* Credit @adamvoss for bandcamp:weekly (ytdl-org#12758)

* Credit @mikf for beam:vod (ytdl-org#13032)

* Credit @jktjkt for dvtv formats (ytdl-org#13063)

* [ChangeLog] Actualize

* release 2017.06.05

* [tvplayer] Fix extraction (closes ytdl-org#13291)

* [rtlnl] Improve _VALID_URL (closes ytdl-org#13295)

* [streamango] Make title optional

* [streamango] Skip download for test (closes ytdl-org#13292)

* [README.md] Clarify output template references (closes ytdl-org#13316)

* [README.md] Improve man page formatting

* [YoutubeDL] Sanitize more fields (ytdl-org#13313)

* [liveleak] Ensure height is int (closes ytdl-org#13313)

* [safari] Improve authentication detection (closes ytdl-org#13319)

* [sohu] Fix numeric fields

* [flickr] Ensure format id is string

* [foxgay] Ensure height is int

* [gfycat] Ensure filesize is int

* [golem] Ensure format id is string

* [jove] Ensure comment count is int

* [sexu] Ensure height is int

* [turbo] Ensure format id is string

* [extractor/common] Return unicode string from _match_id

* [extractor/generic] Ensure format id is unicode string

* [msn] Fix formats extraction

* [newgrounds] Improve formats and uploader extraction (closes ytdl-org#13346)

* [newgrounds:playlist] Add extractor (closes ytdl-org#10611)

* [utils] Improve unified_timestamp

* [newgrounds] Extract more metadata (closes ytdl-org#13232)

* [rutv] Add support for testplayer.vgtrk.com (closes ytdl-org#13347)

* [xfileshare] Modernize and pass referrer

* [xfileshare] Add support for rapidvideo (closes ytdl-org#13348)

* [compat] Introduce compat_HTMLParseError

* [utils] Handle HTMLParseError in extract_attributes (closes ytdl-org#13349)

* [xfileshare] PEP 8

* [ChangeLog] Actualize

* release 2017.06.12

* [compat] Add compat_HTMLParseError to __all__

* [corus] Add support for history.ca (closes ytdl-org#13359)

* [corus] Add support for showcase.ca
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.