Skip to content

Commit

Permalink
[cleanup] Misc (#8182)
Browse files Browse the repository at this point in the history
Closes #7796, Closes #8028
Authored by: barsnick, sqrtNOT, gamer191, coletdjnz, Grub4K, bashonly
  • Loading branch information
bashonly committed Sep 23, 2023
1 parent c2da0b5 commit 5ca095c
Show file tree
Hide file tree
Showing 20 changed files with 22 additions and 34 deletions.
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Expand Up @@ -217,7 +217,7 @@ After you have ensured this site is distributing its content legally, you can fo
1. Add an import in [`yt_dlp/extractor/_extractors.py`](yt_dlp/extractor/_extractors.py). Note that the class name must end with `IE`.
1. Run `python test/test_download.py TestDownload.test_YourExtractor` (note that `YourExtractor` doesn't end with `IE`). This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all`
1. Make sure you have atleast one test for your extractor. Even if all videos covered by the extractor are expected to be inaccessible for automated testing, tests should still be added with a `skip` parameter indicating why the particular test is disabled from running.
1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L91-L426). Add tests and code for as many as you want.
1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L119-L440). Add tests and code for as many as you want.
1. Make sure your code follows [yt-dlp coding conventions](#yt-dlp-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart):

$ flake8 yt_dlp/extractor/yourextractor.py
Expand Down Expand Up @@ -251,7 +251,7 @@ Extractors are very fragile by nature since they depend on the layout of the sou

### Mandatory and optional metafields

For extraction to work yt-dlp relies on metadata your extractor extracts and provides to yt-dlp expressed by an [information dictionary](yt_dlp/extractor/common.py#L91-L426) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by yt-dlp:
For extraction to work yt-dlp relies on metadata your extractor extracts and provides to yt-dlp expressed by an [information dictionary](yt_dlp/extractor/common.py#L119-L440) or simply *info dict*. Only the following meta fields in the *info dict* are considered mandatory for a successful extraction process by yt-dlp:

- `id` (media identifier)
- `title` (media title)
Expand Down Expand Up @@ -696,15 +696,15 @@ formats = [

### Use convenience conversion and parsing functions

Wrap all extracted numeric data into safe functions from [`yt_dlp/utils.py`](yt_dlp/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
Wrap all extracted numeric data into safe functions from [`yt_dlp/utils/`](yt_dlp/utils/): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.

Use `url_or_none` for safe URL processing.

Use `traverse_obj` and `try_call` (superseeds `dict_get` and `try_get`) for safe metadata extraction from parsed JSON.

Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field extraction, `unified_timestamp` for uniform `timestamp` extraction, `parse_filesize` for `filesize` extraction, `parse_count` for count meta fields extraction, `parse_resolution`, `parse_duration` for `duration` extraction, `parse_age_limit` for `age_limit` extraction.

Explore [`yt_dlp/utils.py`](yt_dlp/utils.py) for more useful convenience functions.
Explore [`yt_dlp/utils/`](yt_dlp/utils/) for more useful convenience functions.

#### Examples

Expand Down
2 changes: 1 addition & 1 deletion README.md
Expand Up @@ -1800,7 +1800,7 @@ The following extractors use this feature:
#### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The main clients are `web`, `android` and `ios` with variants `_music`, `_embedded`, `_embedscreen`, `_creator` (e.g. `web_embedded`); and `mweb` and `tv_embedded` (agegate bypass) with no variants. By default, `ios,android,web` is used, but `tv_embedded` and `creator` variants are added as required for age-gated videos. Similarly, the music variants are added for `music.youtube.com` urls. You can use `all` to use all the clients, and `default` for the default clients.
* `player_client`: Clients to extract video data from. The main clients are `web`, `android` and `ios` with variants `_music`, `_embedded`, `_embedscreen`, `_creator` (e.g. `web_embedded`); and `mweb`, `mweb_embedscreen` and `tv_embedded` (agegate bypass) with no variants. By default, `ios,android,web` is used, but `tv_embedded` and `creator` variants are added as required for age-gated videos. Similarly, the music variants are added for `music.youtube.com` urls. You can use `all` to use all the clients, and `default` for the default clients.
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
Expand Down
2 changes: 1 addition & 1 deletion devscripts/make_changelog.py
Expand Up @@ -260,7 +260,7 @@ class CommitRange:
AUTHOR_INDICATOR_RE = re.compile(r'Authored by:? ', re.IGNORECASE)
MESSAGE_RE = re.compile(r'''
(?:\[(?P<prefix>[^\]]+)\]\ )?
(?:(?P<sub_details>`?[^:`]+`?): )?
(?:(?P<sub_details>`?[\w.-]+`?): )?
(?P<message>.+?)
(?:\ \((?P<issues>\#\d+(?:,\ \#\d+)*)\))?
''', re.VERBOSE | re.DOTALL)
Expand Down
1 change: 0 additions & 1 deletion test/test_YoutubeDL.py
Expand Up @@ -631,7 +631,6 @@ def test_add_extra_info(self):
self.assertEqual(test_dict['playlist'], 'funny videos')

outtmpl_info = {
'id': '1234',
'id': '1234',
'ext': 'mp4',
'width': None,
Expand Down
6 changes: 3 additions & 3 deletions test/test_networking_utils.py
Expand Up @@ -269,14 +269,14 @@ def test_compat_http_error_autoclose(self):
assert not response.closed

def test_incomplete_read_error(self):
error = IncompleteRead(b'test', 3, cause='test')
error = IncompleteRead(4, 3, cause='test')
assert isinstance(error, IncompleteRead)
assert repr(error) == '<IncompleteRead: 4 bytes read, 3 more expected>'
assert str(error) == error.msg == '4 bytes read, 3 more expected'
assert error.partial == b'test'
assert error.partial == 4
assert error.expected == 3
assert error.cause == 'test'

error = IncompleteRead(b'aaa')
error = IncompleteRead(3)
assert repr(error) == '<IncompleteRead: 3 bytes read>'
assert str(error) == '3 bytes read'
6 changes: 3 additions & 3 deletions yt_dlp/YoutubeDL.py
Expand Up @@ -239,9 +239,9 @@ class YoutubeDL:
'selected' (check selected formats),
or None (check only if requested by extractor)
paths: Dictionary of output paths. The allowed keys are 'home'
'temp' and the keys of OUTTMPL_TYPES (in utils.py)
'temp' and the keys of OUTTMPL_TYPES (in utils/_utils.py)
outtmpl: Dictionary of templates for output names. Allowed keys
are 'default' and the keys of OUTTMPL_TYPES (in utils.py).
are 'default' and the keys of OUTTMPL_TYPES (in utils/_utils.py).
For compatibility with youtube-dl, a single string can also be used
outtmpl_na_placeholder: Placeholder for unavailable meta fields.
restrictfilenames: Do not allow "&" and spaces in file names
Expand Down Expand Up @@ -422,7 +422,7 @@ class YoutubeDL:
asked whether to download the video.
- Raise utils.DownloadCancelled(msg) to abort remaining
downloads when a video is rejected.
match_filter_func in utils.py is one example for this.
match_filter_func in utils/_utils.py is one example for this.
color: A Dictionary with output stream names as keys
and their respective color policy as values.
Can also just be a single color policy,
Expand Down
2 changes: 1 addition & 1 deletion yt_dlp/compat/urllib/__init__.py
@@ -1,7 +1,7 @@
# flake8: noqa: F405
from urllib import * # noqa: F403

del request
del request # noqa: F821
from . import request # noqa: F401

from ..compat_utils import passthrough_module
Expand Down
1 change: 0 additions & 1 deletion yt_dlp/extractor/abc.py
Expand Up @@ -180,7 +180,6 @@ class ABCIViewIE(InfoExtractor):
_VALID_URL = r'https?://iview\.abc\.net\.au/(?:[^/]+/)*video/(?P<id>[^/?#]+)'
_GEO_COUNTRIES = ['AU']

# ABC iview programs are normally available for 14 days only.
_TESTS = [{
'url': 'https://iview.abc.net.au/show/gruen/series/11/video/LE1927H001S00',
'md5': '67715ce3c78426b11ba167d875ac6abf',
Expand Down
4 changes: 0 additions & 4 deletions yt_dlp/extractor/ign.py
Expand Up @@ -197,10 +197,6 @@ class IGNVideoIE(IGNBaseIE):
'thumbnail': 'https://sm.ign.com/ign_me/video/h/how-hitman/how-hitman-aims-to-be-different-than-every-other-s_8z14.jpg',
'duration': 298,
'tags': 'count:13',
'display_id': '112203',
'thumbnail': 'https://sm.ign.com/ign_me/video/h/how-hitman/how-hitman-aims-to-be-different-than-every-other-s_8z14.jpg',
'duration': 298,
'tags': 'count:13',
},
'expected_warnings': ['HTTP Error 400: Bad Request'],
}, {
Expand Down
1 change: 0 additions & 1 deletion yt_dlp/extractor/nebula.py
Expand Up @@ -127,7 +127,6 @@ class NebulaIE(NebulaBaseIE):
'channel_id': 'lindsayellis',
'uploader': 'Lindsay Ellis',
'uploader_id': 'lindsayellis',
'timestamp': 1533009600,
'uploader_url': 'https://nebula.tv/lindsayellis',
'series': 'Lindsay Ellis',
'display_id': 'that-time-disney-remade-beauty-and-the-beast',
Expand Down
1 change: 0 additions & 1 deletion yt_dlp/extractor/peekvids.py
Expand Up @@ -146,7 +146,6 @@ class PlayVidsIE(PeekVidsBaseIE):
'uploader': 'Brazzers',
'age_limit': 18,
'view_count': int,
'age_limit': 18,
'categories': list,
'tags': list,
},
Expand Down
2 changes: 1 addition & 1 deletion yt_dlp/extractor/radiofrance.py
Expand Up @@ -82,7 +82,7 @@ class RadioFranceBaseIE(InfoExtractor):
def _extract_data_from_webpage(self, webpage, display_id, key):
return traverse_obj(self._search_json(
r'\bconst\s+data\s*=', webpage, key, display_id,
contains_pattern=r'(\[\{.*?\}\]);', transform_source=js_to_json),
contains_pattern=r'\[\{(?s:.+)\}\]', transform_source=js_to_json),
(..., 'data', key, {dict}), get_all=False) or {}


Expand Down
6 changes: 3 additions & 3 deletions yt_dlp/extractor/rcs.py
Expand Up @@ -239,10 +239,10 @@ class RCSEmbedsIE(RCSBaseIE):
}
}, {
'url': 'https://video.gazzanet.gazzetta.it/video-embed/gazzanet-mo05-0000260789',
'match_only': True
'only_matching': True
}, {
'url': 'https://video.gazzetta.it/video-embed/49612410-00ca-11eb-bcd8-30d4253e0140',
'match_only': True
'only_matching': True
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.iodonna.it/video-iodonna/personaggi-video/monica-bellucci-piu-del-lavoro-oggi-per-me-sono-importanti-lamicizia-e-la-famiglia/',
Expand Down Expand Up @@ -325,7 +325,7 @@ class RCSIE(RCSBaseIE):
}
}, {
'url': 'https://video.corriere.it/video-360/metro-copenaghen-tutta-italiana/a248a7f0-e2db-11e9-9830-af2de6b1f945',
'match_only': True
'only_matching': True
}]


Expand Down
1 change: 0 additions & 1 deletion yt_dlp/extractor/rokfin.py
Expand Up @@ -40,7 +40,6 @@ class RokfinIE(InfoExtractor):
'channel': 'Jimmy Dore',
'channel_id': 65429,
'channel_url': 'https://rokfin.com/TheJimmyDoreShow',
'duration': 213.0,
'availability': 'public',
'live_status': 'not_live',
'dislike_count': int,
Expand Down
2 changes: 0 additions & 2 deletions yt_dlp/extractor/s4c.py
Expand Up @@ -78,15 +78,13 @@ class S4CSeriesIE(InfoExtractor):
'info_dict': {
'id': '864982911',
'title': 'Iaith ar Daith',
'description': 'md5:e878ebf660dce89bd2ef521d7ce06397'
},
}, {
'url': 'https://www.s4c.cymru/clic/series/866852587',
'playlist_mincount': 8,
'info_dict': {
'id': '866852587',
'title': 'FFIT Cymru',
'description': 'md5:abcb3c129cb68dbb6cd304fd33b07e96'
},
}]

Expand Down
1 change: 0 additions & 1 deletion yt_dlp/extractor/sovietscloset.py
Expand Up @@ -76,7 +76,6 @@ class SovietsClosetIE(SovietsClosetBaseIE):
'title': 'Arma 3 - Zeus Games #5',
'uploader': 'SovietWomble',
'thumbnail': r're:^https?://.*\.b-cdn\.net/c0e5e76f-3a93-40b4-bf01-12343c2eec5d/thumbnail\.jpg$',
'uploader': 'SovietWomble',
'creator': 'SovietWomble',
'release_timestamp': 1461157200,
'release_date': '20160420',
Expand Down
2 changes: 1 addition & 1 deletion yt_dlp/extractor/youtube.py
Expand Up @@ -902,7 +902,7 @@ def extract_relative_time(relative_time_text):
e.g. 'streamed 6 days ago', '5 seconds ago (edited)', 'updated today', '8 yr ago'
"""

# XXX: this could be moved to a general function in utils.py
# XXX: this could be moved to a general function in utils/_utils.py
# The relative time text strings are roughly the same as what
# Javascript's Intl.RelativeTimeFormat function generates.
# See: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/RelativeTimeFormat
Expand Down
2 changes: 1 addition & 1 deletion yt_dlp/networking/__init__.py
@@ -1,4 +1,4 @@
# flake8: noqa: 401
# flake8: noqa: F401
from .common import (
HEADRequest,
PUTRequest,
Expand Down
2 changes: 1 addition & 1 deletion yt_dlp/networking/_urllib.py
Expand Up @@ -337,7 +337,7 @@ def handle_sslerror(e: ssl.SSLError):

def handle_response_read_exceptions(e):
if isinstance(e, http.client.IncompleteRead):
raise IncompleteRead(partial=e.partial, cause=e, expected=e.expected) from e
raise IncompleteRead(partial=len(e.partial), cause=e, expected=e.expected) from e
elif isinstance(e, ssl.SSLError):
handle_sslerror(e)
elif isinstance(e, (OSError, EOFError, http.client.HTTPException, *CONTENT_DECODE_ERRORS)):
Expand Down
4 changes: 2 additions & 2 deletions yt_dlp/networking/exceptions.py
Expand Up @@ -75,10 +75,10 @@ def __repr__(self):


class IncompleteRead(TransportError):
def __init__(self, partial, expected=None, **kwargs):
def __init__(self, partial: int, expected: int = None, **kwargs):
self.partial = partial
self.expected = expected
msg = f'{len(partial)} bytes read'
msg = f'{partial} bytes read'
if expected is not None:
msg += f', {expected} more expected'

Expand Down

0 comments on commit 5ca095c

Please sign in to comment.