Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the issue that adding irregular proxies might cause some sites to fail to download #8036

Closed
9 of 10 tasks
SevenLives opened this issue Sep 6, 2023 · 5 comments · Fixed by #8046
Closed
9 of 10 tasks
Labels
patch-available There is patch available that should fix this issue. Someone needs to make a PR with it regression Works in youtube-dl/older yt-dlp site-bug Issue with a specific website

Comments

@SevenLives
Copy link
Contributor

SevenLives commented Sep 6, 2023

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

  • I'm reporting a bug unrelated to a specific site
  • I've verified that I'm running yt-dlp version 2023.07.06 (update instructions) or later (specify commit)
  • I've checked that all provided URLs are playable in a browser with the same IP and same login details
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched known issues and the bugtracker for similar issues including closed ones. DO NOT post duplicates
  • I've read the guidelines for opening an issue

Provide a description that is worded well enough to be understood

When I run the program, and add the --proxy parameter without a protocol header (for example, "127.0.0.1:1080", without "http://, socks5://" etc protocol headers), certain websites start to have problems.

I have noticed that in the file yt_dlp/YoutubeDL.py, on line 4085 of the function 'build_request_director', proxies are handled in such a way that if there's no protocol header, it will automatically add the http:// protocol header.

def build_request_director(self, handlers, preferences=None):
        logger = _YDLLogger(self)
        headers = self.params['http_headers'].copy()
        proxies = self.proxies.copy()
        clean_headers(headers)
        clean_proxies(proxies, headers)

        director = RequestDirector(logger=logger, verbose=self.params.get('debug_printtraffic'))
        for handler in handlers:
            director.add_handler(handler(
                logger=logger,
                headers=headers,
                cookiejar=self.cookiejar,
                proxies=proxies,
                prefer_system_certs='no-certifi' in self.params['compat_opts'],
                verify=not self.params.get('nocheckcertificate'),
                **traverse_obj(self.params, {
                    'verbose': 'debug_printtraffic',
                    'source_address': 'source_address',
                    'timeout': 'socket_timeout',
                    'legacy_ssl_support': 'legacyserverconnect',
                    'enable_file_urls': 'enable_file_urls',
                    'client_cert': {
                        'client_certificate': 'client_certificate',
                        'client_certificate_key': 'client_certificate_key',
                        'client_certificate_password': 'client_certificate_password',
                    },
                }),
            ))
        director.preferences.update(preferences or [])
        return director

However, I've found that in the yt_dlp/extractor/abematv.py file, when getting an instance, it does not process the proxies. This causes multiple instances to be created without adding a handler, which results in a 'url type unknown' error.

def add_opener(ydl, handler):  # FIXME: Create proper API in .networking
    """Add a handler for opening URLs, like _download_webpage"""
    # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
    # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
    print('Adding abematv-license:// han2er22eee')
    rh = ydl._request_director.handlers['Urllib']
    if 'abematv-license' in rh._SUPPORTED_URL_SCHEMES:
        return
    
    opener = rh._get_instance(cookiejar=ydl.cookiejar, proxies=ydl.proxies)
    assert isinstance(opener, urllib.request.OpenerDirector)
    opener.add_handler(handler)
    rh._SUPPORTED_URL_SCHEMES = (*rh._SUPPORTED_URL_SCHEMES, 'abematv-license')

If you change the code to the following, it can fix this problem.

def add_opener(ydl, handler):  # FIXME: Create proper API in .networking
    """Add a handler for opening URLs, like _download_webpage"""
    # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
    # https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
    print('Adding abematv-license:// han2er22eee')
    rh = ydl._request_director.handlers['Urllib']
    if 'abematv-license' in rh._SUPPORTED_URL_SCHEMES:
        return
    headers = ydl.params['http_headers'].copy()
    proxies = ydl.proxies.copy()
    clean_proxies(proxies,headers)
    opener = rh._get_instance(cookiejar=ydl.cookiejar, proxies=proxies)
    assert isinstance(opener, urllib.request.OpenerDirector)
    opener.add_handler(handler)
    rh._SUPPORTED_URL_SCHEMES = (*rh._SUPPORTED_URL_SCHEMES, 'abematv-license')

I'm not sure if other websites would have the same issue, but I hope this would be of some help.

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

.\yt-dlp.exe -vU --proxy "127.0.0.1:1080" https://abema.tv/video/episode/90-1843_s55_p1
[debug] Command-line config: ['-vU', '--proxy', '127.0.0.1:1080', 'https://abema.tv/video/episode/90-1843_s55_p1']
[debug] Encodings: locale cp936, fs utf-8, pref cp936, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version nightly@2023.09.05.203540 [d3d81cc98] (win32_dir)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.22621-SP0 (OpenSSL 1.1.1k  25 Mar 2021)
[debug] exe versions: none
[debug] Optional libraries: Cryptodome-3.18.0, brotli-1.0.9, certifi-2023.07.22, mutagen-1.47.0, sqlite3-2.6.0, websockets-11.0.3
[debug] Proxy map: {'all': '127.0.0.1:1080'}
[debug] Loaded 1864 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp-nightly-builds/releases/latest
Available version: nightly@2023.09.05.203540, Current version: nightly@2023.09.05.203540
yt-dlp is up to date (nightly@2023.09.05.203540)
[AbemaTV] Extracting URL: https://abema.tv/video/episode/90-1843_s55_p1
[AbemaTV] Authorizing
[AbemaTV] 90-1843_s55_p1: Downloading webpage
[AbemaTV] 90-1843_s55_p1: Checking playability
[AbemaTV] 90-1843_s55_p1: Downloading m3u8 information
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, size, br, asr, proto, vext, aext, hasaud, source, id
[debug] Default format spec: best/bestvideo+bestaudio
[info] 90-1843_s55_p1: Downloading 1 format(s): 5300
[debug] Invoking hlsnative downloader on "https://vod-abematv.akamaized.net/program/90-1843_s55_p1/1080/playlist.m3u8?aver=1"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 758
[download] Destination: #1:新シーズン♡きみとの、ひと夏の恋の記録【しょうとあカップルデート:前編】 [90-1843_s55_p1].mp4
[debug] File locking is not supported. Proceeding without locking
[download]   0.1% of ~   2.08GiB at  586.29KiB/s ETA 01:01:45 (frag 1/758)ERROR: unable to download video data: <urlopen error unknown url type: abematv-license>
Traceback (most recent call last):
  File "yt_dlp\networking\_urllib.py", line 428, in _send
  File "urllib\request.py", line 525, in open
  File "urllib\request.py", line 547, in _open
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 1425, in unknown_open
urllib.error.URLError: <urlopen error unknown url type: abematv-license>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 3395, in process_info
  File "yt_dlp\YoutubeDL.py", line 3116, in dl
  File "yt_dlp\downloader\common.py", line 455, in download
  File "yt_dlp\downloader\hls.py", line 375, in real_download
  File "yt_dlp\downloader\fragment.py", line 526, in download_and_append_fragments
  File "yt_dlp\downloader\fragment.py", line 367, in decrypt_fragment
  File "yt_dlp\downloader\fragment.py", line 356, in _get_key
  File "yt_dlp\YoutubeDL.py", line 4059, in urlopen
  File "yt_dlp\networking\common.py", line 114, in send
  File "yt_dlp\networking\_helper.py", line 203, in wrapper
  File "yt_dlp\networking\common.py", line 325, in send
  File "yt_dlp\networking\_urllib.py", line 443, in _send
yt_dlp.networking.exceptions.TransportError: <urlopen error unknown url type: abematv-license>
@SevenLives SevenLives added bug Bug that is not site-specific triage Untriaged issue labels Sep 6, 2023
@SevenLives
Copy link
Contributor Author

The code in the latest main branch may have this issue. However, the code from July 6, 2023, does not have this issue. I noticed that this issue may have been caused by changes to the core code.

@coletdjnz
Copy link
Member

The Abema extractor needs a complete rewrite to support the new networking framework properly, so wouldn't surprise me if there are issues.

The developer who used to maintain it is no longer online/part of this project so someone else will need to pick it up.

If anyone does want to, feel free to @ me for help on the framework side of things if need be.

@SevenLives
Copy link
Contributor Author

The Abema extractor needs a complete rewrite to support the new networking framework properly, so wouldn't surprise me if there are issues.

The developer who used to maintain it is no longer online/part of this project so someone else will need to pick it up.

If anyone does want to, feel free to @ me for help on the framework side of things if need be.

Okay, I understand. Unfortunately, I may not be of much help because I haven't used Python for development before. I can only provide some minor suggestions.

@coletdjnz
Copy link
Member

coletdjnz commented Sep 6, 2023

That said I believe your patch is the right way to fix this issue with the current hacky workaround for Abema, so feel free to open a PR with it.

@coletdjnz coletdjnz added site-bug Issue with a specific website patch-available There is patch available that should fix this issue. Someone needs to make a PR with it regression Works in youtube-dl/older yt-dlp and removed bug Bug that is not site-specific triage Untriaged issue labels Sep 6, 2023
SevenLives added a commit to SevenLives/yt-dlp that referenced this issue Sep 6, 2023
@pukkandan
Copy link
Member

I'm not sure if other websites would have the same issue, but I hope this would be of some help.

Abema is currently the only extractor interfacing with RH directly. So issue will only be here. It's unlikely an Abema rewrite will happen any time soon. So your patch can be merged. Pls open a PR.

SevenLives added a commit to SevenLives/yt-dlp that referenced this issue Sep 7, 2023
coletdjnz pushed a commit that referenced this issue Sep 16, 2023
Fixes #8036

Authored by: SevenLives
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this issue Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patch-available There is patch available that should fix this issue. Someone needs to make a PR with it regression Works in youtube-dl/older yt-dlp site-bug Issue with a specific website
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants