Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug REGEX with extractor for Eurosport #7042

Closed
9 of 10 tasks
emilejanssens opened this issue May 13, 2023 · 0 comments · Fixed by #7076
Closed
9 of 10 tasks

Bug REGEX with extractor for Eurosport #7042

emilejanssens opened this issue May 13, 2023 · 0 comments · Fixed by #7076
Labels
site-bug Issue with a specific website triage Untriaged issue

Comments

@emilejanssens
Copy link

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

  • I'm reporting a bug unrelated to a specific site
  • I've verified that I'm running yt-dlp version 2023.03.04 (update instructions) or later (specify commit)
  • I've checked that all provided URLs are playable in a browser with the same IP and same login details
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched known issues and the bugtracker for similar issues including closed ones. DO NOT post duplicates
  • I've read the guidelines for opening an issue

Provide a description that is worded well enough to be understood

As part of a testing project for our university, we found through unit tests that some valid URLs of the Eurosport site are not accepted by the regexof yt-dlp.
link to the regex :
yt-dlp/extractor/eurosport.py

_VALID_URL = r'https?://www\.eurosport\.com/\w+/[\w-]+/\d+/[\w-]+_(?P<id>vid\d+)'

Potential solution:

_VALID_URL = r'https?://www\.eurosport\.com/\w+/[\w-]+/[\d-]+/[\w-]+_(?P<id>vid\d+)'

Change \d+ to [\d-]+

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

.\yt-dlp.exe -vU https://www.eurosport.com/football/champions-league/2022-2023/pep-guardiola-emotionally-destroyed-after-manchester-city-win-over-bayern-munich-in-champions-league_vid1896254/video.shtml -P .\trash\
[debug] Command-line config: ['-vU', 'https://www.eurosport.com/football/champions-league/2022-2023/pep-guardiola-emotionally-destroyed-after-manchester-city-win-over-bayern-munich-in-champions-league_vid1896254/video.shtml', '-P', '.\\trash\\']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8 
[debug] yt-dlp version stable@2023.03.04 [392389b7d] (win_exe) 
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.19045-SP0 (OpenSSL 1.1.1k  25 Mar 2021) 
[debug] exe versions: none 
[debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4 
[debug] Proxy map: {} 
[debug] Loaded 1786 extractors 
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest 
Available version: stable@2023.03.04, Current version: stable@2023.03.04 
Current Build Hash: 5590c57bd0433ed239a2deaaf92e2ad6f37fe50f53664c821575cafe106a9421 
yt-dlp is up to date (stable@2023.03.04) 
[generic] Extracting URL: https://www.eurosport.com/football/champions-league/2022-2023/pep-guardiola-emotionally-destroyed-after-manchester-city-win-over-bayern-munich-in-champions-league_vid1896254/video.shtml
[generic] video: Downloading webpage 
WARNING: [generic] Falling back on generic information extractor 
[generic] video: Extracting information 
[debug] Looking for embeds 
ERROR: Unsupported URL: https://www.eurosport.com/football/champions-league/2022-2023/pep-guardiola-emotionally-destroyed-after-manchester-city-win-over-bayern-munich-in-champions-league_vid1896254/video.shtml
Traceback (most recent call last): 
  File "yt_dlp\YoutubeDL.py", line 1518, in wrapper
  File "yt_dlp\YoutubeDL.py", line 1594, in __extract_info
  File "yt_dlp\extractor\common.py", line 694, in extract
  File "yt_dlp\extractor\generic.py", line 2510, in _real_extract
yt_dlp.utils.UnsupportedError: Unsupported URL: https://www.eurosport.com/football/champions-league/2022-2023/pep-guardiola-emotionally-destroyed-after-manchester-city-win-over-bayern-munich-in-champions-league_vid1896254/video.shtml
@emilejanssens emilejanssens added bug Bug that is not site-specific triage Untriaged issue labels May 13, 2023
@pukkandan pukkandan added site-bug Issue with a specific website and removed bug Bug that is not site-specific labels May 13, 2023
pukkandan pushed a commit that referenced this issue May 29, 2023
Closes #7042
Authored by: HobbyistDev
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this issue Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website triage Untriaged issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants