Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xhamster.com broken site support - new video_id with letters #25789

Closed
TheRealDude2 opened this issue Jun 25, 2020 · 5 comments
Closed

xhamster.com broken site support - new video_id with letters #25789

TheRealDude2 opened this issue Jun 25, 2020 · 5 comments

Comments

@TheRealDude2
Copy link
Contributor

@TheRealDude2 TheRealDude2 commented Jun 25, 2020

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2020.06.16.1
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

Verbose log

youtube-dl -v  https://de.xhamster.com/videos/skinny-girl-fucks-herself-hard-in-the-forest-xhnBJZx
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'https://de.xhamster.com/videos/skinny-girl-fucks-herself-hard-in-the-forest-xhnBJZx']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2020.06.16.1
[debug] Python version 2.7.16 (CPython) - Linux-4.19.118-v7+-armv7l-with-debian-10.4
[debug] exe versions: ffmpeg 4.1.4-1, ffprobe 4.1.4-1
[debug] Proxy map: {}
[generic] skinny-girl-fucks-herself-hard-in-the-forest-xhnBJZx: Requesting header
WARNING: Falling back on generic information extractor.
[generic] skinny-girl-fucks-herself-hard-in-the-forest-xhnBJZx: Downloading webpage
[generic] skinny-girl-fucks-herself-hard-in-the-forest-xhnBJZx: Extracting information
ERROR: Unsupported URL: https://de.xhamster.com/videos/skinny-girl-fucks-herself-hard-in-the-forest-xhnBJZx
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2387, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2562, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2551, in _XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1659, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1523, in _raiseerror
    raise err
ParseError: syntax error: line 1, column 0
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 797, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 3382, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: https://de.xhamster.com/videos/skinny-girl-fucks-herself-hard-in-the-forest-xhnBJZx

Description

Trying to download video at this location:
https://de.xhamster.com/videos/skinny-girl-fucks-herself-hard-in-the-forest-xhnBJZx

New URL scheme with letters instead of numbers for the video_id (in the example xhnBJZx). I think this is the problem. Letters are used since 2 days for all new videos.

Tested with Python version 2.7.16 and 3.7.3.

I am not so good at regular expressions and have never made a pull request, but I was able to create a working version of the extractor xhamster.py. But I think it would be better if it was changed by someone who knew about it.

@TheRealDude2
Copy link
Contributor Author

@TheRealDude2 TheRealDude2 commented Jun 27, 2020

JChris246 thx for the pull request. I noticed a small problem. Your change cut the video_id, if there is a capital letter.

I chanaged the line :
videos/(?P<display_id_2>[^/])-(?P<id_2>\d+)
to
videos/(?P<display_id_2>[^/]
)-(?P<id_2>\w+)

but i'm not sure, if this catch to much. Is your expresion extended with capital letters (something like [\da-zA-Z]+) the same expression as \w+

@JChris246
Copy link
Contributor

@JChris246 JChris246 commented Jun 28, 2020

JChris246 thx for the pull request. I noticed a small problem. Your change cut the video_id, if there is a capital letter.

I chanaged the line :
videos/(?P<display_id_2>[^/])-(?P<id_2>\d+) to videos/(?P<display_id_2>[^/])-(?P<id_2>\w+)

but i'm not sure, if this catch to much. Is your expresion extended with capital letters (something like [\da-zA-Z]+) the same expression as \w+

Ahhh yess, my bad. I changed the regex to [\dA-z]+ to include capital letters. To answer your question \w+ and [\da-zA-Z]+ are quite similar. However the difference (to my knowledge) is that \w+ will include _.

@TheRealDude2
Copy link
Contributor Author

@TheRealDude2 TheRealDude2 commented Jun 29, 2020

Ahhh yess, my bad. I changed the regex to [\dA-z]+ to include capital letters. To answer your question \w+ and [\da-zA-Z]+ are quite similar. However the difference (to my knowledge) is that \w+ will include _.

Ah, I got it. Thanks for the change and the tip. I think I'm going to read up on the regular expressions AND git.

@parasiteoflife
Copy link

@parasiteoflife parasiteoflife commented Jul 16, 2020

Please commit the PR, its been 20 days.

@YuukieO
Copy link

@YuukieO YuukieO commented Aug 6, 2020

still broken in youtube-dl 2020-07-28:

[debug] System config: []
[debug] User config: ['-o', '%(title)s.%(ext)s']
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://xhamster.com/videos/eliza-ibarra-helps-her-gf-for-a-scissoring-nuru-tutorial-xhmB061']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2020.07.28
[debug] Python version 3.6.10 (CPython) - Linux-5.3.18-lp152.33-default-x86_64-with-glibc2.3.4
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4
[debug] Proxy map: {}
[generic] eliza-ibarra-helps-her-gf-for-a-scissoring-nuru-tutorial-xhmB061: Requesting header
WARNING: Falling back on generic information extractor.
[generic] eliza-ibarra-helps-her-gf-for-a-scissoring-nuru-tutorial-xhmB061: Downloading webpage
[generic] eliza-ibarra-helps-her-gf-for-a-scissoring-nuru-tutorial-xhmB061: Extracting information
ERROR: Unsupported URL: https://xhamster.com/videos/eliza-ibarra-helps-her-gf-for-a-scissoring-nuru-tutorial-xhmB061
Traceback (most recent call last):
  File "/usr/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 797, in extract_info
    ie_result = ie.extract(url)
  File "/usr/bin/youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "/usr/bin/youtube-dl/youtube_dl/extractor/generic.py", line 3382, in _real_extract
    raise UnsupportedError(url)
youtube_dl.utils.UnsupportedError: Unsupported URL: https://xhamster.com/videos/eliza-ibarra-helps-her-gf-for-a-scissoring-nuru-tutorial-xhmB061
dstftw pushed a commit that referenced this issue Aug 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.