Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pornhub broken #7074

Closed
Firer opened this issue Oct 6, 2015 · 6 comments
Closed

Pornhub broken #7074

Firer opened this issue Oct 6, 2015 · 6 comments

Comments

@Firer
Copy link

@Firer Firer commented Oct 6, 2015

I have swapped "h1" below with "removedh1" to stop it being picked up as markdown.

youtube-dl --verbose -g http://www.pornhub.com/view_video.php?viewkey=1752516565
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'--verbose', u'-g', u'http://www.pornhub.com/view_video.php?viewkey=1752516565']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2015.09.28
[debug] Python version 2.6.6 - Linux-2.6.32-504.8.1.el6.x86_64-x86_64-with-centos-6.6-Final
[debug] exe versions: none
[debug] Proxy map: {}
ERROR: Unable to extract title; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/www/youtube-dl-master/youtube-dl/youtube_dl/YoutubeDL.py", line 660, in extract_info
    ie_result = ie.extract(url)
  File "/www/youtube-dl-master/youtube-dl/youtube_dl/extractor/common.py", line 288, in extract
    return self._real_extract(url)
  File "/www/youtube-dl-master/youtube-dl/youtube_dl/extractor/pornhub.py", line 70, in _real_extract
    video_title = self._html_search_regex(r'<h1 [^>]+>([^<]+)', webpage, 'title')
  File "/www/youtube-dl-master/youtube-dl/youtube_dl/extractor/common.py", line 591, in _html_search_regex
    res = self._search_regex(pattern, string, name, default, fatal, flags, group)
  File "/www/youtube-dl-master/youtube-dl/youtube_dl/extractor/common.py", line 582, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
RegexNotFoundError: Unable to extract title; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
@dstftw
Copy link
Collaborator

@dstftw dstftw commented Oct 6, 2015

Works fine for me. Post the output of youtube-dl --dump-pages http://www.pornhub.com/view_video.php?viewkey=1752516565.

@Firer
Copy link
Author

@Firer Firer commented Oct 6, 2015

It seems to be intermittent. On one of my systems, if I run the command youtube-dl -g http://www.pornhub.com/view_video.php?viewkey=1752516565 multiple times in a row some will work and some won't. I guess it may not be a youtube-dl problem.

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Oct 7, 2015

I've run the command youtube-dl -v -s --write-pages "http://www.pornhub.com/view_video.php?viewkey=1752516565" 100 times in a row and no one failed. I guess pornhub has already fixed their servers. Did you still have the problem? If so, post the results of the command mentioned by @dstftw above.

@Firer
Copy link
Author

@Firer Firer commented Oct 7, 2015

Yes, I just ran the command python /www/youtube-dl-master/youtube-dl -g http://www.pornhub.com/view_video.php?viewkey=175251656 10 times in a row. 4 worked and 6 gave the error.

youtube-dl --dump-pages http:// www.pornhub.com/view_video.php?viewkey=1752516565 [PornHub] 1752516565: Downloading webpage [PornHub] Dumping request to http://www.pornhub.com/view_video.php?viewkey=17525 16565 PGh0bWw+PGhlYWQ+PHNjcmlwdCB0eXBlPSJ0ZXh0L2phdmFzY3JpcHQiPjwhLS0KZnVuY3Rpb24gbGVh c3RGYWN0b3IobikgewogaWYgKGlzTmFOKG4pIHx8ICFpc0Zpbml0ZShuKSkgcmV0dXJuIE5hTjsKIGlm IChuPT0wKSByZXR1cm4gMDsKIGlmIChuJTEgfHwgbipuPDIpIHJldHVybiAxOwogaWYgKG4lMj09MCkg cmV0dXJuIDI7CiBpZiAobiUzPT0wKSByZXR1cm4gMzsKIGlmIChuJTU9PTApIHJldHVybiA1OwogdmFy IG09TWF0aC5zcXJ0KG4pOwogZm9yICh2YXIgaT03O2k8PW07aSs9MzApIHsKICBpZiAobiVpPT0wKSAg ICAgIHJldHVybiBpOwogIGlmIChuJShpKzQpPT0wKSAgcmV0dXJuIGkrNDsKICBpZiAobiUoaSs2KT09 MCkgIHJldHVybiBpKzY7CiAgaWYgKG4lKGkrMTApPT0wKSByZXR1cm4gaSsxMDsKICBpZiAobiUoaSsx Mik9PTApIHJldHVybiBpKzEyOwogIGlmIChuJShpKzE2KT09MCkgcmV0dXJuIGkrMTY7CiAgaWYgKG4l KGkrMjIpPT0wKSByZXR1cm4gaSsyMjsKICBpZiAobiUoaSsyNCk9PTApIHJldHVybiBpKzI0OwogfQog cmV0dXJuIG47Cn0KZnVuY3Rpb24gZ28oKSB7CiB2YXIgcD0xNzQ1ODM4NDI4NDMxOyB2YXIgcz0yNTUw Mzg3NDQ5OyB2YXIgbjsKaWYgKChzID4+IDEyKSAmIDEpIHArPS8qCnArPSAqLzM5ODM0MDk5Ki8qCmVs c2UgcC09CiovMTU7LyogMTIwODg2MTA4KgoqL2Vsc2UgLyogMTIwODg2MTA4KgoqL3AtPS8qCnArPSAq LzQ2NzA5ODQ1KjEzOwlpZiAoKHMgPj4gMCkgJiAxKS8qCmVsc2UgcC09CiovcCs9IDYxOTYyNDUyMyoz OyBlbHNlIC8qIDEyMDg4NjEwOCoKKi9wLT0vKiAxMjA4ODYxMDgqCiovMTg5MzgyNjIyOCoJMTsvKiAx MjA4ODYxMDgqCiovaWYgKChzID4+IDQpICYgMSkKcCs9LyoKKjEzOwoqLzE5OTI3OTY5Nio3Oy8qCnAr PSAqL2Vsc2UgcC09CTI5MDY1OTAwNCovKgplbHNlIHAtPQoqLzU7aWYgKChzID4+IDExKSAmIDEpLyog MTIwODg2MTA4KgoqL3ArPS8qCioxMzsKKi8zNzg1MTUwMSogMTI7IGVsc2UgLyogMTIwODg2MTA4Kgoq L3AtPS8qCnArPSAqLzU5MDI4Mzg2KgoxMjsvKgpwKz0gKi9pZiAoKHMgPj4gMTUpICYgMSkvKgpwKz0g Ki9wKz05MDQ0MDQ4MyovKiAxMjA4ODYxMDgqCiovMTY7ZWxzZSAJcC09CTc1MzY2OTIzKi8qIDEyMDg4 NjEwOCoKKi8xNjsgcC09NDk2NDkxNzQ0MjsKIG49bGVhc3RGYWN0b3IocCk7CnsgZG9jdW1lbnQuY29v a2llPSJSTktFWT0iK24rIioiK3AvbisiOiIrcysiOjQxODI5MjY2MzM6MSI7CiAgZG9jdW1lbnQubG9j YXRpb24ucmVsb2FkKHRydWUpOyB9Cn0KLy8tLT48L3NjcmlwdD48L2hlYWQ+Cjxib2R5IG9ubG9hZD0i Z28oKSI+CkxvYWRpbmcgLi4uCjwvYm9keT4KPC9odG1sPgo= ERROR: Unable to extract title; please report this issue on https://yt-dl.org/bu g . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete outp ut.

@kidol
Copy link
Contributor

@kidol kidol commented Oct 7, 2015

It's scrape-detection on PornHub's side...
Either slow down your requests (so that detection doesn't trigger), or use proxies.

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Oct 7, 2015

Thanks @kidol. It's a duplicate of #5930. youtube-dl may be never able to fix it due to it's complexity.

@dstftw dstftw closed this Oct 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.