Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instagram URL + proxy breaks the code #25464

Closed
Kikobeats opened this issue May 30, 2020 · 2 comments
Closed

Instagram URL + proxy breaks the code #25464

Kikobeats opened this issue May 30, 2020 · 2 comments
Labels

Comments

@Kikobeats
Copy link

@Kikobeats Kikobeats commented May 30, 2020

Hello,

When I run youtube-dl providing an Instagram post with video and using a proxy, the Python interpreter crashed.

error trace
youtube-dl --dump-json -f best --proxy=https://lum-customer-hl_1234-zone-twittervideo2-ip-1.2.3.4:pwd@zproxy.lum-superproxy.io:22225 --verbose https://www.instagram.com/p/B5LeHK2h4p0/
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--dump-json', u'-f', u'best', u'--proxy=https://lum-customer-hl_1234-zone-twittervideo2-ip-1.2.3.4:pwd@zproxy.lum-superproxy.io:22225', u'--verbose', u'https://www.instagram.com/p/B5LeHK2h4p0/']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2020.05.29
[debug] Python version 2.7.16 (CPython) - Darwin-19.4.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 4.2.2, ffprobe 4.2.2, rtmpdump 2.4
[debug] Proxy map: {u'http': u'https://lum-customer-hl_1234-zone-twittervideo2-ip-1.2.3.4:pwd@zproxy.lum-superproxy.io:22225', u'https': u'https://lum-customer-hl_1234-zone-twittervideo2-ip-1.2.3.4:pwd@zproxy.lum-superproxy.io:22225'}
ERROR: Unable to extract video url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/Users/josefranciscoverdugambin/Projects/microlink/metascraper/packages/metascraper-media-provider/node_modules/youtube-dl/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 797, in extract_info
    ie_result = ie.extract(url)
  File "/Users/josefranciscoverdugambin/Projects/microlink/metascraper/packages/metascraper-media-provider/node_modules/youtube-dl/bin/youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "/Users/josefranciscoverdugambin/Projects/microlink/metascraper/packages/metascraper-media-provider/node_modules/youtube-dl/bin/youtube-dl/youtube_dl/extractor/instagram.py", line 195, in _real_extract
    video_url = self._og_search_video_url(webpage, secure=False)
  File "/Users/josefranciscoverdugambin/Projects/microlink/metascraper/packages/metascraper-media-provider/node_modules/youtube-dl/bin/youtube-dl/youtube_dl/extractor/common.py", line 1123, in _og_search_video_url
    return self._html_search_regex(regexes, html, name, **kargs)
  File "/Users/josefranciscoverdugambin/Projects/microlink/metascraper/packages/metascraper-media-provider/node_modules/youtube-dl/bin/youtube-dl/youtube_dl/extractor/common.py", line 1014, in _html_search_regex
    res = self._search_regex(pattern, string, name, default, fatal, flags, group)
  File "/Users/josefranciscoverdugambin/Projects/microlink/metascraper/packages/metascraper-media-provider/node_modules/youtube-dl/bin/youtube-dl/youtube_dl/extractor/common.py", line 1005, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
RegexNotFoundError: Unable to extract video url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

I verified the proxy works as expected using a Twitter URL

Twitter URL with proxy working fine
youtube-dl --dump-json -f best --no-warnings --no-call-home --no-check-certificate --prefer-free-formats --youtube-skip-dash-manifest --referer=https://twitter.com/verge/status/957383241714970624 --proxy=https://lum-customer-hl_1234-zone-twittervideo2-ip-1.2.3.4:pwd@zproxy.lum-superproxy.io:22225 --verbose -- https://twitter.com/verge/status/957383241714970624
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--dump-json', u'-f', u'best', u'--no-warnings', u'--no-call-home', u'--no-check-certificate', u'--prefer-free-formats', u'--youtube-skip-dash-manifest', u'--referer=https://twitter.com/verge/status/957383241714970624', u'--proxy=https://lum-customer-hl_1234-zone-twittervideo2-ip-1.2.3.4:pwd@zproxy.lum-superproxy.io:22225', u'--verbose', u'--', u'https://twitter.com/verge/status/957383241714970624']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2020.05.29
[debug] Python version 2.7.16 (CPython) - Darwin-19.4.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 4.2.2, ffprobe 4.2.2, rtmpdump 2.4
[debug] Proxy map: {u'http': u'https://lum-customer-hl_1234-zone-twittervideo2-ip-1.2.3.4:pwd@zproxy.lum-superproxy.io:22225', u'https': u'https://lum-customer-hl_1234-zone-twittervideo2-ip-1.2.3.4:pwd@zproxy.lum-superproxy.io:22225'}
{"display_id": "957383241714970624", "extractor": "twitter", "tbr": 1280, "protocol": "https", "description": "Is it bad to blow into game cartridges? https://t.co/Y3yAimrUnP", "tags": [], "timestamp": 1517092926, "format": "http-1280 - 720x720", "formats": [{"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Referer": "https://twitter.com/verge/status/957383241714970624", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3678.1 Safari/537.36"}, "protocol": "m3u8_native", "format": "hls-320 - 240x240", "url": "https://video.twimg.com/amplify_video/943561675927519232/pl/240x240/hqThe2qwGxY4us_s.m3u8", "vcodec": "avc1.4d0015", "tbr": 320.0, "height": 240, "width": 240, "ext": "mp4", "preference": null, "fps": null, "manifest_url": "https://video.twimg.com/amplify_video/943561675927519232/pl/YNw1OIz1A5FFywhq.m3u8", "format_id": "hls-320", "acodec": "mp4a.40.2"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Referer": "https://twitter.com/verge/status/957383241714970624", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3678.1 Safari/537.36"}, "protocol": "https", "format": "http-320 - 240x240", "url": "https://video.twimg.com/amplify_video/943561675927519232/vid/240x240/mijiQdCq-p9FaO8H.mp4", "tbr": 320, "height": 240, "width": 240, "ext": "mp4", "format_id": "http-320"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Referer": "https://twitter.com/verge/status/957383241714970624", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3678.1 Safari/537.36"}, "protocol": "m3u8_native", "format": "hls-832 - 480x480", "url": "https://video.twimg.com/amplify_video/943561675927519232/pl/480x480/3qIAtN3BK0tvUuQX.m3u8", "vcodec": "avc1.4d001f", "tbr": 832.0, "height": 480, "width": 480, "ext": "mp4", "preference": null, "fps": null, "manifest_url": "https://video.twimg.com/amplify_video/943561675927519232/pl/YNw1OIz1A5FFywhq.m3u8", "format_id": "hls-832", "acodec": "mp4a.40.2"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Referer": "https://twitter.com/verge/status/957383241714970624", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3678.1 Safari/537.36"}, "protocol": "https", "format": "http-832 - 480x480", "url": "https://video.twimg.com/amplify_video/943561675927519232/vid/480x480/qURzB_XtWBE-dvRa.mp4", "tbr": 832, "height": 480, "width": 480, "ext": "mp4", "format_id": "http-832"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Referer": "https://twitter.com/verge/status/957383241714970624", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3678.1 Safari/537.36"}, "protocol": "m3u8_native", "format": "hls-1280 - 720x720", "url": "https://video.twimg.com/amplify_video/943561675927519232/pl/720x720/p0lEHBKAhtm_3T9E.m3u8", "vcodec": "avc1.640020", "tbr": 1280.0, "height": 720, "width": 720, "ext": "mp4", "preference": null, "fps": null, "manifest_url": "https://video.twimg.com/amplify_video/943561675927519232/pl/YNw1OIz1A5FFywhq.m3u8", "format_id": "hls-1280", "acodec": "mp4a.40.2"}, {"http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Referer": "https://twitter.com/verge/status/957383241714970624", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3678.1 Safari/537.36"}, "protocol": "https", "format": "http-1280 - 720x720", "url": "https://video.twimg.com/amplify_video/943561675927519232/vid/720x720/h1uN7biCI-Fbzm9D.mp4", "tbr": 1280, "height": 720, "width": 720, "ext": "mp4", "format_id": "http-1280"}], "height": 720, "_filename": "The Verge - Is it bad to blow into game cartridges-957383241714970624.mp4", "like_count": 145, "uploader": "The Verge", "duration": 146.563, "format_id": "http-1280", "upload_date": "20180127", "id": "957383241714970624", "playlist": null, "thumbnails": [{"url": "https://pbs.twimg.com/media/DRg1OMRVwAEuwTK.jpg?name=thumb", "width": 150, "resolution": "150x150", "id": "thumb", "height": 150}, {"url": "https://pbs.twimg.com/media/DRg1OMRVwAEuwTK.jpg?name=small", "width": 680, "resolution": "680x680", "id": "small", "height": 680}, {"url": "https://pbs.twimg.com/media/DRg1OMRVwAEuwTK.jpg?name=large", "width": 1080, "resolution": "1080x1080", "id": "large", "height": 1080}, {"url": "https://pbs.twimg.com/media/DRg1OMRVwAEuwTK.jpg?name=medium", "width": 1080, "resolution": "1080x1080", "id": "medium", "height": 1080}, {"url": "https://pbs.twimg.com/media/DRg1OMRVwAEuwTK.jpg?name=orig", "width": 1080, "resolution": "1080x1080", "id": "orig", "height": 1080}], "title": "The Verge - Is it bad to blow into game cartridges?", "url": "https://video.twimg.com/amplify_video/943561675927519232/vid/720x720/h1uN7biCI-Fbzm9D.mp4", "extractor_key": "Twitter", "ext": "mp4", "http_headers": {"Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3678.1 Safari/537.36", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Referer": "https://twitter.com/verge/status/957383241714970624"}, "repost_count": 48, "uploader_id": "verge", "width": 720, "comment_count": 15, "uploader_url": "https://twitter.com/verge", "webpage_url": "https://twitter.com/verge/status/957383241714970624", "requested_subtitles": null, "fulltitle": "The Verge - Is it bad to blow into game cartridges?", "age_limit": 0, "thumbnail": "https://pbs.twimg.com/media/DRg1OMRVwAEuwTK.jpg?name=orig", "webpage_url_basename": "957383241714970624", "playlist_index": null}

In fact, if I just remove the proxy flag, the Instagram URL works as expected

Instagram without proxy working fine
youtube-dl --dump-json -f best --verbose https://www.instagram.com/p/BmYooZbhCfJ/
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--dump-json', u'-f', u'best', u'--verbose', u'https://www.instagram.com/p/BmYooZbhCfJ/']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2020.05.29
[debug] Python version 2.7.16 (CPython) - Darwin-19.4.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 4.2.2, ffprobe 4.2.2, rtmpdump 2.4
[debug] Proxy map: {}
{"display_id": "BmYooZbhCfJ", "extractor": "Instagram", "protocol": "https", "description": "\u202aModel 3 Performance testing in Alaska \u2744\ufe0f\u202c", "upload_date": "20180812", "timestamp": 1534090105, "format": "0 - 640x352", "formats": [{"protocol": "https", "format": "0 - 640x352", "url": "https://scontent-mad1-1.cdninstagram.com/v/t50.2886-16/38871629_1045788998909492_7127403467848548352_n.mp4?_nc_ht=scontent-mad1-1.cdninstagram.com&_nc_cat=108&_nc_ohc=vOzlzgkte68AX-p2fYb&oe=5ED52DB7&oh=97282b5d24dc6ea9bf5c867ce42d0bc2", "http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3706.6 Safari/537.36"}, "height": 352, "width": 640, "ext": "mp4", "format_id": "0"}], "height": 352, "_filename": "Video by teslamotors-BmYooZbhCfJ.mp4", "like_count": 105929, "uploader": "Tesla", "format_id": "0", "uploader_id": "teslamotors", "playlist": null, "thumbnails": [{"url": "https://scontent-mad1-1.cdninstagram.com/v/t51.2885-15/e15/38517607_1061334650699625_2957597926945193984_n.jpg?_nc_ht=scontent-mad1-1.cdninstagram.com&_nc_cat=109&_nc_ohc=GeZdHGpRgOAAX8P5Dmp&oh=26d81fa41a54d8602c5cdd1fdc945bbe&oe=5ED4F294", "id": "0"}], "title": "Video by teslamotors", "url": "https://scontent-mad1-1.cdninstagram.com/v/t50.2886-16/38871629_1045788998909492_7127403467848548352_n.mp4?_nc_ht=scontent-mad1-1.cdninstagram.com&_nc_cat=108&_nc_ohc=vOzlzgkte68AX-p2fYb&oe=5ED52DB7&oh=97282b5d24dc6ea9bf5c867ce42d0bc2", "extractor_key": "Instagram", "http_headers": {"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language": "en-us,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3706.6 Safari/537.36"}, "ext": "mp4", "comments": [], "id": "BmYooZbhCfJ", "width": 640, "comment_count": null, "playlist_index": null, "webpage_url": "https://www.instagram.com/p/BmYooZbhCfJ/", "requested_subtitles": null, "fulltitle": "Video by teslamotors", "thumbnail": "https://scontent-mad1-1.cdninstagram.com/v/t51.2885-15/e15/38517607_1061334650699625_2957597926945193984_n.jpg?_nc_ht=scontent-mad1-1.cdninstagram.com&_nc_cat=109&_nc_ohc=GeZdHGpRgOAAX8P5Dmp&oh=26d81fa41a54d8602c5cdd1fdc945bbe&oe=5ED4F294", "webpage_url_basename": "BmYooZbhCfJ"}

So is the combination of the Instagram URL + Proxy flag that is crashing the code in some way.

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2020.05.29
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones
Kikobeats added a commit to microlinkhq/metascraper that referenced this issue May 30, 2020
it's causing wrong behavior, need to be address ytdl-org/youtube-dl#25464
@dstftw
Copy link
Collaborator

@dstftw dstftw commented May 31, 2020

--write-pages and see what's returned by instagram.

@dstftw dstftw closed this May 31, 2020
@dstftw dstftw added the invalid label May 31, 2020
@Kikobeats
Copy link
Author

@Kikobeats Kikobeats commented May 31, 2020

@dstftw the error is the same and it doesn't generate any debug file.

BTW, why the issue is considered invalid?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.