-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Imgur 404 error but link works in browser #869
Comments
I also cannot download anything via Imgur, regardless of file type despite the link working as intended in the browser. Except for me the error is (for every download):
|
Can confirm as well, that I too can't download anything from the imgur. |
Yes and navigating to the link in a browser will unveil the reason: |
The reason you're getting 401 from that link is the same reason I mention in #828 you're missing the auth headers to access that API link. As for the rest of the issue at hand here, There are a lot of things being removed from Imgur right now. It seems they're being removed from the API first and the direct file links will sometimes work for a bit afterwards. You can work around this for direct links with an edit to the download_factory but I would not advise it long term as any dead link will just pick up the removed image and treat it like it's been successful. Also any malformed links provided by the Reddit API can just download the HTML of the 404 page as the downloader will not see the redirect and think it's getting the right file. It's the main reason the change to the API was made in the first place. If you are willing to run with those caveats or are willing to double-check them all here is the patch: change this: if re.match(r"(i\.|m\.|o\.)?imgur", sanitised_url):
return Imgur
elif re.match(r"(i\.|thumbs\d{1,2}\.|v\d\.)?(redgifs|gifdeliverynetwork)", sanitised_url):
return Redgifs
elif re.match(r"(thumbs\.|giant\.)?gfycat\.", sanitised_url):
return Gfycat
elif re.match(r".*/.*\.[a-zA-Z34]{3,4}(\?[\w;&=]*)?$", sanitised_url) and not DownloadFactory.is_web_resource(
sanitised_url
):
return Direct to this: if re.match(r"(i\.|thumbs\d{1,2}\.|v\d\.)?(redgifs|gifdeliverynetwork)", sanitised_url):
return Redgifs
elif re.match(r"(thumbs\.|giant\.)?gfycat\.", sanitised_url):
return Gfycat
elif re.match(r".*/.*\.[a-zA-Z34]{3,4}(\?[\w;&=]*)?$", sanitised_url) and not DownloadFactory.is_web_resource(
sanitised_url
):
return Direct
elif re.match(r"(i\.|m\.|o\.)?imgur", sanitised_url):
return Imgur Any gifv links will download as such with that change. If you would like them downloaded as mp4 you can insert the two new lines to downloader at line 96: try:
if submission.url.endswith(".gifv"):
submission.url = submission.url.replace(".gifv", ".mp4")
downloader_class = DownloadFactory.pull_lever(submission.url) These edits are provided as-is and I won't be providing additional support for them. |
Oh i understand now. Some of the submissions where very recent so I hadn't considered they could already be removed. |
Is there a way to figure out which files need to be double checked? |
bdfr has the --no-dupes option that promises to avoid downloading the same image/video twice by comparing hashes. Since the 'removed' image is the same every time, that option catches it. You'll just get one of them and bdfr will skip all other posts that were removed by imgur. I'm currently re-downloading my saved posts with this fix and the --no-dupes option, the log displays "Resource hash d835884373f4d6c8f24742ceabe74946 from submission downloaded elsewhere" messages every now and then so I'm confident it's working. |
Plus the images are all exactly the same (absurdly low) size. It's easy to use a tool like find to get them all. |
Description
imgur links keep giving a 404 error even though they work on my browser. An imgur link such as https://i.imgur.com/xxxxxx.gifv opens up on my browser. https://i.imgur.com/xxxxxx WITHOUT the gifv extension loads a 404 page. The two 404 links in the log I provided work fine on my browser using the i.imgur link that ends with .gifv extension
Command
Environment
Logs
The text was updated successfully, but these errors were encountered: