Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reddit: (Link domain support) Some links in comments are not handled correctly causing them to not be downloaded #90

Open
reasonabledoubt opened this issue Oct 6, 2017 · 6 comments

Comments

@reasonabledoubt
Copy link

Expected Behavior

The link within the comment should be parsed normally and download.

Actual Behavior

Examples, including full command output:

$ java -jar ripme-1.5.7.jar --url https://www.reddit.com/r/legendarylootz/comments/74ftqb/when_you_cum_so_hard_your_nips_show_it_gif/dnxzojk/
Loaded file:[redacted]/ripme-1.5.7.jar!/rip.properties
Loaded log4j.properties
Initialized ripme v1.5.7
[+] Creating directory: ./rips/reddit_post_74ftqb
Retrieving https://gfycat.com/DevotedWhimsicalBovine
    Downloading file: https://giant.gfycat.com/DevotedWhimsicalBovine.mp4 Retry #1
[!] Unable to rip URL: https://www.erome.com/a/rv4Vu2et),
[+] Saved https://giant.gfycat.com/DevotedWhimsicalBovine.mp4 as ./rips/reddit_post_74ftqb/74ftqb-DevotedWhimsicalBovine.mp4

and

$ java -jar ripme-1.5.7.jar --url https://www.reddit.com/r/AnalGW/comments/74c4i2/anal_makes_me_moan_f/dnxm2rb/
Loaded file:[redacted]/ripme-1.5.7.jar!/rip.properties
Loaded log4j.properties
Initialized ripme v1.5.7
[+] Creating directory: ./rips/reddit_post_74c4i2
[!] Unable to rip URL: https://i.imgur.com/EBT9If9.gifv
    Downloading file: https://i.imgur.com/sdaOjbk.jpg Retry #1
[+] Saved https://i.imgur.com/sdaOjbk.jpg as ./rips/reddit_post_74c4i2/74c4i2-sdaOjbk.jpg

This is on a completely new test setup with no previous downloads or configuration, everything is default settings.

In both of these examples the parent post is downloaded (that's fine) but the content linked in the comment is not. I figured when I saw the trailing paren in the first one it wasn't being escaped or parsed properly, which is why I looked for a second example that also used markdown formatting, but in the second example the imgur .gifv looks like it's formatted just fine in the error output but it also doesn't work so I'm not sure and that's why I included both examples. The more I review this the more I think it's possible that I just stumbled on two separate issues at the same time, but I'm not entirely sure at this point.

Also I probably should mention that normally I'm using this as part of a entire user download run with a cronjob and not a single comment, however I noticed these in my logs and tailored my examples so they were just related to the issue and hopefully easier to reproduce.

@rautamiekka
Copy link
Contributor

As far as getting 1 MP4 and 1 JPG went it worked fine:

Downloading https://giant.gfycat.com/DevotedWhimsicalBovine.mp4
Downloaded .\rips\reddit_post_74ftqb\74ftqb-DevotedWhimsicalBovine.mp4
Rip complete, saved to G:\_DOWNLOADS_\RipMe\rips\reddit_post_74ftqb
Downloading https://i.imgur.com/sdaOjbk.jpg
Downloaded .\rips\reddit_post_74c4i2\74c4i2-sdaOjbk.jpg
Rip complete, saved to G:\_DOWNLOADS_\RipMe\rips\reddit_post_74c4i2

Window$ 7 Ultimate SP1 x64
Oracle Java SE 8 Update 144 x64
RipMe 1.5.7

@reasonabledoubt
Copy link
Author

@rautamiekka, the intended downloads are https://www.erome.com/a/rv4Vu2et and https://i.imgur.com/EBT9If9.gifv which are the files linked in the reddit comments, the same ones preceded by [!] Unable to rip URL: in my code pastes.

You've downloaded https://giant.gfycat.com/DevotedWhimsicalBovine.mp4 and https://i.imgur.com/sdaOjbk.jpg which are clearly not them (they're the content of the parent reddit posts, not the comments).

@rautamiekka
Copy link
Contributor

^ In which case it's a bug, possibly a 'website changed' one, not fully sure.

@cyian-1756
Copy link
Collaborator

Does this happen with links to other sites, or just erome?

@reasonabledoubt
Copy link
Author

@cyian-1756, I reviewed my logs and there's lots of other sites, yeah. Two days ago there were 71 instances where an [!] Unable to rip URL: error line ended with a close paren, and yesterday's (my first example above) made it 72. It was the only erome link to fail that way.

Then for .gifv links, there's 211 of them following [!] Unable to rip URL:, and none have the close paren. So it looks more and more to me like two separate issues.

For the close paren, I think ripme gets json formatted by reddit, which would include the markdown formatting, which for links is [link text](URL) which is where I think that close paren is coming from.

For imgur's gifv, I'd imagine it's because it doesn't match the "Direct link to image" pattern in RipUtils.java.

@bvcyt
Copy link

bvcyt commented Oct 18, 2017

@metaprime metaprime changed the title Reddit: Some links in comments are not handled correctly causing them to not be downloaded Reddit: (Link domain support) Some links in comments are not handled correctly causing them to not be downloaded Nov 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants