Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to Disable Downloading Images Older Than Last Rip #18

Closed
Davrial opened this issue Jul 10, 2017 · 8 comments
Closed

Ability to Disable Downloading Images Older Than Last Rip #18

Davrial opened this issue Jul 10, 2017 · 8 comments

Comments

@Davrial
Copy link

Davrial commented Jul 10, 2017

I have downloaded a lot of albums at this point, and I frequently use the "Re-Rip Checked" option to download new content in said albums. The problem is, if there are any images in the albums that I delete (mostly filler images, thumbnail versions, etc), they all get re-downloaded when I re-rip.

So my suggestion (slash question as to if it's even possible) is as what the title says: Would it be possible to add a check/toggle option to disable the downloading of files older than the last rip, specifically when re-ripping albums? That may or may not even make the re-ripping even faster, as it would force it to skip over images older than set date/time, which would mean it wouldnt have to check if a file has already been downloaded.

@rautamiekka
Copy link
Contributor

Unless someone has a better idea, that'd mandate using a file containing the links to the previously downloaded files. For example: https://github.com/rg3/youtube-dl/ can use a file which lists the vid IDs of already fetched vids -after- you DL, which is way easier to implement and use compared to a general pic ripper.

The JSON format I can imagine right now that'll enable matching multiple items from a single address:

{
    "ORIGINAL LINK SUCH AS FOLDER OR ALBUM": {
        "ITEM IN FOLDER OR ALBUM": [
            "DOWNLOAD LINK USED",
            ...
        ], ...
    }, ...
}

For example:

{
    "http://rautamiekka.deviantart.com/gallery/?catpath=/": {
        "http://rautamiekka.deviantart.com/art/Princess-Luna-BIG-COLLAB-colored-0-672156533": [
            "http://img09.deviantart.net/ed89/i/2016/335/4/2/_sleeping_on_the_job___finnish_by_rautamiekka-daq4v9l.png"
        ]
    }, ...
}

That allows for very fine (by folder/album address, item address in the folder/album, the DL link used, the filename used) matching of already-downloaded content, assuming the same format works for everything.

Otherwise we'll need ripper-specific formats built into the specific ripper and an unified API to use the format.

@ghost
Copy link

ghost commented Jul 11, 2017

Having this addition to the app would drastically reduce download times.

@rautamiekka
Copy link
Contributor

^ Re-download times, mind you.

@Davrial
Copy link
Author

Davrial commented Jul 12, 2017

^ *Re-RIP times, mind you. It would be explicitly preventing re-downloads.

Also as for your overspecialization of folder/file details, it wouldnt need to be nearly that complicated. You would just have the ripper document the last time it ran a rip and then have that compare to the metadata on a site which shows its age/timestamp. It would be extra easy with sites like Tumblr where all images are stored with a string name, because then the ripper could just be told to only rip images with a newer string than the most recent image currently saved/ripped for each specific url/folder.

@metaprime
Copy link
Contributor

@rautamiekka That's exactly the solution I imagined. This request has been opened and discussed multiple times in the past, and at least one of those issues is open (probably aggregating links to the duplicates as well).

@ghost
Copy link

ghost commented Jul 15, 2017

I support this enhancement as well. I opened an issue for deduplication a week or two ago, but it got closed since "ripme doesn't have deduplication by design".

@metaprime
Copy link
Contributor

metaprime commented Aug 12, 2017

@ANonCoder123 This is a distinct issue from deduplication. This is an issue of not downloading URLs already downloaded. It mostly would solve the problem of deduplication (except for duplicates downloaded from different URLs)

@cyian-1756
Copy link
Collaborator

This was added, to use it click the "Remember URL history" button in the config menu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants