Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Output Dir, and check if file is not allready there before extracting. #10

Closed
oursours opened this issue Mar 31, 2013 · 30 comments

Comments

@oursours
Copy link

Hi,

first of all thanks for the great work ! :)

This script is really nice and works perfectly for me.

I m just wondering if there is a way to specify an output dir ? ( i red carefully the doc but seems not ... ). That would be really nice if we could and also check if the file isn't allready extracted in the output directory.

Regards.

@arfoll
Copy link
Owner

arfoll commented Apr 1, 2013

Correct, we don't support that. Since it's been asked for quite a lot, we probably should. The reason I haven't done it is because I can't find a way that suits me. The only way I can think of is to write to a randomly generated folder in /tmp, clean the results on there, and then copy to the output dir. Guess we could check the file list from the rar file and then check the output dir if the file exist.

  • should we check the md5sums before assuming the output dir contains the files? Should we overwrite them or just spit a warning out?
  • should we check /tmp has enough space before extracting? Usually /tmp being tmpfs my machines (ram donations are welcome!) it would be annoying...
  • Also this will be a little slower (or alot if you don't use tmpfs or have a lot of swap and little ram) than in place extraction, you ok with that? We could remove the cleaning if using an output dir...

What do you think?

@oursours
Copy link
Author

oursours commented Apr 2, 2013

Hey ! Thanks for your quick reply !

What kind of ram do you need ? I actually have some i m not using ... i'd be happy to send it to you if it's the kind you need ...

Regarding the script:

-A friend of mine actually wrote a small script for my need. You can find it here:

http://pastebin.com/UWCNLrqb

As you can see he is just checking the name of the files inside the archive before extracting it. And he writes the name of the extracted file in a txt to avoid extracting it multiple times.

I guess you could implement that in unrarall with a trigger for the input and output dir, wich would give:

unrarall -input /my/rar/files -output /my/unrared/files

That would just be an option added to unrarall ... I guess you could do the same with the tmp dir option if lot s of people asked for it !

To answer your questions:

-I guess we shouldn t check the md5sums before of course ... i guess most of the people use that kind of script for automation, so overwriting or spitting a warning is not a good idea either.
-I would actually create a tmp dir in the output dir, assuming that if the user is extracting there, it's because it's suppose to have enought space. So the order would be : 1.checking if the files allready exist 2. create a tmp dir 3. unpack the files with or without all the wonderfull options of unrarall 4. move the files to the output dir 5. delete tmp dir

Sorry if my explanations are not good, english is not my birth language.

Regards.

@arfoll
Copy link
Owner

arfoll commented Apr 2, 2013

Using a tmp dir in the output dir makes much more sense than going through /tmp.

I don't like the txt file idea, I'd rather just dynamically check, unrar & 7zip seem very fast at spitting the information out anyways.

Anyways i'll have a bash at it when I get a minute, on travel for work atm. Or you can give it a go if like ;-)


Moi non plus c'est pas ma langue maternelle ;-)

Don't worry about the ram, Since I've moved to tmpfs and removed swap due to using mostly SSDs for non data stuff though i've had a few surprises doing things I use to do in /tmp (especially in machines with 1GB)...

@delcypher
Copy link
Collaborator

@arfoll When did we start doing MD5 sums? I thought the cksfv were just checking CRC32? k

I like the sounds of the output dir option. Would the intention be to recreate the directory structure found in the root (where unrarall starts its search) inside the output directory. For example let's say we have

~/mydownloads/Black.Dynamite/blk.dyn.rar
~mydownloads/Superbad/superbad.rar
...

and we run

$ unrarall --clean=all ~/mydownloads/ ~/outputdir/

do we get... (this makes more sense to me)

~/outputdir/Black.Dynamite/blk.dyn.mkv
~/output/Superbad/superbad.mkv

or do we get...

~/outputdir/blk.dyn.mkv
~/outputdir/superbad.mkv

?
We'd also have to think about what the clean-up hooks do. The rar hook needs to be applied to the source directory but the other hooks need to be applied to the output directory.

@oursours Checking if the file is already extracted could be unreliable. For example if unrarall was killed during extraction then files with the correct name could be left behind but they would not be correct. If someone proceeded to run unrarall again then if we simply check the correct filename exists we will skip extraction even though the file(s) were not correctly extracted previously. Do we think is acceptable? I suppose we could show a warning when we skip over files.

@arfoll
Copy link
Owner

arfoll commented Apr 2, 2013

@delcypher I think I explained badly, the md5sums would there exactly to fix the problem you ask oursours... sfv files just check the integrity of the archive files, not their contained files. By checking the md5sum of the files written against the files in the archive (not entirely sure how) we could fix the issue.

Recreating the top level dir would be nicer I agree. As for the cleanup hooks I guess we need source & output cleanups, but I suggest we dont run the source cleanups in this case, as I'm guessing the reason to do this is mostly to keep the source as it came.

@strider2112
Copy link

Has anyone ever come back to this?

I would definitely like to see an output folder. That way I can run for example:
unrarall --clean=rar ~/Downloads ~/Downloads/Extracted

twice (back-to-back) and cover all the rar-in-rars as well, and then have a somewhat organized "Extracted" folder for easy rectifying later.

As-per delcypher's comment about whether you get "outputdir/Superbad/Superbad.mkv" or "outputdir/Superbad.mkv", isn't that what the difference is in the switch --full-path? or does --full-path do this:
unrarall finds SB.rar
outputdir/SB/Superbad.mkv

Again, I would really, really, want an output directory capability.

@delcypher
Copy link
Collaborator

@strider2112 --full-path just passes x to the unrar command which means the full path inside the archive file is used. My point was whether you want to recreate the directory structure that the rar file was found in (by unrarall) inside the output directory or whether you would want all the found rar files to be extracted into the same output directory (this might result in a mess).

If you're happy to have everything extracted into the same output directory then this should be easy to implement.

@strider2112
Copy link

Hmmm. It already makes a bit of a mess of my Downloads folder. This would just help automate the cleaning process.

As far as I know, every extraction program offers some sort of "destination folder" option.

As for the original question, I would rather not have the directory structure be with the folder that the rar was in, for me this would end up with:
/Downloads/Downloads/Big Bang Theory.mkv
Downloads/Downloads/Super bad.mkv

Etc. Because every rar I have is in "Downloads". If that makes sense, this option would nastify things

@delcypher
Copy link
Collaborator

@strider2112 I've implemented this feature (--output flag) in a feature_outputdir branch. Could you please test it?

@arfoll Could you check this doesn't break anything for you?

@strider2112
Copy link

I would be happy to test it. I just gotta get it onto my server, this will take a couple hours (I don't have remote access right now)

Just to clarify, with the new --output flag, is the proper syntax like so?
unrarall --clean=all ~/Downloads/ --output ~/Downloads/Extracted
or like this
unrarall --clean=all --output=~/Downloads/Extracted ~/Downloads
?

@delcypher
Copy link
Collaborator

The options always come before the mandatory directory argument so

$ unrarall --clean=all --output ~/Downloads/Extracted ~/Downloads

is correct.

The output flag doesn't use the = like --clean= does. Maybe I should fix that?

@strider2112
Copy link

It doesn't matter for right now, I just got the code onto my server (I got access to my ssh tunnel), I am going to test it now, I just need a rar file...
EDIT: And it appears to work exactly as expected, I'm not sure about cleaning as I just quickly rar'd nohup.out and moved the rar to my client where I ran the script. It functioned as expected, decrypting "Test.rar" and moving "nohup.out" to my Extracted/ folder.

Although it makes more sense to use the = sign like in --clean=, I actually prefer it not to. But for the sake of anyone else using the code, it might make more sense.

I will test it soon on a more complicated rar. A set of 6 parts, encrypted, and probably each filled with 20-30 rar parts. We'll see how she does...

@delcypher
Copy link
Collaborator

@strider2112 Thanks for testing. I think I might keep the = because --clean= takes a list of strings were as --output just takes one. @arfoll what do you think?

@arfoll
Copy link
Owner

arfoll commented Nov 14, 2014

@delcypher I agree with you 100%.

I think the last missing 'feature' is for a cleanup hook to wipe the empty dir if clean=all made the dir with rar files empty.

@strider2112
Copy link

@delcypher Do you mean keep it without the =? That would work best for me. The reason is selfish, it is because I use my mobile phone most of the time, and to access the "=" sign, I need to go through a couple menus.

Either way, I'm happy with the functionality. Good news boys! I tested the script with two rar sets:
One rar set was a set of 6 rars
One rar set was a single rar
Both of them had multiple rars inside
The script successfully cleaned and moved them to where they needed to be! Seems like it worked.

@delcypher
Copy link
Collaborator

@strider2112 . I meant keep --output I've now merged this functionality into the master branch.

@delcypher
Copy link
Collaborator

@arfoll I'm not sure if that's a desirable feature.

If you do

$ unrarall --clean=all --output ~/mystuff/ .

you might not want the current working directory removed

Similarly if you do something like

$ unrarall --clean=all --output ~/mystuff/ ~/Downloads

you probably don't want your Downloads folder removed.

Or do you mean remove empty directories inside the specified directory? For the ~/Downloads/ example that would remove empty directories inside ~/Downloads but not ~/Downloads itself.

@strider2112
Copy link

I don't think housecleaning should be your responsibility. The --clean feature is great because for me it automates the removal process for .rar files and .nfo files, etc. But to have it go too deep into removing directories is kind of scary.

What if I have my Extracted folder sub-directoried so that I have a few "buffer" directories like this:
~/Downloads/Extracted/Movies
~/Downloads/Extracted/Music

I wouldn't want the script to delete Movies/ and Music/ every time.
Deleting the CWD is bad practice I think, what if a glitch or error in the coding ignores the fact that the CWD should be empty... often I have the scripts set to run from
$:~

@arfoll
Copy link
Owner

arfoll commented Nov 14, 2014

@strider2112 housecleaning is exactly what unrarall is for ;-)

@delcypher so if you have the dir structure:
~/Downloads/Monkeys.The.Rats/001.rar
and I run unrarall witth an output dir I'd expect to see Monkeys.The.Rats/ dissapear. Now if there is another file in there not cleaned up by a rule then it wouldnt remove the dir.

Anyways, just a thought, it seems to me like something logical that most people would want in this case. But doesn't hinder your patch going to master. It works ok for me.

@strider2112
Copy link

@delcypher so is there a new Master copy of the script that is ready for me to download? I will be sure to test it for bugs.

I'm new to GitHub, is there any way that I can set up so that it emails me if a new release is done? (ie. if you do decide to add further cleaning functionality).

@arfoll I didn't know that you originally wrote unrarall for housecleaning purposes. My biggest reason to use it is that it is quick, easy, and can handle all my archives. But now that I think about it - housecleaning makes perfect sense. I'm sure whatever gets done will be logical.

@strider2112
Copy link

Quick question regarding functionality of this script, if I were to modify this:
declare -x UNRARALL_OUTPUT_DIR=""
to read
declare -x UNRARALL_OUTPUT_DIR="~/Downloads/Extracted/"
would I not need to declare --output every time?
Would I still need to use the --output switch, just without the path behind it?

I'm kind of new to bash programming, but this makes sense to me.

@delcypher
Copy link
Collaborator

@delcypher so is there a new Master copy of the script that is ready for me to download? I will be sure to test it for bugs.

I've pushed the code into the master branch of this repository. It's mostly the same as what I had in the feature_outputdir branch except I cleaned up the --help output.

I'm new to GitHub, is there any way that I can set up so that it emails me if a new release is done? (ie. if you do decide to add further cleaning functionality).

I think "watching" the repository will give you this information but I think you will get notified about everything that happens in this repository. Take a look at the GitHub docs on this

Quick question regarding functionality of this script, if I were to modify this:
declare -x UNRARALL_OUTPUT_DIR=""
to read
declare -x UNRARALL_OUTPUT_DIR="~/Downloads/Extracted/"
would I not need to declare --output every time?

That is correct.

@delcypher
Copy link
Collaborator

@arfoll It looks like writing a clean up hook to remove empty directories that doesn't delete the root is trivially easy with find. Do you want to hook to apply to the output directory or the directory that contains the rar file(s), or both?

@arfoll
Copy link
Owner

arfoll commented Nov 14, 2014

@delcypher trivially easy? Wow sounds like it should be done STRAIGHT AWAY!

only the directory that contained the rar files. The output dir should not
be empty since cleanup will only run on success. Can you have empty rar
files?

On 14 November 2014 22:29, Dan Liew notifications@github.com wrote:

@arfoll https://github.com/arfoll It looks like writing a clean up hook
to remove empty directories that doesn't delete the root is trivially easy
with find. Do you want to hook to apply to the output directory or the
directory that contains the rar file(s), or both?


Reply to this email directly or view it on GitHub
#10 (comment).

@delcypher
Copy link
Collaborator

trivially easy? Wow sounds like it should be done STRAIGHT AWAY!

I'll give it a go.

only the directory that contained the rar files. The output dir should not
be empty since cleanup will only run on success. Can you have empty rar
files?

facepalm you're right. Empty rar files sounds like an edge case not worth caring about.

@delcypher
Copy link
Collaborator

@arfoll Implemented in f54552f . Please try it out. It seemed to work okay on my simple test.

@delcypher
Copy link
Collaborator

@arfoll If all's well we should probably bump the version number and make a new release.

@xieem
Copy link

xieem commented May 1, 2015

@delcypher so did this got implemented? If yes, how? Like the previous mentioned command? $ unrarall --clean=all --output ~/mystuff/ ~/Downloads ?

@delcypher
Copy link
Collaborator

@xieem I believe it was implemented. Read unrarall --help.

@strider2112
Copy link

@xieem the output directory was implemented and is in the master file. It works just like how you're asking.

As for this advanced cleanup they last talked about, I'm not sure. I never downloaded an update since then.

@arfoll arfoll closed this as completed Apr 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants