Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Files unrelated to job get removed when post-processing Cleanup List runs #2577

Open
albino1 opened this issue May 31, 2023 · 11 comments
Open

Comments

@albino1
Copy link

albino1 commented May 31, 2023

I wasn't sure the best way to title this, but essentially I noticed that if you have any files in your final download folder or any subfolder that are on your Cleanup List then they get removed when the job finishes running, regardless of whether or not they're related to the actual job. This might be a bug, or it might be just be a consequence of having a Cleanup List at all, but it seems to me that it shouldn't be removing files unrelated to the download.

Quick reproducible example:

  • I set my Cleanup List in SABnzbd to exe, png, nfo, nzb
  • I created a subfolder in my download directory called MyApplication
  • Inside I placed some fake files named fake.exe, fake.png, fake.nfo, fake.nzb, fake.ext (.ext for control)
  • I downloaded a random TV show NZB with a category set and sort turned on (this seems important, it doesn't happen with Default and no sorting)

Result: All of the cleanup list files are gone from the MyApplication subfolder. In fact, if there's only cleanup files in the folder then the whole folder is removed.

Initial structure:

D:\downloads\sab\
D:\downloads\sab\MyApplication\
D:\downloads\sab\MyApplication\fake.exe
D:\downloads\sab\MyApplication\fake.png
D:\downloads\sab\MyApplication\fake.nfo
D:\downloads\sab\MyApplication\fake.nzb
D:\downloads\sab\MyApplication\fake.ext

After download and cleanup completes:

D:\downloads\sab\
D:\downloads\sab\Title.S01E01.720p.WEB-DL.AAC2.0.h264-Group.mkv
D:\downloads\sab\MyApplication\
D:\downloads\sab\MyApplication\fake.ext

Some logging:

2023-05-31 15:52:53,073::DEBUG::[filesystem:919] [sabnzbd\filesystem.py.cleanup_empty_directories] Removing dir \\?\D:\downloads\sab\Title.S01E01.720p.WEB-DL.AAC2.0.h264-Group
2023-05-31 15:52:53,074::DEBUG::[deobfuscate_filenames:88] No additional par2 files found to process
2023-05-31 15:52:53,076::INFO::[postproc:1080] Removing unwanted file \\?\D:\downloads\sab\MyApplication\fake.exe
2023-05-31 15:52:53,076::DEBUG::[filesystem:912] [sabnzbd\postproc.py.cleanup_list] Deleting file \\?\D:\downloads\sab\MyApplication\fake.exe
2023-05-31 15:52:53,077::INFO::[postproc:1080] Removing unwanted file \\?\D:\downloads\sab\MyApplication\fake.nfo
2023-05-31 15:52:53,077::DEBUG::[filesystem:912] [sabnzbd\postproc.py.cleanup_list] Deleting file \\?\D:\downloads\sab\MyApplication\fake.nfo
2023-05-31 15:52:53,077::INFO::[postproc:1080] Removing unwanted file \\?\D:\downloads\sab\MyApplication\fake.nzb
2023-05-31 15:52:53,078::DEBUG::[filesystem:912] [sabnzbd\postproc.py.cleanup_list] Deleting file \\?\D:\downloads\sab\MyApplication\fake.nzb
2023-05-31 15:52:53,078::INFO::[postproc:1080] Removing unwanted file \\?\D:\downloads\sab\MyApplication\fake.png
2023-05-31 15:52:53,079::DEBUG::[filesystem:912] [sabnzbd\postproc.py.cleanup_list] Deleting file \\?\D:\downloads\sab\MyApplication\fake.png
2023-05-31 15:52:53,333::INFO::[nzbstuff:1836] [sabnzbd\postproc.py.process_job] Purging data for job Title.S01E01.720p.WEB-DL.AAC2.0.h264-Group (delete_all_data=True)
2023-05-31 15:52:53,333::DEBUG::[articlecache:152] Purging 4 articles from the cache/disk
2023-05-31 15:52:53,333::DEBUG::[filesystem:945] Removing dir recursively \\?\D:\sab-incomplete\Title.S01E01.720p.WEB-DL.AAC2.0.h264-Group
2023-05-31 15:52:53,335::INFO::[notifier:123] Sending notification: Download Completed - Title.S01E01.720p.WEB-DL.AAC2.0.h264-Group (type=complete, job_cat=tv)

Other potentially relevant info:

SABnzb version 4.0.0 [5e42e25]
Windows 10 Pro
Category is set to TV with the Sort String set to %dn.%ext

@thezoggy
Copy link
Contributor

thezoggy commented Jun 1, 2023

certain things that sab does only affects things that are tracked but yes the cleanup just runs post to cleanup anything left over in the completed dir ( https://github.com/sabnzbd/sabnzbd/blob/develop/sabnzbd/postproc.py#L1071). it doesnt really care what was tracked (as things could be created post extraction/rename/repaired/etc). and empty dir do get removed during this process as well. its a bit blunt but its also kinda the point of it, you said you dont want x. it will get rid of x.

@albino1
Copy link
Author

albino1 commented Jun 1, 2023

I understand that's how it works currently, it's not smart enough to do it any better, I'm just not sure that's how it should work. Especially not with per-category cleanup coming soon. Download an app and then download a movie and you lose your app.

I'm not brave enough to test this, but imagine somebody made their final directory C:\ root, would SAB then wipe all of the exe files from the Program Files and Windows folders? Even something as simple as the default Windows Downloads or Desktop folders in C:\Users could have unintended consequences. Seems like something that should be prevented if possible as nobody is going to expect unrelated files in their Downloads folder to be arbitrarily deleted.

@thezoggy
Copy link
Contributor

thezoggy commented Jun 1, 2023

sab just does what the user tells it.
there seems to be a trend lately of people wanting us to dumb down sab where we act like apple where we put up so many guards and restrict what the user can do with it at the expense of flexibility and functionality for the user.

we have quite a few guards in sab, but its hard to do certain things like restrict what paths are safe to use. because of different use cases.. different os. people running sab on one os but then volume mounts to others.. or running things in containers/virtual fs.

like disabling cleanup feature if user decides to do * on a cat to prevent jobs, which means dump them in the completed folder.. that might sound good but then that breaks the use case of people that actually want that and know what they are doing.
if we start going down the path of what if scenario, we might as well remove scripts all together. or push it on the user like nzbget does where you have to solve everything with your own script (no i dont think we should, people would shoot themselves in the foot easier, and thats one of the big perks of sab is not having to resort to doing everything via a script).

I can understand that pitch if your worried exposed sab instance doesnt have auth (or really bad auth) and your running it as root on a production system. but my argument there would be to just run in docker to sandbox it, have auth, use reverse proxy, or dont even expose it externally.

now about cleanup, having it be recursive or not to limit scope is double edge. there would be people that would be unhappy no matter what we do. but generically most people want recursive because they dont have bad paths and the downloads are segmented, let alone this is happening from within the job folder.

the feature is there, we dont force it on anyone.
for your dooms day scenario, someone would have to turn off job folder creation. and set completed folder to something a bit abnormal to store downloads.. so more than one misconfiguration.

we can only protect people so much before we reduce functionality. i dont see any easy wins we can do without a negative impact towards functionality/usability. if you are worried about abuse, secure your setup. we do offer up the ability to lock down sab config via a .ini only option https://sabnzbd.org/wiki/configuration/4.0/special#toc2

@albino1
Copy link
Author

albino1 commented Jun 1, 2023

I was being admittedly hyperbolic, but all I'm really saying is:

Download app > Download movie > App is now gone

Download picture > Download TV show > Picture is now gone

This seems incorrect and unintended. I get what you're saying about recursive being double edged, but there's no reason to have the cleanup script working in folders or with files that aren't part of the job.

@thezoggy
Copy link
Contributor

thezoggy commented Jun 1, 2023

I was being admittedly hyperbolic, but all I'm really saying is:

Download app > Download movie > App is now gone

Download picture > Download TV show > Picture is now gone

This seems incorrect and unintended. I get what you're saying about recursive being double edged, but there's no reason to have the cleanup script working in folders or with files that aren't part of the job.

but the cleanup process happens from within the job. why im saying you would have to go out of your way to downloading one job, then update the cat to point to that job folder to then cause the issue

@thezoggy
Copy link
Contributor

thezoggy commented Jun 1, 2023

lets say i set my completed folder to C:\ and then my tv cat is set to TV, so now job lands in C:\TV and when it comes time to do the postprocessing and cleans up the file.. now its in C:\TV\_UNPACK_<job> so again not really an issue as i was saying you have to go beyond out of your way to do your doomsday scenario

2023-05-31 08:39:47,356::INFO::[postproc:1083] Removing unwanted file \\?\C:\TV\_UNPACK_Series.1080p.WEB.H264-GRP\dummy.txt
2023-05-31 08:39:47,356::DEBUG::[filesystem:913] [sabnzbd\postproc.py.cleanup_list] Deleting file \\?\C:\TV\_UNPACK_Series.1080p.WEB.H264-GRP\dummy.txt

@Safihre
Copy link
Member

Safihre commented Jun 1, 2023

We keep a list of all files that belong to a job. However, when using Sorting this list is emptied due to limitations in the current implementation.

@albino1
Copy link
Author

albino1 commented Jun 1, 2023

lets say i set my completed folder to C:\ and then my tv cat is set to TV, so now job lands in C:\TV and when it comes time to do the postprocessing and cleans up the file.. now its in C:\TV\_UNPACK_<job> so again not really an issue as i was saying you have to go beyond out of your way to do your doomsday scenario

I see what you're saying. Maybe I'm just unique in that all of my sorting on my main PC just goes to a blanket downloads folder, I don't have individual TV/Movie/whatever subfolders for sorting on it. On my media server Sonarr and Radarr are handling all that, so SAB sorting isn't even involved.

I guess I previously assumed that post-processing the cleanup list happened before the move to its final destination. I would think it would make more sense to clean up the job while it's in the temp folder before it gets moved. Obviously the files on cleanup lists are generally very tiny, so the real world impact of moving them across drives and network shares just to immediately delete them is minimal, but it seems inefficient and can clearly lead to problems with unrelated files being deleted.

Anyway, it sounds like @Safihre is saying this isn't really feasible without a lot of work, so feel free to close this since it probably affects so few people.

@thezoggy
Copy link
Contributor

thezoggy commented Jun 2, 2023

Its hard to cleanup in incomplete when were not unpacking in it...
Pro and con of io overhead if you have different drives for incomplete + complete folders.

Some setups, it would be optimal to unpack to incomplete and do the cleanup and THEN move whats left.
However, this generally would be viewed as inferior to it is now. As you need more disk space in incomplete, hitting same disk even more (more wear/slowdown). Then also have to worry about clobbering source files. So yes while suboptimal if you were cleaning up extensions that happen to be large files and you had to take the hit of waiting for those to be moved.. but generally things are in rarset and you had to take the hit of them being extracted anyways (we cant cleanup pre-extraction)

@albino1
Copy link
Author

albino1 commented Jun 2, 2023

Fair enough, I'm gonna go ahead and close this, especially since it sounds like the upcoming per-category cleanup wouldn't be any different than the current situation so there's no improvements to be made there either. Thanks for all the feedback.

@albino1 albino1 closed this as completed Jun 2, 2023
@Safihre
Copy link
Member

Safihre commented Jun 2, 2023

I'll reopen this because I really want to fix that the Sorter clears out the file tracking. If that's fixed, this can more easily be managed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants