Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move from temporary storage by file, not by torrent #13013

Open
UnitedMarsupials opened this issue Jun 13, 2020 · 38 comments
Open

Move from temporary storage by file, not by torrent #13013

UnitedMarsupials opened this issue Jun 13, 2020 · 38 comments

Comments

@UnitedMarsupials
Copy link

UnitedMarsupials commented Jun 13, 2020

qBittorrent version and Operating System

v4.2.5 on FreeBSD-11.3/amd64

If on linux, libtorrent-rasterbar and Qt version

What is the problem

I store most of the downloaded files on ZFS -- many terabytes on a redundant array. However, for temporary storage of torrents still in flight, I use a different disk -- single, fast, and small.

This ability to use a different location for transient files is a nice feature of qBittorrent and I love it. However, the downloaded files are only moved to the final location, when the entire torrent is downloaded.

What is the expected behavior

Fully downloaded files should be moved to the final location right away, without waiting for the rest of the torrent to finish.

Steps to reproduce

Enable the "Keep unfinished torrents in ..." feature and try downloading a multi-file torrent. When one of the files finishes downloading, observe, that it waits in the temporary location until the entire torrent is done -- and only then are the multiple files moved to the permanent location.

Extra info(if any)

This is important, when the temporary space is on a small filesystem, unable to hold the entire torrent.

It also makes it inconvenient to use the already-downloaded files.

@glassez
Copy link
Member

glassez commented Jun 15, 2020

I like this idea, at least on the surface, if you do not go into the possible difficulties that may arise during its implementation. I think I even expressed it once in a related topic. But in order to undertake its implementation, you need General approval. We should see here the opinions of other users and contributors, preferably with all the pros and cons of both options, because we will have to choose only one of them, since it is too expensive to support both simultaneously.
@Chocobo1, @thalieht, @FranciscoPombal, @sledgehammer999?

@Chocobo1
Copy link
Member

I like this idea, at least on the surface, if you do not go into the possible difficulties that may arise during its implementation.

I think it make sense although I haven't considered about the drawback (if there are any).

@thalieht
Copy link
Contributor

👍 from me. Can't think of any cons.

@FranciscoPombal
Copy link
Member

FranciscoPombal commented Jun 15, 2020

Well, I don't think it's a good idea.

  1. There are probably many users whose workflow depends on the files only moving to the "Completed" directory once finished. Any scripts written with this assumption (which is the standard for this type of feature) will break.

  2. There are enough bug reports open with moving/external storage/temp folder as is, changing something that already works is too risky. This new mechanism seems rife for tons of edge cases and corrupted fastresumes in case the torrent and/or some or all of its files are renamed in the client mid-download, or if the client crashes, or if files are unselected, etc etc etc.

  3. The "atomic unit" this feature works with is "the torrent", not "each file/folder", so "moving each file as completed" is unexpected. This feature provides a certain workflow in it's current state. Turns out in OP's case, their environment cannot fully support this workflow in some cases (when files are too big for their "cache drive"). But IMO the solution here is "get a bigger drive", rather than "change the behavior of a standard feature to a more complicated and non-standard one".

Seems like OP has the following choice, that is preferable to overhauling an entire feature to adopt non-standard behaviors:

  • Get a bigger "cache drive".
  • Simply take the performance hit when downloading torrents that don't fit in the cache drive.

@UnitedMarsupials
Copy link
Author

UnitedMarsupials commented Jun 15, 2020

Any scripts written with this assumption (which is the standard for this type of feature) will break.

Only if they are written to wait for the appearance of one of the files in the permanent location. Which would be an invalid test anyway, because there'd always be a delay before first such appearance and the last file being written to the final destination.

A better script would instead wait for the disappance of all files in the temporary location. Scripts implemented to do this would continue to work.

Finally, instead of implementing such directory-watching (via kqueue, inotify, or whatever), such scripts should simply expect to be explicitly triggered by qBittorrent itself -- as configured via the "Execute external program" feature.

This obvious and easiest approach -- which is also the only one you've ever promised will work -- would be unaffected by the proposed change at all.

Get a bigger "cache drive"

When downloading multiple large multi-file torrents in parallel, the "bigger" may approach the size of the final destination -- defeating the purpose.

Simply take the performance hit when downloading torrents that don't fit in the cache drive

For one, that's not always immediately obvious -- a long-dormant earlier torrent or two, may suddenly "wake-up" and fill up the cache filesystem. The proposal would make this far less likely to happen.

Secondly, the decision, whether or not to use the temporary filesystem is currently global -- and cannot be specified for a particular torrent, when adding one, can it?

And finally, the other problem with the current implementation is the inherent wait before the download results can be properly used. For example, if I'm downloading an OS installation media, I'd like to be able to start burning the first CD before the other images arrive. Similar argument can be made for entertainment episodes and audio tracks.

@FranciscoPombal
Copy link
Member

@UnitedMarsupials

Fair points about how scripts can be better written overall, but my point is that this is would be an extremely jarring change that would break stuff for other people for the sole purpose of solving your problem, which IMO requires a different solution.

When downloading multiple large multi-file torrents in parallel, the "bigger" may approach the size of the final destination -- defeating the purpose.

Couldn't it just be the case that your whole system is under spec'd for your needs then? This sounds to me like "my car can only do 100 km/h on the motorway and I'd like to get from A to B quicker, so please make all motorways giant moving walkways for cars to get me the extra 20 km/h, instead of just upgrading my car's powertrain".

For one, that's not always immediately obvious -- a long-dormant earlier torrent or two, may suddenly "wake-up" and fill up the cache filesystem. The proposal would make this far less likely to happen.

So you're saying that the drive you're using for temporary torrent storage (which I was referring to as "cache drive" in the context of moving finished downloads) is also pulling double-duty as an actual cache drive for your other filesystem's reads/writes? I believe you can see how this strengthens my point that maybe you're making your system bite off more than it can chew, relative to the use case you envision for it.

Secondly, the decision, whether or not to use the temporary filesystem is currently global -- and cannot be specified for a particular torrent, when adding one, can it?

Don't add too many at a time, or adjust queuing settings. Try opening 50 YT videos at the same time on a 10 Mbps connection. Do you expect them to be watchable without waiting for buffering? No, because the 10 Mbps connection is simply not sufficient for that. It is the bottleneck. You have 2 options: a) don't open 50 YT videos at a time b) get a bigger pipe if you really need that workflow, making a) is not an option.

And finally, the other problem with the current implementation is the inherent wait before the download results can be properly used. For example, if I'm downloading an OS installation media, I'd like to be able to start burning the first CD before the other images arrive. Similar argument can be made for entertainment episodes and audio tracks.

The files are still accessible and open-able in their original locations before the full download is finished.

@UnitedMarsupials
Copy link
Author

UnitedMarsupials commented Jun 15, 2020

[Apologies for noise, too easy to close a ticket accidentally :(]

my point is that this is would be an extremely jarring change that would break stuff for other people

My argument was, that the only legitimate use by these hypothetical other people will remain unaffected. For the proposed change to break a script, that script has to be relying on behavior discovered by observation, not something ever promised by qBittorrent to survive even a minor version-change.

The only legitimate (or "supported") way to invoke a script after downloading a torrent, is by registering it with the program. Scripts registered this way will not be affected by the change.

Couldn't it just be the case that your whole system is under spec'd for your needs then?

I believe, the proposed change would bring useful improvement -- without a downside -- regardless of my particular setup.

Don't add too many at a time

That's a good advice, but what's "too many"? If they are all idle, can I add another -- maybe, it will get in, or, maybe, the others will suddenly awaken (as a seeder goes online) and fill up my cache?

The files are still accessible and open-able in their original locations before the full download is finished.

Though that's true, they may disappear from that location at any moment. Depending on the method, with which they are being accessed when unlinked, the access will either be interrupted shortly afterwards, or the diskspace will not be reclaimed until the file descriptor is closed by the accessing program. Neither thing is good...

@glassez
Copy link
Member

glassez commented Jun 16, 2020

The main disadvantage of currently used approach (from my perspective) is partially downloaded torrents.
E.g. user adds torrent and deselect some files from it. How it should be handled? (AFAIK, current handling isn't so consistent/complete). Should it be moved to "complete" folder after all selected files are downloaded? If Yes then how it is expected to behave in case user select additional files?

Secondly, the decision, whether or not to use the temporary filesystem is currently global -- and cannot be specified for a particular torrent, when adding one, can it?

To allow per-torrent "incomplete" folder settings is in my To-Do list.

@FranciscoPombal
Copy link
Member

FranciscoPombal commented Jun 17, 2020

@UnitedMarsupials

I concede the point about the scripts.

That's a good advice, but what's "too many"? If they are all idle, can I add another -- maybe, it will get in, or, maybe, the others will suddenly awaken (as a seeder goes online) and fill up my cache?

Depends, only you can be the judge of that according to your system's capabilities and the sizes of torrents at play. If your cache drive is 128 GiB, and its already pulling double-duty doing other things, one 125 GiB torrent is probably already too many. Again, consider whether your system's characteristics might be inadequate for the workflow you want it to perform.

Though that's true, they may disappear from that location at any moment. Depending on the method, with which they are being accessed when unlinked, the access will either be interrupted shortly afterwards, or the diskspace will not be reclaimed until the file descriptor is closed by the accessing program. Neither thing is good...

Your point was that:

And finally, the other problem with the current implementation is the inherent wait before the download results can be properly used. For example, if I'm downloading an OS installation media, I'd like to be able to start burning the first CD before the other images arrive. Similar argument can be made for entertainment episodes and audio tracks.

With the current implementation, you know that all files belonging to a torrent are on location A before the whole torrent finishes, B after it does (which is easy to see/monitor). If files are moved piece-meal as they are finished as you suggest, you would have even less guarantees that you will be able to preview any of them properly during the download, unless they are already finished.

I believe, the proposed change would bring useful improvement -- without a downside -- regardless of my particular setup.

Sorry, but no. Refer to what I posted earlier:

  1. There are enough bug reports open with moving/external storage/temp folder as is, changing something that already works is too risky. This new mechanism seems rife for tons of edge cases and corrupted fastresumes in case the torrent and/or some or all of its files are renamed in the client mid-download, or if the client crashes, or if files are unselected, etc etc etc.

@glassez

The main disadvantage of currently used approach (from my perspective) is [handling of] partially downloaded torrents.

This proposed change would also be subject to that problem. Best to fix/improve what we already have IMO.

How it should be handled? (AFAIK, current handling isn't so consistent/complete). Should it be moved to "complete" folder after all selected files are downloaded?

Yes.

If Yes then how it is expected to behave in case user select additional files?

Moved back to "incomplete" until the newly selected files are downloaded, then moved to "complete". If this is not possible, keeping it in "complete" and just downloading the new files is also OK. Might be tricky to handle cases where files move between drives/filesystems.

@UnitedMarsupials
Copy link
Author

UnitedMarsupials commented Jun 17, 2020

I concede the point about the scripts.

Thank you. Now, about the other half of the argument:

There are enough bug reports open with moving/external storage/temp folder as is, changing something that already works is too risky.

I am not following your bug reports, but this statement in itself seems self-contradicting: can something, against which "there are enough bug reports open" even be considered "already working"?

And, if you really apply this principle, can you ever implement a substantial change, to anything? Where is that thin corridor, where there are few enough known problems with something allowing you to touch it, but still sufficiently many for the "if it works, don't touch it" to be inapplicable?

With the current implementation you know that all files belonging to a torrent are on location A before the whole torrent finishes, B after it does (which is easy to see/monitor).

If using an already completed file from the location A (temporary) -- assuming this location is even accessible to the apps/devices I'd like to access the files from -- the files could disappear from there unpredictably at any point. If it were, for example, a video -- accessed from a set-top box via SMB -- I'd have to go to a different SMB-share (mapped to the permanent location) to find it. And the player would then start the video from the beginning instead of resuming.

For another example, if it is a DVD-image, the burning software will keep the file-descriptor open, which means, the space at location A will not be reclaimed until the burning finishes -- also a drawback.

If files are moved piece-meal as they are finished as you suggest, you would have even less guarantees that you will be able to preview any of them properly during the download

Of course, there is such a guarantee! A single file, once fully downloaded -- would be moved to location B (permanent) and stay there for me to access at will.

unless they are already finished

Trying to access a file before it is fully downloaded would be ill-advised anyway, because torrent downloads aren't always sequential, are they?..

@glassez
Copy link
Member

glassez commented Jun 17, 2020

The main disadvantage of currently used approach (from my perspective) is [handling of] partially downloaded torrents.

This proposed change would also be subject to that problem.

Please take the trouble to argue this point. After all, it looks the opposite. Each file stays in "incomplete" folder until it is finished, then it is moved to another folder and stays there, this works no matter how many files are selected initially.

Moved back to "incomplete" until the newly selected files are downloaded, then moved to "complete".

This "move back" seems illogical for me... but keep to download torrent in "complete" folder isn't better generally in this case.

There are enough bug reports open with moving/external storage/temp folder as is, changing something that already works is too risky.

Isn't moving all files at once more unsafe than moving them one by one?

And most importantly, what is the main purpose of this feature? Just keep the unfinished torrent somewhere else so it doesn't get in your eyes?
Let's try to look at it from scratch, away from all those big words (like "standard") that @FranciscoPombal likes to use (it looks just like blindly "copy-pasted" behavior for me).

@coolio2013
Copy link

Trying to access a file before it is fully downloaded would be ill-advised anyway, because torrent downloads aren't always sequential, are they?..

Torrents are not organized by files, but by pieces. Pieces are downloaded or seeded, not files.

I could imagine a new layer (or abstract structure) for files, keeping track of pieces on 2 different locations (1 piece can belong to 2 files, as 1 piece can basically contain the end of file a AND the beginning of file b).

The purpose is to save space on quick storage (SSDs) for leeching, while keeping already leeched parts on inexpensive slow storage (HDD, NAS).

This would generate a hell of lot of work. A lot of things could break, think about re-checking, breaking scripts, forcing, saving fastresume, manual movement to different location, tagging, manual renaming files/folders, unwanted files, etc. Most (if not all) of these low-level things are done by libtorrent, just triggered by qbt (libtorrent already is handling files on this level).
imho it is really risky.

@FranciscoPombal
Copy link
Member

FranciscoPombal commented Jun 17, 2020

@UnitedMarsupials

I am not following your bug reports, but this statement in itself seems self-contradicting: can something, against which "there are enough bug reports open" even be considered "already working"?

And, if you really apply this principle, can you ever implement a substantial change, to anything? Where is that thin corridor, where there are few enough known problems with something allowing you to touch it, but still sufficiently many for the "if it works, don't touch it" to be inapplicable?

I meant it the sense that "it mostly works, but there are edge cases that need fixing, needs polish", which is not contradictory. Apologies if that wasn't clear. It's a system that works. It was not implemented perfectly from the start, but IMO it's more worth it to try to fix the bugs than to replace it with something more complex and with dubious added benefit that will surely have at least as many edge cases that won't be perfectly handled from the start.

If using an already completed file from the location A (temporary) -- assuming this location is even accessible to the apps/devices I'd like to access the files from -- the files could disappear from there unpredictably at any point
...

No, they disappear all at the same time - when the torrent finishes completely. Sure it is kind of "unpredictable" when the whole torrent disappears, but files being moved piece-meal is certainly even more unpredictable. The nature of BitTorrent is such that download progress among files in a torrent is random (there are exceptions). So, given a torrent with files A B C D, they will disappear in random order (and at random rates, in the general case!) to the completed folder with your proposed change. With the current system, you just have to keep an eye on the ETA, and know immediately where everything is.

Trying to access a file before it is fully downloaded would be ill-advised anyway, because torrent downloads aren't always sequential, are they?..

A lot of file formats are made with the "opening them before they are fully downloaded" in mind. For example, mkv files. The preview feature in qBittorrent was extremely requested for a reason.

@glassez

Please take the trouble to argue this point. After all, it looks the opposite. Each file stays in "incomplete" folder until it is finished, then it is moved to another folder and stays there, this works no matter how many files are selected initially.

I was referring to:

E.g. user adds torrent and deselect some files from it. How it should be handled?

Doesn't this question apply to both the current system and the proposed one?

This "move back" seems illogical for me... but keep to download torrent in "complete" folder isn't better generally in this case.

yeah, the problem with "moving back" is that it can potentially be very expensive (when big files are at play).

Isn't moving all files at once more unsafe than moving them one by one?

"Unsafe" in what sense? Moving files one by one is more complex in terms of tracking state no? That is my concern here.

And most importantly, what is the main purpose of this feature? Just keep the unfinished torrent somewhere else so it doesn't get in your eyes?

No, the way I see it, there is a dual purpose: to keep the unfinished torrents in a "staging area" that is separate from other more organized directory hierarchies so as to not clutter them, and/or to keep unfinished torrents in a faster (while being sufficiently large) storage medium.

Let's try to look at it from scratch

I already "looked at it from scratch", and neither you nor @UnitedMarsupials addressed my main point:

The "atomic unit" this feature works with is "the torrent", not "each file/folder [within the torrent]", so "moving each file as completed" is unexpected.

Every program that I have seen (and IMO, I have observed quite a significant sample) that has a "move after complete" feature and that works at a level where the "atomic unit" is a a set of files/folders only moves files belonging to the set once all are ready. As an example, take 7-zip on Windows. The atomic unit is "an archive" or "a set of files to be extracted from it". If you extract files by dragging from the GUI, first, all (of the selected files) are extracted to a temp folder. They are only moved to their final target destination after the whole extraction is complete, not as it completes for each file.

away from all those big words (like "standard") that @FranciscoPombal likes to use (it looks just like blindly "copy-pasted" behavior for me).

Please don't be unnecessarily confrontational, tone down the snark. I don't "throw around big words just because". If this is a reference to when I mentioned that using build as the folder name for out of source CMake builds was the de-facto convention/standard, just to respond in kind this time: I was 100% right, get over it. If this is the turn this conversation is going to take, I'll just stop participating in it. Personally, I only use this feature in very simple cases that are currently not buggy, and would most likely still work fine even if this new implementation were pretty buggy, so my stake in this isn't that high. I'm just trying to represent and "give a voice" to the users that are suffering from the currently buggy edge cases.

@coolio2013

I could imagine a new layer (or abstract structure) for files, keeping track of pieces on 2 different locations (1 piece can belong to 2 files, as 1 piece can basically contain the end of file a AND the beginning of file b).

Yes, handling the piece/file boundary disparity is a potential source of complexity.

This would generate a hell of lot of work. A lot of things could break, think about re-checking, breaking scripts, forcing, saving fastresume, manual movement to different location, tagging, manual renaming files/folders, unwanted files, etc. Most (if not all) of these low-level things are done by libtorrent, just triggered by qbt (libtorrent already is handling files on this level).
imho it is really risky.

Not sure exactly which of these could present a problem in practice, but it could be all of them and even a lot more that were not mentioned. This is for certain:

imho it is really risky.

best to fix what we already have.

@Dnkhatri
Copy link

Unless the op sets all the torrents to download file by file it still won't solve his problem as he would have finished files once 90% of the torrent is done anyway even if this gets implemented or not he is still better of getting a larger cache drive even an hdd cache drive would do for his torrents unless he has an unusual 10gb internet connection a normal hdd would still be faster than his download speeds.

@UnitedMarsupials
Copy link
Author

@FranciscoPombal

If using an already completed file from the location A (temporary) -- assuming this location is even accessible to the apps/devices I'd like to access the files from -- the files could disappear from there unpredictably at any point

No, they disappear all at the same time - when the torrent finishes completely

Yes, at the same, but unpredictable time.

files being moved piece-meal is certainly even more unpredictable

On the contrary -- a fully downloaded file, once in the permanent location (which will, under this proposal, happen very shortly after it is downloaded), can be found in that permanent location and used there for ever.

With the current implementation, if I, impatiently, begin using the file from the temporary location, my usage is likely to be interrupted -- unpredictably.

@Dnkhatri

Unless the op sets all the torrents to download file by file it still won't solve his problem

There is no guaranteed solution to the problem of over-committed resource -- to obtain such a guarantee, one simply mustn't overcommit at all. My proposal simply makes actually running out of space (far) less likely, making the overcommits less risky.

unless he has an unusual 10gb internet connection

My concern -- and the reason I use a cache drive for torrents -- is not speed, but the fact that my permanent storage is on ZFS, which can store things suboptimally if they are written at random, rather than sequentially. Here is one thread discussing it.

That said, shoving the completed files, rather than entire torrents is superior regardless of my particular setup.

My hardware configuration is just as irrelevant as the number of bugs outstanding.

@coolio2013

I could imagine a new layer (or abstract structure) for files [...]

Gentlemen, please, let's not get sidetracked -- this has already generated a lot more discussion, than I expected, when filing this change-request.

@thalieht
Copy link
Contributor

The only question is if this proposal worked 100% bug free would you prefer it over the current one?
All these implementation details are relevant only to devs, what do you care?

This would generate a hell of lot of work. A lot of things could break, think about re-checking, breaking scripts, forcing, saving fastresume, manual movement to different location, tagging, manual renaming files/folders, unwanted files

If i'm not mistaken this change will lay the foundations for #439 for which all this stuff would have to be done anyway.
As for the breaking scripts argument, it sounds to me like the one with the horseshoe makers and cars.

Note that i have no stake in this because i don't use this feature, just saying what seems logical to me.

@glassez
Copy link
Member

glassez commented Jun 18, 2020

I was referring to:

E.g. user adds torrent and deselect some >>files from it. How it should be handled?

Doesn't this question apply to both the current system and the proposed one?

"Per-file" system isn't affected by this problem. Isn't it clear for you? As I already said in this case "incomplete" files are located in "incomplete" folder and "complete" ones in "complete". So when you select previously unselected files they will be downloaded into "incomplete" folder and then moved in "complete" not touching already complete files.

@glassez
Copy link
Member

glassez commented Jun 18, 2020

I already "looked at it from scratch",

Then where does your "atomic unit" come from?

to keep the unfinished torrents in a "staging area" that is separate from other more organized directory hierarchies so as to not clutter them

If it's valid purpose (sorry, seems like yet another portion of bloat to me) then "atomic unit" is entire torrent.

to keep unfinished torrents in a faster (while being sufficiently large) storage medium.

It isn't required to have entire torrent as "atomic unit" for this purpose.

@FranciscoPombal
Copy link
Member

FranciscoPombal commented Jun 18, 2020

@UnitedMarsupials

Yes, at the same, but unpredictable time.

On the contrary -- a fully downloaded file, once in the permanent location (which will, under this proposal, happen very shortly after it is downloaded), can be found in that permanent location and used there for ever.

With the current implementation, if I, impatiently, begin using the file from the temporary location, my usage is likely to be interrupted -- unpredictably.

Ok, if you only ever access already-finished files during the download, then yes, there is no unpredictability in your case. Still, I maintain that this is too much of a niche feature to justify overhauling this feature.

There is no guaranteed solution to the problem of over-committed resource -- to obtain such a guarantee, one simply mustn't overcommit at all. My proposal simply makes actually running out of space (far) less likely, making the overcommits less risky.

Don't overcommit, problem solved. You talk about this change making it "far less likely for one to run out of space", as if this were a general problem that affects many people (it's an implicit claim the way I'm reading, but let me know if implying this wasn't your intention). You're literally the only one with this problem, because you want to carelessly overcommit.

My concern -- and the reason I use a cache drive for torrents -- is not speed, but the fact that my permanent storage is on ZFS, which can store things suboptimally if they are written at random (...)

Then won't you be worse off with this change? Consider torrents with many small files. Since they will be moved piece-meal to your ZFS storage, this can be functionally random writing. If you only move the whole torrent at once at the end, the write is sequential, right?

@thalieht

If i'm not mistaken this change will lay the foundations for #439 for which all this stuff would have to be done anyway.

Maybe I'm wrong, but it seems to me that you can have one without the other, i.e., #439 does not explicitly depend on something introduced by implementing this one. Can it share code common with this one? Maybe, but it's not a necessity.

@glassez

I'm not sure if we have the same definition of a "completed" torrent, and we should decide on that first. Does a torrent with only a few files selected (but each of these few files fully completed) count as "complete", or does it have to have all files selected and complete to count as "complete"?

Then where does your "atomic unit" come from?

In a torrent client, a torrent is an "atomic unit" of operations that claim to operate on torrents. So if a feature is called "when complete, move torrent to...", the torrent should be moved at once only when it is complete, not each file piece-meal. Otherwise a much more description for the feature would be: "move a torrent's files as they are completed to...".

to keep the unfinished torrents in a "staging area" that is separate from other more organized directory hierarchies so as to not clutter them
If it's valid purpose (sorry, seems like yet another portion of bloat to me) then "atomic unit" is entire torrent.

Sorry, I don't understand this. "Yet more bloat"? I was just explaining my interpretation of the purpose of a feature that already exists. The text you quoted is in no way proposing the addition of new stuff, bloat or not.

It isn't required to have entire torrent as "atomic unit" for this purpose.

Please elaborate. Yes, it's not impossible to move files piece-meal, but no one is saying that is, just that it is not a good idea.

@UnitedMarsupials
Copy link
Author

Ok, if you only ever access already-finished files during the download

Of course! Accessing the files still in flight is pointless as there is no telling, which parts of the file have arrived already.

Still, I maintain that this is too much of a niche feature

What? The ability to watch the already download part of a series, while the rest are being downloaded? That's not "niche"...

Don't overcommit, problem solved.

Overcommitting is a perfectly legitimate technique for bolstering resource-utilization, and has a long history in computing and other industries. Cities, for example, don't build hospitals to accommodate all residents -- only a certain share of them, who are likely to get sick at the same time.

There is nothing wrong with the concept itself -- as long as the likelihoods are well-estimated. My proposal seeks to lower the likelihood of filling the temporary filesystem up, even if it will not eliminate the risk of it happening completely.

Consider torrents with many small files. Since they will be moved piece-meal to your ZFS storage

No, as long as each file, however small, is fed to ZFS sequentially -- rather than with blocks out of order as often happens with torrents -- things will be fine.

@coolio2013
Copy link

My concern -- and the reason I use a cache drive for torrents -- is not speed, but the fact that my permanent storage is on ZFS, which can store things suboptimally if they are written at random, rather than sequentially. Here is one thread discussing it.

You'd be better off with adding an L2ARC or SLOG to your ZFS machine, rather than asking devs to rewrite half of the entire code of qbt.
As a sidenote: Personally, I have some big torrents on ZFS, encrypted pool. Not any issues (faster than my other QNAP based NAS). Storing payloads on ZFS would add fragmentation to the pool, but let me know a filesystem which does not fragment files on heavy usage with deleting and adding lots of files.

@coolio2013

I could imagine a new layer (or abstract structure) for files [...]

Gentlemen, please, let's not get sidetracked -- this has already generated a lot more discussion, than I expected, when filing this change-request.

I could imagine was meaning: In order to add that feature, I could imagine... (because a new layer or structure would probably be required to implement your request). Sorry if that wasn't clear.

@UnitedMarsupials
Copy link
Author

You'd be better off with adding an L2ARC or SLOG to your ZFS machine

I have that, thank you very much. But that's irrelevant to the writing of files out of order.

rather than asking devs to rewrite half of the entire code of qbt.

You seem to be overdramatic -- are you really claiming, the temporary filesystem feature comprises 50% (or even 10%) of the qbt codebase?

a new layer or structure would probably be required to implement your request

Weird. You're already renaming the fully downloaded files from .qb! to their rightful extensions as soon as they are done -- without waiting for the rest of the torrent. Is it really so dramatic a change to start moving them to the permanent storage at that same point?

@FranciscoPombal
Copy link
Member

FranciscoPombal commented Jun 18, 2020

@coolio2013

rather than asking devs to rewrite half of the entire code of qbt.

To be fair, this is of course and exaggeration.

@UnitedMarsupials

What? The ability to watch the already download part of a series, while the rest are being downloaded? That's not "niche"...

Series downloads in random order in the general case. If it is automatically downloading sequentially, it means it's downloading fast enough for the short wait time to not be a concern. People (including me) have been doing this (opening files from incomplete torrents) for a long time. But that's not what I was referring to. What's niche is the need for files to be moved piece-meal.

No, as long as each file, however small, is fed to ZFS sequentially -- rather than with blocks out of order as often happens with torrents -- things will be fine.

Ok, I can't argue with this since I don't know enough about ZFS, but I guess your claim here is that fragmentation on ZFS doesn't matter, as long as it doesn't happen between file contents - is that correct?

Weird. You're already renaming the fully downloaded files from .qb! to their rightful extensions as soon as they are done -- without waiting for the rest of the torrent. Is it really so dramatic a change to start moving them to the permanent storage at that same point?

The key difference being that feature is explicitly advertised as working at the file level not torrent level (Append .!qB extension to incomplete files). In fact, its purpose is precisely to enable you to tell complete from incomplete files apart when looking at them in the filesystem during a torrent's download.

Overcommitting is a perfectly legitimate technique for bolstering resource-utilization, and has a long history in computing and other industries. Cities, for example, don't build hospitals to accommodate all residents -- only a certain share of them, who are likely to get sick at the same time.

There is nothing wrong with the concept itself -- as long as the likelihoods are well-estimated. My proposal seeks to lower the likelihood of filling the temporary filesystem up, even if it will not eliminate the risk of it happening completely.

You describe the "good" overcommitting. What you're actually doing is not that. You are completely overwhelming your resources. If you not only overcommit memory, but also overwhelm your swap, your system becomes unresponsive (OOM killer may or may not be able to act in reasonable time, forcing you to do a hard reboot). In the case of city hospitals, "overcommit" is possible because there are mechanisms put in place to deal with totally full hospitals (emergency field hospitals built by the armed forces, for example), which is a more efficient setup than building a hospital for literally everyone (there are no resources for that, or to maintain it). In your case, your cache drive is all you have. Overcommiting that in its entirety is also overwhelming your system. You've got nothing to fall back on, so don't do it.

@UnitedMarsupials
Copy link
Author

What's niche is the need for files to be moved piece-meal.

It is not a "need" -- it is an improvement. And you're not even arguing, that it is not an improvement. Your stance is that it is not a sufficient improvement to justify the costs of implementing it.

To that end you're trying to minimize and trivialize the benefits, while exaggerating the cost...

The key difference being that feature is explicitly advertised as working at the file level not torrent level (Append .!qB extension to incomplete files).

How does such advertising -- or lack thereof -- affect the complexity of implementation? Because @coolio2013 was referring to this (perceived) complexity -- not anything else...

You describe the "good" overcommitting.

That was to counter your attempt to reject overcommitting completely. The "Don't overcommit, problem solved" part, in particular.

You are completely overwhelming your resources.

No, actually, I do not. And I am not even asking, from what you've drawn this conclusion, because it is irrelevant. Using the smaller units -- such as files, rather than entire torrents -- is an improvement. Objectively -- regardless of any flaws in my configuration.

You've got nothing to fall back on, so don't do it.

I can just delete a torrent -- and its partially-downloaded files -- problem solved. But it would still be an improvement, if this were required less often. Not guaranteed to never happen, just be less likely to happen, see? And you agree, that the change would make it less likely to happen -- at least, you haven't disputed it -- so what's left of your opposition?

Frankly, after 23 comments, I see you grasping for straws. I do not know, how this project's internal governance works, but at this point the discussion should be closed and the decision made based on what's already been said.

Thank you.

@FranciscoPombal
Copy link
Member

@UnitedMarsupials

It is not a "need" -- it is an improvement. And you're not even arguing, that it is not an improvement. Your stance is that it is not a sufficient improvement to justify the costs of implementing it.

Hmm, yes? At the end of the day, even if nothing else is wrong with a proposal, if it does not pass a cost/benefit analysis, it is not implemented - if the cost is too high, functionally, it is not considered an improvement.

To that end you're trying to minimize and trivialize the benefits, while exaggerating the cost...

That's an opinion. I know very well the cost of messing with something that involves changes to .fastresumes, the torrent/file move system/state machine, and how often that has led to weird/nasty bugs that then manifest as a ton of bug reports.

The key difference being that feature is explicitly advertised as working at the file level not torrent level (Append .!qB extension to incomplete files).

How does such advertising -- or lack thereof -- affect the complexity of implementation? Because @coolio2013 was referring to this (perceived) complexity -- not anything else...

The "advertising" affects the UX expectations. The complexity of implementation is another separate argument against the proposal.

That was to counter your attempt to reject overcommitting completely. The "Don't overcommit, problem solved" part, in particular.

And then I countered your attempt at making it seem like I was blindly dismissing overcommitment as a strategy. I had to make it clear that trom your description of your setup, you were doing the "bad" kind while preaching the "good".

You are completely overwhelming your resources.

No, actually, I do not. And I am not even asking, from what you've drawn this conclusion, because it is irrelevant. Using the smaller units -- such as files, rather than entire torrents -- is an improvement. Objectively -- regardless of any flaws in my configuration.

From what you described of your setup above, this was my conclusion. The proposal is "objectively an improvement" in the context of a broken workflow. In the same way that changing a loading screen text from press any button to continue to press any button to continue except the power button is "objectively an (UX, in this case) improvement" for a user who would attempt to press the power button when reading the former variant.

You've got nothing to fall back on, so don't do it.

I can just delete a torrent -- and its partially-downloaded files -- problem solved. But it would still be an improvement, if this were required less often. Not guaranteed to never happen, just be less likely to happen, see? And you agree, that the change would make it less likely to happen -- at least, you haven't disputed it -- so what's left of your opposition?

Again,

  1. At the end of the day, even if nothing else is wrong with a proposal, if it does not pass a cost/benefit analysis, it is not implemented - if the cost is too high, functionally, it is not considered an improvement.

  2. Users will be confused with the files being moved piece-meal, because it is not what is expected when you ask for a torrent to be moved upon completion.

In my opinion, it is not worth it to do such an overhaul of a system that by an large already works in the expected way, and just needs some bug hunting, just to please 1 or 2 users who like to carelessly overcommit.

Frankly, after 23 comments, I see you grasping for straws. I do not know, how this project's internal governance works, but at this point the discussion should be closed and the decision made based on what's already been said.

Thank you.

Would you like to speak with the manager?

Frankly, I no longer care. At the end of the day, I'm not going to block anything, so if others want this, it will be implemented. Like I said, my use of this feature is limited to the point I probably will not be affected by any serious bug either way. Have fun dealing with bug reports when something inevitably breaks in the fastresumes, files get accidentally deleted or not moved, etc, and then explaining that this is happening because the system fundamentally changed, when the effort could have been spent fixing existing bugs. Oh, and even if it's bug-free, all the confused users who won't expect files to be moved piece-meal, and will complain that qBittorrent is either not downloading some files or deleting them. IMO there are far higher priority items for the project.

@glassez
Copy link
Member

glassez commented Jun 23, 2020

So if a feature is called "when complete, move torrent to...", the torrent should be moved at once only when it is complete, not each file piece-meal. Otherwise a much more description for the feature would be: "move a torrent's files as they are completed to...".

When I suggested considering it from scratch, I meant the point when there is no corresponding feature yet, but only some problems that need to be solved by implementing some one. So it could be "Move completed files to..." if it satisfied the given conditions more.

I'm not sure if we have the same definition of a "completed" torrent, and we should decide on that first. Does a torrent with only a few files selected (but each of these few files fully completed) count as "complete", or does it have to have all files selected and complete to count as "complete"?

This is a really good question (although not related to this topic, but rather to the current "incomplete folder" implementation) that worths a separate Issue. Do you mind creating it?
IMO, if "atomic unit" is torrent then only fully downloaded torrent should be considered as "complete".

@glassez
Copy link
Member

glassez commented Jun 24, 2020

Related discussion is #7164.
I'm starting to implement some improvements that were approved there.
Since "file based" approach was disapproved there either, I will refrain from taking any steps in this direction for the time being.

@UnitedMarsupials
Copy link
Author

Since "file based" approach was disapproved there either

My take from this ticket is that simple majority would've preferred file-based approach -- and the sole remaining objection is "it is too difficult to do, while the gain is too small".

Perhaps, some people have greater weight in the project's governance, but the main dissenter -- @FranciscoPombal -- promised to "not block anything".

Is there some other manager to talk to?.

@glassez
Copy link
Member

glassez commented Jun 24, 2020

@sledgehammer999 is the maintainer of the project. And it is pointless to take on such a feature without his explicit approval.

@FranciscoPombal
Copy link
Member

@UnitedMarsupials

My take from this ticket is that simple majority would've preferred file-based approach -- and the sole remaining objection is "it is too difficult to do, while the gain is too small".

More like "it has the potential to break an entire existing feature set for the very small benefit of literally only one person with an arguably flawed workflow who asked for it.". Reworking a feature this way has multiple short- and long-term costs. If you want to commit to implement it, solving bugs that arise, and making sure it works fine with the existing features, you can start by submitting a PR. Otherwise, valuable time of other developers/contributors is wasted, and further stretched thin by having to solve new issues related to this new change, instead of being invested in fixing important existing bugs. This project already has too few contributors.

The only comment agreeing with you so far was "yeah, why not" (#13013 (comment)) Which is not a good reason.
@glassez's comments and main concern is, as far as I understand it, about the system in general and gracefully defining and handling "complete/incomplete" states. It's not necessarily agreeing with you either - it's just thinking about whether or not this can be a good general solution (or at least parts of it) to the general problem.

Perhaps, some people have greater weight in the project's governance, but the main dissenter -- @FranciscoPombal -- promised to "not block anything".

There isn't really any established governance system AFAIK - the maintainer's decision is final, beyond that there's 2 members who wield the most decision making power, by virtue of being the only ones besides the maintainer with write access to the repo.

I'm starting to implement some improvements that were approved there.
Since "file based" approach was disapproved there either, I will refrain from taking any steps in this direction for the time being.

@glassez from my part, you are free to experiment with any solutions - don't feel restricted by my opinion of this right now. If you come up with a good solution that happens to use this suggestion, I can only accept it.

This is a really good question (although not related to this topic, but rather to the current "incomplete folder" implementation) that worths a separate Issue. Do you mind creating it?

#13065

@UnitedMarsupials
Copy link
Author

The only comment agreeing with you so far was "yeah, why not"

This is obviously not true. In addition to @thalieht's thumb-up, which you acknowledged, @glassez favored the idea, and so did @Chocobo1...

@FranciscoPombal
Copy link
Member

@UnitedMarsupials

The only comment agreeing with you so far was "yeah, why not"

This is obviously not true. In addition to @thalieht's thumb-up, which you acknowledged, @glassez favored the idea, and so did @Chocobo1...

Did you read those comments carefully? I wouldn't consider this appropriate endorsement (emphasis mine):

I think it make sense although I haven't considered about the drawback (if there are any).

I like this idea, at least on the surface, if you do not go into the possible difficulties that may arise during its implementation. (...)

The latter quote is a fine example of this:

@glassez's comments and main concern is, as far as I understand it, about the system in general and gracefully defining and handling "complete/incomplete" states. It's not necessarily agreeing with you either - it's just thinking about whether or not this can be a good general solution (or at least parts of it) to the general problem.

@UnitedMarsupials
Copy link
Author

I wouldn't consider this appropriate endorsement

Of course, you wouldn't. All confirm, what I already stated -- that the idea itself is good, but, maybe, not good enough to justify difficulties of the implementation.

Your attempt to rephrase the above as "very small benefit of literally only one person with an arguably flawed workflow who asked for it" is as ridiculous as it is dishonest -- it was not among your two original objections, you've moved the goal posts since then.

Obviously, you will not be participating in implementing it, nor will I be -- so I don't need to convince you.

Hopefully, someone else will pick it up. I will, once again, try to unsubscribe from this thread now -- if, being the originator, I cannot unsubscribe, I may be tempted to comment again -- but I intend to resist the temptation.

@FranciscoPombal
Copy link
Member

FranciscoPombal commented Jun 24, 2020

@UnitedMarsupials

All confirm, what I already stated -- that the idea itself is good, but, maybe, not good enough to justify difficulties of the implementation.

Seems we're reaching an agreement. I would still note that I would not consider the idea intrinsically "good". IMO the only value of this idea resides in whatever extent it can be leveraged (fully or partially) to help with #7164 and #13065. It's only intrinsically good for your unique use case.

Your attempt to rephrase the above as "very small benefit of literally only one person with an arguably flawed workflow who asked for it" is as ridiculous as it is dishonest -- it was not among your two original objections, you've moved the goal posts since then.

Lol. Moved the goal posts? You're quoting a paraphrasing of the objections I expressed originally in my very first comment on this issue: #13013 (comment).

@UnitedMarsupials
Copy link
Author

UnitedMarsupials commented Jun 24, 2020

Sigh, unsubscribing is not working.

It's only intrinsically good for your unique use case.

How do you know, that my case is unique? That no one else -- or very few people -- would prefer for the fully-downloaded files to not eat up the space of the temporary filesystem beyond what's actually needed, and to be able to start using the fully-downloaded files (from the final destination) as soon possible?

At least three people here have concurred, that this would be nice -- even if there remain reservations about the ease of implementing it. "Unique use case"? Laughing out loud!

Moved the goal posts?

Yes. Your original objection was, well, objective: "There are enough bug reports open with moving/external storage/temp folder as is, changing something that already works is too risky."

You're now trying to make it about me and my "arguably flawed use-case". That's not paraphrasing, that's a complete change -- the goal posts haven't just moved, you put them into the showers!

@FranciscoPombal
Copy link
Member

How do you know, that my case is unique? That no one else -- or very few people -- would prefer for the fully-downloaded files to not eat up the space of the temporary filesystem beyond what's actually needed, and to be able to start using the fully-downloaded files (from the final destination) as soon possible?

It's true that you might not be the only one. But qBittorrent has been around for quite a while, and this is the first time this has been requested. And so far, no one else is saying "me too, this exactly what I've been looking for!".

At least three people here have concurred, that this would be nice -- even if there remain reservations about the ease of implementing it. "Unique use case"? Laughing out loud!

Again, I don't consider "yeah why not" sufficient to count as agreeing to proceed with the change, or that it really is a good change.

You're now trying to make it about me and my "arguably flawed use-case". That's not paraphrasing, that's a complete change -- the goal posts haven't just moved, you put them into the showers!

Stop accusing me of "moving goalposts". Read #13013 (comment) again. From the start, I clearly expressed the sentiment that your proposal is a solution looking for a problem, due to you "using it wrong".

@UnitedMarsupials
Copy link
Author

[Apologies for the delay - I thought I posted this, and just found it still waiting in my browser]

And so far, no one else is saying "me too, this exactly what I've been looking for!".

How many of the application's users even have a GitHub account?..

I don't consider "yeah why not" sufficient to count as agreeing to proceed with the change

What would you consider to be that? A board of director's vote? A CEO's order? "Yeah why not" is just what one can hope for -- until someone comes up with an actual implementation (a pull request), at their leisure.

Meanwhile, although you vouched to "not block" any such development, you're doing your rhetorical best to discourage it.

Stop accusing me of "moving goalposts". Read #13013 (comment) again.

Maybe, you should read it again. You listed two objections:

  1. There are probably many users whose workflow depends on the files only moving to the "Completed" directory once finished. Any scripts written with this assumption (which is the standard for this type of feature) will break.
  2. There are enough bug reports open with moving/external storage/temp folder as is, changing something that already works is too risky. This new mechanism seems rife for tons of edge cases and corrupted fastresumes in case the torrent and/or some or all of its files are renamed in the client mid-download, or if the client crashes, or if files are unselected, etc etc etc.

The first one you quickly conceded, and the second does not fault my "unique" install at all. When I called this argument self-contradicting (someone less kind could also call it FUD), you switched onto criticizing my setup, and my "unique" use-case. That's when the metaphorical goal posts were moved...

But, maybe, they weren't -- maybe, you simply forgot to put the number (3.) in front of your original suggestion -- where you advised me, I get a bigger drive...

That suggestion is still wrong -- and irrelevant -- even if I get a 10x-bigger cache-drive, it would still be better for completed files to be moved to the final destination soonest. The current implementation is using the cache sub-optimally -- regardless of how big it is. This is an objective fact independent of the flaws of people discussing it and their computer-configurations.

It is a fact because the sole advantage of the cache -- compared to a permanent location -- is the more efficient/faster handling of random writes in general and of gaps in files in particular, a very common occurrence in torrent downloads. That's the primary (only?) reason for anyone to use the cache filesystem for torrents at all.

But, once a file is completely downloaded, that advantage is gone -- to continue to use the cache filesystem for it even for a second is wasteful. Wasteful because cache is more expensive -- by the very nature of the concept -- otherwise the user would've simply used the space dedicated to cache as permanent. Wasteful for everyone, who uses the feature -- not just me -- even if some can afford it better than others.

This takes care of the 3rd of your 2 original objections... But, clearly, you're not going to be the developer implementing this anyway. So, why not unsubscribe from the ticket until someone else comes up with an actual PR?

@FranciscoPombal
Copy link
Member

@UnitedMarsupials

How many of the application's users even have a GitHub account?..

More than you'd think. A lot of people create GH accounts just to post here. I've seen a lot of users whose first issue ever posted is posted here. Plus sometimes people come over from other communication channels (such as the forums) and echo thoughts and ideas discussed there.

What would you consider to be that? A board of director's vote? A CEO's order? "Yeah why not" is just what one can hope for -- until someone comes up with an actual implementation (a pull request), at their leisure.

Meanwhile, although you vouched to "not block" any such development, you're doing your rhetorical best to discourage it.

You seem to think "not block" means something more than it really means. Yes, I'm trying to discourage the core team from working on this unless there is a solid agreement and reasoning. In the meantime, anyone else can take their time come up with an actual implementation (a pull request), at their leisure, like you said. But ultimately, no matter what I say, anyone can do whatever they want, core team or not.

...
you switched onto criticizing my setup, and my "unique" use-case. That's when the metaphorical goal posts were moved...

Again, I didn't "switch". That point is right there in the first post.

But, maybe, they weren't -- maybe, you simply forgot to put the number (3.) in front of your original suggestion -- where you advised me, I get a bigger drive...

I edited my post accordingly to reflect this (added a 3.). Though in the future you should probably read more into the content and not the form when interpreting posts. I'm all for proper presentation, but come on. It's a bit obtuse to claim you misunderstood my point this whole time just because of a minor formatting inconsistency.

That suggestion is still wrong -- and irrelevant -- even if I get a 10x-bigger cache-drive, it would still be better for completed files to be moved to the final destination soonest. The current implementation is using the cache sub-optimally -- regardless of how big it is. This is an objective fact independent of the flaws of people discussing it and their computer-configurations.

It is a fact because the sole advantage of the cache -- compared to a permanent location -- is the more efficient/faster handling of random writes in general and of gaps in files in particular, a very common occurrence in torrent downloads. That's the primary (only?) reason for anyone to use the cache filesystem for torrents at all.

But, once a file is completely downloaded, that advantage is gone -- to continue to use the cache filesystem for it even for a second is wasteful. Wasteful because cache is more expensive -- by the very nature of the concept -- otherwise the user would've simply used the space dedicated to cache as permanent. Wasteful for everyone, who uses the feature -- not just me -- even if some can afford it better than others.

In qBIttorrent, the primary purpose of downloading torrents to storage medium A (the "cache") and then moving it to storage medium B (assuming A is faster than B) when completed is to get better performance when downloading.
Moving each file as they are completed would effectively probabilistically increase the number of torrents you could have active in the cache at the same time without filling it completely, but you'd still need to be careful about overcommit, and all this at the cost of violating some important assumptions currently in place in the storage move subsystem (namely the "atomicity" of the torrent), thus increasing complexity. I'm mostly concerned about the ensuing difficulty in correctly saving/handling paths scattered across disks in fastresumes, especially in corner cases where users use removable storage and the like.

The best case scenario for your proposal is something like allowing you to add 1 TiB torrents made up of 10 MiB files to a mere 100 GiB cache, where not many files are "in flight" at a time. But you can't really predict that each file will be completed quickly enough. In the worst case, all the files only complete at the very end, meaning in the meantime they fill up your cache completely. Much the same as currently, you could add 6x 20 GiB torrents hoping that at least 1 would finish and get moved out of the cache fast enough for the total space occupied to never exceed 100 GiB over the course of downloading all of them, but you can't be sure. Your proposal simply skews the odds of this happening in the user's favor. To me, the trade-off in complexity is not worth it for the core team to focus on this, and that's the message I'm trying to convey.

But, clearly, you're not going to be the developer implementing this anyway.

Yes, anyone else is more than welcome at giving it a shot.

So, why not unsubscribe from the ticket until someone else comes up with an actual PR?

I'll speak when I have things to say. If you don't want conversation, why don't you unsubscribe until someone else comes up with an actual PR (actually, I don't think you can do that since you're the OP, but you can always ignore/stop posting).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants