-
-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constantly writing to IndexedDB when a task is running is bad for SSDs #2355
Comments
Thank you very much for opening up this issue! I am currently a bit overwhelmed by the many requests that arrive each week, so please forgive me, if I fail to respond personally. I am still very likely to at least skim read your request and I'll probably try to fix all (real) bugs if possible and I will likely review every single PR being made (please, give me a heads up if you intent to do so) and I will try to work on popular requests (please upvote via thumbs up on the original issue) whenever possible, but trying to respond to every single issue over the last years has been kind of draining and I need to adjust my approach for this project to remain fun for me and to make any progress with actually coding new stuff. Thanks for your understanding! |
If you want to store your data and have it available even when crashes occur, you need to write it. I don't see any way around this. One could introduce a throttleling mechanism, but that would raise a ton of complicated problems. |
Thank you @johannesjo but I'm sorry, that's definitely not normal. It literally consumes ~25% of "active time" on my HDD, constantly, for hours, as long as task is running. As I said in the issue, here is the sensible way to save data:
No other program ever writes to disk every second for hours. |
Just to make the point very clear, the app is saving 1.1GB every hour if I let it running, even with "automatic backups" disabled. I have been trying to find a way to fix this. I'm not familiar with ngrx but the line |
I'm surprised no one noticed before, this is a huge issue, the I/O is way too intense, this will cause many problems to a lot of users down the line. I would add that the other way to persist change of running tasks while being sparse with I/O is actually to save in IndexedDB the start datetime the last time the task ran. This way, you don't need to actively store any data over time: implicitly, as long as there is a start datetime, it means the task is still running. The app can even be closed or crash, the state won't change and will be resumed next time the app is open. Then, when you stop the task, Super Productivity only needs to check current time (end datetime) minus start datetime to count the length of time and add to the previous task duration (if any) - and add to the parent project etc to update progress bars etc. So it's just a matter of inversing the logic IMHO, and the app could be very much more sparse both in I/O and battery consumption. |
Thank you @lrq3000. But at this point I'm willing to bare any other inconvenience to fix this issue. I'm busy with work and I rely on super productivity and I have a loooot of tasks which makes every second save bigger in size (10GB on my HDD in one day worth of tracked work). I'm literally obliged to continue writing this much daily because I can't switch right now and there are no good alternatives, so if anyone makes any temporary fix pleaaaaaase share the commit with me (I can build from source). |
This issue has not received any updates in 90 days. Please comment, if this still relevant! |
very relevant 😞 |
@johannesjo Do you need any help with the logic? I'm not a typescript coder, but I have coded lots of timetracking systems, maybe I can help? |
@johannesjo I think it's just a matter of changing TRACKING_INTERVAL to be a configurable option instead of a constant? It's there something i'm not seeing? If so I can make the changes here and submit a pull request. |
@hugaleno Ideally, there should be a variable "session_start_time" that is persisted and holds the current session start time in the task object and a variable "elapsed_time" that is not persisted and gets recalculated every second to update the UI only. |
@RationalFragile I agree that is not the ideal solution but it can save a lot of writings to disk. |
No! If the app crashes, you'll reopen it (when you'll want to stop the task and see that the app has crashed) and the task will still be remembering when it started and therefore you won't lose any progress! You really only want to save the task when the session starts and when the session ends.
You can't fix that unless you follow the method I'm suggesting (actually, suggest by lrq3000 above and I validated it by making a quick C# app that I'm using while waiting for this to be fixed).. Because, the app needs to change and save the cumulative time spent on parent project/task (for subtasks) for example; again, because it's not saving the start time in the task. By contrast, if it only saves the start time and then keeps recalculating all the affected elapsed times, it won't save anything during the refreshes... Hope that makes sense. And I'm just trying to elaborate on why (from a logical perspective) we should only save data that is not transient, the elapsed time is a transient property (look at how Stopwatch is implemented in C# for example, would it make sense to update it every millisecond (busy counting) or simply keep the start value and calculate the elapsed time when requested?) but anyway, if there is an easy fix with just a few lines, of course, might as well try it! |
@RationalFragile I got the point, but I see some things to take into account. 1 - Today the current task is a in memory information, so if the app closes unexpectedly , when it reopen there will be no running task. Just listing what I remember now to do not forget. |
@hugaleno But thank you for listing these things, it's a good thing and it'll make misunderstandings less likely. (Tho, if I'm the only one complaining about this, then don't do it please, it'll take you time and effort and I already coded some basic time tracking for myself to get me by. If someone fixes this, I might reconsider SuperProductivity, but I might just add more features in my app instead... Again, sorry, not trying to sound rude, just trying to save you from wasting time if I'm the only one wanting this! Thank you 🙏) |
@RationalFragile Actually I think that this is important to be resolved. That's a lot of I/O and battery usage |
I think you understood very well how to code this @RationalFragile , but to clarify for others: this kind of feature would require only one attribute per task, and one global atttribute:
Note also that there is also another global attribute that already exists: Now, the UI can display a realtime update of transient elapsed timetracking time by simply calculating When the user stops timetracking, then we can again calculate one last time And voilà, with just 4 variables (2 stored globals, 1 stored per task, 1 not stored but only used at runtime - transient_elapsed_time) we get a zero I/O timetracking feature, and on top we get for free the ability to continue tracking even when the app is closed/killed, or even synchronize timetracking between devices (just store the globals in the database file that can be synced, and other devices can timetrack the same task simultaneously - you can start timetracking on your smartphone and continue on your computer!). Sounds like a great feature, no? ;-) |
So it should be simple to implement on top of the current codebase, and the codebase is very clean (awesome work by Johannes!), it's just because I suck at Typescript that I cannot do it on my own :-/ |
PS: also the changes would be retrocompatible: only the way new data is generated is changed, and the end result is the same as before, the only change is that time can then also be tracked while the app is closed, but this behavior can be disabled with a boolean option (if this option is enabled, then when the app is opened, if the global attributes |
Wow! Thank you everybody! This one of the many reasons why I love open source so much! Strangers on the internet collaborating on a complex problem. Wonderful! :) Thank you @lrq3000 ! To summarize with my own words: The plan is to use a separate model to do the time tracking stuff and only "flush" it to the task model whenever tracking is stopped, right? Some things to consider:
On a side note: In general it would be lovely to have more granular writes, so that when a task is updated only that task would be written. But since this might be huge undertaking, I am not sure if this is realistic. |
Taking into account everything was said and looking through the code here are my thoughts:
I don't think we have to change anything except for the part that it saves to the disk after every tracking interval(1 second). I think one approach we could use is: We can add a new action that will trigger the That leave us with the following problem: when the app crashes or closes unexpectedly we could lose the tracked time. To address this we will persist the When the app starts, we look at this track information:
The main benefit I see in this approach is: 1 - We don't have to reinvent the wheel and implement everything Johannes already did to do the same thing in a different way |
Sorry it got super busy at work for me, I forgot this discussion tbh!
I'm not sure about the model part, I don't know the internal architecture of the software, but yes, my suggestion is mostly about making flushing lazy until we need to commit on-disk, but not just that: we also need to make the timetracking variables persistent to ensure we are robust against crashes (and potentially synchronize across devices, more about that below).
Yes no problem, just add the computed interval to those too when timetracking is getting flushed.
I'm not sure either, but after some more thoughts since your question, I think it would be easier/less prone to conflicts to synchronize timetracking state rather than force flushing, because if timetracking state is not synchronized, then two different timetrackings can be started on two different devices, and then you get a nasty conflict when trying to synchronize, where you either lose one tracking flush or the other. To avoid this issue, I can't see any other solution than to synchronize timetracking state, and if we do that, then it's unnecessary to force flush anyway. @hugaleno Thank you very much for having a more precise look in the sourcecode! Your plan sounds awesome! Given what you found, I agree it's unnecessary to rewrite logic that is already there, if just making a few fields persistent and making flush lazy is sufficient to fix this issue then awesome :D BTW could you please point to which file and functions you found these (if you remember)? |
@lrq3000 Just made a merge request for a initial review with all the changes I made so far, including some debug. |
This issue has not received any updates in 90 days. Please comment, if this still relevant! |
This is still relevant |
Very relevant! I just learned about the programming concept of holding data in a buffer and writing it at once, less often in a JS performance workshop today..and I instantly thought of this specific issue from almost a year ago. Thanks @ThePrimeagen ;) |
was this solved or just closed? i stumbled upon this randomly, and superproductivity is on my 2nd top disk-writes after firefox where i mostly stream videos. |
If someone has a good suggestion on how to solve this, I am all ears. I am not sure if there is a way around potential dataloss e.g. when the app or system crashes before writing, if we implement some kind of caching. But Ia m no expert on the subject, so input is welcome. |
Jeg har samme problemet. Et det bara med snap? |
Your Environment
Expected Behavior
It should persist changes of course but not write them to disk every second. Something like once per minute when a task is running and on stop/start is more reasonable.
Current Behavior
It constantly writes to files like "userdatadir\IndexedDB\file__0.indexeddb.blob\1\00" but only when a task is running. And because tasks run for hours and hours, process explorer is showing that superproductivity wrote more than 3 times what my chrome writes. (And I noticed my SSD lifespan shrunk from a projected 13 years to 10 years in less than 6 months, and superproductivity is the program that writes the most to disk, after "System" in Process Explorer. I moved it to my HDD but still, I think this should be changed.)
Steps to Reproduce (for bugs)
Can you reproduce this reliably?
Yes
Console Output
Error Log (Desktop only)
EDIT:
This can be clearly seen in the electron devtools, when a task is running, you get multiple "[Persistence] Save to DB" events every second.
The text was updated successfully, but these errors were encountered: