-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comparing timestamps can lead to data inconsistencies in soft deleted records #8496
Comments
Comparing timestamps for soft deleted records has made the sequelize v4 migration a real pain. I believe it was a mistake to introduce this. I hope this can be changed back in v5, until then we may have to run a fork. The main issue for us is that soft deleted records cannot be excluded from an index. Under v3 we have the condition We encountered this problem attempting to migrate to v4, my initial concerns raised here #7884 The original change #5880 was aimed at solving a semantic problem. I don’t think the consequences of this change were properly thought through. Supporting future deletions of records seems like an afterthought - functionality that in my opinion belongs in the application layer, not the ORM. The issues we have had trying to migrate and the problems outlined by @gabegorelick outweigh the benefit to the few who actually want the feature of future deletions. I do hope this can be addressed. |
If you make |
Thanks for suggesting the above @gabegorelick. I’ve had a look at the This raises the question, why do the time comparison at all? Setting a default value should not have the unintended consequence of negating the scheduled deletion “feature”. It again leads me to believe that it was, at least in part, an afterthought added on to a fix for a "semantic problem". I’m not sure if want to use this as a workaround for my scenario of soft deletion free indexes, as this has the potential to change because the behaviour is confusing. I hope you don’t mind me bringing up these points here, I don’t want to derail the discussion from all your initial points which are spot on. |
I agree with @crispkiwi, my team has also recently run into issues with the new default // set this after model definition because `sequelize.sync`'s `CREATE TABLE`
// statement does not work when this is set in the model definition
MyModel.attributes.deletedAt.defaultValue = { [Sequelize.Op.eq]: null } However, this breaks |
Hey, upfront I'd like to say that, our whole application is running off of Sequelize so I really appreciate this product. That said, this major 4 release has been a nightmare for us. This turned out to be the main issue that we were having. We've had to stop our migration for the current time until we can come up with a lasting solution to this issue as we not going to go any further until we can keep the old behaviour. If we were to create "Proper Indexes" as I was told to by @sushantdhiman via the Slack channel the size of our indexes would become massive and it still runs noticeably slower than it did before. The fact is that we DID have proper indexes before it's just that they indexed out anything that was not We are also looking into Subclassing the Thanks in advance, |
We are trying to move to Sequelize 4 also, and running into this same issue. I similarly believe this was a mistake to force upon everyone by default. We also have indices that use ´deletedAt is not null`, and I would really prefer to keep those. We have no interest in some soft-delete in the future, which I think is not a real use case for the paranoid system. Could this not be made an option to enable/disable at least? |
I have added a PR to make restoring the old behavior an option: #9007 |
This issue is now fixed in v5 branch 014f2f5 |
Now we have a problem when migrating from v4 to v5, what we should do to query like |
Why do you want to perform this query? To me, that query would evaluate to true in every normal circumstance (I can only see it being false if a record was "deleted in the future"). |
Not OP, but if they are asking how to emulate v4's future deletion feature, then that's a valid question. Assuming that's what you want to do, I've had success monkey patching |
@gabegorelick What is this v4's future deletion feature? |
@papb In Sequelize v4, if deletedAt was set to a timestamp in the future, then a record would not be considered as deleted. This was useful to schedule records for deletion. But as pointed out in this issue, it can lead to consistency issues. Thus, it was removed in v5 in favor of considering a record as deleted if deletedAt is set at all. |
@gabegorelick @papb what about provide an option to make deleteAt query more flexible, such as
|
After all, it is irresponsible to add and then remove a speciality |
I've written about this before a few times but I will summarize my concerns again here. Apologies for eschewing the issue template and for the length of this post, but I wanted to use this as a place for summarizing all the issues around Sequelize's practice of comparing timestamps for soft deletes.
Use of client time instead of DB time
Before #8485, Sequelize would soft delete records using client's time, but query for them using database's time (by "client's time" I mean client of the database, e.g. an app server, ad hoc script, etc). This lead to inconsistencies that have been well documented; namely if client and DB clocks aren't in sync, you can potentially see a soft deleted record.
#8485 normalized this so that both delete and query operations use the client's time. This is good in that you no longer have to worry about time zone differences or clock drift between the DB and the client.
But the safer choice would arguably be to use the DB's time rather than the client's. This is because a typical app will have a number of clients (load balanced app instances, ad hoc scripts running on dev machines, etc), all of which need to keep their clocks in sync with each other, compared to only a handful of DB instances (and often only 1).
This problem still exists when using DB time if you have multiple database instances in a cluster, but typically the number of DB instances will be far less than the number of clients. Additionally, DB instances tend to be pretty homogeneous and under tight control, so keeping their clocks in sync would tend to be much easier than keeping client clocks in sync.
That being said, I still think it's still dangerous to rely on DB clocks being in sync, especially when you have multiple DB instances and non-monotonic time sources.
Non-monotonic time source
Javascript's
Date
object is not monotonic, i.e. in the presence of NTP, leap seconds, users manually adjusting the system time, etc, it can tick backwards. When that happens, your previously deleted records are now scheduled to be deleted in the future.For single node deployments, this can potentially be fixed by using something like
process.hrtime
as a time source. But clock drift between nodes is still a problem.Clock drift between nodes
Even if we use a monotonic time source, keeping multiple clocks in sync is really hard. Google resorts to atomic clocks to do it! The linked article mentions that NTP, which I can imagine is more typical than atomic clocks for the majority of Sequelize's users, will typically give you an upper bound for clock offsets at "somewhere between 100ms and 250ms."
Prior art
I haven't researched this too much, but a cursory search reveals that alternatives like ActiveRecord's ActsAsParanoid do not compare timestamps and instead compare to a sentinel value. I'd be curious to see if they're consciously not supporting future deletes or have just never implemented it. Regardless, if Sequelize's behavior in this space really is unique, that seems like more evidence that it's not a safe practice. Especially when it's for a feature (scheduled deletes) that few may use.
Conclusion
I think scheduled soft deletes weaken the consistency of Sequelize. It is very possible to see a soft deleted record 250ms or more after it was soft deleted. Sequelize should consider changing the default behavior to something safer.
The text was updated successfully, but these errors were encountered: