Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behavior for paranoid records since #5897 (see issue #7204) #7290

Closed

Conversation

Wsiegenthaler
Copy link
Contributor

This is a fix for 2 issues introduced by #5880/#5897 and documented in #7204:

  1. Paranoid records remain visible for several hours after they've been deleted. How long the records take to delete seems to depend on how far the server is from UTC. The solution is to always use database time when populating and later evaluating the deletedAt field.
  2. Deleted records can be queried in the same second. The paranoid clause should use 'greater than' instead of 'greater than or equal to' when evaluating the deletedAt field.

Given that this issue fundamentally affects the ability to delete paranoid records, I'd advocate for getting this into v4.

Relevant tests have been updated and now pass, and this change has been added to the 'future' section of the changelog. Please let me know if there's anything else I can do to expedite this!

@codecov-io
Copy link

Codecov Report

Merging #7290 into master will decrease coverage by -18.58%.
The diff coverage is 100%.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 09e79b2...1353275. Read the comment docs.

Copy link
Contributor

@sushantdhiman sushantdhiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a good addition to soft delete system, It should really be querying CURRENT_TIMESTAMP. This wasn't a problem for me because mostly my database and app servers are running on UTC. This will help users using fragmented timezones for app and db server.

I have a few comments that needs to be addressed

}).spread(function(users) {
expect(users[0].username).to.equal('Peter');
expect(users[1].username).to.equal('Paul');

expect(moment(new Date(users[0].deletedAt)).utc().format('YYYY-MM-DD h:mm')).to.equal(this.date);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests can still work, current timestamp will be returning UTC date, which can be compared with this.date defined above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why, but when I run this test the locally instantiated date is set to 1970-01-01 12:00 which is nowhere near the CURRENT_TIMESTAMP of the database, thereby causing the test to fail. Perhaps this is part of the test suite?

Regardless, this seems to highlight the problems that can arise when database and system clocks for whatever reason aren't synced. We could still compare the deletedAt field assigned by the database with the locally instantiated one, but seeing as the server time is no longer used to populate the database I'm not sure it buys us anything.

@@ -1810,7 +1810,7 @@ describe(Support.getTestDialectTeaser('Instance'), function() {
return this.ParanoidUser.create({ username: 'fnord' }).then(function() {
return self.ParanoidUser.findAll().then(function(users) {
return users[0].destroy().then(function() {
expect(users[0].deletedAt.getMonth).to.exist;
expect(users[0].deletedAt.val).to.be.equal('CURRENT_TIMESTAMP');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be populated with actual date object, which is returned by CURRENT_TIMESTAMP ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since updates only return the number of affected rows, we would need to issue a subsequent SELECT to retrieve the actual timestamp assigned by the database.

@Wsiegenthaler
Copy link
Contributor Author

Wsiegenthaler commented Mar 4, 2017

Hello Sushant,
Thanks for taking a look! I've responded to both of your comments though I'm not very pleased with the answers. It seems that treating deletedAt as a timestamp rather than a mere boolean (à la #5897) will have a caveat either way we go.

The caveat as things currently exist (caveat probably being an understatement), is that unless both your server and database are hosted in UTC, deletions may take effect several hours after expected.

On the other hand, with the introduction of this PR, instances returned by destroy() will not contain a valid deletedAt timestamp but instead the CURRENT_TIMESTAMP literal used for the UPDATE. This doesn't feel great either, but in my opinion is far preferable to discrepancies between server, database, and UTC time causing inconsistent deletions.

Of course a third option is to revert back to treating deletedAt as a boolean. This obviates the need for consistent timestamps, but we lose the ability to schedule records for future deletion. Speaking of which, by using CURRENT_TIMESTAMP as this PR does, setting a future deletedAt is impossible using the standard destroy() call anyway. In the interest of getting 4.0 out the door it might be best to pull #5897 until all the details are thought through.

I've only been using Sequelize a short while so I'll be relying on you guys to suggest the most preferred solution. I do think we have to do something to address the current issue before 4.0 however - otherwise users will almost surely find their deleted records suddenly coming back from the dead.

@sushantdhiman
Copy link
Contributor

Thanks @Wsiegenthaler, I did some brainstorming about concerns raised by this PR. Although some of these concerns are valid but we can have timestamp backed soft delete, without much change.

First, I think change $gte => $gt is acceptable, I have no problem with that. Its the right way to do this.

Second, Sequelize consider your database server is running in UTC (Your application server can run in any timezone). Working of our timezone system as explained by @janmeier 2 years ago #854 (comment)

Main concern you raise in #7204 was The server time may be inconsistent with the database time. You mentioned your app server is running in west coast (I assume -08:00 UTC). You just need to set timezone option to correct setting. Because Utils.now generate a timestamp which is backed by timezone mentioned in sequelize.config.timezone

With correct timezone setting, the offset you were noticing should be fixed. Sequelize was considering your app was running in UTC the whole time.

As I tested with Asia/Kolkata (+ 05:30), you can see it correctly set the timezone

Executing (default): INSERT INTO "Ms" ("id","test","createdAt","updatedAt") VALUES (DEFAULT,34,'2017-03-05 11:08:39.444 +05:30','2017-03-05 11:08:39.444 +05:30') RETURNING *;
{ id: 1,
  test: 34,
  updatedAt: 2017-03-05T05:38:39.444Z,
  createdAt: 2017-03-05T05:38:39.444Z,
  deletedAt: null }
Executing (default): UPDATE "Ms" SET "deletedAt"='2017-03-05 11:08:39.466 +05:30' WHERE "id" = 1 AND "deletedAt" IS NULL
{ id: 1,
  test: 34,
  updatedAt: 2017-03-05T05:38:39.444Z,
  createdAt: 2017-03-05T05:38:39.444Z,
  deletedAt: 2017-03-05T05:38:39.466Z } // Correct UTC based timestamp
Executing (default): SELECT "id", "test", "createdAt", "updatedAt", "deletedAt" FROM "Ms" AS "M" WHERE (("M"."deletedAt" > CURRENT_TIMESTAMP OR "M"."deletedAt" IS NULL) AND "M"."id" = 1);
null

@Wsiegenthaler
Copy link
Contributor Author

I just ran some tests and you're absolutely right - it appears the crux is to always keep your database in UTC. I did just this in my dev environment and haven't seen any problems, but are there any pitfalls I should be aware of when changing the timezone in production?

I'll be closing this PR and opening a new one with only the $gte change to the paranoid clause. I think the UTC requirement really ought to be in the docs front and center to prevent others from having the same issues. Let me know if I can help with that.

Thanks again for taking the time to work through this!

@gabegorelick
Copy link
Contributor

Apologies for coming late to this, but issues like #8202 have me diving into the consistency of Sequelize in this area.

  1. Reads that occur less than a millisecond after a delete (i.e. less than the timestamp resolution) will still have this issue. That's potentially what's happening in Same second query for paranoid tables returning soft deleted records #8202. The only solution I see for this is to drop the deleted_at > CURRENT_TIMESTAMP clause when querying and only rely on deleted_at IS NULL.

  2. More serious, it seems to me that even if the server and DB are both operating in UTC, the server's clock could be ahead of the DB's. My understanding (and please correct me if I'm wrong) is that soft deletes use the server's current time, and not the DB's. That seems problematic if it's coupled with using the DB's time when querying since records you intended to immediately delete won't actually be deleted until the DB's clock catches up. The solution to this would be to use CURRENT_TIMESTAMP when deleting if no custom delete_at is passed.

Let me know if there's a better place to continue this discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants