Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for filtering out deleted events from queries #191

Merged
merged 10 commits into from Aug 21, 2018

Conversation

@renatocaval
Copy link
Member

renatocaval commented Jun 1, 2018

replaces #162

(description copied from #162 for context)

Originally from #104 initiated by @morozov1.

This PR introduces filtering of deleted events. To better understand why this is needed, we must understand the historical reasons for this and make some guesses about its evolution.

What follows is an assumption, I don't have all the knowledge of how the plugin architecture evolved. Therefore will try to outline here all the information I could collect and the assumptions I'm making.

Delete Journal Events

I guess that everybody agrees that an Event Sourcing application should not delete events. However, sometimes this may be needed (free up space) or required by law (currently flawed because of snapshoting and highest seq num issue, read further). So, this feature makes sense but has to be used with caution.

Physical or Logical Delete

Events can be physically or logically deleted. Logical deletion seems at first a very odd feature because it doesn't help at all in freeing up space nor does it fulfill legal regulations.

As far I can understand, the only reasons for logical deletions are:

  1. quarantine - events are deleted and after some period, and if nothing gets broken, events are effectively deleted by a DB admin.
  2. plugin spec requires that the highest sequence number ever used by a persistent actor is not lost. How this is achieved is an implementation detail. In the case of akka-persistence-jdbc, this is done by performing a logical delete on the latest event. The event is kept because of the highest seq num requirement, but it's invisible to the outside.

Logical Deletes and Backward "awkward" Compatibility

The previous behavior was logical deletion with events being delivered on query side. This is very odd. Why one would perform a logical delete on the write-side and still consume the events on the read-side? If the events are still needed, one should not "delete" them.

This PR preserves this behavior. Logical deletes are enable by default and events are delivered when querying.

Physical Deletes

Physical deletes are now possible (see #161, thanks @dmi3zkm). However, as mentioned above, the last event is not physically delete because we need to be able to trace the highest seq num per persistent actor. Because of the logical deletion of latest event is an internal trick, we should treat it as a real delete, therefore they are not delivered on queries.

@renatocaval renatocaval requested a review from WellingR Jun 1, 2018
#
# This property affects jdbc-journal.logicalDelete and jdbc-read-journal.includeLogicallyDeleted.
#
logicalDeletion.enable = true

This comment has been minimized.

Copy link
@patriknw

patriknw Jun 1, 2018

Member

I don't think logical deletes are supported by Akka Persistence any more (since long time back) so I find all "logical" stuff here confusing. I believe the plugin is using it for keeping track of the highest sequence number if all events are deleted, but that is an implementation detail and such events should never be shown in query results.

This comment has been minimized.

Copy link
@renatocaval

renatocaval Jun 1, 2018

Author Member

That's the conclusion we had as well. So, we want to move to hard deletes and preserve the last one as logically deleted event because of the sequence nr.

The problem is that we should keep it backwards compatible, even if the previous behaviour doesn't make much sense. At least for me, it doesn't make sense at all. For more context see discussion #162.

What do you suggest instead:

  1. should we make hard delete the default case and have the config to allow users to have the old behaviour?
  2. or should we simply drop the concept of logical deletes? Deleted is deleted.

This comment has been minimized.

Copy link
@patriknw

patriknw Jun 1, 2018

Member

I think it's a bug to return such a deleted event from queries.
Makes no sense to see one deleted event but not others.

This comment has been minimized.

Copy link
@renatocaval

renatocaval Jun 1, 2018

Author Member

That was not the behaviour, the behaviour was to logically delete them and still deliver on the query. All of them.

But yeah, maybe we should consider that all as a bug because it makes no sense at all.

This comment has been minimized.

Copy link
@patriknw

patriknw Jun 1, 2018

Member

ok, I thought it was some special case for the last event (sorry I haven't looked much at code here)

I agree that all logical deletion stuff should be removed in next "breaking change" release.

@WellingR

This comment has been minimized.

Copy link
Member

WellingR commented Jun 1, 2018

I am in favour of dropping the logical deletes in version 4.0.0

else
sql"""SELECT DISTINCT "#$persistenceId"
FROM #$theTableName
WHERE rownum <= $max AND deleted = false""".as[String]

This comment has been minimized.

Copy link
@WellingR

WellingR Jun 6, 2018

Member

The deleted column can be renamed in the config, so you need to apply the same trick here as with the persistenceId

Copy link
Member

WellingR left a comment

These changes look good (except for the small comments I had).

In case we decide to drop the logical deletion in the 4.0.0 version then we should probably include a deprecation warning somehow.

Places where we could log this warning

  • Whenever the plugin is started (e.g. when it loads the config)
  • Whenever a query is executed which could potentially return deleted events
  • Whenever an actual delete is executed.

I guess the first of these options would be the best, since the startup logging is the logging that would be noticed first.

WHERE "#$tags" LIKE $theTag
AND "#$ordering" > $theOffset
AND "#$ordering" <= $maxOffset
AND deleted = false

This comment has been minimized.

Copy link
@WellingR

WellingR Jun 6, 2018

Member

Same as my other comment, you should be able to use #$deleted here

@@ -41,6 +41,7 @@ docker {
}

jdbc-journal {
logicalDelete = ${akka-persistence-jdbc.logicalDeletion.enable}

This comment has been minimized.

Copy link
@WellingR

WellingR Jun 6, 2018

Member

I don't think it is actually needed to include this, since this is the same as in the reference.conf (the same holds for includeLogicallyDeleted below)

This comment has been minimized.

Copy link
@renatocaval

renatocaval Jun 7, 2018

Author Member

Actually we need it, but this is needed only because of how the config is being loaded in the tests. The var replacement was just not working as expected, so I have to re-add it here.

Basically, the var replacement happens too early. First, it loads reference.conf and it resolves the var to true, then it loads the test config and merges with the resolved reference, but at that point, there is nothing else to replace.

I tested with a real application and it works. We can configure akka-persistence-jdbc.logicalDeletion.enable and it propagates to the two others as expected.

@akka akka deleted a comment Jun 7, 2018
@renatocaval

This comment has been minimized.

Copy link
Member Author

renatocaval commented Jun 7, 2018

@WellingR, I fixed the tests. It should pass now.

@akka akka deleted a comment Jun 7, 2018
// delete all three events and wait for confirmations
(actor1 ? DeleteCmd(1)).futureValue shouldBe "deleted-1"
(actor1 ? DeleteCmd(2)).futureValue shouldBe "deleted-2"
(actor1 ? DeleteCmd(3)).futureValue shouldBe "deleted-3"

This comment has been minimized.

Copy link
@renatocaval

renatocaval Jun 7, 2018

Author Member

I improve the tests for the hard delete.

We must delete all events and check that nothing is returned. As such, we prove that the last event, the one that stays as logically deleted, is not returned from the query.

We don't need to test that there is still one logically deleted event in the journal because the test-kit captures that already. That's how we discovered that we need to keep the last one.

@renatocaval

This comment has been minimized.

Copy link
Member Author

renatocaval commented Jul 11, 2018

@WellingR somehow I haven't seen your comments on adding deprecated warnings. Good idea, I will add it.

@akka akka deleted a comment Jul 11, 2018
@WellingR

This comment has been minimized.

Copy link
Member

WellingR commented Jul 11, 2018

Looks good. All that remains is the deprecation warning

@renatocaval renatocaval force-pushed the renatocaval:logically-deleted-events branch from 2663a1c to 1fe9854 Aug 8, 2018
@akka akka deleted a comment Aug 8, 2018
// we should not introduce another dependency here.
// Therefore, we make sure we only log a warning for logical deletes once
lazy val logWarnAboutLogicalDeletionDeprecation = {
logger.warn("Logical deletion of events is deprecated and will be removed in version 4.0.0")

This comment has been minimized.

Copy link
@WellingR

WellingR Aug 9, 2018

Member

"will be removed in akka-persistence-jdbc version 4.0.0"

Also it might be a good idea to include the name of the config value the user needs to change to disable this.

@akka akka deleted a comment Aug 9, 2018
// Therefore, we make sure we only log a warning for logical deletes once
lazy val logWarnAboutLogicalDeletionDeprecation = {
logger.warn(
"Logical deletion of events is deprecated and will be removed in akka-persistende-jdbc version 4.0.0." +

This comment has been minimized.

Copy link
@WellingR

WellingR Aug 9, 2018

Member

Add a space at the end of the string. Other wise the concatentation will be something like ".. version 4.0.0.To disable ..."

@akka akka deleted a comment Aug 20, 2018
@WellingR

This comment has been minimized.

Copy link
Member

WellingR commented Aug 20, 2018

I want to merge this, however I see one last issue, the oracle test seems to be broken

[info] - should not return deleted events when using CurrentEventsByTag *** FAILED *** (270 milliseconds)
[info] java.lang.AssertionError: assertion failed: expected OnComplete, found OnError(java.sql.SQLSyntaxErrorException: ORA-00904: "FALSE": invalid identifier

[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error] akka.persistence.jdbc.query.OracleHardDeleteQueryTest

@renatocaval

This comment has been minimized.

Copy link
Member Author

renatocaval commented Aug 20, 2018

@WellingR oh dam!
I can't test it locally because it never passes on my local docker. I don't know why.

I have the impression it passed before though, but that's probably just an impression. :-)

I will check it tomorrow.

@akka akka deleted a comment Aug 21, 2018
@renatocaval

This comment has been minimized.

Copy link
Member Author

renatocaval commented Aug 21, 2018

@WellingR fixed and as an extra managed to fix my oracle docker!

@WellingR WellingR merged commit c0a6bad into akka:master Aug 21, 2018
2 checks passed
2 checks passed
Codacy/PR Quality Review Up to standards. A positive pull request.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@WellingR WellingR added this to the 3.5.0 milestone Sep 7, 2018
evgenyzic added a commit to evgenyzic/akka-persistence-jdbc that referenced this pull request Dec 16, 2019
* added support for filtering out delete events
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.