Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting local snapshots using criteria should take into account the timestamp #18112

Closed
ktoso opened this issue Aug 3, 2015 · 10 comments
Closed
Assignees
Labels
bug help wanted Issues that the core team will likely not have time to work on t:persistence
Milestone

Comments

@ktoso
Copy link
Member

ktoso commented Aug 3, 2015

Discovered by Stephen McDonald in the akka-user thread: https://groups.google.com/forum/#!searchin/akka-user/localsnapshotstore$20ignores$20timestamps/akka-user/0CpgVgi0PE8/6gYW5v6zjToJ

I came across some surprising behaviour with the LocalSnapshotStore - I was trying to implement what I imagine would be a common pattern: remove older snapshots each time a new snapshot is created. I did this with deleteSnapshots(SnapshotSelectionCriteria(meta.sequenceNr, meta.timestamp - 1)), only to find it would delete all snapshots for the given persistence ID and sequenceNr, regardless of timestamp given.

After staring at my own code for a very long time, I looked at the source for LocalSnapshotStore and found that it seems to ignore timestamps altogether:

https://github.com/akka/akka/blob/master/akka-persistence/src/main/scala/akka/persistence/snapshot/local/LocalSnapshotStore.scala#L161

I initially thought this to be a bug, but looking at the history it appears to have been intentionally removed:

63baaf1

I was able to fix this with a minor change in the attached diff. Is there any interest in this fix? Or am I naive to some history around this?

I guess if it's the intended behaviour, I can easily enough subclass it and override the relevant bits to take timestamps into account - does that sound reasonable?

Attached diff:

diff --git a/akka-persistence/src/main/scala/akka/persistence/snapshot/local/LocalSnapshotStore.scala b/akka-persistence/src/main/scala/akka/persistence/snapshot/local/LocalSnapshotStore.scala
index 4dc1ff6..b513816 100644
--- a/akka-persistence/src/main/scala/akka/persistence/snapshot/local/LocalSnapshotStore.scala
+++ b/akka-persistence/src/main/scala/akka/persistence/snapshot/local/LocalSnapshotStore.scala
@@ -25,6 +25,7 @@ import scala.util._
  * Local filesystem backed snapshot store.
  */
 private[persistence] class LocalSnapshotStore extends SnapshotStore with ActorLogging {
+
   private val FilenamePattern = """^snapshot-(.+)-(\d+)-(\d+)""".r

   import akka.util.Helpers._
@@ -82,7 +83,7 @@ private[persistence] class LocalSnapshotStore extends SnapshotStore with ActorLo

   private def snapshotFiles(metadata: SnapshotMetadata): immutable.Seq[File] = {
     // pick all files for this persistenceId and sequenceNr, old journals could have created multiple entries with appended timestamps
-    snapshotDir.listFiles(new SnapshotSeqNrFilenameFilter(metadata.persistenceId, metadata.sequenceNr)).toVector
+    snapshotDir.listFiles(new SnapshotSeqNrFilenameFilter(metadata.persistenceId, metadata.sequenceNr, metadata.timestamp)).toVector
   }

   @scala.annotation.tailrec
@@ -156,13 +157,14 @@ private[persistence] class LocalSnapshotStore extends SnapshotStore with ActorLo
     }
   }

-  private final class SnapshotSeqNrFilenameFilter(persistenceId: String, sequenceNr: Long) extends FilenameFilter {
-    private final def matches(pid: String, snr: String): Boolean =
-      pid.equals(URLEncoder.encode(persistenceId)) && Try(snr.toLong == sequenceNr).getOrElse(false)
+  private final class SnapshotSeqNrFilenameFilter(persistenceId: String, sequenceNr: Long, timestamp: Long) extends FilenameFilter {
+    private final def matches(pid: String, snr: String, tms: String): Boolean = {
+      pid.equals(URLEncoder.encode(persistenceId)) && Try(snr.toLong == sequenceNr).getOrElse(false) && Try(tms.toLong <= timestamp).getOrElse(false)
+    }

     def accept(dir: File, name: String): Boolean =
       name match {
-        case FilenamePattern(pid, snr, tms) ⇒ matches(pid, snr)
+        case FilenamePattern(pid, snr, tms) ⇒ matches(pid, snr, tms)
         case _                              ⇒ false
       }
@ktoso
Copy link
Member Author

ktoso commented Aug 3, 2015

// cc @stephenmcd

@ktoso ktoso added this to the 2.4-M3 milestone Aug 3, 2015
@ktoso ktoso added the help wanted Issues that the core team will likely not have time to work on label Aug 3, 2015
@stephenmcd
Copy link
Contributor

Thanks Konrad. Any comment on the patch? Would you like me to set up a PR?

@stephenmcd
Copy link
Contributor

Ignore my previous message here, I read this before the mailing list thread.

I'll set up a PR.

@ktoso
Copy link
Member Author

ktoso commented Aug 4, 2015

Thanks! :-)

@ktoso ktoso added the 3 - in progress Someone is working on this ticket label Aug 4, 2015
@patriknw patriknw self-assigned this Aug 13, 2015
@patriknw
Copy link
Member

I think this issue is invalid. The SnapshotSeqNrFilenameFilter that was pointed out to ignore the timestamp is only used when deleting one single snapshot by its sequence number. User api does not include a timestamp.

I can't see anything wrong with the deleteSnapshots with a criteria. That one takes both seqNr and timestamp into account.

@patriknw patriknw modified the milestones: invalid, 2.4-M3 Aug 13, 2015
@patriknw patriknw removed the 3 - in progress Someone is working on this ticket label Aug 13, 2015
@stephenmcd
Copy link
Contributor

Hi @patriknw, thanks for looking at this.

I can't see anything wrong with the deleteSnapshots with a criteria. That one takes both seqNr and timestamp into account.

On the surface it looks that way, but I think there's still a problem - deleteSnapshots correctly builds a list of files matching the given sequence number and timestamp, but it then calls deleteAsync for each of these, which only takes into account the sequence number, just as you described. So in the case of multiple snapshots existing with the same sequence number, but different timestamps, they'll all be deleted - so yes, while the timestamp is initially taken into account, it ends up being ignored eventually.

Also as far as I've seen with LocalSnapshotStore, sequence numbers are never incremented, so this problem is bound to occur when trying to delete all snapshots other than the latest for a given persistence ID.

Given that, do you think my patch is the correct fix, or something else? For my project I've copied LocalSnapshotStore locally and used the above patch to avoid any data loss, and it's working correctly.

Thanks again, and sorry if I've overlooked something.

@patriknw
Copy link
Member

Thanks for clarifying. Now it makes sense.

@patriknw patriknw reopened this Aug 14, 2015
@patriknw patriknw modified the milestones: 2.4-M3, invalid Aug 14, 2015
@patriknw patriknw added the 3 - in progress Someone is working on this ticket label Aug 14, 2015
@stephenmcd
Copy link
Contributor

No problem at all.

patriknw added a commit that referenced this issue Aug 14, 2015
It was actually used for finding the right metadata, but the local store
deleted all files with matching seqNr.
Note that we use 0L as undefined value for the timestamp when deleting
single snapshot (and therefore it makes sense to delete all in that case)
patriknw added a commit that referenced this issue Aug 14, 2015
…mp-patriknw

=per #18112 Use timestamp in deleteSnapshots
@patriknw patriknw removed the 3 - in progress Someone is working on this ticket label Aug 14, 2015
@stephenmcd
Copy link
Contributor

Just confirming the fix in 0cd94e2 resolves the issue for me - thanks a lot @patriknw!

@patriknw
Copy link
Member

thanks for reporting and confirming

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help wanted Issues that the core team will likely not have time to work on t:persistence
Projects
None yet
Development

No branches or pull requests

3 participants