Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DT.AzureStorage: add randomization to history blob names to avoid races #891
DT.AzureStorage: add randomization to history blob names to avoid races #891
Changes from 5 commits
49dcce5
8920c49
9e575cc
dad88f8
f9efbd0
40ad7f3
5c1742d
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we call this method
DeleteBlobAsync
? I don't see any logic here related to orphaned data.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TrackingStoreBase
already implementsITrackingStore
so you can remove this.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for us not to use a GUID in this case? I would suspect that may be more "unique" than a random number, but perhaps that's incorrect.
Also - do we need to be concerned about creating orphaned blobs as a result of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have it in front of me, but it would be good to check our purge logic to see if it might end up missing orphaned blobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A GUID of 128 bit is indeed more random than the 32bit used here. But for this particular purpose I think 32bit is more than enough - split brain does not happen frequently enough to make collisions likely. I didn't want to waste so much space in the blob name.
The purge operation seems to wipe out the entire blob directory belonging to the instance id, so no problem there.
One thing I am not sure about is the case where a history is replaced (e.g. after ContinueAsNew). There should be some code somewhere that removes blobs for the old history (even without the proposed randomization) but I haven't quite spotted it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the current code some more. It simply does not remove any old blobs when updating histories - so those blobs may accumulate until purging the instance. However, importantly, in some select cases, it can overwrite old blobs with a new blob of the same name. Tor example, when calling continue as new and the new blob happens to have the same event type, and the same large field, and be in the same row of the history, then the blob is overwritten.
This latter case is the important one for entities, since they often have large inputs (that is where the entity state is stored) so that state can be overwritten every time the entity does continue-as-new, which is all the time.
This does mean this PR currently has a problem with entities (and eternal orchestrators) since the blobs will accumulate forever. It does not seem very easy to fix since the TrackingStore implementation does not know whether the old history contained any large blobs that need to be removed. We can either issue a query, or track that information somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the issues were addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to allocate a new
Random()
each time or can we use a staticRandom
instance?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little tricky because the
Random
class is unfortunately not thread-safe, so I would have to do something complicated or ugly. I think it is better to keep it simple to read and maintain, and the performance is probably fine considering that this is only called when we need a large blob, which is by itself going to use significant CPU and memory (so this little allocation is a tiny fraction of the cost).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
trackingStoreContext
might be a better word since "data" is heavily overloaded in this context (it would almost read as if this "data" is part of what we're updating).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update this overload to call your new overload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.