Add support for VersionStamps #72

KrzysFR · 2018-04-25T12:14:20Z

Add initial support for VersionStamps in the newest API

Add VersionStamp struct
Add support to Tuple Encoding
Implement fdb_transaction_get_versionstamp
Implement VersionStampedKey and VersionStampedValue atomic mutations
Add unit tests
XML Comments
Samples / Tutorials

Open questions: see discussion at https://forums.foundationdb.org/t/implementing-versionstamps-in-bindings/250

Should the same struct support both 80-bits and 96-bits VersionStamps ?
Should the same struct support both "Incomplete" and "Complete" VersionStamps?
How to expose a nicer API when using retry loops (WriteAsync(...), ReadWriterAsync(...))
What should be the textual representation of a Versionstamp (for ToString() and DebuggerDisplay). Currently it is "@VERSION-ORDER" / "@VERSION-ORDER#USER" and "@?" / "@?#USER" for incomplete stamps.

Example usage:

readonly struct VersionStamp
{
    readonly ulong TransactionVersion;
    readonly ushort TransactionOrder;
    readonly ushort UserVersion; // or 0 if 80-bit

    bool HasUserVersion { get; } // false: 80-bit, true: 96-bit
    bool IsIncomplete { get; }

    Slice ToSlice();
    void WriteTo(Slice dest);
    void WriteTo(ref SliceWriter writer);

    VersionStamp Parse(Slice packed);
    bool TryParse(Slice packed, out VersionStamp stamp);
}

VersionStamp stamp1 = VersionStamp.Incomplete(); // 80-bit, no user version
VersionStamp stamp2 = VersionStamp.Incomplete(42); // 96-bit, user version 42

Slice key1 = stamp1.ToSlice(); // => 10 bytes
Slice key2 = stamp2.ToSlice(); // => 12 bytes

Slice packedStamp = ....;
VersionStamp stamp3 = VersionStamp.Parse(packedStamp);
if (!VersionStamp.TryParse(packedStamp, out var stamp4))
{
    throw new Exception("....");
}

Some conventions:

incomplete VersionStamps always have the highest 8 bits of their Transaction Version set to 1. This is what is used by deserializers to recognize them from complete stamps. The default being all bits set to 1
transactions can create random tokens, but they still need to set the highest 8 bits to respect the point above.
the UserVersion field of 80-bit stamps will always be 0, but should not be accessed (right now I'm not throwing, maybe it should?)

- Handle both 80-bit and 96-bit sizes - Use internal flag to distinguish between both sizes, and incomplete/complete

- Support both 80-bits and 96-bits variants - BUGBUG: cannot recognized complete/incomplete stamps yet when parsing.

…tampedKey() / SetStampedValue() mutations

… tests for VersionStamps

KrzysFR · 2018-04-25T12:34:21Z

Several issues with the way Versionstamps are implemented in other bindings:

Java and Python only expose the larger 96-bits Versionstamp. If a user version is not specified, then a value of 0 is assumed. So their Tuple Layers only handler 12 bytes stamps with prefix 0x33.
They represent the incomplete stamp as all-FFs, though internally they seem to have a boolean to flag incomplete vs complete instances
Tuple support is achieved with a custom method packWithStamp() that has to track the offset where the stamp is located. There are overloads that deal with subspace prefixes (offset need to be adjusted).

This makes it a bit difficult to insert support of VersionStamps with the existing eco-system of Key and Value encoders (via TypeSystem and the various dynamic and typed subspaces). If we go the same route, we would need to change everything to output the extra "offset" field that tracks potential stamps. This would also not be compatible with other non-tuple based encoding schemes (binary, protobuf, hand-rolled).

An idea would be to represent incomplete versionstamps using a custom byte sequence, which is recognized and used to lookup the offset at the last minute before performing the SetVersionStampedKey/Value mutation (via "IndexOf(...)")

Pros:

Compatible with all existing encoding schemes, and all APIs are untouched.
Only VersionStamp-based atomic mutations need to look for the token, but other methods (Set, Get, GetKey, ...) could add a failsafe and throw if they see it in a regular key (most probably a bug).
Would work for both 80-bit and 96-bits stamps.

Cons:

The byte sequence could be used elsewhere in the key by change. So all-FF or all-00 is probably not a good idea.
Would diverge from Java/Python binding API.

One way to prevent the issue of the placeholder sequence conflicting with some other part of the key, would be to use a random token per transaction, and expose the incomplete stamp factory methods on the ITransaction itself. All methods would throw if they see the token twice, and it would change on the next retry. Probably that next random tokens also conflicts in the same key would be low.

await db.WriteAsync((tr) =>
{
    tr.SetVersionStampedKey(
        location.Keys.Encode("Foo", tr.VersionStamp()), 
        Slice.FromString("Hello World")
    );
    tr.SetVersionStampedKey(
        location.Keys.Encode("Bar", tr.VersionStamp(42)),
        Slice.FromString("Hello World")
    );
}, ct);

The call tr.VersionStamp() would return a 80-bit stamp with a random token that would change for each transaction (but be constant during the transaction lifetime). The call tr.VersionStamp(42) would in the same way return an 96-bit stamp with user version 42.

Pros:

The whole process of generating random tokens and checking them is hidden away from the normal user
Stamps produced by different concurrent transactions, or by multiple retries of the same transaction WILL be different.

Cons:

Need to have an instance of the transaction to create a stamp.
If incomplete stamps can be anything, Tuple Encoding cannot distinguish between a complete or incomplete stamp!

Possible solution for last point: if we can ensure that a Transaction Version generated by the database CANNOT have the higher bit set to 1 (ie: cannot have version numbers larger than 2^63) then we could use this bit as a marker. All random incomplete stamps would have this bit set, and all complete stamps would have this bit unset.

KrzysFR · 2018-04-26T12:35:43Z

Another issue: the call to tr.GetVersionStampAsync() must be done before commiting the transaction, but it will complete after the transaction has committed. This creates a lot of problems with the current API

(var result, var stampTask) = await db.ReadWriteAsync((tr) =>
{
    // read/set some keys
    tr.SetVersionStampedKey(location.Keys.Encode("Hello", VersionStamp.Incomplete()), Slice.FromString("World!");

   // if we want to know the stamp, we have to start the task here
   var task = tr.GetVersionStampAsync();
  // but it won't complete until we commit ourselves!!
  //BUGBUG: calling 'task.Result' or 'await task' here would DEAD LOCK!

  return (...., task); // <-- this is weird having to shiip a Task<VersionStamp> as part of the result!
}, ct);

var stamp = await stampTask; // need an additional await after the fact! :(

At the moment, the only solution is to return thas Task<VersionStamp> alongside the result, and let the caller of the retry loop await it and deal with it.

The core issue is that the layers that must know the actual stamp value used, have to execute code AFTER the transaction has committed. When composed with retry loops that control the lifetime of the transaction, it means that code inside the lambda must be able to schedule more code to execute outside the scope of the lambda!

We cannot do much about this, because the low level binding API is designed like this.

We have three choices:

don't deal do anything about it and the binding level, and let the user deal with it. May scare away users or produce horrible code.
have some specialized overloads of WriteAsync/ReadWriteAsync that would return the resolved VersionStamp along the result?
create a new pattern for retry loops that add another "onSuccess" handler that will be called after the transaction commits, and has access to transaction details such as the commit version and stamps generated?

Choice 2 does not solve the issue in all case.

Choice 3 splits code in two, and also may lead to a bad-practice pattern: Business Logic code or Layers that needs to do this will need to have access to the database instance, and call ReadWriteAsync themselves, which makes them not composable with others.

For example, if inside a single HTTP request to an MVC Controller, I need to do 2 or 3 operations (using different layers), and if at least one of them wants to handle the transaction lifetime itself, then they cannot share the same transaction. This will probably lead to mutiple transactions called sequentially, and will 1) introduce more latency, 2) break ACID guarantees if the second or third transaction fails.

…using a random token) - Each transaction generate a random token (and on each retry). - tr.CreateVersionStamp() can be used to get a stamp specific to this transaction

KrzysFR · 2018-04-27T07:57:42Z

Message Queue Sample: [TODO: not complete]

public class FdbMessageBus
{
    public ITypedKeySubspace<string, VersionStamp> Subspace { get; }

    public FdbMessageBus(IKeySubspace folder)
    {
        this.Subspace = location.UsingEncoder<string, VersionStamp>();
    }

    public void PostMessage(IFdbTransaction tr, string queueId, Slice message)
    {
        tr.SetVersionStampedKey(
            this.Location.Keys[queueId, tr.CreateVersionStamp(0)],
            message
        );
    }

    public void PostMessages(IFdbTransaction tr, string queueId, IEnumerable<Slice> messages)
    {
        int idx = 0;
        foreach(var msg in messages)
        {
            tr.SetVersionStampedKey(
                this.Location.Keys[queueId, tr.CreateVersionStamp(idx++)],
                msg
            );
        }
    }

    //TODO: consuming messages

}

…rsionstamp present in the key, and add the 16-bit offset suffix.

…specifiy the offset of the versionstamp

…tinguish between cases

KrzysFR · 2018-04-27T12:49:21Z

The current state of the PR allow basic usage of VersionStamps:

VersionStamp struct can model 80-bit and 96-bits stamps
It is supported by the Tuple encoder natively
SetVersionStampedKey(..) and SetVersionStampedValue(...) are implemented
Transactions can generate incomplete stamps that are randomized on each retry, via one of the CreateVersionStamp(...) overloads
The actual stamp used by the transaction can be obtained via GetVersionStampAsync(..) but this is currently a bit ugly (task must be obtain before calling commit, but should not be awaited until it completes)

I'm going to address the last point in a future PR, at least we can start playing with versionstamps !

KrzysFR added 4 commits April 25, 2018 14:08

Add first prototype of VersionStamp struct

2b6737c

- Handle both 80-bit and 96-bit sizes - Use internal flag to distinguish between both sizes, and incomplete/complete

Add bascic support for VersionStamps to Tuple Encoder

8192bb7

- Support both 80-bits and 96-bits variants - BUGBUG: cannot recognized complete/incomplete stamps yet when parsing.

Add initial support for fdb_transaction_get_versionstamp and add SetS…

2ccce08

…tampedKey() / SetStampedValue() mutations

Re-add unit test on the Tuple Encoding that were dropped before + add…

1cb6819

… tests for VersionStamps

KrzysFR added api Issues or changes related to the client API layer:tuple Tuple Encoding Layer labels Apr 25, 2018

Merge branch 'master' into dev/versionstamps

cc3193f

KrzysFR added 2 commits April 26, 2018 20:31

Fix unpacking of VersionStamps in tuple (when using the boxed API)

5051b06

Add the logic on transactions to generate custom placeholder stamps (…

6ae022a

…using a random token) - Each transaction generate a random token (and on each retry). - tr.CreateVersionStamp() can be used to get a stamp specific to this transaction

KrzysFR mentioned this pull request Apr 27, 2018

OSS-ing custom layers on top of FDB .NET Client (a nuget package to refer to) #77

Closed

KrzysFR added 3 commits April 27, 2018 14:31

SetVersionStampedKey will automatically detect the location of the ve…

cecf0e4

…rsionstamp present in the key, and add the 16-bit offset suffix.

Add overload SetVersionStampedKey(...) where the caller can manually …

da593a6

…specifiy the offset of the versionstamp

Fix parsing of incomplete VersionStamps, using the highest bit to dis…

7919499

…tinguish between cases

KrzysFR merged commit 06f8c42 into master Apr 27, 2018

KrzysFR deleted the dev/versionstamps branch October 25, 2018 19:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for VersionStamps #72

Add support for VersionStamps #72

Uh oh!

KrzysFR commented Apr 25, 2018 •

edited

Loading

Uh oh!

KrzysFR commented Apr 25, 2018 •

edited

Loading

Uh oh!

KrzysFR commented Apr 26, 2018

Uh oh!

KrzysFR commented Apr 27, 2018

Uh oh!

KrzysFR commented Apr 27, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add support for VersionStamps #72

Add support for VersionStamps #72

Uh oh!

Conversation

KrzysFR commented Apr 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KrzysFR commented Apr 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KrzysFR commented Apr 26, 2018

Uh oh!

KrzysFR commented Apr 27, 2018

Uh oh!

KrzysFR commented Apr 27, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KrzysFR commented Apr 25, 2018 •

edited

Loading

KrzysFR commented Apr 25, 2018 •

edited

Loading