Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What types should we use to represent Keys and Values? #71

Open
KrzysFR opened this issue Apr 23, 2018 · 0 comments
Open

What types should we use to represent Keys and Values? #71

KrzysFR opened this issue Apr 23, 2018 · 0 comments
Labels
api Issues or changes related to the client API question

Comments

@KrzysFR
Copy link
Member

KrzysFR commented Apr 23, 2018

Most the of current API uses the Slice struct to represent "a sequence of bytes". It is there to provide a better user experience than a plain old byte[], which tells the API consume, in not so subtle words, to "deal with it yourself". That's not very nice...

The issue is that Slice is a type that's currently included as part of the FoundationDB.Client assembly. But is more useful than just FoundationDB: It can be used - in conjunction with the Tuple Encoding and TypeSystem - to make binary keys and values more user-friendly. Also, it can be used with other database systems, or anything that deals with sequences of bytes.

Application that would want to use this type would need to depend on FoundationDB.Client, which is OK if it needs it anyway, but weird for others.

There is also the new Span<T>/Memory<T> that will land in the BCL "soon". It solves a lot of allocation/copy problems that plagues the BCL since the start and also bridge the gap between native and managed memory. It will soon be widely accepted by most of the APIs in the BCL in the short term.

It would be great if the .NET binding would play well with them, because there will probably be a large ecosystem of new APIs and helper that will make working with bytes easier (formatting, parsing, ...).

It would ideal if the .NET Binding would accept Span<byte>/Memory<byte> as arguments.

BUT, there are potential issues with this:

  • Not all application code wants to (or cannot) deal with Spans just yet.
  • Some application already use a specific type (in our case Slice) and it would be weird to have to cast to/from spans all the time
  • Span is stack-only so cannot be passed as argument to async methods. You have to use Memory for that.
  • But for non-async boilerplate code, Span would be way faster than Memory, creating a mix of Span/Memory depending on the API...
  • "Keys" in FoundationDB are not just sequence of bytes, they have some characteristics like text formatting that knows about Tuple Encoding, JSON, GUIDs, etc...
  • There are also "Values" which also are sequence of bytes, but are not logically the same as keys. If would be nice to have a different type to distinguish keys and values (and help prevent bugs when everything is Slice and it is easy to mix them)

Ideally, we could introduce an interface to represent keys and values, something like IKey and IValue but:

  • the names are kind of generic, maybe IFdbKey and IFdbValue but again this is more generic than simply FoundationDB.
  • interfaces will mean a lot of boxing where would could have simple readonly structs.
  • you cannot implement == / != or any other operators on interfaces
  • there are a lot of issues when missing interfaces on generic overloads (I remember my lesson!)

Maybe structs like FdbKey and FdbValue would be better:

  • name is still Fdb specific, but at least we can add some behavior to justify the prefix
  • they don't allocate, and the JIT can usually evaporate the overhead at runtime (especially since the perf upgrades of .NET Core 2.1)
  • you can implements all the typical ==, !=, <, >, ... operators on structs
  • you can add instances and extension methods to add behavior
  • we can add all the implicit/explicit cast to interop with byte[] / Span / string / Guid / ITuple, etc..
  • anyone implementing a MySuperAwesomeKeyType can also add implicit casts to/from these struct, and it will be transparent to the users (Span/Memory already work like this)

But there are some unresolved issues:

  • You cannot add a cast operator on third-party types that you don't control (so if you are using someone else's MySuperAwesomeKeyType, you cannot implicit cast from it to FdbKey or FdbValue.
  • You cannot derive from or override a struct, so there is no way to have specialized key or value types
  • Struct can become very fat if you are not careful, and passing them around can become a perf overhead (vs the 8 bytes of a ref type pointer).

Also: what would the FdbKey type wrap?

  • A byte[]? too much memory allocation and copies
  • A ReadOnlySpan<byte> ? cannot be a field/prop, and cannot be passed to async methods
  • A ReadOnlyMemory<byte> ? can be passed around, but is slower than spans
  • A plain old object? But how to do we get access to the bytes?
  • Some magical ICanHazBytesBurger interface with multiple methods to access or serialize the bytes? we are back to interfaces which are allocating/boxing traps
  • Use CRTPs and do some funky things with generics (see this talk if you want to be both utterly amazed and loose all hope in humanity at the same time). I'm not the one who wants to explain what the deal is with this weird API signature to new comers!

How deep does the rabbit hole goes? 🐰 ➰ 💫

@KrzysFR KrzysFR added question api Issues or changes related to the client API labels Apr 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Issues or changes related to the client API question
Projects
None yet
Development

No branches or pull requests

1 participant