Skip to content

TokuMX Descriptor for Put Multiple

John Esmet edited this page Jul 24, 2013 · 3 revisions

Overview

  • Put multiple refers to the ydb put/del/update multiple API that generates and inserts keys for both primary and secondary indexes in one call. It currently has the requirement that all information needed to generate keys must be available during both recovery and normal operation. This is generally acccomplished by storing 'schema' information in the descriptor.
  • Put multiple is required for hot indexing (correctness) and the loader (performance - we wish to build all indexes at once, not one by one).
  • Storing schema information in the descriptor proved challening (and messy - according to Zardosht) in MySQL / the handlerton.
    • In MongoDB, we inherited key generation code that seems to work fine. One way to get this to work using put multiple is to simply design a descriptor that can be serialized to disk and interpreted on each generate row for put call (like in MySQL).
    • We may or may not want to make API modifications to get this to work in MongoDB.
  • The high level function for generating keys is IndexSpec::KeyGenerator::getKeys().
  • In the rest of this wiki, we will explore the data and performance requirements for key generation and then propose a solution.

Data requirements

  • The object for which keys will be generated (this will come from src_val in the callback).
  • Index "type". Currently, hashed is the only special type ( _spec._indexType ). Generating hashed keys needs only the single field name that is hashed and whether the index is sparse.
  • modifiable vector of field name strings ( _spec._fieldNames )
    • also the number of fields ( _spec._nFields = keyPattern.nFields() )
    • Only used to copy-construct a new vector from _spec._fieldNames and pass it to getKeys()
    • Currently generated by iterating the key pattern and storing each field name.
    • Could serialize this as an array of c-style strings and modify the code to use that instead of a vector.
  • modifiable vector of BSONElements that are "fixed" (key elements whose values are already known):
    • Copy-constructed from _spec._fixed, which is initialized once to all empty BSONElements
    • So _spec._fixed is probably not necessary to serialize.
  • Whether the index is sparse or not ( _spec._sparse )
    • This is easy enough to store, since it never changes value.
  • The format of the null/undefined element/key.
    • It's not clear why this these aren't statically defined by the BSONObj/Element classes.
    • We can probably statically define them and reference them globally.

Performance requirements

  • The current implementation
    • Copy constructs two vectors of size nFields using existing vectors.
    • Uses pre-computed values of null/undefined element/obj.
    • Uses pre-computed nFields and sparse bits.
  • If we can match or beat the above using a serialized descriptor, we should be in good shape.

Proposed descriptor format

  • 4 byte ordering, 1 byte version, 1 byte index type, 1 byte for sparse, 1 byte for clustering, 4 byte nFields, array of 4 byte offsets (relative to the end of this array), c-string array of field names,
  • The ordering is needed for comparisons. It should be first for performance.
  • Generating the vector of field names is fast. We declare a vector of size nFields (single malloc) and then initialize each c-string pointer appropriately using the offset array and c-string array in the descriptor.
  • Generating the vector of empty bsonelements is slower than copy constructing an existing vector, but it's straight forward and should be fast enough.

Action items

  • Refactor descriptor code in storage/env.cpp to use a custom descriptor instead of assuming it is just an ordering struct.
  • Refactor IndexSpec::KeyGenerator::getKeys() to call a static version which takes all of the information it needs as parameters instead of using class members. The static version will be used by the generate rows for put function.
  • Add generate row for put/del etc to the storage layer.
  • Introduce a Descriptor class that has the above information and knows how to 1.) compare keys 2.) generate keys. The storage layer will delegate to this class for key comparisons and key generation.
  • Modify NamespaceDetails::insert/delete/update indexes code to use put multiple using an array of DB *s from each IndexDetails instead of calling insert/delete/updatePair.
  • Later: Hot indexing, the bulk loader.