Support for `mdb_set_compare` and `mdb_set_dupsort`. #12

bmatsuo · 2015-10-29T00:33:26Z

I think there is potential value here. But there needs to be some research and, in general, the C API needs to change for applications to take full advantage of these functions.

Background

The LMDB C API lets an application specify custom functions used to sort keys and values in a database. And this feature is not supported in lmdb-go despite potential benefits.

It is common for applications to store data in LMDB with fixed-width integer keys. Currently these keys must be encoded as big-endian byte slices and are compared as c-strings when scanned.

It may also be desirable to store data using exotic keys or exploit data structure when sorting duplicates. These cases are extreme but should be possible (and useful) with LMDB. A combination of the LMDB API and cgo prevent such things from being practical today. But were circumstances to change an lmdb-go API that could eventually support this functionality would be ideal.

Proposal

Define a new Cmp type that is an opaque handle for a comparison function. And, when arbitrary comparison functions can be provided, two functions for initializing and deinitializing a Cmp respectively will be be defined.

type Cmp uintptr

// These functions would be defined once the LMDB C API makes it possible use
// arbitrary Go for sorting.
func CompareFunc(func (a, b []byte) int) Cmp
func DestroyCompareFunc(Cmp)

With the Cmp type two new transaction methods can be added for supplying alternate comparison functions.

func (txn *Txn) SetCmp(dbi DBI, cmp Cmp) int)

func (txn *Txn) SetCmpDup(dbi DBI, cmp Cmp) int)

To provide some utility before the LMDB C API changes several C-based sorting functions can be defined.

const (
    CmpBE64 Cmp = 0
    CmpBE32 Cmp = 1
    CmpBE16 Cmp = 2
)

These Go constants will have corresponding C functions defined in the lmdbgo C library.

int lmdbgo_cmp_func_sortbe64(const MDB_val *a, const MDB_val *b);
int lmdbgo_cmp_func_sortbe32(const MDB_val *a, const MDB_val *b);
int lmdbgo_cmp_func_sortbe16(const MDB_val *a, const MDB_val *b);

Examples

Examples of the various facets covered in this proposal are given below. In each case the comparison function is set immediately when the database handle is open, which itself immediately follows the call to Env.Open. This is the only calling pattern that will be supported.

Arbitrary Cmp Functions

In this example an arbitrary comparison function is used to create a Cmp value that is then used to set the key comparison function for the database.

// ...
err = env.Open("/path/to/db", 0, 0644)
if err != nil {
    panic(err)
}

var dbi lmdb.DBI
dbcmp := lmdb.CompareFunc(doCompareCapnp)
defer lmdb.DestroyCompareFunc(dbcmp) 
err = env.Update(func(txn *lmd.Txn) (err error) {
    dbi, err = txn.OpenDBI("mydb", lmdb.Create)
    if err != nil {
        return err
    }
    err = txn.SetCmp(dbcmp)
    return err
})

// ...

Many to Many uint mapping

This example demonstrates a common real world case of database foreign keys. In this case 64-bit database IDs are mapped to one-another using a builtin comparison function to optimize seek performance.

// ...
err = env.Open("/path/to/db", 0, 0644)
if err != nil {
    panic(err)
}

var dbi lmdb.DBI
err = env.Update(func(txn *lmd.Txn) (err error) {
    dbi, err = txn.OpenDBI("mydb", lmdb.Create|lmdb.DupSort|lmdb.DupFixed)
    if err != nil {
        return err
    }
    err = txn.SetCmp(lmdb.CmpBE64)
    if err != nil {
        return err
    }
    err = txn.SetCmpDup(lmdb.CmpBE64)
    return err
})

// ...

Consequences

Existing programs will continue to function the same as they due. Their binary size may grow slightly.

Programs using builtin big-endian comparison functions will function the same semantically, but should have increased performance. Having a known length and encoding allows fast decoding to an integer format that can be quickly compared. Using a big-endian integer encoding is fortuitous because the Env.Copy* methods work as expected.

Risks

Big-endian integer encodings may see no performance increase when using builtin comparison functions.

The C API and cgo may never support hooking up arbitrary Go functions as comparison functions. Or the cgo bridge required may be too slow for practical use.

Alternatives

The MDB_INTEGERKEY and MDB_INTEGERDUP DBI flags perform very similar tasks for integer valued keys/data. The integer flags may even outperform custom functions. However their introduction into the lmdb-go API has not been finalized. The flags also give up specificity about key encoding (big-/little-endian) for performance which will have benefits and drawbacks.

Non-opaque values such as capnproto buffers cannot typically be used as keys in a database. Instead the values must either be stored in a separate database with opaque keys or appended to an opaque fixed-width prefix before stored as a key.

The text was updated successfully, but these errors were encountered:

bmatsuo · 2015-10-30T02:31:15Z

What does a comparison function do when assumptions about key structure are not true? A panic() cannot be allowed to escape a C callback (undefined behavior?).

Some logging needs to happen though logs cannot be automatically associated with an env or dbi.
The data must be sorted somehow. In which direction does bad data move? Up or down? When both values are bad?

bmatsuo · 2015-10-31T23:29:21Z

Bringing up context values in the LMDB mailing list wasn't very productive. It seems unlikely for a context argument to be added.

Other experimentation in #11 has led me to believe that simple integer based keys like "big-endian 64-bit unsigned integers" are unlikely to be fruitful in terms of performance gains.

One option that does seem plausible (though a burden on the user) is to expose a fairly raw API that accepts a C function pointer as an argument. Such a function is actually impossible to call using a Go function because the const keyword is not supported and matching the function signature is not possible without a static C proxy function.

type CmpFunc C.MDB_cmp_func
func (txn *Txn) SetCmp(dbi DBI, cfn *CmpFunc) error
func (txn *Txn) SetCmpDup(dbi DBI, cfn *CmpFunc) error

I have a branch investigating this, bmatsuo/compare-funcs. Using the above API it also explores the penalty involved in delegating a Go func to perform comparison. The experiment exp/cmd/lmdb_cmp explores this and has seen a 3x-6x performance penalty for Go-based comparison. It also indicated that dynamic dispatch is not significant Go comparison functions, the penalty seems to be primarily in the C->Go switch.

bmatsuo · 2015-10-31T23:33:20Z

With the experience I have now I am going to close this issue. I will continue to investigate things in the avenue of my branch, bmatsuo/compare-funcs. Exposing the low-level API may be effective. But I will probably wait at least until when the cgo proposal from #10 takes effect (Go 1.6) and the cgo dynamic check has been implemented.

ghost · 2016-03-23T14:55:24Z

@bmatsuo Execute me,
Would you please tell me how is it going on?
I have to compare keys before insert them into dbs.
Is there a better way now?

bmatsuo · 2016-03-23T18:24:43Z

I haven't revisited this since go1.6 has been released. I still plan on doing it. But I just haven't gotten around to it.

But from the description of your problem I'm not sure if these APIs are what you need. Can you elaborate on your use case?

The use case for the mdb_set_compare and mdb_set_dupsort functions are when you have special comparison requirements. For example, keys in the database are created by concatenating two strings of variable length. A special comparison function is required to sort keys of that nature "correctly".

If you are trying to compare a value to the existing value under the same key before updating (e.g. so new data can be merged into the existing record) then these functions are not what you need. In that case you just issue a Txn.Get with the key, check the value, and then issue the Txn.Put if necessary to store the updated value. In such cases it is probably best to enable Txn.RawRead = true to avoid unnecessary copies during Txn.Get. But this is getting pretty unrelated to the actual issue here.

Let me know if I misunderstood something.

ghost · 2016-03-24T04:54:49Z

Thank you @bmatsuo, I want to impl a simple queue, which is similar with kafka, but lightweight.

The meta db used to record procuder and consumers

Topic1 (db)

Key	Value
Conumser_%s	nextOffset
Consumer_%s	nextOffset
Consumer_%s	nextOffset
Producer	lastOffset
dataDBFileIndex	maxoffset
dataDBFileIndex + 1	maxoffset
dataDBFileIndex + 2	maxoffset
dataDBFileIndex + 3	maxoffset
... ...	... ...
dataDBFileIndex + n	maxoffset

The data db used to record real data.

data file1 (db)

Key	Value
offset	data(bin)
offset + 1	data(bin)
offset + 2	data(bin)
offset + 3	data(bin)
offset + 4	data(bin)
... ...	... ...
offset + n	data(bin)

The Consumer_%s is string，dataDBFileIndex is uint32, and offset is uint64

So my compare func looks like:

int descCmp(const MDB_val *a, const MDB_val *b) {
    /* DESC order in size */
    if (a->mv_size > b->mv_size) return -1;
    if (a->mv_size < b->mv_size) return 1;

    switch (a->mv_size)
    {
        case sizeof(uint32_t) :
            return mdbIntCmp<uint32_t>(a, b);
        case sizeof(uint64_t) :
            return mdbIntCmp<uint64_t>(a, b);
        default:
            return memcmp(a->mv_data, b->mv_data, a->mv_size);
    }
}

template<typename INT_TYPE> int mdbIntCmp(const MDB_val *a, const MDB_val *b) {
    INT_TYPE ia = *(INT_TYPE*)a->mv_data;
    INT_TYPE ib = *(INT_TYPE*)b->mv_data;
    return ia < ib ? -1 : ia > ib ? 1 : 0;
}

I'm not familiar with lmdb, without cmp code，can I impl that?
Thank you for your help!

bmatsuo · 2016-03-24T20:40:54Z

Ok. I have a better idea now. And, Yes. It seems that you would need the functions discussed in this issue if your data needs to be ordered this way (by key size, then by key content).

For your example above specifically, it is also problematic because it is clearly using the MDB_INTEGERKEY/MDB_INTEGERDUP flag(s). And those flags are not currently supported in lmdb-go (see #11). Though unless you have another reason to use those flags equivalent ordering is achieved by encoding unsigned integers as big-endian values using the "encoding/binary" package.

I am going to reopen this issue to track support for mdb_set_compare and mdb_set_dupsort. If you look at my branch, bmatsuo/compare-funcs, and the example program I have that sets the comparison function. It should be possible for you to get started with custom comparison functions.

https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/compare-funcs/exp/cmd/lmdb_cmp/main.go

Because I wrote the example program to gauge performance it allows the user to declare that comparison functions it uses be defined in C, in Go, or additionally in Go with some form of dynamic dispatch.

ghost · 2016-03-25T00:14:23Z

@bmatsuo Get it! 👍

bmatsuo · 2016-03-25T00:24:56Z

@zwb-ict looking again your tables above (Topic1 & data file 1). It seems like you could potentially get by splitting the Topic1 db into two dbs TopicMeta1, which contains string keys "Consumer_*" and "Producer", and TopicIndex1, which contains uint32 keys (dataDBFileIndex).

If you changed your data model in that way (using 3 dbs per topic instead of two) it seems to me like you wouldn't need custom comparison functions at all. Would that work for you?

I'm still interested in getting real support for these functions. But you are going to take a performance hit by using them. If it is possible to easily restructure your schema to avoid them I would highly recommend you do.

edit: Below is an illustration

TopicMeta1 (db)

Key (string)	Value
Conumser_%s	nextOffset
Consumer_%s	nextOffset
Consumer_%s	nextOffset
Producer	lastOffset

TopicIndex1 (db)

Key (uint32)	Value
dataDBFileIndex	maxoffset
dataDBFileIndex + 1	maxoffset
dataDBFileIndex + 2	maxoffset
dataDBFileIndex + 3	maxoffset
... ...	... ...
dataDBFileIndex + n	maxoffset

data file1 (db)

Key (uint64)	Value
offset	data(bin)
offset + 1	data(bin)
offset + 2	data(bin)
offset + 3	data(bin)
offset + 4	data(bin)
... ...	... ...
offset + n	data(bin)

ghost · 2016-03-25T00:29:58Z

@bmatsuo change schema seems to be a better idea, I'll change to that. Thank you!

Define IntegerKey and IntegerDup flags

bmatsuo changed the title ~~Support for mdb_set_compare and mdb_set_duport.~~ Support for mdb_set_compare and mdb_set_dupsort. Oct 29, 2015

bmatsuo mentioned this issue Oct 29, 2015

lmdb: Unsupported features #1

Open

6 tasks

bmatsuo closed this as completed Oct 31, 2015

bmatsuo mentioned this issue Mar 30, 2016

lmdb: limited support for mdb_set_compare and mdb_set_dupsort #61

Open

wojas added a commit to wojas/lmdb-go that referenced this issue Sep 14, 2023

Merge pull request bmatsuo#12 from wojas/define-integerkey-flags

41b4826

Define IntegerKey and IntegerDup flags

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for `mdb_set_compare` and `mdb_set_dupsort`. #12

Support for `mdb_set_compare` and `mdb_set_dupsort`. #12

bmatsuo commented Oct 29, 2015

bmatsuo commented Oct 30, 2015

bmatsuo commented Oct 31, 2015

bmatsuo commented Oct 31, 2015

ghost commented Mar 23, 2016

bmatsuo commented Mar 23, 2016

ghost commented Mar 24, 2016

bmatsuo commented Mar 24, 2016

ghost commented Mar 25, 2016

bmatsuo commented Mar 25, 2016

ghost commented Mar 25, 2016

Support for mdb_set_compare and mdb_set_dupsort. #12

Support for mdb_set_compare and mdb_set_dupsort. #12

Comments

bmatsuo commented Oct 29, 2015

Background

Proposal

Examples

Arbitrary Cmp Functions

Many to Many uint mapping

Consequences

Risks

Alternatives

bmatsuo commented Oct 30, 2015

bmatsuo commented Oct 31, 2015

bmatsuo commented Oct 31, 2015

ghost commented Mar 23, 2016

bmatsuo commented Mar 23, 2016

ghost commented Mar 24, 2016

bmatsuo commented Mar 24, 2016

ghost commented Mar 25, 2016

bmatsuo commented Mar 25, 2016

ghost commented Mar 25, 2016

Support for `mdb_set_compare` and `mdb_set_dupsort`. #12

Support for `mdb_set_compare` and `mdb_set_dupsort`. #12