Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random store failures without any error reported #3

Closed
sadew7 opened this issue Apr 9, 2014 · 9 comments · Fixed by #7
Closed

Random store failures without any error reported #3

sadew7 opened this issue Apr 9, 2014 · 9 comments · Fixed by #7
Assignees

Comments

@sadew7
Copy link

sadew7 commented Apr 9, 2014

Sometimes vedis fails to store key. vedis_kv_store returns VEDIS_OK, but the successive vedis_kv_fetch returns VEDIS_NOTFOUND for the same key.

It doesn't occur for in-memory store. Also I can't reproduce it without vedis_kv_delete calls.

Here is the test case. Usually it fails within 1 minute with "Unable to fetch: -6".

#include "vedis.h"
#include <stdlib.h>
#include <time.h>
#include <iostream>

// generate random bytes of variable (important!) length
char *randomBytes(int* len)
{
    *len = rand() % 4 + 1;
    char *res = new char[*len];

    for (int i = 0; i < *len; i++)
    {
        res[i] = rand() % 256;
    }

    return res;
}

int main()
{
    vedis_lib_init();
    vedis *store;
    int res = vedis_open(&store, "vedis.db");
    if (res != VEDIS_OK)
    {
        std::cout << "Error opening vedis:" << res << std::endl;
        return 0;
    }

    srand(time(NULL));

    char buffer[255];
    vedis_int64 inout;

    int block = 0;
    int count = 0;

    while(1)
    {
        int keyLen;
        int valLen;
        char *key = randomBytes(&keyLen);
        char *val = randomBytes(&valLen);

        // insert random key
        res = vedis_kv_store(store, key, keyLen, val, valLen);
        if (res != VEDIS_OK)
        {
            std::cout << "Unable to store: " << res << std::endl;
            return 0;
        }

        // verify it was stored correctly
        inout = 255;
        res = vedis_kv_fetch(store, key, keyLen, &buffer[0], &inout);
        if (res != VEDIS_OK)
        {
            std::cout << "Unable to fetch: " << res << std::endl;
            return 0;
        }

        // generate new random key and delete if it exists
        delete []key;
        key = randomBytes(&keyLen);

        inout = 255;
        res = vedis_kv_fetch(store, key, keyLen, &buffer[0], &inout);
        if (res == VEDIS_OK)
        {
            res = vedis_kv_delete(store, key, keyLen);
            if (res != VEDIS_OK)
            {
                std::cout << "Unable to delete: " << res << std::endl;
                return 0;
            }
        }

        delete []key;
        delete []val;

        count++;
        if (count > 10000)
        {
            block++;
            count = 0;
            std::cout << block << " 10K iterations" << std::endl;
        }
    }
    return 0;
}
@symisc
Copy link
Owner

symisc commented Apr 9, 2014

Yes, very short keys (1 to 6 bytes) are known to be buggy when using the disk datastore.

Try the same test using large keys (8 bytes and more).

Switch to vedis_exec ( ) instead of the raw datastore api which handle short keys well.

@sadew7
Copy link
Author

sadew7 commented Apr 9, 2014

Updating the code to use much longer keys (min 28 bytes) doesn't change anything.

char *randomBytes(int* len)
{
        *len = rand() % 4 + 1 + 28;
        char *res = new char[*len];

        memset(res, 1, 28);
        for (int i = 27; i < *len; i++)
        {
                res[i] = rand() % 256;
        }

        assert(*len > 28);

        return res;
}

@symisc
Copy link
Owner

symisc commented Apr 9, 2014

How about vedis_exec ( ) instead of the raw datastore api: vedis_kv_fetch(), vebis_kv_store (), etc.

@sadew7
Copy link
Author

sadew7 commented Apr 9, 2014

Just tested with vedis_exec() and vedis_exec_result().
Results are the same.

@symisc
Copy link
Owner

symisc commented Apr 9, 2014

OK, we've experienced the same problem with unqlite which is another product we develop but it was quickly fixed about 8 months ago and since vedis disk kv layer is based on unqlite I think that this obscure bug is manifesting here. Anyway, I have to investigate this deeply.

@sadew7
Copy link
Author

sadew7 commented Apr 9, 2014

Just tried UnQLite.
Used unqlite_kv_store(), unqlite_kv_fetch() and unqlite_kv_delete().

Issue is reproducible there too.
Unfortunately.

@symisc
Copy link
Owner

symisc commented Apr 9, 2014

Yes, apparently this shitty bug is manifesting here. You can help by requesting the source code to UnQLite to devel@symisc.net and investigating the problem with me.

@Yuras
Copy link
Contributor

Yuras commented Jun 23, 2014

@symisc I need your help here.

I tracked the issue down to hot-dirty pages handling. Specifically, pager_write_hot_dirty_pages doesn't check nRef before calling pager_release_page. As a result, we unload page while retaining pointer to it. Later the same memory get reused, and the pointer now refers to wrong page.

So, if I guard pager_release_page with nRef < 1 check, then everything seems to work. But it doesn't look correct for me because the page actually has PAGE_DONT_MAKE_HOT flag set. So it should not be added to hot-dirty list at all. (The vedisKvIoPageDontMakeHot is called on already hot-dirty page.)

Could you please clarify what "hot-dirty" means? What pages can be hot-dirty and what pages can't?

@coleifer
Copy link

OK, we've experienced the same problem with unqlite which is another product we develop but it was quickly fixed about 8 months ago and since vedis disk kv layer is based on unqlite I think that this obscure bug is manifesting here. Anyway, I have to investigate this deeply.

The problem with UnQLite still seems to be present, at least in v1.1.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants