Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi writer threads may result in low performance #7

Open
ShuaiJunlan opened this issue Sep 27, 2018 · 2 comments
Open

Multi writer threads may result in low performance #7

ShuaiJunlan opened this issue Sep 27, 2018 · 2 comments
Labels
question Need more discussion

Comments

@ShuaiJunlan
Copy link

ShuaiJunlan commented Sep 27, 2018

In HaloDBInternal class, the boolean put(byte[] key, byte[] value) function is added a lock, so that it may result in low performance when multi-threads writing.

    boolean put(byte[] key, byte[] value) throws IOException, HaloDBException {
        if (key.length > Byte.MAX_VALUE) {
            throw new HaloDBException("key length cannot exceed " + Byte.MAX_VALUE);
        }

        //TODO: more fine-grained locking is possible. 
        writeLock.lock();
        try {
            Record record = new Record(key, value);
            record.setSequenceNumber(getNextSequenceNumber());
            record.setVersion(Versions.CURRENT_DATA_FILE_VERSION);
            RecordMetaDataForCache entry = writeRecordToFile(record);
            markPreviousVersionAsStale(key);

            //TODO: implement getAndSet and use the return value for
            //TODO: markPreviousVersionAsStale method.
            return inMemoryIndex.put(key, entry);
        } finally {
            writeLock.unlock();
        }
    }
@ahasani
Copy link

ahasani commented Oct 1, 2018

CMiIW for HaloDB itself should never and have no need to have multi thread writer. IMHO if i would have implemented a multithread writer db based on HaloDB i would embed multi instances of HaloDBs to a server which manage those instances with some sort of writer queue for connection pool that load balances all writes. Read is no problem though. HaloDB is perfect for some sort of "Volume" storage Multi Volume Store Server is what your after mate.

Cheers

@amannaly
Copy link
Collaborator

amannaly commented Oct 2, 2018

HaloDB is currently being used in a distributed database that only does single threaded writes to HaloDB. Each database box also has multiple instances of HaloDB running.

Clients do concurrent writes to the database, but those writes go to Kafka, which act as a distributed WAL for the database and each box in the cluster reads from Kafka and then writes to a particular HaloDB instance.

Since all writes to HaloDB are single threaded I haven't spent any time optimizing the performance of concurrent writes. As I have mentioned in the TODO comment in the put method we can optimize concurrent writes by using more fine grained locking.

@ShuaiJunlan Do you currently have a use case which requires multi-threaded writes, for which and the performance is poor?

@amannaly amannaly added the question Need more discussion label Oct 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Need more discussion
Projects
None yet
Development

No branches or pull requests

3 participants