Improve FileInfoBackingCache #1284

sijie · 2018-03-21T23:07:48Z

Descriptions of the changes in this PR:

There are a couple of issues noticed in FileInfoBackingCache:

There is a race condition in loadFileInfo between get-check and put. If concurrent loading happens, there might be a FileInfo loaded into the map after get-check. This can cause incorrect reference count on FileInfo.
FileLoader is doing I/O operation which happens under a giant write lock.
assert is typically not recommended since it is disabled at production runtime typically.

Changes

Check whether fileinfo exists or not after getting write lock and before put
Move any I/O operations out of write lock
release the new FileInfo if concurrent puts happen
remove the usage of assert

Beside that, switch to use ConcurrentLongHashMap to avoid boxing and unboxing.

Related Issues:

#913 #832

sijie · 2018-03-22T05:14:53Z

retest this please

ivankelly

This change introduces a race. (see comment)

Could you give a sequence of events for how the other race occurs?

Has the locking shown to be a problem in benching?
We can get rid of it in other ways, such as:

computeIfAbsent() // lock is at section level
Make CachedFileInfo load and close. Presence of a CachedFileInfo in fileInfos should work as a lock. Only load after a successful put, only remove after marking dead and flushing.

ivankelly · 2018-03-22T15:51:47Z

bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/FileInfoBackingCache.java

@@ -92,8 +107,20 @@ private void releaseFileInfo(long ledgerId, CachedFileInfo fileInfo) {
    }

    void closeAllWithoutFlushing() throws IOException {
-        for (Map.Entry<Long, CachedFileInfo> entry : fileInfos.entrySet()) {
-            entry.getValue().close(false);
+        try {


I don't see the point of this change. I'm pretty sure used a foreach loop in first place to avoid something like this.

there is no entrySet in ConcurrentLongHashMap

ya, saw that later. Better to use UncheckedIOException in any case. Maybe even change FileInfo close to throw it.

I don't think we should change the signature to throw unchecked io exception. checked exception is well handled in the whole path, but not unchecked exception.

The logic here is just to getting around this interface doesn't throw checked exception, so I am able to throw checked exception. there is no really matter it is unchecked execution exception or unchecked io exception.

there is no really matter it is unchecked execution exception or unchecked io exception.

The difference is the instanceof. With UncheckedIOException:

try { // blah } catch (UncheckedIOException uio) { throw uio.getCause(); // returns a IOException }

okay will change that.

done. changed to use UncheckedIOException

ivankelly · 2018-03-22T16:01:59Z

bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/FileInfoBackingCache.java

            }
        } finally {
            lock.readLock().unlock();
        }

+        File backingFile = fileLoader.load(ledgerId, masterKey != null);
+        CachedFileInfo newFi = new CachedFileInfo(ledgerId, backingFile, masterKey);


I am not sure the race condition you describe exists.

1), this method is only called by guava cache when loading the file info. guava cache guarantees awaiting pending load for a given key. so thread a and thread b can not happen on one guava cache. **so one has to be loading from write fi cache, the other one has to be loading from read fi cache. **

2), the file info loader is basically doing file lookup and construct the file info objects. it doesn't touch any persistence state (e.g. create file). so if the file exists at disk, the file info object created by thread a and thread b are identical. so there is no race condition. so the only concern is file doesn't exist on disk, ThreadA picks a directory while ThreadB picks another directory.

now lets talk about "ThreadA picks one directory while ThreadB picks another directory".

in this case, both ThreadA and ThreadB can't find file on the disk, it means it is a first time creation. a read request can only attempt to load file info after a write request creates a file info.
https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/HandleFactoryImpl.java#L61

so this case won't actually happen.

hope this clarify.

The race exists according to the contract provided by FileInfoBackingCache. I'm pretty sure it can happen with the cache loaders also, though that should be irrelevant to the discussion, as whether a container is safe should make it irrelevant as to how it is used.

| Read Fi CacheLoader | Write Fi CacheLoader | | ====================================== | ====================================== | | calls loadInfoInfo | | | - checks fileInfos under read lock | | | - loads fileinfo for file | | | | calls loadFileInfo | | | - checks fileInfos under read lock | | | - loads fileinfo for file | | | - puts into fileInfos under write lock | | | | | | // does something else, maybe fencing | | | | | | calls releaseFileInfo | | | - flushes out fencing flag | | | - removes from fileInfo | | - puts info fileInfos under write lock | | | | calls loadFileInfo | | | - checks fileInfos under read lock | | | - finds fileInfo loaded from other thread | | | - fileInfo does not appear to be fenced | | | |

the file info object created by thread a and thread b are identical.

At the time of creation yes. But one thread can modify the state and persist it back, and the other thread would be unaware of these state changes.

The race exists according to the contract provided by FileInfoBackingCache. I'm pretty sure it can happen with the cache loaders also, though that should be irrelevant to the discussion, as whether a container is safe should make it irrelevant as to how it is used.

this is not a general class. this is a class for the file info cache. you are taking how fileinfo and fileinfo cache work into account. your example is the exact example showing this class is related to fileinfo and fileinfo cache.

in your new example, you are trying to say "fileInfo does not appear to be fenced", how's that possible. all the fencing bit are lazily loaded via FileInfo#checkOpen. At the point Read Fi CacheLoader put the fileinfo into the cache, both write and read fi cache loader use same file info object, which will get exact fence bit.

But one thread can modify the state and persist it back, and the other thread would be unaware of these state changes.

how's possible? I only move the construction and file lookup out of the write lock, this doesn't change the original behavior. the get and put are under the write lock, read & write fi cache will reference same FiInfo when this file info exist in both case.

the only consider would only coming from file lookup is out of the lock, and I have explained this wasn't a problem in my previous comment.

ah, I had it in my head that the loader actually read the file, but it should have been obvious it didn't.

This change works fine then. Why not get rid of the lock completely though?

sijie · 2018-03-22T18:57:42Z

Could you give a sequence of events for how the other race occurs?

I am using your diagram and using the existing logic

  | Read Fi CacheLoader                               | Write Fi CacheLoader                                |
  | ====================================== | ====================================== |
  | calls loadFileInfo                     |                                        |
  | - checks fileInfos under read lock     |                                        |
  |                                        |                                        |
  |                                        | calls loadFileInfo                     |
  |                                        | - checks fileInfos under read lock     |
  |                                        | - create a file info and put into fileInfos under write lock |
  | - create another file  info and put it  |                                        |
  | into fileInfos under write lock            |                                        |

The Read Fi CacheLoader will overwrite the fileinfo created by Write Fi CacheLoader.

Has the locking shown to be a problem in benching?

it is a giant write lock. so every thread is blocking waiting for file lookup (which is done by the file loader).

ivankelly

The Read Fi CacheLoader will overwrite the fileinfo created by Write Fi CacheLoader.

Ah yes. At minimum we should be checking again for existence inside the write lock.

it is a giant write lock. so every thread is blocking waiting for file lookup (which is done by the file loader).

But again, my question is whether this has been observed to be a problem?

Anyhow, if we want to get rid of the locking, we should get rid of the big lock completely, which shouldn't be too hard with the concurrent map.

Consider the following, where the lock is at the ledger granularity

CachedFileInfo loadFileInfo(long ledgerId, byte[] masterKey) throws IOException {
    while (true) {
        CachedFileInfo fi = fileInfos.computeIfAbsent(
                ledgerId, (ledgerId) -> new CachedFileInfo(ledgerId));
        synchronized (fi) {
            if (!fi.initialized()) {
                File backingFile = fileLoader.load(ledgerId, masterKey != null);
                fi.initialize(backingFile, masterKey);
            }
        }
        if (fi.tryRetain()) {
            return fi;
        }
    }
}

ivankelly · 2018-03-23T07:49:27Z

bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/FileInfoBackingCache.java

            }
        } finally {
            lock.readLock().unlock();
        }

+        File backingFile = fileLoader.load(ledgerId, masterKey != null);
+        CachedFileInfo newFi = new CachedFileInfo(ledgerId, backingFile, masterKey);


The race exists according to the contract provided by FileInfoBackingCache. I'm pretty sure it can happen with the cache loaders also, though that should be irrelevant to the discussion, as whether a container is safe should make it irrelevant as to how it is used.

| Read Fi CacheLoader | Write Fi CacheLoader | | ====================================== | ====================================== | | calls loadInfoInfo | | | - checks fileInfos under read lock | | | - loads fileinfo for file | | | | calls loadFileInfo | | | - checks fileInfos under read lock | | | - loads fileinfo for file | | | - puts into fileInfos under write lock | | | | | | // does something else, maybe fencing | | | | | | calls releaseFileInfo | | | - flushes out fencing flag | | | - removes from fileInfo | | - puts info fileInfos under write lock | | | | calls loadFileInfo | | | - checks fileInfos under read lock | | | - finds fileInfo loaded from other thread | | | - fileInfo does not appear to be fenced | | | |

the file info object created by thread a and thread b are identical.

At the time of creation yes. But one thread can modify the state and persist it back, and the other thread would be unaware of these state changes.

ivankelly · 2018-03-23T07:51:10Z

bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/FileInfoBackingCache.java

@@ -92,8 +107,20 @@ private void releaseFileInfo(long ledgerId, CachedFileInfo fileInfo) {
    }

    void closeAllWithoutFlushing() throws IOException {
-        for (Map.Entry<Long, CachedFileInfo> entry : fileInfos.entrySet()) {
-            entry.getValue().close(false);
+        try {


there is no really matter it is unchecked execution exception or unchecked io exception.

The difference is the instanceof. With UncheckedIOException:

try { // blah } catch (UncheckedIOException uio) { throw uio.getCause(); // returns a IOException }

sijie · 2018-03-23T09:07:08Z

Ah yes. At minimum we should be checking again for existence inside the write lock.

this is one of the change I made here, checking the existence.

But again, my question is whether this has been observed to be a problem?

no at benchmark now. I hit the race condition then I noticed this behavior that file lookup is under a giant lock, which I would like to remove. because we have been working hard at SortedLedgerStorage to avoid doing any IO under lock. This FileInfoBackingCache unfortunately added this behavior back. So I have a strong concern about this and want to move the I/O operation out. However I don't want to change the locking behavior on getting FileInfo, reasons explained as below.

if we want to get rid of the locking, we should get rid of the big lock completely, which shouldn't be too hard with the concurrent map.

a lot of race conditions can come around when competing on getting the fileinfo. so check-existence and put is much safer to happen under the write lock. and they are just memory operation, which is fine to happen under the write lock.

However find the directory for a ledger is an I/O operation, it is expensive, unpredictable than memory operation. so it should not happen under the lock.

so the current approach is the most comfortable and safest approach I can think of. However if you really feel strong about this, feel free to start your approach.

ivankelly · 2018-03-23T12:48:57Z

I'm fine with the current approach, but it seems pointless to have locks around a concurrent hashmap. I've made a PR into your branch with a lockless approach.

sijie#5

sijie · 2018-03-23T18:03:52Z

@ivankelly

it seems pointless to have locks around a concurrent hashmap

well. depends on how you see stuffs. concurrent hashmap is section based implementation, it has better parallelism and less contention on accessing stuffs. especially most of the time the map is protected under a read lock and only be blocked when the map is going to modified.

I've made a PR into your branch with a lockless approach.

I have thought about lockless approach but I didn't do it. A lockless approach moves the mutation out of the protected of a write lock, which can potentially make it exposed to race conditions that we don't think of. This area has been very tricky based on the PRs and discussions we had around it. It is a too-risky change for me. That's also the reason I explained above and the reason I don't feel comfortable about your lockless change.

My immediate concern is only "don't do IO under one lock". I am much comfortable to keep the read/write lock until there is an evidence showing read/write lock is a problem.

ivankelly · 2018-03-23T18:48:59Z

Sure. I'm fine with current approach. If we were using stock ConcurrentHashMap it'd be less risky, but since the hashmap is our code too, it's better to err on the safe side.

I'll push the rename of FileLoader in a different PR.

sijie · 2018-03-23T19:12:40Z

If we were using stock ConcurrentHashMap it'd be less risky, but since the hashmap is our code too, it's better to err on the safe side.

ConcurrentLongHashMap is already everywhere in the code base :)

ivankelly · 2018-03-23T20:10:08Z

retest this please

sijie · 2018-03-23T23:07:59Z

retest this please

Improve FileInfoBackingCache

20c5637

sijie added area/bookie release/4.7.0 type/improvement bk-server-perf BK Server Performance labels Mar 21, 2018

sijie added this to the 4.7.0 milestone Mar 21, 2018

sijie self-assigned this Mar 21, 2018

sijie requested review from ivankelly, merlimat, yzang and jiazhai March 21, 2018 23:07

ivankelly requested changes Mar 22, 2018

View reviewed changes

ivankelly requested changes Mar 23, 2018

View reviewed changes

Use UncheckedIOException

e6d0386

ivankelly approved these changes Mar 23, 2018

View reviewed changes

sijie closed this in 1bd2c0e Mar 24, 2018

sijie deleted the improve_fileinfo_backing_cache branch July 16, 2018 02:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve FileInfoBackingCache #1284

Improve FileInfoBackingCache #1284

sijie commented Mar 21, 2018

sijie commented Mar 22, 2018

ivankelly left a comment

ivankelly Mar 22, 2018

sijie Mar 22, 2018

ivankelly Mar 22, 2018

sijie Mar 22, 2018

ivankelly Mar 23, 2018

sijie Mar 23, 2018

sijie Mar 23, 2018

ivankelly Mar 22, 2018

sijie Mar 22, 2018

ivankelly Mar 23, 2018

sijie Mar 23, 2018

ivankelly Mar 23, 2018

sijie commented Mar 22, 2018 •

edited

ivankelly left a comment

ivankelly Mar 23, 2018

ivankelly Mar 23, 2018

sijie commented Mar 23, 2018 •

edited

ivankelly commented Mar 23, 2018

sijie commented Mar 23, 2018

ivankelly commented Mar 23, 2018

sijie commented Mar 23, 2018

ivankelly commented Mar 23, 2018

sijie commented Mar 23, 2018

Improve FileInfoBackingCache #1284

Improve FileInfoBackingCache #1284

Conversation

sijie commented Mar 21, 2018

sijie commented Mar 22, 2018

ivankelly left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sijie commented Mar 22, 2018 • edited

ivankelly left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sijie commented Mar 23, 2018 • edited

ivankelly commented Mar 23, 2018

sijie commented Mar 23, 2018

ivankelly commented Mar 23, 2018

sijie commented Mar 23, 2018

ivankelly commented Mar 23, 2018

sijie commented Mar 23, 2018

sijie commented Mar 22, 2018 •

edited

sijie commented Mar 23, 2018 •

edited