Enable clients to scan offline tables using ScanServers#6156
Enable clients to scan offline tables using ScanServers#6156dlmarion merged 14 commits intoapache:2.1from
Conversation
During a normal client scan the TabletLocator resolves tablets (key extent and location) for a given search range. The location is necessary for the client to be able to create a connection with a tablet server to perform the scan, but the location is not needed when the client is using scan servers. The TabletLocator does not resolve tablets for offline tables. This change introduces the OfflineTabletLocatorImpl that performs this resolution (range -> key extents) and does not provide any location information. This change also modifies the client to allow scans on offline tables when using scan servers and uses the new OfflineTabletLocatorImpl in that code path.
|
This is marked as draft as the more complex text in the new IT class is having an issue with running out of memory. These changes are functional in the smaller scale test, so I likely have a scaling issue and maybe some bugs to work out. |
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Show resolved
Hide resolved
| if (getConsistencyLevel() == ConsistencyLevel.IMMEDIATE) { | ||
| try { | ||
| String tableName = context.getTableName(tableId); | ||
| context.requireNotOffline(tableId, tableName); |
There was a problem hiding this comment.
If this can do ZK operation per creation of a scanner iterator it could cause problems. Not sure what the impl of this method does.
There was a problem hiding this comment.
It's using ZooCache, not hitting ZK directly.
Took this out of draft as I have the IT working ( I fixed the known issues) and I reworked the cache to hopefully provide better memory management at larger scales. |
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Outdated
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| private KeyExtent findOrLoadExtent(KeyExtent start) { |
There was a problem hiding this comment.
Could simplify the locking by making extents use a ConcurrentSkip list. Then would not need the read lock. Could have a single lock only for the case of doing updates, the code that currently gets a write lock. Also the eviction handler could directly remove from the extents map if it were a concurrently skip list w/o any locking.
There was a problem hiding this comment.
I attempted this locally and it doesn't work the way that we want it to. When the cache becomes full Caffeine may start evicting newly inserted entries which ends up causing the IT to fail.
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Outdated
Show resolved
Hide resolved
| Integer.parseInt(ClientProperty.OFFLINE_LOCATOR_CACHE_SIZE.getValue(clientProperties)); | ||
| prefetch = Integer | ||
| .parseInt(ClientProperty.OFFLINE_LOCATOR_CACHE_PREFETCH.getValue(clientProperties)); | ||
| cache = Caffeine.newBuilder().expireAfterAccess(cacheDuration).initialCapacity(maxCacheSize) |
There was a problem hiding this comment.
The cache could have a weigher. The could be useful for the case where tablets splits can widely vary in size.
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Outdated
Show resolved
Hide resolved
|
In 40cafc0 I made some changes to handle the max cache size manually instead of letting Caffeine handle this case. The Caffeine cache is being used to handle the expiration of KeyExtents from the |
| // cache so that they are not immediately evicted. | ||
| if (cacheCount.get() + prefetch + 1 >= maxCacheSize) { | ||
| int evictionSize = prefetch * 2; | ||
| Set<KeyExtent> candidates = new HashSet<>(evictionPolicy.coldest(evictionSize).keySet()); |
There was a problem hiding this comment.
evictionPolicy.coldest() works when setting the Caffeine maximumSize option. Another approach would be to not set the maximumSize on the Cache at all, and use cache.policy().expireAfterAccess().orElseThrow().youngest(int) instead. I'm not sure, but this may be functionally equivalent but might allow us to remove some code trying to deal with Caffeine removing objects prematurely based on size.
There was a problem hiding this comment.
On second thought, with maximumSize option set, then Caffeine will evict based on a modified LRU algorithm, which is likely what we want. We don't want to blindly evict the youngest entries. In this contrived test the youngest entries are the least recently used, but that's not going to be the case in a long-lived process.
| } finally { | ||
| lock.readLock().unlock(); | ||
| } | ||
| lock.writeLock().lock(); |
There was a problem hiding this comment.
The try{}finally{ unlock()} should immediately follow the lock to ensure its unlocked even if there are exceptions.
There was a problem hiding this comment.
I'm not sure I'm following your comment. The code is doing:
lock.readLock().lock();
try {
..
} finally {
lock.readLock().unlock();
}
lock.writeLock().lock();
...
There was a problem hiding this comment.
There are many lines of code, that could throw an exception, between acquiring the write lock and the try/finally that releases the lock.
There was a problem hiding this comment.
Ah, I see what you are referring to now.
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineTabletLocatorImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/clientImpl/TabletLocator.java
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/conf/ClientProperty.java
Outdated
Show resolved
Hide resolved
|
Build with sunny profile passed locally for me |
During a normal client scan the TabletLocator resolves tablets (key extent and location) for a given search range. The location is necessary for the client to be able to create a connection with a tablet server to perform the scan, but the location is not needed when the client is using scan servers. The TabletLocator does not resolve tablets for offline tables.
This change introduces the OfflineTabletLocatorImpl that performs this resolution (range -> key extents) and does not provide any location information. This change also modifies the client to allow scans on offline tables when using scan servers and uses the new OfflineTabletLocatorImpl in that code path.