Skip to content

Enable clients to scan offline tables using ScanServers#6156

Merged
dlmarion merged 14 commits intoapache:2.1from
dlmarion:sserver-offline-tables
Mar 4, 2026
Merged

Enable clients to scan offline tables using ScanServers#6156
dlmarion merged 14 commits intoapache:2.1from
dlmarion:sserver-offline-tables

Conversation

@dlmarion
Copy link
Contributor

During a normal client scan the TabletLocator resolves tablets (key extent and location) for a given search range. The location is necessary for the client to be able to create a connection with a tablet server to perform the scan, but the location is not needed when the client is using scan servers. The TabletLocator does not resolve tablets for offline tables.

This change introduces the OfflineTabletLocatorImpl that performs this resolution (range -> key extents) and does not provide any location information. This change also modifies the client to allow scans on offline tables when using scan servers and uses the new OfflineTabletLocatorImpl in that code path.

During a normal client scan the TabletLocator resolves
tablets (key extent and location) for a given search range.
The location is necessary for the client to be able to create
a connection with a tablet server to perform the scan, but
the location is not needed when the client is using scan
servers. The TabletLocator does not resolve tablets for
offline tables.

This change introduces the OfflineTabletLocatorImpl that
performs this resolution (range -> key extents) and does
not provide any location information. This change also
modifies the client to allow scans on offline tables
when using scan servers and uses the new OfflineTabletLocatorImpl
in that code path.
@dlmarion
Copy link
Contributor Author

This is marked as draft as the more complex text in the new IT class is having an issue with running out of memory. These changes are functional in the smaller scale test, so I likely have a scaling issue and maybe some bugs to work out.

@dlmarion dlmarion self-assigned this Feb 25, 2026
@dlmarion dlmarion added this to the 2.1.5 milestone Feb 25, 2026
if (getConsistencyLevel() == ConsistencyLevel.IMMEDIATE) {
try {
String tableName = context.getTableName(tableId);
context.requireNotOffline(tableId, tableName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this can do ZK operation per creation of a scanner iterator it could cause problems. Not sure what the impl of this method does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's using ZooCache, not hitting ZK directly.

@dlmarion dlmarion marked this pull request as ready for review February 26, 2026 19:17
@dlmarion
Copy link
Contributor Author

This is marked as draft as the more complex text in the new IT class is having an issue with running out of memory. These changes are functional in the smaller scale test, so I likely have a scaling issue and maybe some bugs to work out.

Took this out of draft as I have the IT working ( I fixed the known issues) and I reworked the cache to hopefully provide better memory management at larger scales.

}
}

private KeyExtent findOrLoadExtent(KeyExtent start) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could simplify the locking by making extents use a ConcurrentSkip list. Then would not need the read lock. Could have a single lock only for the case of doing updates, the code that currently gets a write lock. Also the eviction handler could directly remove from the extents map if it were a concurrently skip list w/o any locking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted this locally and it doesn't work the way that we want it to. When the cache becomes full Caffeine may start evicting newly inserted entries which ends up causing the IT to fail.

Integer.parseInt(ClientProperty.OFFLINE_LOCATOR_CACHE_SIZE.getValue(clientProperties));
prefetch = Integer
.parseInt(ClientProperty.OFFLINE_LOCATOR_CACHE_PREFETCH.getValue(clientProperties));
cache = Caffeine.newBuilder().expireAfterAccess(cacheDuration).initialCapacity(maxCacheSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache could have a weigher. The could be useful for the case where tablets splits can widely vary in size.

@dlmarion
Copy link
Contributor Author

dlmarion commented Mar 2, 2026

In 40cafc0 I made some changes to handle the max cache size manually instead of letting Caffeine handle this case. The Caffeine cache is being used to handle the expiration of KeyExtents from the extents set based on last access time. The issue with Caffeine when nearing the max size is that it will select entries in the Cache for eviction that are new. We don't want this behavior as we are pre-fetching KeyExtents from the metadata table for performance reasons. The new logic asks Caffeine for a set of entries that are the coldest, then removes the ones that before the KeyExtent that we are searching for.

// cache so that they are not immediately evicted.
if (cacheCount.get() + prefetch + 1 >= maxCacheSize) {
int evictionSize = prefetch * 2;
Set<KeyExtent> candidates = new HashSet<>(evictionPolicy.coldest(evictionSize).keySet());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evictionPolicy.coldest() works when setting the Caffeine maximumSize option. Another approach would be to not set the maximumSize on the Cache at all, and use cache.policy().expireAfterAccess().orElseThrow().youngest(int) instead. I'm not sure, but this may be functionally equivalent but might allow us to remove some code trying to deal with Caffeine removing objects prematurely based on size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, with maximumSize option set, then Caffeine will evict based on a modified LRU algorithm, which is likely what we want. We don't want to blindly evict the youngest entries. In this contrived test the youngest entries are the least recently used, but that's not going to be the case in a long-lived process.

} finally {
lock.readLock().unlock();
}
lock.writeLock().lock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try{}finally{ unlock()} should immediately follow the lock to ensure its unlocked even if there are exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm following your comment. The code is doing:

lock.readLock().lock();
try {
..
} finally {
lock.readLock().unlock();
}
lock.writeLock().lock();
...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many lines of code, that could throw an exception, between acquiring the write lock and the try/finally that releases the lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you are referring to now.

@dlmarion
Copy link
Contributor Author

dlmarion commented Mar 3, 2026

Build with sunny profile passed locally for me

@dlmarion dlmarion merged commit 34823ac into apache:2.1 Mar 4, 2026
7 of 8 checks passed
@dlmarion dlmarion deleted the sserver-offline-tables branch March 4, 2026 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants