KNOX-2134 - Add caching to ZookeeperRemoteAliasService#205
KNOX-2134 - Add caching to ZookeeperRemoteAliasService#205pzampino wants to merge 1 commit intoapache:masterfrom
Conversation
risdenk
left a comment
There was a problem hiding this comment.
I left some initial comments. I need to take a closer look in an IDE.
| zkAlias.encrypt(expectedPasswordDev).getBytes(StandardCharsets.UTF_8)); | ||
|
|
||
| try { | ||
| Thread.sleep(1000); |
There was a problem hiding this comment.
Why do we need to do this? Why 1 second? What are we sleeping for?
There was a problem hiding this comment.
It takes some time for the znode changes to propagate to the listener. Back when I had done the original remote registry work, I had used much shorter delay intervals in the tests, but then the tests would intermittently fail. The delay in those tests was bumped to 1s, so I've done the same here, to avoid the same intermittent failures.
| try { | ||
| Thread.sleep(1000); | ||
| } catch (InterruptedException e) { | ||
| // |
There was a problem hiding this comment.
Reset the interrupted thread status. Fixed a bunch of that here - https://issues.apache.org/jira/browse/KNOX-2130 couldn't figure out how to fail the build sadly :(
Thread.currentThread().interrupt();
There was a problem hiding this comment.
Well, this is test-only code, and no one of importance is checking the interrupted status further up the stack in this case. So, I'm not convinced there is much value in setting the interrupted status here.
| try { | ||
| Thread.sleep(1000); | ||
| } catch (InterruptedException e) { | ||
| // |
There was a problem hiding this comment.
Same as above - Thread.currentThread().interrupt();
Also again why 1 second?
| } | ||
| } else { | ||
| // Add it to the local cache only | ||
| Map<String, String> clusterAliases = aliasCache.computeIfAbsent(clusterName, m -> new ConcurrentHashMap<>()); |
There was a problem hiding this comment.
These 2 lines are used a few times. Might help to make it a "add to alias cache" method.
| if (aliasCache.containsKey(paths[0])) { | ||
| aliasCache.get(paths[0]).remove(paths[1]); | ||
| } |
There was a problem hiding this comment.
This could be another helper method for alias cache. Remove from aliasCache. Not sure if there is a possible race condition here between the check and remove.
| /* try to get it from remote registry */ | ||
| if (remoteClient != null) { | ||
| checkPathsExist(remoteClient); | ||
| String encrypted = null; | ||
|
|
||
| if(remoteClient.entryExists(buildAliasEntryName(clusterName, alias))) { | ||
| encrypted = remoteClient | ||
| .getEntryData(buildAliasEntryName(clusterName, alias)); | ||
| // Try the local cache first | ||
| if (aliasCache.containsKey(clusterName)) { | ||
| String value = aliasCache.get(clusterName).get(alias); | ||
| if (value != null) { | ||
| password = value.toCharArray(); | ||
| } | ||
| } | ||
|
|
||
| /* Generate a new password */ | ||
| if (encrypted == null) { | ||
| // If it wasn't found in the local cache, check the remote registry | ||
| if (password == null) { | ||
| /* try to get it from remote registry */ | ||
| if (remoteClient != null) { | ||
| checkPathsExist(remoteClient); | ||
| String encrypted = null; | ||
|
|
||
| /* Generate a new password */ | ||
| if (generate) { | ||
| generateAliasForCluster(clusterName, alias); | ||
| password = getPasswordFromAliasForCluster(clusterName, alias); | ||
| if (remoteClient.entryExists(buildAliasEntryName(clusterName, alias))) { | ||
| encrypted = remoteClient | ||
| .getEntryData(buildAliasEntryName(clusterName, alias)); | ||
| } | ||
|
|
||
| } else { | ||
| try { | ||
| password = decrypt(encrypted).toCharArray(); | ||
| } catch (final Exception e) { | ||
| throw new AliasServiceException(e); | ||
| /* Generate a new password */ | ||
| if (encrypted == null) { | ||
|
|
||
| /* Generate a new password */ | ||
| if (generate) { | ||
| generateAliasForCluster(clusterName, alias); | ||
| password = getPasswordFromAliasForCluster(clusterName, alias); | ||
| } | ||
|
|
||
| } else { | ||
| try { | ||
| password = decrypt(encrypted).toCharArray(); | ||
| } catch (final Exception e) { | ||
| throw new AliasServiceException(e); | ||
| } |
There was a problem hiding this comment.
I need to look at this in more detail - probably in an IDE. Just noting to myself to look closer.
| // First, check the alias cache | ||
| if (aliasCache.containsKey(clusterName)) { | ||
| remoteAliases.addAll(aliasCache.get(clusterName).keySet()); | ||
| } else { // If it's not available locally, then check ZK | ||
| /* If we have remote registry configured, query it */ | ||
| if (remoteClient != null) { | ||
| remoteAliases = remoteClient.listChildEntries(buildClusterEntryName(clusterName)); | ||
| } |
There was a problem hiding this comment.
I don't think I follow this logic. In my head, the cache should be up to date. We should never just straight return remove data without updating the cache.
There was a problem hiding this comment.
If the cache isn't up to date for whatever reason, we should update the cache and return that data. Now have to be careful about a race condition here.
|
In light of the related discussion, closing this PR to work on the more complete solution. |
What changes were proposed in this pull request?
Added a cache to ZookeeperRemoteAliasService to improve the performance of alias lookups.
How was this patch tested?
Added test to ZookeeperRemoteAliasServiceTest.