Skip to content

KNOX-2134 - Add caching to ZookeeperRemoteAliasService#205

Closed
pzampino wants to merge 1 commit intoapache:masterfrom
pzampino:master
Closed

KNOX-2134 - Add caching to ZookeeperRemoteAliasService#205
pzampino wants to merge 1 commit intoapache:masterfrom
pzampino:master

Conversation

@pzampino
Copy link
Contributor

What changes were proposed in this pull request?

Added a cache to ZookeeperRemoteAliasService to improve the performance of alias lookups.

How was this patch tested?

Added test to ZookeeperRemoteAliasServiceTest.

Copy link
Contributor

@risdenk risdenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some initial comments. I need to take a closer look in an IDE.

zkAlias.encrypt(expectedPasswordDev).getBytes(StandardCharsets.UTF_8));

try {
Thread.sleep(1000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do this? Why 1 second? What are we sleeping for?

Copy link
Contributor Author

@pzampino pzampino Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes some time for the znode changes to propagate to the listener. Back when I had done the original remote registry work, I had used much shorter delay intervals in the tests, but then the tests would intermittently fail. The delay in those tests was bumped to 1s, so I've done the same here, to avoid the same intermittent failures.

try {
Thread.sleep(1000);
} catch (InterruptedException e) {
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reset the interrupted thread status. Fixed a bunch of that here - https://issues.apache.org/jira/browse/KNOX-2130 couldn't figure out how to fail the build sadly :(

Thread.currentThread().interrupt();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this is test-only code, and no one of importance is checking the interrupted status further up the stack in this case. So, I'm not convinced there is much value in setting the interrupted status here.

try {
Thread.sleep(1000);
} catch (InterruptedException e) {
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - Thread.currentThread().interrupt();

Also again why 1 second?

}
} else {
// Add it to the local cache only
Map<String, String> clusterAliases = aliasCache.computeIfAbsent(clusterName, m -> new ConcurrentHashMap<>());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 lines are used a few times. Might help to make it a "add to alias cache" method.

Comment on lines +460 to +462
if (aliasCache.containsKey(paths[0])) {
aliasCache.get(paths[0]).remove(paths[1]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be another helper method for alias cache. Remove from aliasCache. Not sure if there is a possible race condition here between the check and remove.

Comment on lines -236 to +281
/* try to get it from remote registry */
if (remoteClient != null) {
checkPathsExist(remoteClient);
String encrypted = null;

if(remoteClient.entryExists(buildAliasEntryName(clusterName, alias))) {
encrypted = remoteClient
.getEntryData(buildAliasEntryName(clusterName, alias));
// Try the local cache first
if (aliasCache.containsKey(clusterName)) {
String value = aliasCache.get(clusterName).get(alias);
if (value != null) {
password = value.toCharArray();
}
}

/* Generate a new password */
if (encrypted == null) {
// If it wasn't found in the local cache, check the remote registry
if (password == null) {
/* try to get it from remote registry */
if (remoteClient != null) {
checkPathsExist(remoteClient);
String encrypted = null;

/* Generate a new password */
if (generate) {
generateAliasForCluster(clusterName, alias);
password = getPasswordFromAliasForCluster(clusterName, alias);
if (remoteClient.entryExists(buildAliasEntryName(clusterName, alias))) {
encrypted = remoteClient
.getEntryData(buildAliasEntryName(clusterName, alias));
}

} else {
try {
password = decrypt(encrypted).toCharArray();
} catch (final Exception e) {
throw new AliasServiceException(e);
/* Generate a new password */
if (encrypted == null) {

/* Generate a new password */
if (generate) {
generateAliasForCluster(clusterName, alias);
password = getPasswordFromAliasForCluster(clusterName, alias);
}

} else {
try {
password = decrypt(encrypted).toCharArray();
} catch (final Exception e) {
throw new AliasServiceException(e);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to look at this in more detail - probably in an IDE. Just noting to myself to look closer.

Comment on lines +174 to +181
// First, check the alias cache
if (aliasCache.containsKey(clusterName)) {
remoteAliases.addAll(aliasCache.get(clusterName).keySet());
} else { // If it's not available locally, then check ZK
/* If we have remote registry configured, query it */
if (remoteClient != null) {
remoteAliases = remoteClient.listChildEntries(buildClusterEntryName(clusterName));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I follow this logic. In my head, the cache should be up to date. We should never just straight return remove data without updating the cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cache isn't up to date for whatever reason, we should update the cache and return that data. Now have to be careful about a race condition here.

@pzampino
Copy link
Contributor Author

pzampino commented Dec 3, 2019

In light of the related discussion, closing this PR to work on the more complete solution.

@pzampino pzampino closed this Dec 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments