New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMBARI-23941] Updating serviceLevelParams and clusterLevelParams in server-side metadata holder both when adding and removing a service/property #1366
Conversation
…erver-side metadata holder both when adding and removing a service/property
Refer to this link for build results (access rights to CI server needed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good except a typo.
@@ -87,7 +87,16 @@ public boolean updateServiceLevelParams(SortedMap<String, MetadataServiceInfo> u | |||
break; | |||
} | |||
} | |||
|
|||
for (String key : serviceLevelParams.keySet()) { | |||
if (!update.containsKey(key) || !update.get(key).equals(update.get(key))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a typo in this line: update.get(key)
is compared to itself (update.get(key)
) instead of serviceLevelParams.get(key)
.
Further, I think this comparison can be removed: values for keys present in both maps are already compared in the first loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm...this is weird..I fixed this issue long before I created the PR (it also broker unit tests). Let me check my stash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
break; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is almost identical to updateServiceLevelParams
. (The duplication is not introduced in this change, but is highlighted by the fact that the same problem needs to be fixed in two places.) Can you please extract a method that takes 2 maps and updates one of them from the other? Then please delegate to that method from these two. Please make sure to keep the version that uses null
-safe comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea; I'll extract it to a util method (maybe an existing/new class; will check this out)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@jonathan-hurley , this seems like it may be familiar to you. If so, then maybe there is a common solution to your issues. |
@rlevas No, my problem is separate from this one. This is simply keeping the agent and server caches in sync. The issue I'm having is that when adding or removing services/components, a massive event needs to be fired fill with data - and it doesn't happen automatically. |
@jonathan-hurley , Thanks for commenting so quickly. |
Looks good, but can you check if we need to guarantee thread safety somehow in the update method or do we always call this on a single thread? |
@zeroflag I'll take care of proper locking being applied (even if it was modified on 1 thread it does not harm). |
…dating service/cluster level parameters
@adoroszlai @zeroflag @rlevas @mpapirkovskyy please review again; thx! |
Refer to this link for build results (access rights to CI server needed): |
@smolnar82 Thanks for the changes, they look good with respect to what was requested. However, on second thought, I have some concern about how this entire change would work with "partial" updates. For example, the following method passes service-level params only for a single service. With your change, wouldn't this kind of update get rid of all other existing services from the metadata? Lines 5652 to 5655 in e9374c9
Similarly, there are cases where empty map in update means "no changes" in that part of the metadata: Lines 5620 to 5623 in e9374c9
Lines 5634 to 5637 in e9374c9
Wouldn't these updates clear service- or cluster-level params? |
@adoroszlai In the meantime I'd really appreciate if @mpapirkovskyy would review this change and let me know if this is the approach what we should follow. Thanks in advance! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also tend to think that agent change to not delete params may be better/easier for now, so it will be in sync with server.
I don't think some logic in agent scripts can rely on presence/absence of param rather than its value.
@aonishuk any thoughts?
private Set<String> statusCommandsToRun = new HashSet<>(); | ||
private SortedMap<String, MetadataServiceInfo> serviceLevelParams = new TreeMap<>(); | ||
private SortedMap<String, String> clusterLevelParams = new TreeMap<>(); | ||
private final static Lock LOCK = new ReentrantLock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for static lock?
There are multiple instances of this object and all of them are going use single lock instance.
It looks like theres better place for lock, and again not for static one:
org.apache.ambari.server.agent.stomp.MetadataHolder#handleUpdate
Another possibility is to use ConcurrentSkipListMap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to have the lock a simple one (i.e. not static). However I'd keep it here to make sure we lock only the update piece and do not spend more time on locking where it is not necessary.
final boolean changed = !Objects.equals(mapToBeUpdated, updatedMap); | ||
|
||
if (changed) { | ||
mapToBeUpdated.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @adoroszlai that this is going to break partial metadata updates.
So we should either always send full metadata which will require additional modifications.
Or change logic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix it soon
….e. service or cluster level params should not be touched) and for the case when service level params are updated for a single service only (in case of cluter level params we always use full metadata)
@adoroszlai @mpapirkovskyy @zeroflag @rlevas please review my changes; thanks in advance! I re-executed the integration test I described above and re-run the unit tests in
|
Refer to this link for build results (access rights to CI server needed): |
@adoroszlai @mpapirkovskyy @zeroflag @rlevas please review my changes; thanks in advance! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except minor nit about locking.
public boolean updateServiceLevelParams(SortedMap<String, MetadataServiceInfo> update, boolean fullMetadataInUpdatedMap) { | ||
if (update != null) { | ||
if (this.serviceLevelParams == null) { | ||
this.serviceLevelParams = new TreeMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these map creations should also be guarded by lock
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right (this was a last minute movement at certainly not thought over); it's fixed now.
Refer to this link for build results (access rights to CI server needed): |
retest this please |
Refer to this link for build results (access rights to CI server needed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mpapirkovskyy the solution to do this on agent sounds hacky to me, should be done by server by all logic.
However it's possible, but I think it would be better the server and agent data are in synch to avoid other issues caused by this hack.
retest this please |
@smolnar82 I think the fix for broken unit test needs to be merged from |
Refer to this link for build results (access rights to CI server needed): |
@adoroszlai if that was the case I should not have even seen the error since I did not merge anything from origin/trunk during my patches to avoid force pushing |
What changes were proposed in this pull request?
When a service has been removed from the cluster its
serviceLevelParams
stayed in server side's metadata holder but removed from the agent side cache. Next time when someone wanted to re-install the same service there were no metadata change event triggered so that the agent side code failed due to missingserviceLevelParams
for that service (the same is true forclusterLevelParams
metadata).How was this patch tested?
Added unit test code to cover these cases; latest JUnit test results in
ambari-server
:In addition to uni testing the following integration test steps were executed:
ambari-server.jar
with mine after modifying and building the code