Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMBARI-25719. Fail to enable kerberos due to NullPointerException and HostNotFoundException #3354

Closed
wants to merge 6 commits into from

Conversation

eubnara
Copy link
Contributor

@eubnara eubnara commented Aug 26, 2022

I missed one logging on thread. I add it.
This PR follows #3344 (comment).

… HostNotFoundException

- make a log on threads, propagate Exception and don't ignore it silently
@brahmareddybattula
Copy link
Contributor

@eubnara any chance to keep the test report ..?

} catch (NumberFormatException|AmbariException ignored) {}
} catch (NumberFormatException|AmbariException e) {
LOG.error("Exception on sendAgentCommand", e);
throw new RuntimeException(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be incomtablie change right as it was ignore before. and actionschduler throws only ambariexception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I change RuntimeException to AmbariException ?
If I don't rethrow the exception, it hangs on Test Kerberos Client or Distribute Keytabs and you would wait until it gets timeout.
image
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's through AmbariException.

Copy link
Contributor Author

@eubnara eubnara Aug 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think RuntimeException or AmbariRuntimeException will be handled on ActionScheduler#run. (https://github.com/apache/ambari/blob/branch-2.7/ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionScheduler.java#L352-L354)

So, I just changed RuntimeException to AmbariRuntimeException.
If you want to use AmbariException, many changes are needed.
Isn't it the right way to use AmbariRuntimeException instead of AmbariException?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also another example line which use AmbariRuntimeException when a step fails in the middle of a stage.
=> https://github.com/apache/ambari/blob/branch-2.7/ambari-server/src/main/java/org/apache/ambari/server/events/publishers/AgentCommandsPublisher.java#L178

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.. I also feel, AmbariRuntimeException should be fine.

} catch (NumberFormatException|AmbariException ignored) {}
} catch (NumberFormatException|AmbariException e) {
LOG.error("Exception on sendAgentCommand", e);
throw new AmbariRuntimeException(e);
Copy link
Contributor Author

@eubnara eubnara Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait please.
When installing new cluster, NumberFormatException raises because there is no cluster id for new cluster yet on line clusterId = Long.valueOf(((ExecutionCommand)ac).getClusterId());.
It hangs and cannot proceed next stages.
I'll fix it wait please.

2022-09-02 07:55:34,472 ERROR [agent-command-publisher-1] AgentCommandsPublisher:115 - Exception on sendAgentCommand
java.lang.NumberFormatException: null
        at java.lang.Long.parseLong(Long.java:552)
        at java.lang.Long.valueOf(Long.java:803)
        at org.apache.ambari.server.events.publishers.AgentCommandsPublisher.lambda$null$0(AgentCommandsPublisher.java:110)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
        at com.google.common.collect.CollectSpliterators$1.lambda$forEachRemaining$1(CollectSpliterators.java:116)
        at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
        at com.google.common.collect.CollectSpliterators$1.forEachRemaining(CollectSpliterators.java:116)
        at com.google.common.collect.CollectSpliterators$1FlatMapSpliterator.lambda$forEachRemaining$1(CollectSpliterators.java:247)
        at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1723)
        at com.google.common.collect.CollectSpliterators$1FlatMapSpliterator.forEachRemaining(CollectSpliterators.java:247)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
        at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
        at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401)
        at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734)
        at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
        at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
        at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:650)
        at org.apache.ambari.server.events.publishers.AgentCommandsPublisher.lambda$sendAgentCommand$1(AgentCommandsPublisher.java:103)
        at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1386)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brahmareddybattula
Hmm... Is it better not to rethrow? Just logging?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, just logging should be fine, we just need to avoid the hung right.

@eubnara
Copy link
Contributor Author

eubnara commented Sep 3, 2022

I have created a new PR to clean up commit logs. => #3361

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants