[FLINK-9356] Improve error message for when queryable state not ready / reachable#6028
[FLINK-9356] Improve error message for when queryable state not ready / reachable#6028yanghua wants to merge 2 commits intoapache:masterfrom
Conversation
|
cc @tillrohrmann , here is a local recovery rocksdb full test error . |
|
hi @kl0u , there is another PR about queryable state, would you like to have a look? thanks~ |
kl0u
left a comment
There was a problem hiding this comment.
Can you comment on this @florianschmidt1994 ?
| return FutureUtils.completedExceptionally(new UnknownLocationException("Could not contact the state location oracle to retrieve the state location.")); | ||
| return FutureUtils.completedExceptionally( | ||
| new UnknownLocationException("Could not contact the state location oracle to retrieve the state location for state=" | ||
| + queryableStateName + " of job=" + jobId + ", the caused reason maybe the state is not ready or there is no job exists.")); |
There was a problem hiding this comment.
This message is pretty verbose and it contains grammatical errors. If it were to write this, it should be something like:
Could not contact the state location oracle to retrieve the location for state=QSName of job= JobID. The reason can be that the state is not ready or that that does not exist.
But before putting this in, it would be helpful if @florianschmidt1994 commented on what he thinks, as he is the one that opened the issue.
There was a problem hiding this comment.
I think this is a very good step in the right direction! What I've been wondering is
-
Maybe we could look into whether or not we could distinguish the two cases and have separate error messages (this would require some more effort and I don't know if this is easily doable)
-
Do we need to expose the term "state location oracle" here? As an outstanding person I am not familiar with it and I think it might more more of a confusion than helpful to someone encountering that error message, seeing as this is not a really exceptional case but more of a setup / timing thing
There was a problem hiding this comment.
@florianschmidt1994 thanks for your opinion
for 1 : here the thrown error message is based on
final KvStateLocationOracle kvStateLocationOracle = proxy.getKvStateLocationOracle(jobId);
if kvStateLocationOracle is null will throw the exception, but just depends this variable, we could not give a explicit reason, I think giving the possibility is good, the explicit reason should been given in the implementation of method getKvStateLocationOracle or others (but not belongs this issue).
for 2 : accept your idea, we cold remove the oracle keyword
|
cc @kl0u |
|
LGTM, so +1 and I will merge later. |
What is the purpose of the change
This pull request improve error message for when queryable state not ready / reachable
Brief change log
Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
@Public(Evolving): (yes / no)Documentation