Skip to content

[FLINK-9356] Improve error message for when queryable state not ready / reachable#6028

Closed
yanghua wants to merge 2 commits intoapache:masterfrom
yanghua:FLINK-9356
Closed

[FLINK-9356] Improve error message for when queryable state not ready / reachable#6028
yanghua wants to merge 2 commits intoapache:masterfrom
yanghua:FLINK-9356

Conversation

@yanghua
Copy link
Copy Markdown
Contributor

@yanghua yanghua commented May 17, 2018

What is the purpose of the change

This pull request improve error message for when queryable state not ready / reachable

Brief change log

  • Improve error message for when queryable state not ready / reachable

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@yanghua
Copy link
Copy Markdown
Contributor Author

yanghua commented May 17, 2018

cc @tillrohrmann , here is a local recovery rocksdb full test error .

@yanghua
Copy link
Copy Markdown
Contributor Author

yanghua commented May 17, 2018

hi @kl0u , there is another PR about queryable state, would you like to have a look? thanks~

Copy link
Copy Markdown
Contributor

@kl0u kl0u left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on this @florianschmidt1994 ?

return FutureUtils.completedExceptionally(new UnknownLocationException("Could not contact the state location oracle to retrieve the state location."));
return FutureUtils.completedExceptionally(
new UnknownLocationException("Could not contact the state location oracle to retrieve the state location for state="
+ queryableStateName + " of job=" + jobId + ", the caused reason maybe the state is not ready or there is no job exists."));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message is pretty verbose and it contains grammatical errors. If it were to write this, it should be something like:

Could not contact the state location oracle to retrieve the location for state=QSName of job= JobID. The reason can be that the state is not ready or that that does not exist.

But before putting this in, it would be helpful if @florianschmidt1994 commented on what he thinks, as he is the one that opened the issue.

Copy link
Copy Markdown
Contributor

@florianschmidt1994 florianschmidt1994 May 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a very good step in the right direction! What I've been wondering is

  1. Maybe we could look into whether or not we could distinguish the two cases and have separate error messages (this would require some more effort and I don't know if this is easily doable)

  2. Do we need to expose the term "state location oracle" here? As an outstanding person I am not familiar with it and I think it might more more of a confusion than helpful to someone encountering that error message, seeing as this is not a really exceptional case but more of a setup / timing thing

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@florianschmidt1994 thanks for your opinion
for 1 : here the thrown error message is based on

final KvStateLocationOracle kvStateLocationOracle = proxy.getKvStateLocationOracle(jobId);

if kvStateLocationOracle is null will throw the exception, but just depends this variable, we could not give a explicit reason, I think giving the possibility is good, the explicit reason should been given in the implementation of method getKvStateLocationOracle or others (but not belongs this issue).

for 2 : accept your idea, we cold remove the oracle keyword

@yanghua
Copy link
Copy Markdown
Contributor Author

yanghua commented May 23, 2018

cc @kl0u

@yanghua
Copy link
Copy Markdown
Contributor Author

yanghua commented May 29, 2018

cc @kl0u @zentol

@kl0u
Copy link
Copy Markdown
Contributor

kl0u commented May 29, 2018

LGTM, so +1 and I will merge later.
Thanks for the work @yanghua and for the review @florianschmidt1994 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants