Skip to content

Commit

Permalink
[FLINK-18598][python][docs] Add documentation on how to wait for the …
Browse files Browse the repository at this point in the history
…job execution to finish when using asynchronous APIs

This closes #13295.
  • Loading branch information
shuiqiangchen authored and dianfu committed Sep 5, 2020
1 parent 6b9cdd4 commit 7da74dc
Show file tree
Hide file tree
Showing 2 changed files with 39 additions and 0 deletions.
20 changes: 20 additions & 0 deletions docs/dev/python/faq.md
Expand Up @@ -97,3 +97,23 @@ table_env.add_python_file('myDir')
def my_udf():
from utils import my_util
{% endhighlight %}

## Wait for jobs to finish when executing jobs in mini cluster

When executing jobs in mini cluster(e.g. when executing jobs in IDE) and using the following APIs in the jobs(
e.g. TableEnvironment.execute_sql, StatementSet.execute, etc in the Python Table API; StreamExecutionEnvironment.execute_async
in the Python DataStream API), please remember to explicitly wait for the job execution to finish as these APIs are asynchronous.
Otherwise you may could not find the execution results as the program will exit before the job execution finishes. Please refer
to the following example on how to do that:

{% highlight python %}
# execute SQL / Table API query asynchronously
t_result = table_env.execute_sql(...)
t_result.get_job_client().get_job_execution_result().result()

# execute DataStream Job asynchronously
job_client = stream_execution_env.execute_async('My DataStream Job')
job_client.get_job_execution_result().result()
{% endhighlight %}

<strong>Note:</strong> There is no need to wait for the job execution to finish when executing jobs in remote cluster and so remember to remove these codes when executing jobs in remote cluster.
19 changes: 19 additions & 0 deletions docs/dev/python/faq.zh.md
Expand Up @@ -96,3 +96,22 @@ table_env.add_python_file('myDir')
def my_udf():
from utils import my_util
{% endhighlight %}

## 当在 mini cluster 环境执行作业时,显式等待作业执行结束

当在 mini cluster 环境执行作业(比如,在IDE中执行作业)且在作业中使用了如下API(比如 Python Table API 的
TableEnvironment.execute_sql, StatementSet.execute 和 Python DataStream API 的 StreamExecutionEnvironment.execute_async)
的时候,因为这些API是异步的,请记得显式地等待作业执行结束。否则程序会在已提交的作业执行结束之前退出,以致无法观测到已提交作业的执行结果。
请参考如下示例代码,了解如何显式地等待作业执行结束:

{% highlight python %}
# 异步执行 SQL / Table API 作业
t_result = table_env.execute_sql(...)
t_result.get_job_client().get_job_execution_result().result()

# 异步执行 DataStream 作业
job_client = stream_execution_env.execute_async('My DataStream Job')
job_client.get_job_execution_result().result()
{% endhighlight %}

<strong>注意:</strong> 当往远程集群提交作业时,无需显式地等待作业执行结束,所以当往远程集群提交作业之前,请记得移除这些等待作业执行结束的代码逻辑。

0 comments on commit 7da74dc

Please sign in to comment.