docs(flow): clean up (#5255)

jina-ai · Oct 10, 2022 · bcf17c3 · bcf17c3
1 parent 573f607
commit bcf17c3
Show file tree

Hide file tree

Showing 8 changed files with 271 additions and 293 deletions.
diff --git a/docs/fundamentals/flow/add-executors.md b/docs/fundamentals/flow/add-executors.md
diff --git a/docs/fundamentals/flow/create-flow.md b/docs/fundamentals/flow/create-flow.md
@@ -1,8 +1,8 @@
 (flow)=
-# Basic
+# Basics
 
 
-{class}`~jina.Flow` defines how your Executors are connected together and how your data *flows* through them.
+A {class}`~jina.Flow` defines how your Executors are connected together and how your data *flows* through them.
 
 
 ## Create
@@ -32,14 +32,14 @@ An empty Flow contains only {ref}`the Gateway<flow>`.
 :scale: 70%
 ```
 
-For production, it is recommended to define the Flows with YAML. This is because YAML files are independent of Python logic code and easy to maintain.
+For production, you should define your Flows with YAML. This is because YAML files are independent of the Python logic code and easier to maintain.
 
 
 ### Conversion between Python and YAML
 
-Python Flow definition can be easily converted to/from YAML definition.
+A Python Flow definition can be easily converted to/from a YAML definition.
 
-To load a Flow from a YAML file, use the {meth}`~jina.Flow.load_config`:
+To load a Flow from a YAML file, use {meth}`~jina.Flow.load_config`:
 
 ```python
 from jina import Flow
@@ -61,12 +61,12 @@ f.save_config('flow.yml')
 
 When a {class}`~jina.Flow` starts, all its {ref}`added Executors <flow-add-executors>` will start as well, making it possible to {ref}`reach the service through its API <access-flow-api>`.
 
-There are three ways to start a Flow. Depending on the use case, you can start a Flow either in Python, or from a YAML file, or from the terminal.
+There are three ways to start a Flow: In Python, from a YAML file, or from the terminal.
 
 - Generally in Python: use Flow as a context manager in Python.
-- As an entrypoint from terminal: use Jina CLI and a Flow YAML.
+- As an entrypoint from terminal: use `Jina CLI <cli>` and a Flow YAML file.
 - As an entrypoint from Python code: use Flow as a context manager inside `if __name__ == '__main__'`
-- No context manager: manually call {meth}`~jina.Flow.start`  and {meth}`~jina.Flow.close`.
+- No context manager: manually call {meth}`~jina.Flow.start` and {meth}`~jina.Flow.close`.
 
 
 ````{tab} General in Python
@@ -119,14 +119,14 @@ A successful start of a Flow looks like this:
 :scale: 70%
 ```
 
-Your addresses and entrypoints can be found in the output. When enabling more features such as monitoring, HTTP gateway, TLS encryption, this display will also expand to contain more information.
+Your addresses and entrypoints can be found in the output. When you enable more features such as monitoring, HTTP gateway, TLS encryption, this display expands to contain more information.
 
 
 ### Set multiprocessing `spawn` 
 
-Some cornet cases require to force `spawn` start method for multiprocessing, e.g. if you encounter "Cannot re-initialize CUDA in forked subprocess". 
+Some corner cases require forcing a `spawn` start method for multiprocessing, for example if you encounter "Cannot re-initialize CUDA in forked subprocess". 
 
-You may try `JINA_MP_START_METHOD=spawn` before starting the Python script to enable this.
+You can use `JINA_MP_START_METHOD=spawn` before starting the Python script to enable this.
 
 ```bash
 JINA_MP_START_METHOD=spawn python app.py
@@ -139,8 +139,7 @@ There's no need to set this for Windows, as it only supports spawn method for mu
 ## Serve forever
 
 In most scenarios, a Flow should remain reachable for prolonged periods of time.
-This can be achieved by `jina flow --uses flow.yml` from terminal.
-
+This can be achieved by `jina flow --uses flow.yml` from the terminal.
 
 Or if you are serving a Flow from Python:
 
@@ -153,7 +152,7 @@ with f:
     f.block()
 ```
 
-The `.block()` method blocks the execution of the current thread or process, which enables external clients to access the Flow.
+The `.block()` method blocks the execution of the current thread or process, enabling external clients to access the Flow.
 
 In this case, the Flow can be stopped by interrupting the thread or process. 
 
@@ -186,7 +185,7 @@ e.set()  # set event and stop (unblock) the Flow
 
 ## Visualize
 
-A {class}`~jina.Flow` has a built-in `.plot()` function which can be used to visualize a `Flow`:
+A {class}`~jina.Flow` has a built-in `.plot()` function which can be used to visualize the `Flow`:
 ```python
 from jina import Flow
 
@@ -210,13 +209,13 @@ f.plot('flow-2.svg')
 :width: 70%
 ```
 
-One can also do it in the terminal via:
+You can also do it in the terminal:
 
 ```bash
 jina export flowchart flow.yml flow.svg 
 ```
 
-One can also visualize a remote Flow by passing the URL to `jina export flowchart`.
+You can also visualize a remote Flow by passing the URL to `jina export flowchart`.
 
 ## Export
 
@@ -230,7 +229,7 @@ f = Flow().add()
 f.to_docker_compose_yaml()
 ```
 
-One can also do it in the terminal via:
+You can also do it in the terminal:
 
 ```shell
 jina export docker-compose flow.yml docker-compose.yml 
@@ -250,16 +249,16 @@ f = Flow().add()
 f.to_kubernetes_yaml('flow_k8s_configuration')
 ```
 
-One can also do it in the terminal via:
+You can also do it in the terminal:
 
 ```shell
 jina export kubernetes flow.yml ./my-k8s 
 ```
 
-This will generate the necessary Kubernetes configuration files for all the {class}`~jina.Executor`s of the Flow.
+This generates the Kubernetes configuration files for all the {class}`~jina.Executor`s in the Flow.
 The generated folder can be used directly with `kubectl` to deploy the Flow to an existing Kubernetes cluster.
 
-For an advance utilisation of Kubernetes with jina please refer to this {ref}`How to <kubernetes>` 
+For advanced utilisation of Kubernetes with Jina please refer to {ref}`How to <kubernetes>` 
 
 
 ```{tip}
@@ -270,7 +269,7 @@ If you do not wish to rebuild the image, set the environment variable `JINA_HUB_
 
 ```{admonition} See also
 :class: seealso
-For more in-depth guides on Flow deployment, take a look at our how-tos for {ref}`Docker compose <docker-compose>` and
+For more in-depth guides on Flow deployment, check our how-tos for {ref}`Docker compose <docker-compose>` and
 {ref}`Kubernetes <kubernetes>`.
 ```
 
diff --git a/docs/fundamentals/flow/health-check.md b/docs/fundamentals/flow/health-check.md
@@ -1,23 +1,23 @@
 # Readiness & health check
 A Jina {class}`~jina.Flow` consists of {ref}`a Gateway and Executors<architecture-overview>`,
-each of which have to be healthy before the Flow is ready to receive requests.
+all of which have to be healthy before the Flow is ready to receive requests.
 
 A Flow is marked as "ready", when all its Executors and its Gateway are fully loaded and ready.
 
 Each Executor provides a health check in the form of a [standardized gRPC endpoint](https://github.com/grpc/grpc/blob/master/doc/health-checking.md) that exposes this information to the outside world.
-This means that health checks can automatically be performed by Jina itself as well as external tools like Docker Compose, Kubernetes service meshes, or load balancers.
+This means health checks can be automatically performed by Jina itself, as well as external tools like Docker Compose, Kubernetes service meshes, or load balancers.
 
 
-## Readiness of a Flow
+## Flow Readiness
 
-In most cases, it is most useful to check if an entire Flow is ready to accept requests.
+In most cases, it is useful to check if an entire Flow is ready to accept requests.
 To enable this readiness check, the Jina Gateway can aggregate health check information from all services and provides
 a readiness check endpoint for the complete Flow.
 
 
 <!-- start flow-ready -->
 
-{class}`~jina.Client` offer a convenient API to query these readiness endpoints. You can call {meth}`~jina.clients.mixin.HealthCheckMixin.is_flow_ready` or {meth}`~jina.Flow.is_flow_ready`, it will return `True` if the Flow is ready, and `False` when it is not.
+{class}`~jina.Client` offers an API to query these readiness endpoints. You can call {meth}`~jina.clients.mixin.HealthCheckMixin.is_flow_ready` or {meth}`~jina.Flow.is_flow_ready`. It returns `True` if the Flow is ready, and `False` if it is not.
 
 ````{tab} via Flow
 ```python
@@ -115,7 +115,7 @@ WARNI… JINA@92986 message lost 100% (3/3)
 
 ### Flow status using third-party clients
 
-You can check the status of a Flow using any gRPC/HTTP/Websocket client, not just Jina's Client implementation.
+You can check the status of a Flow using any gRPC/HTTP/WebSockets client, not just Jina's Client implementation.
 
 To see how this works, first instantiate the Flow with its corresponding protocol and block it for serving:
 
@@ -149,7 +149,7 @@ DEBUG  Flow@19059 2 Deployments (i.e. 2 Pods) are running in this Flow
 
 #### Using gRPC
 
-When using grpc, you can use [grpcurl](https://github.com/fullstorydev/grpcurl) to hit the Gateway's gRPC service that is responsible for reporting the Flow status.
+When using grpc, use [grpcurl](https://github.com/fullstorydev/grpcurl) to access the Gateway's gRPC service that is responsible for reporting the Flow status.
 
 ```shell
 docker pull fullstorydev/grpcurl:latest
@@ -166,7 +166,7 @@ You can simulate an Executor going offline by killing its process.
 kill -9 $EXECUTOR_PID # in this case we can see in the logs that it is 19059
 ```
 
-Then by doing the same check, you will see that it returns an error:
+Then by doing the same check, you can see that it returns an error:
 
 ```shell
 docker run --network='host' fullstorydev/grpcurl -plaintext 127.0.0.1:12345 jina.JinaGatewayDryRunRPC/dry_run
@@ -209,40 +209,39 @@ docker run --network='host' fullstorydev/grpcurl -plaintext 127.0.0.1:12345 jina
 ````
 
 
-#### Using HTTP or Websocket
+#### Using HTTP or WebSockets
 
-When using HTTP or Websocket as the Gateway protocol, you can use curl to target the `/dry_run` endpoint and get the status of the Flow.
+When using HTTP or WebSockets as the Gateway protocol, use curl to target the `/dry_run` endpoint and get the status of the Flow.
 
 
 ```shell
 curl http://localhost:12345/dry_run
 ```
-The error-free output below signifies a correctly running Flow:
+Error-free output signifies a correctly running Flow:
 ```json
 {"code":0,"description":"","exception":null}
 ```
 
-You can simulate an Executor going offline by killing its process.
+You can simulate an Executor going offline by killing its process:
 
 ```shell script
 kill -9 $EXECUTOR_PID # in this case we can see in the logs that it is 19059
 ```
 
-Then by doing the same check, you will see that the call returns an error:
+Then by doing the same check, you can see that the call returns an error:
 
 ```json
 {"code":1,"description":"failed to connect to all addresses |Gateway: Communication error with deployment executor0 at address(es) {'0.0.0.0:12346'}. Head or worker(s) may be down.","exception":{"name":"InternalNetworkError","args":["failed to connect to all addresses |Gateway: Communication error with deployment executor0 at address(es) {'0.0.0.0:12346'}. Head or worker(s) may be down."],"stacks":["Traceback (most recent call last):\n","  File \"/home/joan/jina/jina/jina/serve/networking.py\", line 726, in task_wrapper\n    timeout=timeout,\n","  File \"/home/joan/jina/jina/jina/serve/networking.py\", line 241, in send_requests\n    await call_result,\n","  File \"/home/joan/.local/lib/python3.7/site-packages/grpc/aio/_call.py\", line 291, in __await__\n    self._cython_call._status)\n","grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"failed to connect to all addresses\"\n\tdebug_error_string = \"{\"created\":\"@1654074272.702044542\",\"description\":\"Failed to pick subchannel\",\"file\":\"src/core/ext/filters/client_channel/client_channel.cc\",\"file_line\":3134,\"referenced_errors\":[{\"created\":\"@1654074272.702043378\",\"description\":\"failed to connect to all addresses\",\"file\":\"src/core/lib/transport/error_utils.cc\",\"file_line\":163,\"grpc_status\":14}]}\"\n>\n","\nDuring handling of the above exception, another exception occurred:\n\n","Traceback (most recent call last):\n","  File \"/home/joan/jina/jina/jina/serve/runtimes/gateway/http/app.py\", line 142, in _flow_health\n    data_type=DataInputType.DOCUMENT,\n","  File \"/home/joan/jina/jina/jina/serve/runtimes/gateway/http/app.py\", line 399, in _get_singleton_result\n    async for k in streamer.stream(request_iterator=request_iterator):\n","  File \"/home/joan/jina/jina/jina/serve/stream/__init__.py\", line 78, in stream\n    async for response in async_iter:\n","  File \"/home/joan/jina/jina/jina/serve/stream/__init__.py\", line 154, in _stream_requests\n    response = self._result_handler(future.result())\n","  File \"/home/joan/jina/jina/jina/serve/runtimes/gateway/request_handling.py\", line 148, in _process_results_at_end_gateway\n    partial_responses = await asyncio.gather(*tasks)\n","  File \"/home/joan/jina/jina/jina/serve/runtimes/gateway/graph/topology_graph.py\", line 128, in _wait_previous_and_send\n    self._handle_internalnetworkerror(err)\n","  File \"/home/joan/jina/jina/jina/serve/runtimes/gateway/graph/topology_graph.py\", line 70, in _handle_internalnetworkerror\n    raise err\n","  File \"/home/joan/jina/jina/jina/serve/runtimes/gateway/graph/topology_graph.py\", line 125, in _wait_previous_and_send\n    timeout=self._timeout_send,\n","  File \"/home/joan/jina/jina/jina/serve/networking.py\", line 734, in task_wrapper\n    num_retries=num_retries,\n","  File \"/home/joan/jina/jina/jina/serve/networking.py\", line 697, in _handle_aiorpcerror\n    details=e.details(),\n","jina.excepts.InternalNetworkError: failed to connect to all addresses |Gateway: Communication error with deployment executor0 at address(es) {'0.0.0.0:12346'}. Head or worker(s) may be down.\n"],"executor":""}}
 ```
 
 (health-check-microservices)=
-## Health check of an Executor
+## Executor health check
 
-In addition to a performing a readiness check for the entire Flow, it is also possible to check every individual Executor in said Flow,
-by utilizing a [standardized gRPC health check endpoint](https://github.com/grpc/grpc/blob/master/doc/health-checking.md).
+You can check every individual Executor in a Flow, by using a [standard gRPC health check endpoint](https://github.com/grpc/grpc/blob/master/doc/health-checking.md).
 In most cases this is not necessary, since such checks are performed by Jina, a Kubernetes service mesh or a load balancer under the hood.
-Nevertheless, it is possible to perform these checks as a user.
+Nevertheless, you can perform these checks yourself.
 
-When performing these checks, you can expect on of the following `ServingStatus` responses:
+When performing these checks, you can expect one of the following `ServingStatus` responses:
 - **`UNKNOWN` (0)**: The health of the Executor could not be determined
 - **`SERVING` (1)**: The Executor is healthy and ready to receive requests
 - **`NOT_SERVING` (2)**: The Executor is *not* healthy and *not* ready to receive requests
@@ -264,7 +263,7 @@ with f:
     f.block()
 ```
 
-On another terminal, you can use [grpcurl](https://github.com/fullstorydev/grpcurl) to send RPC requests to your services.
+In another terminal, you can use [grpcurl](https://github.com/fullstorydev/grpcurl) to send gRPC requests to your services.
 
 ```shell
 docker pull fullstorydev/grpcurl:latest
@@ -278,18 +277,18 @@ docker run --network='host' fullstorydev/grpcurl -plaintext 127.0.0.1:12346 grpc
 ```
 
 (health-check-gateway)=
-## Health check of the Gateway
+## Gateway health check
 
-Just like each individual Executor, the Gateway also exposes a health check endpoint.
+Just like each individual Executors, the Gateway also exposes a health check endpoint.
 
-In contrast to Executors however, a Gateway can use gRPC, HTTP, or Websocket, and the health check endpoint changes accordingly.
+In contrast to Executors however, a Gateway can use gRPC, HTTP, or WebSocketss, and the health check endpoint changes accordingly.
 
 
 #### Gateway health check with gRPC
 
-When using gRPC as the protocol to communicate with the Gateway, the Gateway uses the exact same mechanism as Executors to expose its health status: It exposes the [ standard gRPC health check](https://github.com/grpc/grpc/blob/master/doc/health-checking.md) to the outside world.
+When using gRPC as the protocol to communicate with the Gateway, the Gateway uses the exact same mechanism as Executors to expose its health status: It exposes the [standard gRPC health check](https://github.com/grpc/grpc/blob/master/doc/health-checking.md) to the outside world.
 
-With the same Flow as described before, you can use the same way to check the Gateway status:
+With the same Flow as before, you can use the same way to check the Gateway status:
 
 ```bash
 docker run --network='host' fullstorydev/grpcurl -plaintext 127.0.0.1:12345 grpc.health.v1.Health/Check
@@ -302,18 +301,18 @@ docker run --network='host' fullstorydev/grpcurl -plaintext 127.0.0.1:12345 grpc
 ```
 
 
-#### Gateway health check with HTTP or Websocket
+#### Gateway health check with HTTP or WebSockets
 
 ````{admonition} Caution
 :class: caution
-For Gateways running with HTTP or Websocket, the gRPC health check response codes outlined {ref}`above <health-check-microservices>` do not apply.
+For Gateways running with HTTP or WebSockets, the gRPC health check response codes outlined {ref}`above <health-check-microservices>` do not apply.
 
 Instead, an error free response signifies healthiness.
 ````
 
-When using HTTP or Websocket as the protocol for the Gateway, it exposes the endpoint `'/'` that one can query to check the status.
+When using HTTP or WebSockets as the protocol for the Gateway, you can query the endpoint `'/'` to check the status.
 
-First, crate a Flow with HTTP or Websocket protocol:
+First, crate a Flow with HTTP or WebSockets protocol:
 
 ```python
 from jina import Flow
@@ -322,21 +321,21 @@ f = Flow(protocol='http', port=12345).add()
 with f:
     f.block()
 ```
-Then, you can query the "empty" endpoint:
+Then query the "empty" endpoint:
 ```bash
 curl http://localhost:12345
 ```
 
-And you will get a valid empty response indicating the Gateway's ability to serve.
+You get a valid empty response indicating the Gateway's ability to serve:
 ```json
 {}
 ```
 
-## Use jina ping to do health checks
+## Use jina ping for health checks
 
-Once a Flow is running, you can use `jina ping` CLI  {ref}`CLI <../api/jina_cli>` to run readiness check of the complete Flow or of individual Executors or Gateway.
+Once a Flow is running, you can use `jina ping` CLI  {ref}`CLI <../api/jina_cli>` to run a readiness check of the complete Flow or of individual Executors or Gateway.
 
-Let's start a Flow in the terminal by executing the following python code:
+Start a Flow in Python:
 
 ```python
 from jina import Flow
@@ -345,32 +344,32 @@ with Flow(protocol='grpc', port=12345).add(port=12346) as f:
     f.block()
 ```
 
-We can check the readiness of the Flow:
+Check the readiness of the Flow:
 
 ```bash
 jina ping flow grpc://localhost:12345
 ```
 
-Also we can check the readiness of an Executor:
+You can also check the readiness of an Executor:
 
 ```bash
 jina ping executor localhost:12346
 ```
 
-or the readiness of the Gateway service:
+...or the readiness of the Gateway service:
 
 ```bash
 jina ping gateway  grpc://localhost:12345
 ```
 
-When these commands succeed, you will see something like:
+When these commands succeed, you should see something like:
 
 ```text
 INFO   JINA@28600 readiness check succeeded 1 times!!! 
 ```
 
-```admonition Use it in Kubernetes
+```admonition Use in Kubernetes
 :class: note
-This CLI exits with code 1 when the readiness check is not successful, which makes it a good choice to be used as readinessProbe for Executor and Gateway when
+The CLI exits with code 1 when the readiness check is not successful, which makes it a good choice to be used as readinessProbe for Executor and Gateway when
 deployed in Kubernetes.
 ```