iter8-tools · kalantar · Nov 2, 2023 · Oct 31, 2023 · Oct 31, 2023 · Nov 1, 2023
diff --git a/docs/getting-started/first-abn.md b/docs/getting-started/first-abn.md
@@ -26,10 +26,10 @@ This tutorial describes how to do A/B testing of a backend component using the [
 
 A simple sample two-tier application using the Iter8 SDK is provided. Note that only the frontend component uses the Iter8 SDK. Deploy both the frontend and backend components of this application as described in each tab:
 
-=== "frontend"
+=== "Frontend"
     Install the frontend component using an implementation in the language of your choice:
 
-    === "node"
+    === "Node"
         ```shell
         kubectl create deployment frontend --image=iter8/abn-sample-frontend-node:0.17.3
         kubectl expose deployment frontend --name=frontend --port=8090
@@ -43,7 +43,7 @@ A simple sample two-tier application using the Iter8 SDK is provided. Note that
 
     The frontend component is implemented to call `Lookup()` before each call to the backend component. The frontend component uses the returned version number to route the request to the recommended version of the backend component.
 
-=== "backend"
+=== "Backend"
     Release an initial version of the backend named `backend`:
 
     ```shell
@@ -66,9 +66,15 @@ In one shell, port-forward requests to the frontend component:
     ```
 In another shell, run a script to generate load from multiple users:
     ```shell
-    curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/abn-sample/generate_load.sh | sh -s --
+    curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.18.3/samples/abn-sample/generate_load.sh | sh -s --
     ```
 
+The load generator and sample frontend application outputs the backend that handled each recommendation. With just one version is deployed, all requests are handled by `backend-0`. In the output you will see something like:
+
+```
+Recommendation: {"Id":19,"Name":"sample","Source":"backend-74ff88c76d-nb87j"}
+```
+
 ## Deploy candidate
 
 A candidate version of the *backend* component can be deployed simply by adding a second version to the list of versions:
@@ -91,6 +97,12 @@ EOF
 While the candidate version is deploying, `Lookup()` will return only the version index number `0`; that is, the first, or primary, version of the model.
 Once the candidate version is ready, `Lookup()` will return both `0` and `1`, the indices of both versions, so that requests can be distributed across both versions.
 
+Once both backend versions are responding to requests, the output of the load generator will include recommendations from the candidate version. In this example, you should see something like:
+
+```
+Recommendation: {"Id":19,"Name":"sample","Source":"backend-candidate-1-56cb7cd5cf-bkrjv"}
+```
+
 ## Compare versions using Grafana
 
 Inspect the metrics using Grafana. If Grafana is deployed to your cluster, port-forward requests as follows:
@@ -132,6 +144,12 @@ EOF
 
 Calls to `Lookup()` will now recommend that all traffic be sent to the new primary version `backend` (currently serving the promoted version of the code).
 
+The output of the load generator will again show just `backend_0`:
+
+```
+Recommendation: {"Id":19,"Name":"sample","Source":"backend-74ff88c76d-nb87j"}
+```
+
 ## Cleanup
 
 Delete the sample application:
@@ -144,3 +162,15 @@ helm delete backend
 Uninstall the Iter8 controller:
 
 --8<-- "docs/getting-started/uninstall.md"
+
+If you installed Grafana, you can delete it as follows:
+
+```shell
+kubectl delete svc/grafana, deploy/grafana
+```
+
+***
+
+Congratulations! :tada: You completed your first A/B test with Iter8.
+
+***
diff --git a/docs/getting-started/first-performance.md b/docs/getting-started/first-performance.md
@@ -80,7 +80,7 @@ The Iter8 dashboard will look like the following:
 ![`http` Iter8 dashboard](../user-guide/tasks/images/httpdashboard.png)
 
 ## View logs
-Logs are useful for debugging.
+Logs are useful for debugging. To see the test logs:
 
 ```shell
 kubectl logs -l iter8.tools/test=httpbin-test
@@ -102,6 +102,12 @@ kubectl delete deploy/httpbin
 
 --8<-- "docs/getting-started/uninstall.md"
 
+If you installed Grafana, you can delete it as follows:
+
+```shell
+kubectl delete svc/grafana, deploy/grafana
+```
+
 ***
 
 Congratulations! :tada: You completed your first performance test with Iter8.

diff --git a/docs/getting-started/first-release.md b/docs/getting-started/first-release.md
@@ -63,7 +63,7 @@ You can also send requests from a pod within the cluster:
 
 1. Create a `sleep` pod in the cluster from which requests can be made:
 ```shell
-curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/kserve-serving/sleep.sh | sh -
+curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.18.4/samples/kserve-serving/sleep.sh | sh -
 ```
 
 2. Exec into the sleep pod:
@@ -76,7 +76,7 @@ kubectl exec --stdin --tty "$(kubectl get pod --sort-by={metadata.creationTimest
 curl httpbin.default -s -D - | grep -e '^HTTP' -e app-version
 ```
 
-The output includes the success of the request (the HTTP return code) and the version of the application that responded (the `app-version` response header). For example:
+The output includes the success of the request (the HTTP return code) and the version of the application that responded (in the `app-version` response header). In this example:
 
 ```
 HTTP/1.1 200 OK
@@ -123,7 +123,15 @@ When the second version is deployed and ready, the Iter8 controller automaticall
 
 ### Verify routing
 
-You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will now be handled equally by both versions.
+You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will now be handled equally by both versions. Output will be something like:
+
+```
+HTTP/1.1 200 OK
+app-version: httpbin-0
+...
+HTTP/1.1 200 OK
+app-version: httpbin-1
+```
 
 ## Modify weights (optional)
 
@@ -177,7 +185,12 @@ Once the (reconfigured) primary version ready, the Iter8 controller will automat
 
 ### Verify routing
 
-You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version.
+You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version. Output will be something like:
+
+```
+HTTP/1.1 200 OK
+app-version: httpbin-0
+```
 
 ## Cleanup
 
@@ -187,6 +200,18 @@ Delete the application and its routing configuration:
 helm delete httpbin
 ```
 
+If you used the `sleep` pod to generate load, remove it:
+
+```shell
+kubectl delete deploy sleep
+```
+
 Uninstall Iter8 controller:
 
 --8<-- "docs/getting-started/uninstall.md"
+
+***
+
+Congratulations! :tada: You completed your first blue-green rollout with Iter8.
+
+***
diff --git a/docs/roadmap.md b/docs/roadmap.md
@@ -9,11 +9,8 @@ hide:
 
 1. Stabilizing Iter8 APIs for CNCF sandboxing
 2. Autoscaling the metrics service
-3. Install infrastructure components such as Istio
-4. Install ML components such as KServe and KServe ModelMesh
-5. Extend routing templates to include application management
-6. Support multi-cluster installs
-7. Open Data Hub tier 1 project
-8. Metrics & evaluation for foundation model/LLM-based apps
-9. Hyperparameter tuning for foundation model/LLM-based inference pipelines
-10. Data/concept drift detection for ML models
+3. Support multi-cluster installs
+4. Open Data Hub tier 1 project
+5. Metrics & evaluation for foundation model/LLM-based apps
+6. Hyperparameter tuning for foundation model/LLM-based inference pipelines
+7. Data/concept drift detection for ML models
diff --git a/docs/tutorials/integrations/kserve-mm/abn.md b/docs/tutorials/integrations/kserve-mm/abn.md
@@ -61,6 +61,12 @@ application:
 EOF
 ```
 
+Wait for the backend model to be ready:
+
+```shell
+kubectl wait --for condition=ready isvc/backend-0 --timeout=600s
+```
+
 ## Generate load
 
 In one shell, port-forward requests to the frontend component:
@@ -70,9 +76,15 @@ In one shell, port-forward requests to the frontend component:
 
 In another shell, run a script to generate load from multiple users:
     ```shell
-    curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/abn-sample/generate_load.sh | sh -s --
+    curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.18.3/samples/abn-sample/generate_load.sh | sh -s --
     ```
 
+The load generator and sample frontend application outputs the backend that handled each recommendation. With just one version is deployed, all requests are handled by `backend-0`. In the output you will see something like:
+
+```
+Recommendation: backend-0__isvc-3642375d03
+```
+
 ## Deploy candidate
 
 A candidate version of the model can be deployed simply by adding a second version to the list of versions:
@@ -105,6 +117,12 @@ EOF
 Until the candidate version is ready, calls to `Lookup()` will return only the version index number `0`; that is, the first, or primary, version of the model.
 Once the candidate version is ready, `Lookup()` will return both `0` and `1`, the indices of both versions, so that requests can be distributed across both versions.
 
+Once both backend versions are responding to requests, the output of the load generator will include recommendations from the candidate version. In this example, you should see something like:
+
+```
+Recommendation: backend-1__isvc-3642375d03
+```
+
 ## Compare versions using Grafana
 
 Inspect the metrics using Grafana. If Grafana is deployed to your cluster, port-forward requests as follows:
@@ -155,6 +173,12 @@ EOF
 
 Calls to `Lookup()` will now recommend that all traffic be sent to the new primary version `backend-0` (currently serving the promoted version of the code).
 
+The output of the load generator will again show just `backend_0`:
+
+```
+Recommendation: backend-0__isvc-3642375d03
+```
+
 ## Cleanup
 
 Delete the backend:
@@ -171,4 +195,10 @@ kubectl delete deploy/frontend svc/frontend
 
 Uninstall Iter8 controller:
 
---8<-- "docs/getting-started/uninstall.md"
+--8<-- "docs/getting-started/uninstall.md"
+
+If you installed Grafana, you can delete it as follows:
+
+```shell
+kubectl delete svc/grafana, deploy/grafana
+```
diff --git a/docs/tutorials/integrations/kserve-mm/blue-green.md b/docs/tutorials/integrations/kserve-mm/blue-green.md
@@ -50,6 +50,12 @@ application:
 EOF
 ```
 
+Wait for the backend model to be ready:
+
+```shell
+kubectl wait --for condition=ready isvc/wisdom-0 --timeout=600s
+```
+
 ??? note "What happens?"
     - Because `environment` is set to `kserve-modelmesh-istio`,  an `InferenceService` object is created.
     - The namespace `default` is inherited from the Helm release namespace since it is not specified in the version or in `application.metadata`.
@@ -90,33 +96,12 @@ cat grpc_input.json \
 | grep -e app-version
 ```
 
-The output includes the version of the application that responded (the `app-version` response header). For example:
+The output includes the version of the application that responded (in the `app-version` response header). In this example:
 
 ```
 app-version: wisdom-0
 ```
 
-??? note "To send requests from outside the cluster"
-    To configure the release for traffic from outside the cluster, a suitable Istio `Gateway` is required. For example, this [sample gateway](https://raw.githubusercontent.com/kalantar/docs/release/samples/iter8-sample-gateway.yaml). When using the Iter8 `release` chart, set the `gateway` field to the name of your `Gateway`. Finally, to send traffic:
-
-    (a) In a separate terminal, port-forward the ingress gateway:
-    ```shell
-    kubectl -n istio-system port-forward svc/istio-ingressgateway 8080:80
-    ```
-    (b) Download the proto file and sample input:
-    ```shell
-    curl -sO https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/modelmesh-serving/kserve.proto
-    curl -sO https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/modelmesh-serving/grpc_input.json
-    ```
-    \(c) Send requests using the `Host` header:
-    ```shell
-    cat grpc_input.json | \
-    grpcurl -vv -plaintext -proto kserve.proto -d @ \
-    -authority wisdom.modelmesh-serving \
-    localhost:8080 inference.GRPCInferenceService.ModelInfer \
-    | grep -e app-version
-    ```
-
 ## Deploy candidate
 
 A candidate version of the model can be deployed simply by adding a second version to the list of versions comprising the application:
@@ -151,7 +136,13 @@ When the candidate version is ready, the Iter8 controller will Iter8 will automa
 
 ### Verify Routing
 
-You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will be handled equally by both versions.
+You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will be handled equally by both versions. Output will be something like:
+
+```
+app-version: wisdom-0
+...
+app-version: wisdom-1
+```
 
 ## Modify weights (optional)
 
@@ -186,7 +177,7 @@ Iter8 automatically reconfigures the routing to distribute traffic between the v
 
 ### Verify Routing
 
-You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. 70 percent of requests will now be handled by the candidate version; the remaining 30 percent by the primary version.
+You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. 70 percent of requests will now be handled by the candidate version (`wisdom-1`); the remaining 30 percent by the primary version (`wisdom-0`).
 
 ## Promote candidate
 
@@ -216,7 +207,11 @@ Once the (reconfigured) primary `InferenceService` ready, the Iter8 controller w
 
 ### Verify Routing
 
-You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version.
+You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version. Output will be something like:
+
+```
+app-version: wisdom-0
+```
 
 ## Cleanup
 
@@ -226,6 +221,12 @@ Delete the models are their routing:
 helm delete wisdom
 ```
 
+If you used the `sleep` pod to generate load, remove it:
+
+```shell
+kubectl delete deploy sleep
+```
+
 Uninstall Iter8 controller:
 
 --8<-- "docs/getting-started/uninstall.md"