Skip to content

Commit

Permalink
added FaultInjection test
Browse files Browse the repository at this point in the history
  • Loading branch information
VeerMuchandi committed Nov 14, 2017
1 parent b4088dd commit 385d44e
Show file tree
Hide file tree
Showing 8 changed files with 139 additions and 6 deletions.
17 changes: 11 additions & 6 deletions CanaryContentBasedRouting.md
Expand Up @@ -8,7 +8,7 @@ Test the application in the browser. "Reviews" output is random each time you ac
Default traffic from all users to reviews version v1 Default traffic from all users to reviews version v1


``` ```
$ oc create -f create -f samples/bookinfo/kube/route-rule-all-v1.yaml $ oc create -f samples/bookinfo/kube/route-rule-all-v1.yaml
``` ```


Look at the routerules that are created Look at the routerules that are created
Expand All @@ -22,7 +22,7 @@ ratings-default RouteRule.v1alpha2.config.istio.io
reviews-default RouteRule.v1alpha2.config.istio.io reviews-default RouteRule.v1alpha2.config.istio.io
``` ```


Understand the the `reviews-default` routerule. Note the traffic goes to version reviews v1 by default. Test in browser and you'll see no stars now. Understand the the `reviews-default` routerule. Note the traffic goes to version reviews v1 by default.


``` ```
$ oc get routerule reviews-default -o yaml $ oc get routerule reviews-default -o yaml
Expand All @@ -45,7 +45,12 @@ spec:
route: route:
- labels: - labels:
version: v1 version: v1
``` ```

**Test** in the browser and you'll see no stars now because reviews version 1 does not call ratings service.


Let's now assume that we are adding Reviews service v2 as a canary and let us say we want to allow a specific user to use this service. Reviews v2 calls ratings service but displays the ratings as **black stars**.


Redirect specific user to version v2 based on the content in the cookie Redirect specific user to version v2 based on the content in the cookie


Expand Down Expand Up @@ -91,8 +96,8 @@ spec:
version: v2 version: v2
``` ```
Per this newly added rule, if the cookie has `user=jason`, traffic goes to reviews v2 (black stars) otherwise default applies. Per this newly added rule, if the cookie has `user=jason`, traffic goes to reviews v2 otherwise default applies.


Test by Signing in as user 'jason' in the browser. Use any password. You'll only see black starts i.e, reviews v2. If you sign out you will see no stars!! Test by Signing in as user 'jason' in the browser. Use any password. You'll only see **black stars** i.e, reviews v2. If you sign out you will see **no stars**!! Try it a few times and have fun :)


**Summary:** Assume you created a new version of reviews service v2 and you want to test it as specific user before releasing it or making it generally available. You introduce it as a canary to specific users to test. **Summary:** Assume you created a new version of reviews service v2 and you want to test it as specific user before releasing it or making it generally available. You introduce it as a canary to specific user to test.
127 changes: 127 additions & 0 deletions FaultInjection.md
@@ -0,0 +1,127 @@
# Fault Injection
Fault injection is a mechanism where we will intentionally introduce a fault condition into a system and observe it's behavior.

In this example, we will observe what happens when we add some delay to mimic network latency into the ratings microservice. We will then observe the overall behavior of the system to check if still responds or will it cause failures other failures.

### Pre-requisites

This is a followup after the [canary test](./CanaryContentBasedRouting.md). So if you haven't executed the canary test, you would want to include the default routing rules listed there.

Also sign-out from being the user "Jason"


### Exercise

Test the application in the browser and observe the response times in zipkin. In order to find the URL for zipkin run

```
$ oc get route zipkin -n istio-system
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
zipkin zipkin-istio-system.192.168.64.7.nip.io zipkin http None
```
In my case zipkin url is `zipkin-istio-system.192.168.64.7.nip.io`. Use that and find the response time for the latest usage. You will notice that the response time for the service is a few milliseconds. You can click and expand the trace and you will find the time consumed by individual services as shown here

![ZipkinTrace](./images/zipkin1.jpeg)

Now login as user "Jason" and use the system and measure again. Click on the latest trace to expand. You'll find in the trace that the request goes from reviews service to ratings service. But the ratings service would just take a few milliseconds to respond.

![ZipkinTraceWithRatings](./images/zipkin2.jpeg)

Now let's introduce some delay on the ratings service specifically for user "Jason". This rule introduces a fixed delay of 7 seconds on the ratings service for any traffic coming from Jason. If you see the rule, you will understand that we are introducing this delay using `httpFault`.


```
$ oc create -f samples/bookinfo/kube/route-rule-ratings-test-delay.yaml
routerule "ratings-test-delay" created
$ oc get routerule ratings-test-delay -o yaml
apiVersion: config.istio.io/v1alpha2
kind: RouteRule
metadata:
clusterName: ""
creationTimestamp: 2017-10-27T21:41:04Z
deletionGracePeriodSeconds: null
deletionTimestamp: null
name: ratings-test-delay
namespace: bookinfo
resourceVersion: "7746"
selfLink: /apis/config.istio.io/v1alpha2/namespaces/bookinfo/routerules/ratings-test-delay
uid: 886157b9-bb5f-11e7-9c32-1ad90b5af171
spec:
destination:
name: ratings
httpFault:
delay:
fixedDelay: 7s
percent: 100
match:
request:
headers:
cookie:
regex: ^(.*?;)?(user=jason)(;.*)?$
precedence: 2
route:
- labels:
version: v1
```


Now try accessing the application. The reviews part of the application fails with error "**Error Fetching Product Reviews**" as below:
![FaultIntroduced](./images/FaultWith10SDelay.jpeg)

Check zipkin tracing now again. The last one shows in red to represent failure.

![ZipkinTraceWithRatings](./images/zipkin3.jpeg)
![ZipkinTraceWithRatings](./images/zipkin4.jpeg)

The detailed trace shows that ratings service responded in ~7 seconds. But the reviews service failed in ~3 seconds and then it went for a retry. Even during the retry ratings responded after ~7 seconds and the reviews failed. This is because the timeout between the productpage and reviews service is less (3s + 1 retry = 6s total) than the timeout between the reviews and ratings service (10s)

Sign-out from "Jason" and test as a default user, the calls should go through fine with no errors.


**Edit the delay**

Let's now edit the delay to 2.8 seconds on the ratings service to see

```
$ oc edit routerule ratings-test-delay
```

Find this section in the editor
```
spec:
destination:
name: ratings
httpFault:
delay:
fixedDelay: 7s
percent: 100
```

Change it to

```
spec:
destination:
name: ratings
httpFault:
delay:
fixedDelay: 2.8s
percent: 100
```
and save.

Test again signing in as user "jason". Black star ratings (reviews v2) should be back again. But you will notice a slight wait of <3 seconds. Also observe Zipkin traces to find that the ratings service uses ~2.8 seconds and the rest of the calls go through with no errors.


**Cleanup content based route rules**

$ oc delete routerule ratings-test-delay
routerule "ratings-test-delay" deleted

$ oc delete routerule reviews-test-v2
routerule "reviews-test-v2" deleted

### Summary
In this lab, we have learnt to inject a fault by mimicing network latency for a specific user and tested how the overall system behaves.
Binary file added images/FaultWith10SDelay.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/zipkin1.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/zipkin2.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/zipkin3.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/zipkin4.jpeg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions readme.md
Expand Up @@ -17,4 +17,5 @@ For testing, you will run the same steps in [Istio Documentation](https://istio.
instead of running `istioctl create -f samples/bookinfo/kube/route-rule-all-v1.yaml`, you will run `oc create -f samples/bookinfo/kube/route-rule-all-v1.yaml` instead of running `istioctl create -f samples/bookinfo/kube/route-rule-all-v1.yaml`, you will run `oc create -f samples/bookinfo/kube/route-rule-all-v1.yaml`


* Testing Canary, Content based routing [Istio Docs](https://istio.io/docs/tasks/traffic-management/request-routing.html#content-based-routing) [openshift commands](./CanaryContentBasedRouting.md) * Testing Canary, Content based routing [Istio Docs](https://istio.io/docs/tasks/traffic-management/request-routing.html#content-based-routing) [openshift commands](./CanaryContentBasedRouting.md)
* Fault Injection with Network Latency [Istio Docs](https://istio.io/docs/tasks/traffic-management/fault-injection.html) [openshift commands](./FaultInjection.md)
* More tests to be added * More tests to be added

0 comments on commit 385d44e

Please sign in to comment.