Chaosengine was not clear after the experiment was over #1398

badashanren · 2020-04-02T03:28:26Z

What happened:
Chaosengine was not clear after the experiment was over. spec.jobCleanPolicy is "delete"
And I run the experiment again, the experiment-pod is not start up. I have to remove chaoengine by manual. Is it normal?
What you expected to happen:
Chaosengine was clear after the experiment was over, I can run the experiment multiple times.
How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

ksatchit · 2020-04-02T05:18:07Z

Thanks for the feedback @badashanren ! Currently, we need to remove & recreate the engine to start the experiment (some of the users are working around this by setting up cron jobs to remove/create the chaosengine). We are solving this in a couple of stages:

Stage-1: We will be able to keep a "completed" chaosengine and patch it to re-trigger experiments (ref: Add logic for Graceful Termination, when chaos engine is deleted #1360 (comment)) - Should be available by 1.3 (15th of this month)
Stage-2: Provide scheduling capability to rerun the experiments multiple in a desired interval (ref: Better way to do "scheduled chaos" #1223). This should be available by 1.4, in all probability.

badashanren · 2020-04-02T07:03:43Z

Thanks for the feedback @badashanren ! Currently, we need to remove & recreate the engine to start the experiment (some of the users are working around this by setting up cron jobs to remove/create the chaosengine). We are solving this in a couple of stages:

Stage-1: We will be able to keep a "completed" chaosengine and patch it to re-trigger experiments (ref: #1360 (comment)) - Should be available by 1.3 (15th of this month)

Stage-2: Provide scheduling capability to rerun the experiments multiple in a desired interval (ref: Better way to do "scheduled chaos" #1223). This should be available by 1.4, in all probability.

Got it~
BTW, I want to use other chaos library(such as chaosblade) to inject chaos. Some tips or docs will be appreciated

ksatchit · 2020-04-03T04:32:59Z

Sure @badashanren ! That is an interesting usecase. Litmus has been designed in a way where it can reuse other chaos tools/chaos logic - provided, it is containerized. Typically, the workflow involves being able to consume a config that the respective tools would need, with the litmus chaos-runner creating a job with the said chaos container/library. For ex: We have been using pumba in that model for some of the network experiments. Another example of integration is with chaostoolkit - where not just the chaos-injection stage, but an entire experiment is written using that tool & is orchestrated by Litmus.

We don't have a formal document describing how it is done yet, but will add it & share it for review ASAP.

Having said that, I am interested in learning more about your usecase for chaos in general, where chaosblade fits in etc., so that I can better suggest/document things.

badashanren · 2020-04-03T07:30:34Z

That is an interesting usecase. Litmus has been designed in a way where it can reuse other chaos tools/chaos logic - provided, it is containerized. Typically, the workflow involves being able to consume a config that the respective tools would need, with the litmus chaos-runner creating a job with the said chaos container/library. For ex: We have been using pumba in that model for some of the network experiments. Another example of integration is with chaostoolkit - where not just the chaos-injection stage, but an entire experiment is written using that tool & is orchestrated by Litmus.

We don't have a formal document describing how it is done yet, but will add it & share it for review ASAP.

Having said that, I am interested in learning more about your usecase for chaos in general, where chaosblade fits in etc., so that I can better suggest/document things.

First,I like litmus,I think it's not a chaos tool,it should be a chaos platform. it's scalable, it's very important. Many Chaos tools have their advantages, litmus can be quickly compatible with other tools.
But if we have no formal docs, it is very difficulty to us.

ksatchit · 2020-04-03T07:44:57Z

That's great to hear! The doc is coming your way in a few hours :)

ksatchit · 2020-04-03T08:59:05Z

@badashanren, here is some info you may find useful.

The formal docs are here: Litmus Docs
A simplistic/basic developer guide to construct experiments in ansible is here: Developer Guide

Note: This provides a simple scaffolding tool for ansible-based experiment, but we sure support others (python/golang etc.,) as well. The dev guides for those will be added soon and is tracked via Dockerize the experiment scaffold tool to avoid managing dependencies #1259
The issue tracking the docs for integration w/ other tools (has a summary) is here: Provide the steps to integrate litmus with other chaos tools #1205. This is also being tracked for 1.3 (Apr 15th), hope to provide a first-cut version of this today EoD so that you can take a look!
A long-pending style guide which may elaborate on patterns/approaches is tracked here: Create a styleGuide for naming conventions used in the LItmusChaos Project #764. This might make it to patch releases on 1.3 (Mostly ~Apr 30-1st Week of May)

I will be talking about 2 types of integrations in the upcoming doc that resolves #1205 :

The experiment is already available via a different tool and you would only like to orchestrate it via Litmus. Ex: chaostoolkit integration w/ Litmus.
- A sample experiment logic is here: https://github.com/litmuschaos/test-tools/blob/master/chaostoolkit/data/k8_wrapper.py. Some associated enhancements are being placed in Added support for Reporting - control via ENV flag test-tools#121
- The ChaosExperiment/ChaosEngine CR which litmus will execute for this is here: https://github.com/litmuschaos/chaos-charts/tree/staging/charts/chaostoolkit/k8-pod-delete
The experiment needs to be constructed with user defined entry (pre-chaos) & exit (post-chaos) checks based on the scenario/application in picture. However, the chaos is itself is actually injected via separate tool. Ex: OpenEBS experiments reusing a Pumba based job to do network delays. In this case, the environment variable LIB & LIB_IMAGE in the chaosexperiment/chaosengine CR is set to the desired tool & a chaoslib wrapper is available in litmus which will use that tool to perform chaos.
- Example of such an experiment: https://github.com/litmuschaos/litmus/tree/master/experiments/openebs/openebs-target-network-loss
- The corresponding ChaosExperiment/ChaosEngine CRs are here: https://github.com/litmuschaos/chaos-charts/tree/staging/charts/openebs/openebs-target-network-loss

badashanren · 2020-04-03T09:12:06Z

@badashanren, here is some info you may find useful.

The formal docs are here: Litmus Docs

A simplistic/basic developer guide to construct experiments in ansible is here: Developer Guide
Note: This provides a simple scaffolding tool for ansible-based experiment, but we sure support others (python/golang etc.,) as well. The dev guides for those will be added soon and is tracked via Dockerize the experiment scaffold tool to avoid managing dependencies #1259

The issue tracking the docs for integration w/ other tools (has a summary) is here: Provide the steps to integrate litmus with other chaos tools #1205. This is also being tracked for 1.3 (Apr 15th), hope to provide a first-cut version of this today EoD so that you can take a look!

A long-pending style guide which may elaborate on patterns/approaches is tracked here: Create a styleGuide for naming conventions used in the LItmusChaos Project #764. This might make it to patch releases on 1.3 (Mostly ~Apr 30-1st Week of May)

I will be talking about 2 types of integrations in the upcoming doc that resolves #1205 :

The experiment is already available via a different tool and you would only like to orchestrate it via Litmus. Ex: chaostoolkit integration w/ Litmus.

A sample experiment logic is here: https://github.com/litmuschaos/test-tools/blob/master/chaostoolkit/data/k8_wrapper.py. Some associated enhancements are being placed in litmuschaos/test-tools#121

The ChaosExperiment/ChaosEngine CR which litmus will execute for this is here: https://github.com/litmuschaos/chaos-charts/tree/staging/charts/chaostoolkit/k8-pod-delete

The experiment needs to be constructed with user defined entry (pre-chaos) & exit (post-chaos) checks based on the scenario/application in picture. However, the chaos is itself is actually injected via separate tool. Ex: OpenEBS experiments reusing a Pumba based job to do network delays. In this case, the environment variable LIB & LIB_IMAGE in the chaosexperiment/chaosengine CR is set to the desired tool & a chaoslib wrapper is available in litmus which will use that tool to perform chaos.

Example of such an experiment: https://github.com/litmuschaos/litmus/tree/master/experiments/openebs/openebs-target-network-loss

The corresponding ChaosExperiment/ChaosEngine CRs are here: https://github.com/litmuschaos/chaos-charts/tree/staging/charts/openebs/openebs-target-network-loss

Thanks very much,I will study it carefully.

ksatchit · 2020-04-17T04:46:01Z

Thanks for the feedback @badashanren ! Currently, we need to remove & recreate the engine to start the experiment (some of the users are working around this by setting up cron jobs to remove/create the chaosengine). We are solving this in a couple of stages:

Stage-1: We will be able to keep a "completed" chaosengine and patch it to re-trigger experiments (ref: #1360 (comment)) - Should be available by 1.3 (15th of this month)

Stage-2: Provide scheduling capability to rerun the experiments multiple in a desired interval (ref: Better way to do "scheduled chaos" #1223). This should be available by 1.4, in all probability.

Just to circle back on the 1.3 items related to topics discussed in this issue/comments : @badashanren .

You can now re-apply chaosengine resources or patch it w/o having to delete / recreate it (Add logic for Graceful Termination, when chaos engine is deleted #1360 ). You can find the FAQ/steps for it here: https://docs.litmuschaos.io/docs/faq-general/#how-to-restart-chaosengine-after-graceful-completion
Integration w/ other tools explained via: https://docs.google.com/presentation/d/1OGuSisuory7jE-LvrDyC6J9x1wdom2Bowc0Fso6F4Ps/edit

Upcoming:

As noted 1.3.1 will contain the style guide, some more improvements to developer docs (including an example for use w/ chaosblade)

1.4 will have the dockerization of the scaffold tool to avoid dependencies.

ksatchit · 2020-05-16T06:11:22Z

The ChaosScheduler has been made available (as alpha feature) in release 1.4 This should meet the requirement of being able to repeatedly run the experiment by creating/removing the chaosengine.

Refer to https://docs.litmuschaos.io/docs/scheduling/ & https://docs.litmuschaos.io/docs/chaosschedule/

ksatchit added kind/feature project/community Issues raised by community members area/chaos-scheduler Regarding the chaos scheduler labels Apr 2, 2020

ksatchit added this to the 1.4 milestone Apr 2, 2020

ksatchit mentioned this issue Apr 17, 2020

Provide the steps to integrate litmus with other chaos tools #1205

Closed

ksatchit closed this as completed May 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chaosengine was not clear after the experiment was over #1398

Chaosengine was not clear after the experiment was over #1398

badashanren commented Apr 2, 2020

ksatchit commented Apr 2, 2020 •

edited

Loading

badashanren commented Apr 2, 2020

ksatchit commented Apr 3, 2020 •

edited

Loading

badashanren commented Apr 3, 2020

ksatchit commented Apr 3, 2020 •

edited

Loading

ksatchit commented Apr 3, 2020 •

edited

Loading

badashanren commented Apr 3, 2020

ksatchit commented Apr 17, 2020

ksatchit commented May 16, 2020 •

edited

Loading

Chaosengine was not clear after the experiment was over #1398

Chaosengine was not clear after the experiment was over #1398

Comments

badashanren commented Apr 2, 2020

ksatchit commented Apr 2, 2020 • edited Loading

badashanren commented Apr 2, 2020

ksatchit commented Apr 3, 2020 • edited Loading

badashanren commented Apr 3, 2020

ksatchit commented Apr 3, 2020 • edited Loading

ksatchit commented Apr 3, 2020 • edited Loading

badashanren commented Apr 3, 2020

ksatchit commented Apr 17, 2020

Upcoming:

ksatchit commented May 16, 2020 • edited Loading

ksatchit commented Apr 2, 2020 •

edited

Loading

ksatchit commented Apr 3, 2020 •

edited

Loading

ksatchit commented Apr 3, 2020 •

edited

Loading

ksatchit commented Apr 3, 2020 •

edited

Loading

ksatchit commented May 16, 2020 •

edited

Loading