Allow st2sensorcontainer to be partitioned #51

warrenvw · 2019-03-01T06:43:38Z

Resolves #4.

Benefit from Docker-friendly single-sensor-per-container mode (added since st2 v2.9) as a way of Sensor Partitioning, distributing the computing load between many pods and relying on K8s failover/reschedule mechanisms, instead of running everything on 1 single instance of st2sensorcontainer.

Remaining work:

Documentation in README.md
Documentation updates in st2docs repo after doc changes approved in this repo.

warrenvw · 2019-03-01T07:01:15Z

I'm thinking about releasing a new chart that includes this PR and #50. Need to determine the appVersion (2.10dev or 3.0dev?)... and chart version (v0.10.0?).

arm4b · 2019-03-01T16:56:18Z

I'm good with releasing it in one Helm chart version.

Need to make sure that PR are merged separately and tested in isolation, especially considering: #50 (comment)

appVersion remains the same as we're relying on dev and there is no 2.10dev. See https://github.com/StackStorm/st2/blob/master/st2common/st2common/__init__.py

arm4b · 2019-03-01T17:03:13Z

templates/deployments.yaml

-        image: "{{ template "imageRepository" . }}/st2actionrunner{{ template "enterpriseSuffix" . }}:{{ .Chart.AppVersion }}"
-        imagePullPolicy: {{ .Values.image.pullPolicy }}
+        image: "{{ template "imageRepository" $ }}/st2actionrunner{{ template "enterpriseSuffix" $ }}:{{ $.Chart.AppVersion }}"
+        imagePullPolicy: {{ $.Values.image.pullPolicy }}


Curious what is the trick with . vs $.?
I don't see this widely demonstrated in Helm examples or other charts. Does it fixes anything, what's the difference?

Ref: https://helm.sh/docs/chart_template_guide/#variables

The range function will “range over” (iterate through) a list. But now something interesting happens. Just like with sets the scope of ., so does a range operator.

However, there is one variable that is always global - $ - this variable will always point to the root context. This can be very useful when you are looping in a range and need to know the chart’s release name.

The scope of the range block is .Values.st2.packs.sensors. So any of the values outside that scope need to be referred to with $. I'm open to any other "cleaner" solution.

Makes sense, thanks for explanation!

arm4b

Overall nice work!

There is a big call to improve the documentation for single-sensor-per-container mode.
Additionally, made several comments to request a discussion/alternatives in some code parts.

arm4b · 2019-03-01T17:07:12Z

templates/deployments.yaml

+        imagePullPolicy: {{ $.Values.image.pullPolicy }}
        # TODO: Add liveness/readiness probes (#3)
        #livenessProbe:
        #readinessProbe:


Please expose liveness & readiness probes in Helm values as an option.

This is something that'll be needed in context of st2cicd work you're doing when some sensors can be opening a port/http endpoint.

Linking issue #3, as this PR helps resolve some of that issue.

arm4b · 2019-03-01T17:17:57Z

values.yaml

+      # To partition sensors with one sensor per node, override st2.packs.sensors.
+      # NOTE: Do not modify this file.
+      - name: aio
+        settings:


Does it makes sense to move child resources one level up and omit settings sub-category at all?

I'll experiment with this. I don't see any reason why not. I'd done it this way because one of the examples I'd seen grouped variables under settings. Didn't seem like a bad idea at the time.

arm4b · 2019-03-01T17:32:53Z

values.yaml

+      # Default "all-in-one" sensorcontainer that runs all sensors.
+      # To partition sensors with one sensor per node, override st2.packs.sensors.
+      # NOTE: Do not modify this file.
+      - name: aio


Just a question here to bring a discussion:

Any other naming variations apart of aio? This brings some bad memories from StackStorm past which had "AIO" installer. Wondering if there is any chance to make st2sensorcontainer named as before (with no postfix?) if there is no single-sensor-per container mode enabled.

As you worked on this probably had any ideas or alternatives in mind?

Not a bad idea.. i'll look into keeping the same previous name (without a postfix). On my first pass, the code was simpler if .name was used for all cases, but I'll see if I can come up with an elegant solution that does what you're asking.

arm4b · 2019-03-01T18:24:28Z

templates/deployments.yaml

 kind: Deployment
 metadata:
-  name: {{ .Release.Name }}-st2sensorcontainer{{ template "enterpriseSuffix" . }}
+  name: {{ $.Release.Name }}-st2sensorcontainer-{{ .name }}{{ template "enterpriseSuffix" $ }}


Just thinking out loud: it could be a matter of time when name exceeds the limit of number of allowed characters.
But we have no solutions here, except of cutting/trimming it. A task for some future.

The current pod names are about 75 characters long including the resource prefix /api/v1/pods. The maximum length of a Kubernetes resource name is 253 characters. Unless I'm missing something obvious, I do not think we'll exceed any limits.

https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

See:
https://github.com/helm/charts/blob/d712e0cfd7d1426b6adb59473530fcc84bacbaf2/stable/rabbitmq-ha/templates/_helpers.tpl#L11-L12

helm/helm#2022

There are a bunch of other issues reported everywhere in K8s/Helm related to char limits.

arm4b · 2019-03-01T18:28:09Z

values.yaml

+    # It is possible to run st2sensorcontainer in HA mode by running one process on each compute instance.
+    # Each sensor node needs to be provided with proper partition information to share work with other sensor
+    # nodes so that the same sensor does not run on different nodes.
+    sensors:


Documentation/examples and references to it should be definitely improved.

Take a look at README.md. Additionally, we need to have a section there describing the full example, why is this needed and defaults.

👍 I definitely need to add documentation to configure st2sensor partitioning. I'll use some examples to help clarify.

arm4b · 2019-03-01T18:31:36Z

BTW, did you figure out why the K8s jobs didn't trigger (as part of the previous iteration)?

arm4b · 2019-03-01T18:32:32Z

TODO: once it's merged (including Documentation) we'll need a st2docs sync-up.

warrenvw · 2019-03-01T20:23:50Z

I would imagine that the jobs didn't trigger due to some issue with Helm. I do not know the precise root cause at this point in time.

See https://github.com/stackstorm/discussions/issues/318#issuecomment-468190021 for more details. I didn't bother checking this earlier because I had no reason to think helm would be so poorly behaved. I'd always installed helm chart within the chart workspace. I didn't ever have to install from a packaged chart just to install correctly.

Remove reference to "all-in-one" in comment.

The documentation now describes how to configure single-sensor-per-container.

arm4b · 2019-03-08T15:58:07Z

README.md

+distributes the computing load between many pods and relies on K8s failover/reschedule mechanisms,
+instead of running everything on a single instance of st2sensorcontainer. To partition the sensors,
+create a yaml file containing `st2.packs.sensors`, and at a minimum, the `name` and `ref` elements.
+You can also specify a `livenessProbe` and `readinessProbe` that Kubernetes will use to check


This part about the probes, resources and rest could be omitted to keep documentation more balanced.

arm4b · 2019-03-08T16:01:10Z

README.md

+```
+
+Pass the name of this file to `helm install` using the `-f <file>` option. Add additional sensors to
+the `sensors:` list following the same format as above.


I think it worth also mentioning that particular sensor should be shipped as part of the Dockerized packs.

arm4b

Looks good 👍

Great feature to have!

Allow st2sensorcontainer to be partitioned

e50a10f

warrenvw added enhancement New feature or request Docker Helm K8s RFR labels Mar 1, 2019

warrenvw self-assigned this Mar 1, 2019

warrenvw requested a review from arm4b March 1, 2019 06:43

arm4b reviewed Mar 1, 2019

View reviewed changes

arm4b suggested changes Mar 1, 2019

View reviewed changes

warrenvw added 7 commits March 5, 2019 14:26

Define and use the hyphenPrefix template

7826299

Add liveness and readiness probes for sensors

3a00282

Generalize the liveness and readiness probes

6a38ca9

Add livenessProbe and readinessProbe to values

ae38f5b

Remove reference to "all-in-one" in comment.

Update st2sensorcontainer documentation

9dbebcc

The documentation now describes how to configure single-sensor-per-container.

Merge branch 'master' into feat/partitionsensors

541538f

Update CHANGELOG

b69f5ee

warrenvw requested a review from arm4b March 7, 2019 22:33

Add "In Development" heading

494c184

arm4b reviewed Mar 8, 2019

View reviewed changes

arm4b approved these changes Mar 8, 2019

View reviewed changes

warrenvw added 3 commits March 8, 2019 11:00

Update docs based on feedback

1bd407e

Some more tweaks to docs

51609c4

Add a simplified example

ebf820e

Remove extraneous comment

12ab1ef

warrenvw merged commit d843026 into master Mar 8, 2019

warrenvw deleted the feat/partitionsensors branch March 8, 2019 22:22

warrenvw mentioned this pull request Mar 12, 2019

Update st2sensorcontainer docs StackStorm/st2docs#865

Merged

cognifloyd removed the RFR label Jul 2, 2021

Uh oh!

Allow st2sensorcontainer to be partitioned #51

Allow st2sensorcontainer to be partitioned #51

Uh oh!

Conversation

warrenvw commented Mar 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

warrenvw commented Mar 1, 2019

Uh oh!

arm4b commented Mar 1, 2019

Uh oh!

arm4b Mar 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

warrenvw Mar 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arm4b left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

warrenvw Mar 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arm4b Mar 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arm4b commented Mar 1, 2019

Uh oh!

arm4b commented Mar 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

warrenvw commented Mar 1, 2019

Uh oh!

arm4b Mar 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arm4b left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

warrenvw commented Mar 1, 2019 •

edited

Loading

arm4b Mar 1, 2019 •

edited

Loading

warrenvw Mar 1, 2019 •

edited

Loading

warrenvw Mar 1, 2019 •

edited

Loading

arm4b Mar 4, 2019 •

edited

Loading

arm4b commented Mar 1, 2019 •

edited

Loading

arm4b Mar 8, 2019 •

edited

Loading