Skip to content

Conversation

@warrenvw
Copy link
Contributor

@warrenvw warrenvw commented Mar 1, 2019

Resolves #4.

Benefit from Docker-friendly single-sensor-per-container mode (added since st2 v2.9) as a way of Sensor Partitioning, distributing the computing load between many pods and relying on K8s failover/reschedule mechanisms, instead of running everything on 1 single instance of st2sensorcontainer.

Remaining work:

  • Documentation in README.md
  • Documentation updates in st2docs repo after doc changes approved in this repo.

@warrenvw warrenvw self-assigned this Mar 1, 2019
@warrenvw warrenvw requested a review from arm4b March 1, 2019 06:43
@warrenvw
Copy link
Contributor Author

warrenvw commented Mar 1, 2019

I'm thinking about releasing a new chart that includes this PR and #50. Need to determine the appVersion (2.10dev or 3.0dev?)... and chart version (v0.10.0?).

@arm4b
Copy link
Member

arm4b commented Mar 1, 2019

I'm good with releasing it in one Helm chart version.

Need to make sure that PR are merged separately and tested in isolation, especially considering: #50 (comment)

appVersion remains the same as we're relying on dev and there is no 2.10dev. See https://github.com/StackStorm/st2/blob/master/st2common/st2common/__init__.py

image: "{{ template "imageRepository" . }}/st2actionrunner{{ template "enterpriseSuffix" . }}:{{ .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
image: "{{ template "imageRepository" $ }}/st2actionrunner{{ template "enterpriseSuffix" $ }}:{{ $.Chart.AppVersion }}"
imagePullPolicy: {{ $.Values.image.pullPolicy }}
Copy link
Member

@arm4b arm4b Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what is the trick with . vs $.?
I don't see this widely demonstrated in Helm examples or other charts. Does it fixes anything, what's the difference?

Copy link
Contributor Author

@warrenvw warrenvw Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref: https://helm.sh/docs/chart_template_guide/#variables

The range function will “range over” (iterate through) a list. But now something interesting happens. Just like with sets the scope of ., so does a range operator.

However, there is one variable that is always global - $ - this variable will always point to the root context. This can be very useful when you are looping in a range and need to know the chart’s release name.

The scope of the range block is .Values.st2.packs.sensors. So any of the values outside that scope need to be referred to with $. I'm open to any other "cleaner" solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for explanation!

Copy link
Member

@arm4b arm4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall nice work!

There is a big call to improve the documentation for single-sensor-per-container mode.
Additionally, made several comments to request a discussion/alternatives in some code parts.

imagePullPolicy: {{ $.Values.image.pullPolicy }}
# TODO: Add liveness/readiness probes (#3)
#livenessProbe:
#readinessProbe:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please expose liveness & readiness probes in Helm values as an option.

This is something that'll be needed in context of st2cicd work you're doing when some sensors can be opening a port/http endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking issue #3, as this PR helps resolve some of that issue.

values.yaml Outdated
# To partition sensors with one sensor per node, override st2.packs.sensors.
# NOTE: Do not modify this file.
- name: aio
settings:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it makes sense to move child resources one level up and omit settings sub-category at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll experiment with this. I don't see any reason why not. I'd done it this way because one of the examples I'd seen grouped variables under settings. Didn't seem like a bad idea at the time.

values.yaml Outdated
# Default "all-in-one" sensorcontainer that runs all sensors.
# To partition sensors with one sensor per node, override st2.packs.sensors.
# NOTE: Do not modify this file.
- name: aio
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question here to bring a discussion:

Any other naming variations apart of aio? This brings some bad memories from StackStorm past which had "AIO" installer. Wondering if there is any chance to make st2sensorcontainer named as before (with no postfix?) if there is no single-sensor-per container mode enabled.

As you worked on this probably had any ideas or alternatives in mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a bad idea.. i'll look into keeping the same previous name (without a postfix). On my first pass, the code was simpler if .name was used for all cases, but I'll see if I can come up with an elegant solution that does what you're asking.

kind: Deployment
metadata:
name: {{ .Release.Name }}-st2sensorcontainer{{ template "enterpriseSuffix" . }}
name: {{ $.Release.Name }}-st2sensorcontainer-{{ .name }}{{ template "enterpriseSuffix" $ }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud: it could be a matter of time when name exceeds the limit of number of allowed characters.
But we have no solutions here, except of cutting/trimming it. A task for some future.

Copy link
Contributor Author

@warrenvw warrenvw Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current pod names are about 75 characters long including the resource prefix /api/v1/pods. The maximum length of a Kubernetes resource name is 253 characters. Unless I'm missing something obvious, I do not think we'll exceed any limits.

https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

Copy link
Member

@arm4b arm4b Mar 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See:
https://github.com/helm/charts/blob/d712e0cfd7d1426b6adb59473530fcc84bacbaf2/stable/rabbitmq-ha/templates/_helpers.tpl#L11-L12

helm/helm#2022

There are a bunch of other issues reported everywhere in K8s/Helm related to char limits.

# It is possible to run st2sensorcontainer in HA mode by running one process on each compute instance.
# Each sensor node needs to be provided with proper partition information to share work with other sensor
# nodes so that the same sensor does not run on different nodes.
sensors:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation/examples and references to it should be definitely improved.

Take a look at README.md. Additionally, we need to have a section there describing the full example, why is this needed and defaults.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I definitely need to add documentation to configure st2sensor partitioning. I'll use some examples to help clarify.

@arm4b
Copy link
Member

arm4b commented Mar 1, 2019

BTW, did you figure out why the K8s jobs didn't trigger (as part of the previous iteration)?

@arm4b
Copy link
Member

arm4b commented Mar 1, 2019

TODO: once it's merged (including Documentation) we'll need a st2docs sync-up.

@warrenvw
Copy link
Contributor Author

warrenvw commented Mar 1, 2019

I would imagine that the jobs didn't trigger due to some issue with Helm. I do not know the precise root cause at this point in time.

See https://github.com/stackstorm/discussions/issues/318#issuecomment-468190021 for more details. I didn't bother checking this earlier because I had no reason to think helm would be so poorly behaved. I'd always installed helm chart within the chart workspace. I didn't ever have to install from a packaged chart just to install correctly.

@warrenvw warrenvw requested a review from arm4b March 7, 2019 22:33
README.md Outdated
distributes the computing load between many pods and relies on K8s failover/reschedule mechanisms,
instead of running everything on a single instance of st2sensorcontainer. To partition the sensors,
create a yaml file containing `st2.packs.sensors`, and at a minimum, the `name` and `ref` elements.
You can also specify a `livenessProbe` and `readinessProbe` that Kubernetes will use to check
Copy link
Member

@arm4b arm4b Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part about the probes, resources and rest could be omitted to keep documentation more balanced.

README.md Outdated
```

Pass the name of this file to `helm install` using the `-f <file>` option. Add additional sensors to
the `sensors:` list following the same format as above.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it worth also mentioning that particular sensor should be shipped as part of the Dockerized packs.

Copy link
Member

@arm4b arm4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

Great feature to have!

@warrenvw warrenvw merged commit d843026 into master Mar 8, 2019
@warrenvw warrenvw deleted the feat/partitionsensors branch March 8, 2019 22:22
@cognifloyd cognifloyd removed the RFR label Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Partitioning StackStorm Sensors

4 participants