New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write a basic Dotmesh Operator that replicates our current DaemonSet setup #344
Comments
Per #343 (comment), we don't have a StatefulSet template per se, but we do have a design for an operator which will set up PVCs, node labels etc appropriately! |
As part of this issue, we need a testing strategy. We are going to develop simple Then we can write dind tests for the dotmesh operator creating pods which consuming PVCs from a Later, we'll be able to simulate killing a dotmesh node and having the PV failover and "reattach" to a different dind. |
* 'master' of https://github.com/dotmesh-io/dotmesh: #344: a failing test which isn't run yet in CI, tee hee.|
We now have dind flexvolume and dynamic provisioners, which work according to this test. Next steps, as I see them:
LOCAL
PV PER NODE
POOL OF DOTMESHES
|
Note that some of the above plan spans different github issues! |
see https://kubernetes.io/blog/2018/01/introducing-client-go-version-6 |
We now have a thing we can compile and run in a test cluster locally, which prints out messages when nodes come/go/change, and the start of a code structure that will run The Algorithm whenever something interesting happens.
…ve no pod bound to them! Downside: the pods crash and burn on startup. But I think that should be just a matter of tweaking the pod template.
… fix stuff, and improved the template
…rt new dotmesh pods while old ones are dying.
…DM namespace already), make node labelling two-stage.
…od deployment, so GC works correctly (and kubectl drain?)
…works. `kubectl drain` fails if the node has pods controlled by operators on it. This makes it intermittent already because sometimes the etcd pod is on that node, and downright failsome with the dotmesh operator in play.
…p), and they're not referenced from the docs any more.
This is in production, so I'm calling it done. |
* master: NFC: More logging dotscience#3 make subdot roots writeable by all, for containers which run as non-root FIX: Missed space :-( Testing stuff in CI is tedious. FIX: Missed the `-c` option to the `dm dot delete...` FIX: Typo... #17: Pull the right image, use a dedicated config, and test `dm dot delete` on the remote NFC: Test adding sleep to ensure replication. #17: Avoid echoing the API key, and run the smoke tests on Linux (it's easier for me to debug them there) #17: Made the smoke test push to a remote cluster (if credentials are passed into SMOKE_TEST_REMOTE and SMOKE_TEST_APIKEY). NFC: Fix logging on error messages #352: Attempt to reduce flakiness by checking replication status on both nodes in a cluster NFC: Comments concerning pod health checking NFC: Re-enable flaky test for debugging #344: We no longer need the GKE yamls (that's handled in the ConfigMap), and they're not referenced from the docs any more. NFC: fix typo sneaked into yaml NFC: Comment out test until we can work out how to fix it
This is part of epic #385 .
With #343 done, we can write a simple Dotmesh Operator that runs Dotmesh on every node in the cluster.
This can become the canonical way of running DM in k8s once documented!
The text was updated successfully, but these errors were encountered: