Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove redundant Instance.Status.AggregatedStatus field #1474

Merged
merged 2 commits into from
Apr 30, 2020

Conversation

zen-dog
Copy link
Contributor

@zen-dog zen-dog commented Apr 17, 2020

Summary:
#1442 introduced new Instance.Spec.PlanExecution.Status field which duplicates part of the Instance.Spec.PlanStatus information . With that, plan execution status is now saved three times: in planExecution, aggregatedStatus and the corresponding planStatus fields e.g.:

apiVersion: kudo.dev/v1beta1
kind: Instance
spec:
  planExecution:
    planName: deploy
    status: IN_PROGRESS.     # <-- one
status:
  aggregatedStatus:
    activePlanName: deploy
    status: IN_PROGRESS      # <-- two
  planStatus:
    deploy:
      name: deploy
      phases:
      - name: deploy
        status: IN_PROGRESS   # <-- three
        steps:
        - name: app
          status: IN_PROGRESS
      status: IN_PROGRESS

aggregatedStatus is completely duplicating planExecution while planStatus provides extended information on individual phases and steps. This PR removes aggregatedStatus field in favor of planExecution.

The only caveat is that we previously kept aggregatedStatus.status value even after the plan was finished to signal how last plan execution ended. However, kudo plan status already calculates and displays this information from the corresponding planStatus. Additionally, having aggregatedStatus.activePlanName populated when no plan is active is somewhat broken, to begin with.

Signed-off-by: Aleksey Dukhovniy alex.dukhovniy@googlemail.com

@kensipe
Copy link
Member

kensipe commented Apr 17, 2020

FAIL: kudo/harness (0.01s)
        --- FAIL: kudo/harness/zookeeper-upgrade-test (304.78s)
        --- FAIL: kudo/harness/kafka-upgrade-test (305.15s)

@kensipe
Copy link
Member

kensipe commented Apr 17, 2020

The tests that are failing are in the operators repo

zk: https://github.com/kudobuilder/operators/tree/master/repository/zookeeper/tests/zookeeper-upgrade-test
kafka: https://github.com/kudobuilder/operators/tree/master/repository/kafka/tests/kafka-upgrade-test

These tests failures here can be ignored... but we will need a PR to resolve in operators

Copy link
Member

@kensipe kensipe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kensipe
Copy link
Member

kensipe commented Apr 17, 2020

some follow up thoughts...

  1. I do like aggregatedstatus or one var that indicates that there is an active plan... I don't see aggregatedstatus as a duplicate of status... it is "aggregated". It is only a duplicate now because we only have 1 plan running at a time... which I expect to change sometime.
  2. I don't like status updates in spec:... it should be in status

@runyontr
Copy link
Member

The paradigm in Kubernetes world is that properties set by the user should reside in the spec and properties set by the controller should reside in the status. As a result, if we're going to remove one of them, it should be the spec's state. This is kind of outlined in this doc: https://book-v1.book.kubebuilder.io/basics/status_subresource.html.

@zen-dog
Copy link
Contributor Author

zen-dog commented Apr 17, 2020

I don't like status updates in spec:... it should be in status

I don't like it either, but #1442 describes why we needed to duplicate this information 😞 It's either that or merging Status resource back into the Instance. Which is arguably a bigger evil.

@zen-dog
Copy link
Contributor Author

zen-dog commented Apr 17, 2020

The paradigm in Kubernetes world is that properties set by the user should reside in the spec and properties set by the controller should reside in the status.

I know, but sadly, the Spec field is needed for the instance admission controller. #1442 description sheds some light on the problem. Otherwise, there is no way to guarantee consistency around plan execution. We can discuss it offline in detail if you like.

@runyontr
Copy link
Member

It seems that the merging of Instance and PlanExecution into a single controller made this confusion. The Instance controller should be defining the desired Spec of the PlanExecution, i.e. what PE should be running based on the Instance changes. The PE Controller would then be responsible for updating the status of the PE as it progresses. Once those were merged together, the Instance controller now has multiple Specs to ensure are correctly represented in the cluster:

  1. The Instance is running the right PE
  2. The PE is progressing as intended.

@zen-dog
Copy link
Contributor Author

zen-dog commented Apr 17, 2020

Yes, we tried splitting PE an I controllers/resources but that has fundamentally the same consistency issues.

At the end, it boils down to the fact that KUDO API is edge-based (running plans) and it is not ok to miss/ignore/allow invalid edges. It is ok for the Deployment-controller to reconcile multiple updates to the number of instances by picking one, many or just the last update. But if KUDO IC would do the same you might run multiple plans at the same time (or miss plans completely). Which is ok, if the only plan is deploy but not ok if you have backup and restore and running both concurrently ruins your data.

To keep it consistent you need a webhook. As for keeping both fields in the same resource: it's the only way a webhook can decline invalid updates since k8s does not support transactional updates of multiple resources. Split this in two e.g. PE and I (or just subresource) and you have the same consistency problems.

To be clear, I'm not necessarily advocating for the edge-based APIs. But there is a class of problems (mostly one-off commands like running a plan, restarting a node etc.) which are hard to express in a level-based world. Which gives KUDO an edge (!) over existing solutions.

P.S. More details on why the edge-based API is impossible to keep consistent in pure controller-paradigm here: https://hackmd.io/@RYL9RmUQQMKZ7EnyqC1UVg/HkTQpRiN8#Manually-Triggered-Plans

@ANeumann82 ANeumann82 added the release/breaking-change This PR contains breaking changes and is marked in the release notes label Apr 20, 2020
Summary:
#1442 introduced new `Instance.Spec.PlanExecution.Status` field which duplicates part of the `Instance.Spec.PlanStatus` information . With that, plan execution status is now saved three times: in `planExecution`, `aggregatedStatus` and the corresponding `planStatus` fields e.g.:

```yaml
apiVersion: kudo.dev/v1beta1
kind: Instance
spec:
  planExecution:
    planName: deploy
    status: IN_PROGRESS
status:
  aggregatedStatus:
    activePlanName: deploy
    status: IN_PROGRESS
  planStatus:
    deploy:
      name: deploy
      phases:
      - name: deploy
        status: IN_PROGRESS
        steps:
        - name: app
          status: IN_PROGRESS
      status: IN_PROGRESS
```

`aggregatedStatus` is completely duplicating `planExecution` while `planStatus` provides extended information on individual phases and steps. This PR removes `aggregatedStatus` field in favour of `planExecution`.

The only caveat is that we previously kept `aggregatedStatus.status` value even **after** the plan was finished to signal how _last_ plan execution ended. However, `kudo plan status` already calculates and displays this information from the corresponding `planStatus`. Additionally, having `aggregatedStatus.activePlanName` populated when no plan is active is somewhat broken to begin with.

Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>
Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>
@zen-dog zen-dog merged commit e9cfc2b into master Apr 30, 2020
@zen-dog zen-dog deleted the ad/rm-aggregated-status branch April 30, 2020 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release/breaking-change This PR contains breaking changes and is marked in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants