Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP 70 - Add/remove Stages to/from Project #70

Closed
wants to merge 4 commits into from

Conversation

johannes-b
Copy link
Member

@johannes-b johannes-b commented Feb 7, 2022

Add/remove Stages to/from Project

Success Criteria: Keptn supports adding and removing stages for existing projects.

Short abstract

In Keptn, a project stage (or just stage) defines a logical space, which has a dedicated purpose for an application in a continuous delivery process. Typically, a project has multiple stages that are ordered depending on the maturity level of the application. For instance, a project can consist of a deployment, hardening, and production stage whereas the maturity level grows from left to right. While a deployment stage is used for feature development and initial testing, an application in production is bullet-proofed and services production load.

Due to modern Cloud-native development approaches, a static set of stages is not sufficient anymore. For instance, it should be possible to spin up a stage for chaos-testing an application and to remove it after the test was conducted. Another use case may focus on rolling out a multi/hybrid-Cloud strategy, which requires the need to add new stages for additional execution planes that are running on another Cloud provider.

Currently, it is not possible to add a new stage after the initial project setup. This KEP addresses this missing capability of adding/removing a stage on demand.

Depends on

Why

Target audience / Pain points / Related discussions

Pain points:

  • After creating a project Keptn, it is not possible to add or remove a stage. The only workaround is re-creating the project with an updated shipyard.

Use-case drivers:

  • Add stages after initial project creation keptn#4693 - Add stages after initial project creation: This allows adopting the project due to changed requirements in the software (application) development lifecycle.
  • It should be possible to spin up a temporary test stage, e.g. for functional/unit tests, as part of continuous integration (CI). This would shift Keptn closer to the developer and establish it already as an integral part of CI.

Related discussions / PR

What

  • As mentioned above, stages are ordered from left to right creating a child-parent relationship. The following example shows a setup with four stages. While the stages dev, hardening, and production are in sequential order, the stage production-remote is parallel to production and has the hardening stage as parent.
    image

  • Currently, the linking of stages to create a child-parent relationship is based on sequences and their triggeredOn configuration. Based on the above example, a shipyard configures the relationship from hardening to dev as follows:

    apiVersion: "spec.keptn.sh/0.2.0"
    kind: "Shipyard"
    metadata:
      name: "shipyard-sockshop"
    spec:
      stages:
        - name: "dev"
          sequences:
            - name: "delivery"
              tasks:
                - name: "deployment"
    
        - name: "hardening"
          sequences:
            - name: "delivery"
              triggeredOn:
                - event: "dev.delivery.finished"
          [...]
    

To configure a child-parent-relationship between stages, both stages need to have the same sequence (delivery). Besides, the child stage has to have a triggeredOn set on the event of the parent stage (dev.delivery.finished).

What it is?

⭐ Use Case: As a user, I can add a new stage to a project.

This use case comes in two ways:

  1. Add a new stage to extend the stage chain. For example, introduce a new stage chaos in-between dev and hardening for a dedicated testing purpose.
  2. Add a new stage quality-assurance that is parallel to another one.

Outcome:
image

Both ways of adding a stage are supported by adjusting the shipyard of that project accordingly:

apiVersion: "spec.keptn.sh/0.2.0"
kind: "Shipyard"
metadata:
  name: "shipyard-sockshop"
spec:
  stages:
    - name: "dev"
      sequences:
        - name: "delivery"
          tasks:
            - name: "deployment"

    - name: "chaos"
      sequences:
        - name: "delivery"
          triggeredOn:
            - event: "dev.delivery.finished"
          tasks:
            - name: "deployment"

    - name: "hardening"
      sequences:
        - name: "delivery"
          triggeredOn:
            - event: "chaos.delivery.finished"
            
   - name: "quality-assurance"
      sequences:
        - name: "delivery"
          triggeredOn:
            - event: "chaos.delivery.finished"
      [...]
  • After updating the shipyard, a keptn upgrade project or API call on PUT /project/{sockshop} adds the new stages:
keptn update project sockshop --shipyard=new_shipyard.yaml

image

⭐ Use Case: As a user, I can remove a stage from a project.

  • To remove a stage from a project, the user has to modify the shipyard of that project accordingly:
  • After updating the shipyard, a keptn upgrade project or API call on PUT /project/{sockshop} removes the stages
keptn update project sockshop --shipyard=new_shipyard.yaml

image

What it is? Summary:

  • Implementation of the PUT /project/ endpoint allowing a user to add/remove a stage by updating the project.
  • CLI / Bridge leverage this endpoint to add/remove stages:
    • keptn update project sockshop --shipyard=new_shipyard.yaml
    • In the Bridge, it should be possible to update the shipyard in the Settings > Project page:
      image

What it is not?

  • Implementation of the API endpoints for the stage entity:
    • API: POST /project/{project}/stage/{stage} > Creates a stage for theproject.
    • API: DELETE /project/{project}/stage/{stage} > Deletes stage from project and updates the shipyard accordingly.

Open Discussions

  • to-be-listed

Signed-off-by: Johannes <johannes.braeuer@dynatrace.com>
@johannes-b johannes-b changed the title Add/remove Stages to/from Project KEP 70 - Add/remove Stages to/from Project Feb 7, 2022
Signed-off-by: Johannes <johannes.braeuer@dynatrace.com>
@johannes-b johannes-b marked this pull request as ready for review February 8, 2022 08:56
@johannes-b johannes-b added enhancement New feature or request roadmap-candidate Potential candidate for inclusion into the Keptn roadmap labels Feb 8, 2022
@agrimmer
Copy link
Member

An alternative approach is to allow the editing of stages via the shipyard. This gives the user all possibilities to express how the sequences in the stages are interlinked, which is not possible with follows.

Signed-off-by: Johannes <johannes.braeuer@dynatrace.com>
@oleg-nenashev oleg-nenashev added roadmap This initiative is a part of the Keptn roadmap and removed roadmap-candidate Potential candidate for inclusion into the Keptn roadmap labels Mar 15, 2022
@agrimmer
Copy link
Member

agrimmer commented Mar 31, 2022

When discussing this KEP with the team, we determined a list of open questions which should be answered in this KEP:

  1. What should happen with events of a deleted stage? Should they be deleted? For example, what should happen if I delete a stage and then create a new one with the same name? Do we want to preserve the events?
  2. Do we need to change the used Git model into a folder-based structure in order to allow dynamically adding/removing stages?
  3. When creating a new stage, do we want to reuse configs from other/parent stages? Or do we want an empty config?

@johannes-b
Copy link
Member Author

johannes-b commented Apr 11, 2022

@agrimmer, excellent questions and many thanks for bringing those up. I will answer them in the order of (1), (3), and (2):


(1) What should happen with events of a deleted stage? Should they be deleted? For example, what should happen if I delete a stage and then create a new one with the same name? Do we want to preserve the events?

Similar to the approach of deleting a service, events need to stay when deleting a stage. This is required to display historical sequence executions in Bridge.


(3) When creating a new stage, do we want to reuse configs from other/parent stages? Or do we want an empty config?

No reuse of config. In a first iteration of this enhancement proposal, we will go with an empty config. But the folders and the metadata.yaml for the services must be created. In other words, the steps that are executed when doing a keptn create service.


(2) Do we need to change the used Git model into a folder-based structure in order to allow dynamically adding/removing stages?

This is a rather challenging question that required some research. Links to recent blog posts can be found here:

In order to derive a decision here, @agrimmer, @thisthat, @markuslackner and I (@johannes-b ) have created a comparison of the branch-based and folder-based approachs:

A) Branch-based structure B) Folder-based structure
* "as designed" and used * implemented in resource-service
* but not rolled-out
Git usage: git checkout/pull master < ?

git checkout -b quality

< add files / folders >

git commit -s -m "Added new stage quality"

git push
git checkout master

mkdir quality

< add files / folders >

git commit -s -m "Added new stage quality"

git push
Advantages / Disadvantages in general: Advantages of this approach in general:
✔️ Security concern: It is much easier to secure individual branches in a Git repository instead of folders in a single branch

Disadvantages of this approach in general:
❌ Pull requests and merges between different branches are problematic > Promotion is never a simple Git merge
❌ People are tempted to include environment-specific code and create configuration drift.
❌ Large number of environments > maintenance of all environments (and their branches) gets quickly out of hand
❌ The branch-per-environment model goes against the existing Kubernetes ecosystem (e.g., Helm / Kustomize)**
Advantages of this approach in general:
✔️ The order of commits on the repo is irrelevant
✔️ By only copying files around, you only take exactly what you need and nothing else (no configuration drift)
✔️ No need to use Git cherry-picks or any other advanced git method to promote releases
✔️ Free to make any change from any environment to either an upstream or downstream environment (without any constraints about the correct “order” of environments) → to support hot-fix strategies
✔️ File diff operations to understand what is different between environments in all directions

Disadvantages of this approach in general:
❌ Security concern: Git means are needed to implement security (manual approval, PRs, CODEOWNER file, etc.)
Kustomize Kustomize has a base/overlay concept based on the folder structure. Example:

├── base
│ |── deployment.yaml
│ |── kustomization.yaml
│ └── service.yaml
└── overlays
├── dev
│ ├── kustomization.yaml
│ └── patch.yaml
├── prod
│ ├── kustomization.yaml
│ └── patch.yaml
└── staging
├── kustomization.yaml
└── patch.yaml
Helm Helm allows modeling this using a base Helm chart with different versions of a values.yaml - Example:

chart/
[...chart files here..]
common/
values-common.yml
variants/
prod/
values-prod.yml
non-prod/
values-non-prod.yml
[...other variants…]
envs/
prod-eu/
values-env-default.yaml
values-replicas.yaml
values-version.yaml
values-settings.yaml
[..other environments…]
Advantages / Disadvantages in the context of Keptn: ❌ Performance implications (multiple git pulls, i.e., one for each stage)
❌ Only one sequence at a time can be processed
❌ Impossible to retrigger old sequences, because we loose the history once stage is deleted
✔️ Well tested
✔️ Performance gain / no read lock (one git pull)
* For write operations, a lock is still needed.
✔️ Complete history if the same stage is added/deleted multiple times
✔️ Git commit id = keptn context → is shared across the sequence execution (simplifies retriggering of a sequence)
✔️ Easier to go through the stages (e.g., get the resources of a service - used by the Bridge)
Is there a blocker for this KEP? Which branch will be the origin of the new branch (is it master/main)?
⚠️ Cannot be HEAD of main/master because then the stage would e.g. contain the shipyard.yaml. Besides, it can`t be any other branch because then we would reuse its git commit history.
Helps us to get to a point where we can connect to any repo.
(i.e., could also be a source-code repository of an application)
❌ No, since we modify the repository by adding/deleting branches and which branch should be used as the origin? ✔️ Yes, as the only required is a .keptn folder with the sub-folders for the stages. Given read permissions on that folder, Keptn could modify it by simple commits.

Theoretically, we could then bind a repo not only to a project but also to services as we just need the upstream repo + branch assuming we find a .keptn folder.
Helps us to get to a point where we don`t need a Git repo anymore.

(i.e., could make use of an S3 bucket)
✔️ Yes, we could go even one step further and eliminate the dependency to Git because Keptn then just needs a URL to a bucket (e.g. S3) where a .keptn folder is available. From there the configuration is consumed, download S3.
Conclusion: Adding/Removing stages is based on the current approach Adding/Removing stages is based on the new approach
→ Implies that we need to migrate repositories

⭐ Based on the above comparison of the folder-based and branch-based approach, the requirement of switching to a folder-based structure of storing configuration in the Git repository has been derived. Please see the advantages that speak in favor of the folder-based and the disadvantages (as well as limitations) that speak against the branch-based approach.

Since this is a drastic change in the way Keptn is using the Git repository, further discussions will happen to do this change. Please post your thoughts and comments here.

@thschue
Copy link
Contributor

thschue commented Apr 12, 2022

Basically, I like the approach of flattening the repository structure and eventually deprecating the upstream repo in the future.

In my opinion, a major case is not handled at the moment which led to the development of the promotion service and which might be prevented when switching the repository structure. This consists of two parts, the storage of the artefacts and the retrieval of them when deploying them.

Storing artefacts
As far as I know, a user is not able to commit a bunch of changes in one transaction which specifies his version. The Keptn API only accepts single files, which might not be practicable in many cases, therefore the current approach was to commit everything to a branch in the upstream, tag this with the service name and version number (only as a label as the version is not reflected in the event schema of keptn) to be able to retrieve exactly this version on deployment time.

Retrieving artefacts
The second part of the solution was a service (promotion-service) which inspected the version number (and service name), fetched exactly the configuration for a specific service-version combination and stored this in the “stage-branch” to be consumed by the keptn services. Using this approach, we were able to update one service to a specific version and to do some things as rollbacks and selective approvals (specify a specific version of a service to approve for deployment in the next stage).

What’s the problem with the new approach?
The storing artefacts part might be the same as before, as this has been handled in a branch which was not managed by keptn. Nevertheless, as the main branch of the upstream repo will be utilised a bit more, the creation of this branch might be harder. The same for tagging on this branch, this will work as intended.

As far as I could get it, you plan to retrieve the artefacts based on commit ids in the future, and this might be the main part which will break this approach.

Some questions:

  • How can I ensure that the correct artefacts for a specific version of a service get deployed?
  • When taking the promotion-service out of the game, how could I specify which version (!= commit id) of a service I want to deploy?
  • How could I update a bunch of configurations to keptn as one bundle/in one transaction and mark them with a specific version number?

As I wrote in the introduction, I totally support your approach and I would appreciate every efforts to remove the need to write to the upstream repository from a “user” perspective and deprecate the promotion service. But currently, I think above’s cases are unhandled and might be avoided in the worst case (as I’m currently not really sure how artefacts will be fetched then).

@johannes-b
Copy link
Member Author

johannes-b commented Apr 14, 2022

Hello @thschue,

many thanks for the great feedback and the discussion we had! 🙌

In a meeting, Thomas and I draw a picture to iterate over the questions raised in his previous comment.

  • It addresses the main challenge of: How does a Keptn user know which version (!= Git commit id) to deploy?.
  • Why is this actually a problem? Because it could be the case that a user has no access to the Git repo storing the artifacts and hence the Git commit is unknown.

Please consider the derived drawing that is be explained below:

Versioning in Keptn drawio

  • Starting in the top left corner, we see a CI pipeline that is building, testing artifacts and finally updating configuration to deploy. While in Keptn 0.14 this configuration is uploaded using keptn add-resource or manually committed to the upstream repo, uploading it via a /config endpoint that knows a service, version and config parameter would make it much easier.
  • The config parameter supports two formats:
    1. A ZIP archive following a folder structure that stores the config files as explained below.
    2. A URL that points to a ./keptn folder with the same structure. This ./keptn folder could be part of another repository, e.g., a source code repository that is used for service development.
  • The resource-service processes the API request and stores the configuration in Git. Basically, it extracts or downloads the config and pushes it to the Git repository. The folder structure could look as follows - but need to be clarified:
    ./base
    ./services
      |- carts
      |- payment
    ./stages
      |- dev
      |- hardening
      |- production
    
  • Besides, the resource-service tags the commit using the service name and version: svc-version. It is important to have the service name as part of the tag since the repo is managing multiple services that eventually have the same version at some point in time. However, with the service name it becomes a unique key.
  • (Future - just an idea for now): Another variant of the resource-service could use an S3 bucket as the backend with an implementation of the versioning approach for that kind of storage. This could be based on the file name.
  • Now, let's assume a Keptn user (or CI) wants to trigger a deployment for carts in v0.2.0. How does the person know that this version is stored with the Git commit #85a3f2c? A very challenging task or actually not possible when the repo is not accessible. The only meta-data that is known to the person is the version; i.e., v0.2.0. Consequently, the only information that can be provided to Keptn at this point of time.
  • When executing the delivery sequence, this version number v0.2.0 could be mapped to the Git commit id #85a3f2c. At the end, this is a technical detail.

Conclusion and status quo

TL;DR:

  • The Keptn API hides how config is stored in Keptn - for now, Git as the backend will stay, but this needs to be challenged.
  • The Git repo will become an artifact store that is managed by Keptn only! No changes are allowed.
  • Versioning of artifacts is needed to allow a user to trigger a delivery without knowing the Git commit Id.
  • Lastly, uploading config is a bulk operation that takes a ZIP or URL to a ./keptn folder.

Action Items

  • How does the folder structure look like? The above drawing shows a proposal, but not a final agreement.
  • Which kind of config is stored? Just deployment/test config or Keptn config as well?

Already known tickets:

@thschue
Copy link
Contributor

thschue commented Apr 14, 2022

The behaviour would have an impact on #67 and the gitops-prototype itself as illustrated here:

image

As the artefacts could be pushed in one batch, there would be no more need to manipulate the upstream repository. Therefore, It would be possible to use a generic GitOps toolkit (as the ArgoCD or flux controllers) for the manipulation of the keptn resources. The delivery of the artefacts could be done via the keptn CLI (push), or with a simple controller (pull), which fetches the configuration and delivers it to the keptn API.

Currently, also the promotion service which is doing the version handling at the moment, could be removed completely if keptn itself is able to provide artifacts for a specific version of a service in a stage. The drawing also shows, that the keptn API gets the center of all operations again, and there is no need for a user to access the upstream anymore.

@agardnerIT
Copy link

I see this listed as a disadvantage of branch based but don't see a corresponding advantage in the folder column: Only one sequence at a time can be processed. Would a folder based approach allow multiple sequences within a project to run concurrently and thus solve #79 ?

@thisthat
Copy link
Member

Hey @agardnerIT, I would say that the folder-based approach definitely moves us closer to solving #79.

@thisthat
Copy link
Member

Thank you for the write-up @johannes-b and @thschue. Could you please provide more details on what type of configuration you envision to be stored in the .keptn folder? Only relevant files for the deployment or also keptn specific configuration (e.g., webhooks)?
Furthermore, in the folder structure example I see the stages/services definition:

./base
./services
  |- carts
  |- payment
./stages
  |- dev
  |- hardening
  |- production

Would this replace the shipyard.yaml file?

@johannes-b
Copy link
Member Author

Hey @agardnerIT

thanks for highlighting: Only one sequence at a time can be processed which has to be slightly adapted to: Only one resource/config request can be processed at a time

The reasoning behind that disadvantage:

This limitation falls back to the problem that the branch-based approach requires a lock on the Git repo to derive the correct Git commit id. Unfortunately, we do not know the Git commit id upfront due to diverged branches that can exist in the current branch-based approach. Since the only way of deriving the Git commit is by locking the repo for a moment, this lock blocks other requests.

I agree with @thisthat that going with the folder-based approach brings us closer to supporting the use case mentioned here: #79 since we eliminated the risk of reading configuration from another sequence (which is the main motivation for queuing sequences right now).


Hey @thisthat

I agree that it needs clarification whether Keptn specific config is managed this way too. Currently, much focus is on deployment specific config. I added your thought to the above list of Action Items. Hence, we need to follow up on that one.

@thschue
Copy link
Contributor

thschue commented Apr 15, 2022

Hello @thisthat!

We mostly thought about the configurations held in files in the repository at the moment (like helm-charts, sli.yaml, slo.yaml, job.config). The Webhook Configuration is more a keptn configuration that should be held in a Custom Resource or managed via the bridge in the long term (but in the beginning, it can stay in the Artifact store).

We also discussed the replacement of the shipyard, which should be the ultimate goal, in my opinion. But we also agreed that this should not happen within the scope of this enhancement proposal. We should keep the internal keptn structure like Projects, Stages, Sequences in the Database (or as CRDs in Kubernetes in the Future). I see no advantage in getting the project structure from the Artifact Store (aka Upstream Repository), and I also think this change could bring us a lot of flexibility.

For example, our new structure would allow the definition of simple configurations in all stages (base directory), and a customer can override specific configurations (like SLIs) per stage. If we add a new stage and there are no overrides, it's possible to deploy this service without adding any Artifacts per stage (but yes, the stage needs to be created).

Adding and removing stages would be very easy in this case, as we would not have to take care of the Artifact store. The database would be the source of truth, and the Artifact store could contain specific information for this stage, but it wouldn't have to.

Finally, the Artifact Store (whether we're using git/s3/local filesystem as backend) should only contain artifacts that the other keptn services might consume. It should be possible to configure Keptn via Custom Resources in the GitOps/Declarative approach or the CLI/Bridge in an imperative way (or applying the custom resources via the cli).

@johannes-b
Copy link
Member Author

johannes-b commented Apr 21, 2022

In the developer meeting on the 21th of April, we agreed on splitting this KEP since it already covers three aspects:

  • (1) Initial enhancement request: Add/remove Stages to/from Project > KEP-70
  • (2) Derived pre-requisite: Switch from branch to folder-based Git usage > KEP-81
  • (3) Derived follow-up: Version awareness in Keptn > KEP-82

I will proceed with moving (2) and (3) into a dedicated KEP. For the sake of comprehensibility, the started discussions for (2) and (3) will stay in this KEP.

------------          ------------          ------------
|          |          |          |          |          |
|  KEP-81  |--------->|  KEP-70  |--------->|  KEP-82  |
|          |          |          |          |          |
------------          ------------          ------------

Signed-off-by: Johannes <johannes.braeuer@dynatrace.com>
@thschue
Copy link
Contributor

thschue commented Nov 14, 2022

This issue is mainly obsolete with the Lifecycle Controller, and I am no more really convinced that it makes sense to add this functionality to Keptn itself. Therefore, I propose to close this enhancement proposal (won't do).

@johannes-b, @thisthat: What do you think about this?

@thschue thschue added the question Further information is requested label Nov 14, 2022
@johannes-b
Copy link
Member Author

You could argue that this is still a missing capability. However, the Keptn lifecycle-toolkit is more flexible in this regard, and investments there are more future-proven. I'm fine closing this KEP.
@thisthat how about you?

@thisthat
Copy link
Member

I think the Keptn Lifecycle Toolkit (KLT) is a better place for this enhancement since it targets the delivery of applications. I am also in favor of closing it with won't do and re-shape this enhancement with the KLT in mind.

@johannes-b
Copy link
Member Author

Closed in favor of the Keptn Lifecycle Toolkit (KLT).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested roadmap This initiative is a part of the Keptn roadmap wontdo
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants