Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API refactor #37

Open
hunter opened this issue May 11, 2016 · 17 comments
Open

API refactor #37

hunter opened this issue May 11, 2016 · 17 comments

Comments

@hunter
Copy link
Contributor

hunter commented May 11, 2016

Now that we have an initial prototype built, there are a few areas on the API that can be improved.

Mentioned in #15, the ThirdPartyResource may be a better approach to integrating the API. We get the benefits of Kube API Auth along with the use of etcd - https://github.com/kubernetes/kubernetes/blob/release-1.2/docs/design/extending-api.md

(Currently a stub... updates to come)

@hunter
Copy link
Contributor Author

hunter commented May 12, 2016

Updated with some info on ThirdPartyResources

@hunter hunter added the ready label May 25, 2016
@darkcrux
Copy link
Collaborator

This would probably tie in with the Controller Refactor. Also removes the need for us to start our own etcd instance.

@hunter
Copy link
Contributor Author

hunter commented May 25, 2016

My current understanding of ThirdPartyResources is that only the resource is stored in etcd. For the rest of associated data it should be stored outside of that (so I assume our own etcd)

@darkcrux
Copy link
Collaborator

Yes. Thinking about pipelines, stages, and build info would be stored as ThirdPartyResources and it's up to the controllers to monitor the api for changes to start builds, send notifications, etc.

@hunter
Copy link
Contributor Author

hunter commented May 25, 2016

Ah yeah, I hadn't considered storing more than pipelines in there.

One thing we might need to check is if TPR is enabled in GKE

@darkcrux
Copy link
Collaborator

I don't think it's enabled in our latest installation.

@darkcrux
Copy link
Collaborator

A few things I think we need to improve on the API.

  1. Create separate models for the API and the controllers. Right now the API, controller, and the datastore. So it's pretty complex. we need to simplify them.
  2. breakdown the models from one big structure (pipeline contains builds contains stages) into several entries in etcd. This would pattern TPR well. Idea is when TPR is available, we can just go kubectl get pipelines, kubectl get builds, kubectl get stages, etc. It's also easier on the API side when querying.
  3. remove references of datastore and scm from the api. This is a tech debt. these should be on the controller side. also make it easier to refactor the controller afterwards.

@hunter
Copy link
Contributor Author

hunter commented May 27, 2016

How will users be managed in this API? Per-user pipelines (with different auth flows for different services). Can we use the K8s user (which would probably fit the TPR model)?

@hunter
Copy link
Contributor Author

hunter commented May 27, 2016

Would it make sense to use gRPC which seems to be the trend for newer K8s Go projects? (Helm, etcd)

@darkcrux
Copy link
Collaborator

  1. breakdown the models from one big structure (pipeline contains builds contains stages) into several entries in etcd. This would pattern TPR well. Idea is when TPR is available, we can just go kubectl get pipelines, kubectl get builds, kubectl get stages, etc. It's also easier on the API side when querying.

This would relate to the controllers too. I think it would be better for the API to have separate data structures for the pipelines, builds, and stages. The API just creates/updates them then a separate controller watches over the changes, runs the builds, then calls the API to update on the status.

When TPR is available, it'd be easier to just go and say: kubectl get {pipeline,builds,stages} etc. Not sure if kubectl create -f ... would work too.

How will users be managed in this API? Per-user pipelines (with different auth flows for different services). Can we use the K8s user (which would probably fit the TPR model)?

we could probably use the k8s user. or if not, make the user into another TPR? something like k5s-users?

@darkcrux
Copy link
Collaborator

Still thinking about the users.

should pipelines be linked to users? eg. 2 users accessing the same repo can have separate pipelines? which means a repo can have several hooks pertaining to a pipeline?

Our current implem doesn't have a concept of user. All the pipelines shown are created by logged in users (who have admin access to the repo) but they are shared for all users (even without repo access). Everyone can run the build.

@hunter
Copy link
Contributor Author

hunter commented May 27, 2016

This would relate to the controllers too. I think it would be better for the API to have separate data structures for the pipelines, builds, and stages. The API just creates/updates them then a separate controller watches over the changes, runs the builds, then calls the API to update on the status.

I'm wondering if builds are something that lives as a resource. Its something thats generated as part of a running a pipeline but its never edited after its finished. Would it be better to keep it in its own data store (etcd or object store)?

@hunter
Copy link
Contributor Author

hunter commented May 27, 2016

should pipelines be linked to users? eg. 2 users accessing the same repo can have separate pipelines? which means a repo can have several hooks pertaining to a pipeline?

Yes, I was considering that too. As a dev can I run the same pipeline as another user for doing my own testing... I would think yes.

@darkcrux
Copy link
Collaborator

I'm wondering if builds are something that lives as a resource. Its something thats generated as part of a running a pipeline but its never edited after its finished. Would it be better to keep it in its own data store (etcd or object store)?

The builds do get edited for a short time, when a stage or all stages finishes. Same as the stages when their job is complete. Was thinking of it similar to a Job when it's running, it spins up a pod and for the moment that it's running, it is visible when we run kubectl get pods, then afterwards it gets hidden unless we use --show-all flag. Though it is never edited after it's done though. It still needs to be queried afterwards though. or removed if not needed anymore.

@darkcrux
Copy link
Collaborator

The idea of having the builds as a resource is that the API can just edit the build's definition marking a stage as READY for the controller to pick up and start running. well it works either way as TPR or just entries on a diff backend.

@hunter
Copy link
Contributor Author

hunter commented May 27, 2016

Hadn't considered it that way but makes some sense if there are edits going on

@darkcrux
Copy link
Collaborator

darkcrux commented Jun 3, 2016

A general idea of how the new spec will look like as well as updates for the way the pipeline resources are stored in etcd (making it closer to K8S TPR).

pipeline spec

kind: Pipeline
api: extension/v1
metadata:
  name: new-pipeline
  namespace: default
  uuid: {generated}
  labels: {}
spec:
  notifs:
    - slack:
        secret: slack-creds
    - email:
        secret: email-creds
  triggers:
    - github:
        events:
          - push
          - pr
    - webhook: {}
    - quay: {}
  sources:
    - name: github-src
      scm:
        repo: github:darkcrux/obvious
        secret: repo-creds
    ...
  vars:
    test_var: testing
  stages:
    - pull:
        from: github-src
        to: src/github/
    - command:
    ...
  output:
    publish:
      to: quay.io
      secrets: quay-creds
    deploy:

note: stages can override vars and notifs.

todo:

  • scm branches/tags <-- how to best describe them?
  • output.deploy <-- how to best describe it?

pipelines

Prefix: /kontinuous/pipelines/{pipeline-uuid}/{json-data}

JSON data:

key type description
uuid string unique id for the pipeline (needed by frontend)
name string the pipeline friendly name
created int unix nano timestamp on when pipeline was created
spec object yaml representation of the spec (base64 encoded)
spec_src string if not empty, expects a file in a source to update the spec
current_build int id of the current build. -1 if no builds yet

pipeline - uuid map

key: /kontinuous/pipeline-uuid/{pipeline-name}

value: the uuid for the pipeline name

note: the api should use the friendly name of the pipeline but internally we should use the uuid. This is used to map the pipeline name to it's uuid.

builds

Prefix: /kotninuous/builds/{pipeline-uuid}/{build-uuid}/{json-data}

key type description
uuid string unique id for the build (needed by frontend)
pipeline-uuid string unique id of the parent pipeline
number int the build number
status string status of the build, can be success, fail, waiting, pending
created int unix nano timestamp for when the build was created
started int unix nano timestamp for when the build started
finished int unix nano timestamp for when the build completed
current_stage_uuid string current stage uuid
spec object the spec used for this build
sources.scm.type string the type of scm source (github,gitlab,bitbucket,generic)
sources.scm.clone_url string the clone url for the git source
sources.scm.branch string current branch to use for the build
sources.scm.commit string commit hash of the build
sources.scm.author string author of the build
sources.etc ??? ??? ??? other source metadata?

build - uuid map

key: /kontinuous/build-uuid/{pipeline-uuid}.{build-num}

value: the build uuid for the given build number

note: the api shoulduse the build number but internally we should use the uuid. this is used to map the pipeline name to it's uuid.

stages

Prefix: /kontinuous/stages/{pipeline-uuid}/{build-uuid}/{stage-uuid}/{json-data}

key type description
uuid string unique id for the stage
pipeline-uuid string unique id of the parent pipeline
build-uuid string unique id of the parent build
status string pending, success, fail, waiting, skip, skipped
created int unix nano timestamp for when the stage was created
started int unix nano timestamp for when the stage was started
finished int unix nano timestamp for when the stage was completed
resumed int unix nano timestamp for when the stage was resumed (from waiting status)
skipped int unix nano timestamp for when the stage was skipped
spec object the stage spec with the templates already processed
log_path string path in minio to find the logs
artifact_path string path in minio to find the artifacts

stage - uuid map

key: /kontinuous/stage-uuid/{pipeline-uuid}.{build-uuid}.{stage-num}

value: the stage uuid for the given stage number

@darkcrux darkcrux added in progress and removed ready labels Jun 8, 2016
@darkcrux darkcrux self-assigned this Jun 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants