Skip to content

Conversation

@deads2k
Copy link
Contributor

@deads2k deads2k commented Feb 5, 2021

builds on #1614

This adds manifests to deploy and run sippy. This is similar to what I have in my private toy cluster, but I need to do a little bit of alignment and work out ingresses instead of routes.

/assign @wojtek-t

/hold

explicit hold until I'm able to test these locally.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/access Define who has access to what via IAM bindings, role bindings, policy, etc. wg/k8s-infra labels Feb 5, 2021
@k8s-ci-robot k8s-ci-robot requested review from dims and thockin February 5, 2021 19:05
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 5, 2021
- k8s-infra-rbac-triageparty-release@kubernetes.io
- k8s-infra-rbac-slack-infra@kubernetes.io
- k8s-infra-rbac-node-perf-dash@kubernetes.io
- k8s-infra-rbac-wg-reliability@kubernetes.io
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is already merged - please rebase

wg-reliability has tools used to aid in transparency of stability metrics.

[sippy](https://github.com/openshift/sippy) summarizes multiple test-grid dashboards with data slicing of related jobs, job runs,
and tests. Visit (placeholder for eventual URL) to see the kube instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably request public IP and DNS now (given we already have support for running it).

I'm wondering if we want "sippy.k8s.io" or something more generic?

@@ -0,0 +1,17 @@
apiVersion: v1
kind: Service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also want Ingress and Certificate too, as e.g. here:
https://github.com/kubernetes/k8s.io/tree/master/perf-dash.k8s.io

- --local-data
- /data
- --dashboard
- kube-master=sig-release-master-blocking,sig-release-master-informing=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want both? Or maybe we want to start just with blocking?

Or actually, I guess it's probably moot anyway, because we probably want to create our own dashboard for this purpose...

terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
command:
- /tmp/src/scripts/fetchdata-kube.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this? I didn't see that anywhere.
I think you can simply run:
./sippy --fetch-data /usr/local/google/home/wojtekt/sippy_data --dashboard=kube-master=sig-release-master-blocking=

Or I guess, you might wanted this to run periodically already.
Do we want a container? Or do we want a CronJob that will be doing this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh - I guess with CronJob, we may have a problem with mounting for different pods...

So we probably want this to be a container within a pod.

- mountPath: /data
name: data
restartPolicy: Always
terminationGracePeriodSeconds: 30
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't 30 default?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is

- ReadWriteOnce
resources:
requests:
storage: 900M
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we want to set some StorageClass, right?

Base automatically changed from master to main February 9, 2021 00:35
@deads2k deads2k changed the title [wip] add manifests for initial deployment of sippy add manifests for initial deployment of sippy Feb 22, 2021
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 22, 2021
@spiffxp
Copy link
Contributor

spiffxp commented Feb 25, 2021

/retest

Copy link
Contributor

@spiffxp spiffxp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather see these manifests in a folder called sippy or sippy.k8s.io, in a sippy namespace to match.

If we're going down the path of assigning group ownership to apps deployed in cluster (we should! accountability is great!) I would rather see a sig: foo label, given that wg's don't own code.

I would suggest sig: architecture or sig: release to start... though long term it should be sig: testing when we have bandwidth.

@spiffxp
Copy link
Contributor

spiffxp commented Feb 25, 2021

/hold

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me (modulo missing storage class).

resources:
requests:
storage: 900M
# storageClassName: standard
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spiffxp - what storage class we have in the community cluster?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wojtek-t we have 2 storage classes in the cluster :

kubectl get sc
NAME                 PROVISIONER            AGE
ssd                  kubernetes.io/gce-pd   334d
standard (default)   kubernetes.io/gce-pd   551d

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@deads2k - standard should be good enough for us;
Though I would visibly increase the size, say to 10GB or sth.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k - friendly ping

Copy link
Contributor

@spiffxp spiffxp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm
/hold
Remove /hold when ready to deploy

Thanks for changing the namespace, I ended up setting it up via #1844 so you'll likely need to rebase


- email-id: k8s-infra-rbac-wg-reliability@kubernetes.io
name: k8s-infra-rbac-wg-reliability
- email-id: k8s-infra-rbac-sippy@kubernetes.io
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll probably need to rebase, I did this in #1844

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 30, 2021
@k8s-ci-robot k8s-ci-robot removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 31, 2021
Copy link
Contributor

@spiffxp spiffxp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@@ -0,0 +1,17 @@
apiVersion: networking.k8s.io/v1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if aaa is at a version that supports this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch - the cluster is still in 1.16 (will be upgraded to 1.17 soon, but ingress/v1 is only in 1.19), so we should use v1beta1 stil...

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 31, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, spiffxp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 1, 2021
@wojtek-t
Copy link
Member

wojtek-t commented Apr 1, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 1, 2021
@deads2k
Copy link
Contributor Author

deads2k commented Apr 1, 2021

/hold

We will merge on monday when everyone is back in the office and ready for potential excitement

@deads2k
Copy link
Contributor Author

deads2k commented Apr 8, 2021

/hold cancel

here we go!

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 8, 2021
@k8s-ci-robot k8s-ci-robot merged commit ee7f23e into kubernetes:main Apr 8, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Apr 8, 2021
@sbueringer
Copy link
Member

@deads2k for even more excitement. Is there a specific issue where I can follow the implementation?

@deads2k
Copy link
Contributor Author

deads2k commented Apr 8, 2021

@deads2k for even more excitement. Is there a specific issue where I can follow the implementation?

sippy lives here: https://github.com/openshift/sippy/

@sbueringer
Copy link
Member

Ah sorry that was misleading. I meant more like the deployment of sippy for the kubernetes repos. I saw the DNS name sippy.k8s.io, but it's not available yet. So I assumed there might be an issue somewhere (which I can subscribe to) to stay up-to-date about the deployment implementation.

@deads2k
Copy link
Contributor Author

deads2k commented Apr 8, 2021

Ah sorry that was misleading. I meant more like the deployment of sippy for the kubernetes repos. I saw the DNS name sippy.k8s.io, but it's not available yet. So I assumed there might be an issue somewhere (which I can subscribe to) to stay up-to-date about the deployment implementation.

As it happens, I was just starting to look now and I haven't worked out where my resources went.

deads@cloudshell:~$ kubectl --context=prod-aaa -nsippy get pod
No resources found in sippy namespace.
deads@cloudshell:~$ kubectl --context=prod-aaa -nsippy get deployment
No resources found in sippy namespace.
deads@cloudshell:~$ kubectl --context=prod-aaa -nsippy get all
Error from server (Forbidden): replicationcontrollers is forbidden: User "deads@redhat.com" cannot list resource "replicationcontrollers" in API group "" in the namespace "sippy": r
equires one of ["container.replicationControllers.list"] permission(s).
Error from server (Forbidden): daemonsets.apps is forbidden: User "deads@redhat.com" cannot list resource "daemonsets" in API group "apps" in the namespace "sippy": requires one of 
["container.daemonSets.list"] permission(s).
deads@cloudshell:~$ kubectl --context=prod-aaa -nsippy get ingress
No resources found in sippy namespace.

@sbueringer
Copy link
Member

sbueringer commented Apr 8, 2021

Ah got it, thx :). I'll just follow the update here then.

@ameukam
Copy link
Member

ameukam commented Apr 8, 2021

@deads2k 👋🏾 , resources deployment is still a manual action in aaa. You will need to connect and deploy them yourself.

@spiffxp
Copy link
Contributor

spiffxp commented Apr 9, 2021

Part of #1900

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/access Define who has access to what via IAM bindings, role bindings, policy, etc. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants