Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

turn down greenhouse #24247

Open
BenTheElder opened this issue Nov 4, 2021 · 15 comments
Open

turn down greenhouse #24247

BenTheElder opened this issue Nov 4, 2021 · 15 comments
Assignees
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@BenTheElder
Copy link
Member

Part of kubernetes/enhancements#2420

test-infra has used RBE instead for some time now, greenhouse was used for Kubernetes builds but no supported branches are anymore. Greenhouse is a pretty large deployment for test-infra (dedicated 32 core VM, 3TB pd-ssd per build cluster), which is rather unjustified without Kubernetes using it anymore.

Additionally, it doesn't appear to be properly auto-deployed anymore? and isn't actively developed / needed by the project at large. We should just turn it down, its time has passed.

Any remaining projects that happen to be using it should fallback to bazel without caching automatically and if they find that they need a cache still they can either spin up a more reasonably sized deployment with SIG k8s-infra (greenhouse is well documented and we'll leave the sources for now) or use some alternate mechanism (e.g. RBE).

/assign

@BenTheElder BenTheElder added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Nov 4, 2021
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 4, 2021
@BenTheElder BenTheElder added the sig/testing Categorizes an issue or PR as relevant to SIG Testing. label Nov 4, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 4, 2021
@BenTheElder
Copy link
Member Author

For k8s-prow-builds (the google.com build cluster) I can't find any current mechanism auto-managing anything other than the greenhouse application image so to test turning down the instance I'm going to manually delete the service object such that the instance is soft-turned-down. If all is well we can fully turn it down later.

kubectl get -oyaml svc bazel-cache
apiVersion: v1
kind: Service
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"run":"bazel-cache"},"name":"bazel-cache","namespace":"default"},"spec":{"ports":[{"port":8080,"protocol":"TCP"}],"selector":{"app":"greenhouse"}}}
  creationTimestamp: "2018-02-06T01:25:58Z"
  labels:
    run: bazel-cache
  name: bazel-cache
  namespace: default
  resourceVersion: "322348273"
  selfLink: /api/v1/namespaces/default/services/bazel-cache
  uid: af6fe93d-0adc-11e8-accd-42010a80009c
spec:
  clusterIP: 10.63.246.72
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app: greenhouse
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

For k8s.io the configs live here: https://github.com/kubernetes/k8s.io/tree/e18c7c6377b78a5b2935bea598e4b75497b2f89c/infra/gcp/terraform/k8s-infra-prow-build/prow-build/resources/default

cc @spiffxp

@ameukam
Copy link
Member

ameukam commented Nov 23, 2021

/cc

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 21, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 23, 2022
@BenTheElder
Copy link
Member Author

/remove-lifecyle rotten
/assign @ameukam

@BenTheElder BenTheElder removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 23, 2022
@ameukam
Copy link
Member

ameukam commented Mar 23, 2022

I still see some cache.

/data/kubernetes/kubernetes,b7fa00de3a7c5dec1bb841e81f144ae9 # du -d 1 -h
2.6T    ./cas
555.1M  ./ac
2.6T    .

It's possible it's stale cache. I'll manually clear the cache and see what's happening.

@BenTheElder
Copy link
Member Author

Greenhouse is an LRU cache so it will always be ~full unless it has zero usage since setup.

With a full cache we could check how recently entries were accessed.

@ameukam
Copy link
Member

ameukam commented Mar 23, 2022

Greenhouse is an LRU cache so it will always be ~full unless it has zero usage since setup.

With a full cache we could check how recently entries were accessed.

Good idea!

/data/kubernetes/kubernetes,b7fa00de3a7c5dec1bb841e81f144ae9/cas # ls -lhur | tail -20
-rw-------    1 root     root         129 Mar  3 15:08 af72d6d45af49bc12c8f24988cda5bbcab607eb1acd0ee0a16017de8329de1c9
-rw-------    1 root     root      163.6M Mar  3 15:08 91f1cd8a5ec9355c6131100df2613216874926fe016331cc275e63e9d8badc47
-rw-------    1 root     root          41 Mar  3 15:08 6f9f6f97af118c28ca02aee095cfb9f5c0642990b8bc0a5a66207043f80436bb
-rw-------    1 root     root      154.5M Mar  3 15:08 54296156a8a8f5489f82dc67e220bd1414e81aeb24c51b2fbb557d0490568a73
-rw-------    1 root     root          33 Mar  3 15:08 2d5f8cdd41a08fb0227bef2feb8afdbf5c3221d2162740deddac689099411c6a
-rw-------    1 root     root      953.1M Mar  3 15:08 ff8c981b8dea03e4619cee094b5a515c949f166d5583c28dfc450330aaa242d5
-rw-------    1 root     root          33 Mar  3 15:08 ad6e165a5b018bb213d3e40d5059797922ffc119398e6eb0c8412d55dc9c0da9
-rw-------    1 root     root         129 Mar  3 15:08 a5b42259f179a25a0588cf0f157e8b69f911d746d0fbdc1e2b0419a0c5c67533
-rw-------    1 root     root          41 Mar  3 15:08 70238e0aaa38087e5f3ff632e6c3a761acd50227a0a745296d1e3282f0975860
-rw-------    1 root     root      318.9M Mar  3 15:08 59f24f968b5d6e141ff700f97ec99a4958e89a25d3ac96a66fd2d16f8138eab2
-rw-------    1 root     root          41 Mar  3 15:08 c2befdb860da789db1e8c73c0e209a70d3c9e3861e9e037dd11cc5c9dfe7c782
-rw-------    1 root     root          33 Mar  3 15:08 15c45ff20039d7024b9a7bec6f6fa0cf14981fc1643c2ff0e995f7b8271ac4a7
-rw-------    1 root     root         129 Mar  3 15:08 0b22393597be5fed56bdf53584bb177cc7eb91f9ca397ce47727c5d7b287b4b3
-rw-------    1 root     root        1.1G Mar  3 15:08 827675094d35b9b5d06bda7e8e75c1aa3a5f83b84f4b12a5bfdf149d1cba9606
-rw-------    1 root     root      453.9M Mar  3 15:08 b09b0db9eec6067069704f8606ee16488f142c2fa7cb8ceba7b0edd8054dad56
-rw-------    1 root     root           0 Mar  3 15:09 e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
-rw-------    1 root     root          33 Mar  3 15:09 e2dc208d675b1645afef7fcae438debb5d12cefb5152e01efa0c118eb1d201e2
-rw-------    1 root     root         129 Mar  3 15:09 be629e40e1a2fe0c8a22d060cb0177f53523a6958c7c366935a52658b4b2a923
-rw-------    1 root     root          41 Mar  3 15:09 483ab076e9a4a36a4184a1c8a6ec82e0bc9453b9a17e6b4dc96af9589320c9ba
-rw-------    1 root     root       60.1K Mar 23 20:58 7fffc697b4402e1f9c9cbdcc8c0cfae028c0fc2fdf7cc63eedbaaed57b075e6e

@ameukam
Copy link
Member

ameukam commented Apr 19, 2022

ameukam added a commit to ameukam/test-infra that referenced this issue Apr 19, 2022
Related:
  - kubernetes#24247

Disable greenhouse as a bazel cache

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
@BenTheElder
Copy link
Member Author

We can use GCS as a bazel cache https://docs.bazel.build/versions/main/remote-caching.html#google-cloud-storage

We don't want to do that because garbage collection is ??
Greenhouse basically exists so we can do LRU of bounded size.

ameukam added a commit to ameukam/test-infra that referenced this issue May 10, 2022
Related: kubernetes#24247

Followup of: kubernetes#26021

Stop bazel cache for job execution.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/test-infra that referenced this issue May 11, 2022
Related: kubernetes#24247

Followup of: kubernetes#26021

Stop bazel cache for job execution.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/test-infra that referenced this issue May 12, 2022
Related: kubernetes#24247

Followup of: kubernetes#26021

Stop to use bazel cache for job execution.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/test-infra that referenced this issue May 12, 2022
Related: kubernetes#24247

Followup of: kubernetes#26021

Stop use bazel cache for job execution.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/test-infra that referenced this issue May 16, 2022
Related: kubernetes#24247

Followup of: kubernetes#26021

Stop to use bazel cache for job execution.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/test-infra that referenced this issue May 16, 2022
Related: kubernetes#24247

Followup of: kubernetes#26021

Stop to use bazel cache for job execution.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/test-infra that referenced this issue May 17, 2022
Related: kubernetes#24247

Followup of: kubernetes#26021

Stop to use bazel cache for job execution.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/test-infra that referenced this issue May 17, 2022
Related: kubernetes#24247

Followup of: kubernetes#26021

Stop to use bazel cache for job execution.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
k8s-ci-robot pushed a commit that referenced this issue May 18, 2022
* csi: stop use bazel remote cache

Related: #24247

Followup of: #26021

Stop to use bazel cache for job execution.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>

* Drop usage of bash function 'use_bazel'

Stop usage of Bazel for the differents tests

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>

* Manually add prowjob pull-kubernetes-csi-csi-proxy-integration

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 18, 2022
@BenTheElder
Copy link
Member Author

I think we're ready to do this?
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 26, 2022
ameukam added a commit to ameukam/k8s.io that referenced this issue Aug 8, 2022
Part of:
  - kubernetes/test-infra#24247

Remove k8s service for greenhouse. This is done before we shutdown the
workload

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/k8s.io that referenced this issue Sep 20, 2022
Related:
  - kubernetes/test-infra#24247

Greenhouse have been disabled on the community infrastructure for more
than 30 days. kubernetes#4059
No issue or incident reported until now.
for more than 30 days. kubernetes#4059
No issue or incident reported until now.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/test-infra that referenced this issue Sep 23, 2022
Part of:
  - kubernetes#24247

Follow-up of:
  - kubernetes/k8s.io#4245

Stop scraping metrics from the greenhouse instance deployed in
k8s-infra.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/k8s.io that referenced this issue Sep 30, 2022
Part of:
  - kubernetes/test-infra#24247

Remove nodepool dedicated to greenhouse. k8s deployments and services
have been removed.

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 24, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 23, 2022
@ameukam
Copy link
Member

ameukam commented Nov 24, 2022

/remove-lifecycle rotten
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Nov 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

No branches or pull requests

4 participants