Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question: Could you explain how you treat cluster data? #139

Closed
1 task done
juan-vg opened this issue Mar 29, 2023 · 5 comments
Closed
1 task done

question: Could you explain how you treat cluster data? #139

juan-vg opened this issue Mar 29, 2023 · 5 comments
Labels
question Further information is requested

Comments

@juan-vg
Copy link

juan-vg commented Mar 29, 2023

Checklist:

  • I've searched for similar issues and couldn't find anything matching

Subject of the issue

I would like to know how my private cluster-data is treated. I would not like to disclose any personal info to GTP/OpenAI. I miss some information about this topic in the README.

Is the collected data anonymized before sending it to GPT? Which data is collected? Are my private and sensible clusters going to keep safe? Why? Is this tools GDPR compliant? Etc...

@AlexsJones
Copy link
Member

AlexsJones commented Mar 29, 2023

Thank you, this is an important question.
Let me answer it

k8sgpt analyze

This connects to the Kubernetes API server and only looks at Conditions and Status messages on objects.
There is no AI used in this step at all

k8sgpt analyze --explain

This still will forward an aggregate parcel of information which may have personally identifiable information in it e.g. error back off container/alexsjones:latest is failing! for example.

If you have any doubts or worries I would not recommend using --explain if you do not trust OpenAI.
Although it is all over TLS I cannot speak to their data retention policy other than this document I found here

As for the k8sgpt we store nothing aside from the ~/.k8sgpt.yaml which would only cache the results of --explain on your local machine if you choose to use it.

I hope this helps.

@AlexsJones AlexsJones added the question Further information is requested label Mar 29, 2023
@juan-vg
Copy link
Author

juan-vg commented Mar 29, 2023

I see. Thank you for clarifying it!

Regarding the OpenAI data retention policy, it comes to my mind to apply some kind of data anonymization before sending it to them. It could be very cool to implement it or to allow a plugin to do so. From my point of view it's not about trusting OpenAI or not, but about preventing them from using that sensitive data for their future trainings.

@AlexsJones
Copy link
Member

I think this could be a game changer, I know that Google has "Data loss prevention API" in GCP.
I wonder if we could start somewhere simply with Regex and build it out into a module?

@juan-vg
Copy link
Author

juan-vg commented Mar 31, 2023

I've been playing with ChatGPT, providing some outputs from k8sgpt and asking for a Go script to automatically detect sensitive data and transform it into random chars while respecting the same word/sentence structure. I based the detection on word entropy but I realize it's not the way to go (works better for detecting random char strings, usually used in passwords and tokens). Is it possible for you to provide examples of each message the k8sgpt could output so we can then feed ChatGPT with better training data for the detection?

@AlexsJones
Copy link
Member

Just to round out this issue, we will add a task to clarify this in the documentation.

As for examples, here are some but none of my workloads directly expose PII in their error strings:

0 argocd/argocd-application-controller-0(StatefulSet/argocd-application-controller)                                   
- Error: back-off 5m0s restarting failed container=argocd-application-controller-instrumentation pod=argocd-application-controller-0_argocd(1748b1c2-28b2-40d9-b58e-9906220fbb0d)


1 default/deathstar-5b559d699b-smvrn(Deployment/deathstar)
- Error: Back-off pulling image "docker.io/cilium/starwaraaes"


2 foo/alpha-6c85f869-d7tbq(Deployment/alpha)
- Error: Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: container init was OOM-killed (memory limit too low?): unknown


3 observability/loki-read-0(StatefulSet/loki-read)
- Error: 0/5 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 1 node(s) had volume node affinity conflict, 3 node(s) were unschedulable. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling..


4 observability/loki-read-1(StatefulSet/loki-read)
- Error: 0/5 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 1 node(s) had volume node affinity conflict, 3 node(s) were unschedulable. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling..


5 observability/loki-write-0(StatefulSet/loki-write)
- Error: 0/5 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 1 node(s) had volume node affinity conflict, 3 node(s) were unschedulable. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling..


6 observability/loki-write-1(StatefulSet/loki-write)
- Error: 0/5 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 1 node(s) had volume node affinity conflict, 3 node(s) were unschedulable. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling..


7 observability/prometheus-observability-kube-prometh-prometheus-0(prometheus-observability-kube-prometh-prometheus)
- Error: 0/5 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 1 node(s) had volume node affinity conflict, 3 node(s) were unschedulable. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling..


8 argocd/argocd-applicationset-controller-5c5496c549-7ppk2(Deployment/argocd-applicationset-controller)
- Error: back-off 5m0s restarting failed container=argocd-applicationset-controller-instrumentation pod=argocd-applicationset-controller-5c5496c549-7ppk2_argocd(993f24eb-f345-4199-a0c1-8cced885716d)


9 argocd/argocd-dex-server-5fcdf867b7-mv496(Deployment/argocd-dex-server)
- Error: back-off 5m0s restarting failed container=dex-instrumentation pod=argocd-dex-server-5fcdf867b7-mv496_argocd(32e8f49e-88f0-4e54-8408-ea3162292f7d)


10 argocd/argocd-repo-server-5785865bd8-pgtc7(Deployment/argocd-repo-server)
- Error: back-off 5m0s restarting failed container=argocd-repo-server-instrumentation pod=argocd-repo-server-5785865bd8-pgtc7_argocd(66078850-2252-4230-97aa-c22fe98dff8d)


11 argocd/argocd-server-59677d6f74-n2p5m(Deployment/argocd-server)
- Error: back-off 5m0s restarting failed container=argocd-server-instrumentation pod=argocd-server-59677d6f74-n2p5m_argocd(060722da-5f30-41f9-9895-8c375abe2034)


12 default/deathstar(deathstar)
- Error: Service has not ready endpoints, pods: [Pod/deathstar-5b559d699b-smvrn], expected 1


13 argocd/argocd-applicationset-controller(argocd-applicationset-controller)
- Error: Service has not ready endpoints, pods: [Pod/argocd-applicationset-controller-5c5496c549-7ppk2], expected 1


14 argocd/argocd-server(argocd-server)
- Error: Service has not ready endpoints, pods: [Pod/argocd-server-59677d6f74-n2p5m], expected 1


15 cats/cats(cats)
- Error: Service has no endpoints, expected label app.kubernetes.io/instance=cats
- Error: Service has no endpoints, expected label app.kubernetes.io/name=cats


16 argocd/argocd-server-metrics(argocd-server-metrics)
- Error: Service has not ready endpoints, pods: [Pod/argocd-server-59677d6f74-n2p5m], expected 1


17 khole-system/khole-controller-manager-metrics-service(khole-controller-manager-metrics-service)
- Error: Service has no endpoints, expected label control-plane=controller-manager


18 metallb-system/metallb-webhook-service(metallb-webhook-service)
- Error: Service has no endpoints, expected label app.kubernetes.io/component=controller
- Error: Service has no endpoints, expected label app.kubernetes.io/instance=metallb
- Error: Service has no endpoints, expected label app.kubernetes.io/name=metallb


19 observability/observability-kube-prometh-prometheus(observability-kube-prometh-prometheus)
- Error: Service has no endpoints, expected label app.kubernetes.io/name=prometheus
- Error: Service has no endpoints, expected label prometheus=observability-kube-prometh-prometheus


20 observability/prometheus-operated(prometheus-operated)
- Error: Service has no endpoints, expected label app.kubernetes.io/name=prometheus


21 argocd/argocd-dex-server(argocd-dex-server)
- Error: Service has not ready endpoints, pods: [Pod/argocd-dex-server-5fcdf867b7-mv496], expected 1


22 argocd/argocd-metrics(argocd-metrics)
- Error: Service has not ready endpoints, pods: [Pod/argocd-application-controller-0], expected 1


23 argocd/argocd-repo-server(argocd-repo-server)
- Error: Service has not ready endpoints, pods: [Pod/argocd-repo-server-5785865bd8-pgtc7], expected 1

fyuan1316 pushed a commit to fyuan1316/k8sgpt that referenced this issue Jun 26, 2023
Signed-off-by: Aris Boutselis <arisboutselis08@gmail.com>
Co-authored-by: Aris Boutselis <arisboutselis08@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants