New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get logs for all pipelines, even if they're not running #9749
Get logs for all pipelines, even if they're not running #9749
Conversation
298d6b5
to
d70c737
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #9749 +/- ##
==========================================
+ Coverage 59.10% 59.16% +0.05%
==========================================
Files 582 582
Lines 69880 69904 +24
==========================================
+ Hits 41304 41358 +54
+ Misses 27983 27964 -19
+ Partials 593 582 -11 ☔ View full report in Codecov by Sentry. |
func (s *debugServer) listApps(ctx context.Context) (_ []*debug.App, retErr error) { | ||
// list apps returns a list of running apps, and a list of apps which may possibly exist. The | ||
// intent is to use the first result for things like "kubectl describe" and the second for getting | ||
// loki logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be refactored? Might it be clearer to merge podApps
(which would be the old listApps
and pipelineApps
(which would generate all apps for all known pipelines), rather than in GetDumpV2Template
listing pipelines, passing that into listApps
, then in listApps
iterating over running pods, deleting a pipeline pod’s associated pod from the list of pipelines, then creating an app for each remaining pipeline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should really be implemented in some completely different way; running pods should only affect logs, loki should be queried to enumerate all logs that loki can get, etc. But this is a release blocker because everyone is using autoscaling now and all the debug dumps that people are generating are useless, so I'm trying to keep it as minimal as possible for the backport.
d70c737
to
c78845c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The current algorithm for determining which pods to get logs for involves looking at running pods. This is not correct for pipelines; a pipeline could be autoscaled to 0. This PR adjusts the algorithm to add in any pipelines that don't currently have pods to the lokiLogs request. For example: ``` $ pachctl debug template ... describes: - name: console - name: default/edges pipeline: name: edges project: default - name: etcd - name: pachd - name: pachyderm-kube-event-tail - name: pachyderm-proxy - name: pg-bouncer logs: - name: console - name: default/edges pipeline: name: edges project: default - name: etcd - name: pachd - name: pachyderm-kube-event-tail - name: pachyderm-proxy - name: pg-bouncer lokiLogs: - name: console - name: default/edges pipeline: name: edges project: default - name: default/montage pipeline: name: montage project: default - name: etcd - name: pachd - name: pachyderm-kube-event-tail - name: pachyderm-proxy - name: pg-bouncer ... ``` Montage has autoscaling and isn't currently up; it now appears in lokiLogs, but not logs or describes.
… (#9750) The current algorithm for determining which pods to get logs for involves looking at running pods. This is not correct for pipelines; a pipeline could be autoscaled to 0. This PR adjusts the algorithm to add in any pipelines that don't currently have pods to the lokiLogs request. For example: ``` $ pachctl debug template ... describes: - name: console - name: default/edges pipeline: name: edges project: default - name: etcd - name: pachd - name: pachyderm-kube-event-tail - name: pachyderm-proxy - name: pg-bouncer logs: - name: console - name: default/edges pipeline: name: edges project: default - name: etcd - name: pachd - name: pachyderm-kube-event-tail - name: pachyderm-proxy - name: pg-bouncer lokiLogs: - name: console - name: default/edges pipeline: name: edges project: default - name: default/montage pipeline: name: montage project: default - name: etcd - name: pachd - name: pachyderm-kube-event-tail - name: pachyderm-proxy - name: pg-bouncer ... ``` Montage has autoscaling and isn't currently up; it now appears in lokiLogs, but not logs or describes.
The current algorithm for determining which pods to get logs for involves looking at running pods. This is not correct for pipelines; a pipeline could be autoscaled to 0. This PR adjusts the algorithm to add in any pipelines that don't currently have pods to the lokiLogs request. For example: ``` $ pachctl debug template ... describes: - name: console - name: default/edges pipeline: name: edges project: default - name: etcd - name: pachd - name: pachyderm-kube-event-tail - name: pachyderm-proxy - name: pg-bouncer logs: - name: console - name: default/edges pipeline: name: edges project: default - name: etcd - name: pachd - name: pachyderm-kube-event-tail - name: pachyderm-proxy - name: pg-bouncer lokiLogs: - name: console - name: default/edges pipeline: name: edges project: default - name: default/montage pipeline: name: montage project: default - name: etcd - name: pachd - name: pachyderm-kube-event-tail - name: pachyderm-proxy - name: pg-bouncer ... ``` Montage has autoscaling and isn't currently up; it now appears in lokiLogs, but not logs or describes.
The current algorithm for determining which pods to get logs for involves looking at running pods. This is not correct for pipelines; a pipeline could be autoscaled to 0. This PR adjusts the algorithm to add in any pipelines that don't currently have pods to the lokiLogs request. For example:
Montage has autoscaling and isn't currently up; it now appears in lokiLogs, but not logs or describes.