Skip to content

Commit

Permalink
[release 4.6] Bug 1953582: GatherClusterOperators and GatherClusterOp…
Browse files Browse the repository at this point in the history
…eratorsPodAndEvents (#413)

* Bug 1951034: Split up the GatherClusterOperators into smaller parts (#397)

* Split operators gather

This splits the cluster operators gather into two gathers, one for operator resources and other for unhealty operators.

* Adds precommit target

This adds the precommit target to the makefile, so we can execute it to test the stashed changes before actually commit.

* Move CompactedEvent

This move the structure that was previously defined on operators to operators_unhealthy.

* Change operators name

This changes the previous operator_unhealty to operator_pods_and_events.

* Update gathered documentation

This updates the gatherer documentation for operators and operators_pods_and_events

* Skip Test_UnhealtyOperators_FetchPodContainerLog

This removes and skips with note the Test_UnhealtyOperators_FetchPodContainerLog test
  • Loading branch information
rluders committed Jun 11, 2021
1 parent 85d0184 commit 69f2168
Show file tree
Hide file tree
Showing 8 changed files with 977 additions and 251 deletions.
15 changes: 15 additions & 0 deletions docs/gathered-data.md
Expand Up @@ -117,10 +117,25 @@ Location in archive: config/oauth/
See: docs/insights-archive-sample/config/oauth


## ClusterOperatorPodsAndEvents

collects all the ClusterOperators degraded Pods
for degraded cluster operators or that lives at the Cluster Operator's namespace, to collect:

- Pod definitions
- Previous and current Pod Container logs (when available)
- Namespace Events

* Location of pods in archive: config/pod/
* Location of events in archive: events/
* Id in config: operators_pods_and_events


## ClusterOperators

collects all ClusterOperators and their resource.
It finds unhealthy Pods for unhealthy operators
GatherClusterOperators collects all the ClusterOperators definitions and their resources.

The Kubernetes api https://github.com/openshift/client-go/blob/master/config/clientset/versioned/typed/config/v1/clusteroperator.go#L62
Response see https://docs.openshift.com/container-platform/4.3/rest_api/index.html#clusteroperatorlist-v1config-openshift-io
Expand Down
1 change: 1 addition & 0 deletions pkg/gather/clusterconfig/0_gatherer.go
Expand Up @@ -37,6 +37,7 @@ func (g *Gatherer) Gather(ctx context.Context, recorder record.Interface) error
GatherPodDisruptionBudgets(g),
GatherMostRecentMetrics(g),
GatherClusterOperators(g),
GatherClusterOperatorPodsAndEvents(g),
GatherContainerImages(g),
GatherNodes(g),
GatherConfigMaps(g),
Expand Down
61 changes: 0 additions & 61 deletions pkg/gather/clusterconfig/0_utils.go
Expand Up @@ -15,7 +15,6 @@ import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/serializer"
"k8s.io/apimachinery/pkg/util/json"
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
kubescheme "k8s.io/client-go/kubernetes/scheme"

Expand Down Expand Up @@ -168,32 +167,6 @@ func (a Anonymizer) GetExtension() string {
return "json"
}

// CompactedEvent holds one Namespace Event
type CompactedEvent struct {
Namespace string `json:"namespace"`
LastTimestamp time.Time `json:"lastTimestamp"`
Reason string `json:"reason"`
Message string `json:"message"`
}

// CompactedEventList is collection of events
type CompactedEventList struct {
Items []CompactedEvent `json:"items"`
}

// EventAnonymizer implements serializaion of Events with anonymization
type EventAnonymizer struct{ *CompactedEventList }

// Marshal serializes Events with anonymization
func (a EventAnonymizer) Marshal(_ context.Context) ([]byte, error) {
return json.Marshal(a.CompactedEventList)
}

// GetExtension returns extension for anonymized event objects
func (a EventAnonymizer) GetExtension() string {
return "json"
}

func anonymizeURLCSV(s string) string {
strs := strings.Split(s, ",")
outSlice := anonymizeURLSlice(strs)
Expand Down Expand Up @@ -243,40 +216,6 @@ func anonymizePod(pod *corev1.Pod) *corev1.Pod {
return pod
}

func isHealthyPod(pod *corev1.Pod, now time.Time) bool {
// pending pods may be unable to schedule or start due to failures, and the info they provide in status is important
// for identifying why scheduling has not happened
if pod.Status.Phase == corev1.PodPending {
if now.Sub(pod.CreationTimestamp.Time) > 2*time.Minute {
return false
}
}
// pods that have containers that have terminated with non-zero exit codes are considered failure
for _, status := range pod.Status.InitContainerStatuses {
if status.LastTerminationState.Terminated != nil && status.LastTerminationState.Terminated.ExitCode != 0 {
return false
}
if status.State.Terminated != nil && status.State.Terminated.ExitCode != 0 {
return false
}
if status.RestartCount > 0 {
return false
}
}
for _, status := range pod.Status.ContainerStatuses {
if status.LastTerminationState.Terminated != nil && status.LastTerminationState.Terminated.ExitCode != 0 {
return false
}
if status.State.Terminated != nil && status.State.Terminated.ExitCode != 0 {
return false
}
if status.RestartCount > 0 {
return false
}
}
return true
}

// MinimalNodeInfo contains the most essential information about a node
type MinimalNodeInfo struct {
ProviderID string `json:"providerID"`
Expand Down

0 comments on commit 69f2168

Please sign in to comment.