Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test pods logs collected - Automate BZ1838973 #134

Merged

Conversation

psimovec
Copy link
Member

@psimovec psimovec commented Jul 7, 2020

This is my first contribution in golang, any suggestions are welcome!

automates https://bugzilla.redhat.com/show_bug.cgi?id=1838973

Changes to previous code:
upgraded checkPodsLogs to accept regex instead of a string / added possibility to check only new logs... I don't know if the solution I have done is alright, I just did not want to change old function's parameters..

Added:
ExecCmd - execute bash command on a pod
TestPodLogsCollected test case

(In recent PR #124 , waiting time for logs was reduced from 30 mins to 5 mins... But I wait for a fresh archive, and it takes 6 minutes. Is it okay to set it to 10 minutes?) solved

(I will squash commits after review if everything is ok)

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 7, 2020
@openshift-ci-robot
Copy link
Contributor

Hi @psimovec. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 7, 2020
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 9, 2020
@psimovec psimovec changed the title [WIP] Test pods logs collected - Automate BZ1838973 [RFR] Test pods logs collected - Automate BZ1838973 Jul 9, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 9, 2020
@psimovec psimovec changed the title [RFR] Test pods logs collected - Automate BZ1838973 Test pods logs collected - Automate BZ1838973 Jul 9, 2020
@@ -79,6 +84,16 @@ func clusterOperatorInsights() *configv1.ClusterOperator {
return operator
}


func clusterOperatorMonitoring() *configv1.ClusterOperator {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to merge it with clusterOperatorInsights and create one function that can check any operator

Copy link
Member Author

@psimovec psimovec Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have extracted function that can check any operator, it looks cleaner, but kept the functions, so the strings of operator names don't have to be used in code, but keeping the functions might be useless (for now I at least don't touch other tests)

Extracted the function, didn't delete clusterOperatorInsights yet, it will be done in follow-up cleanup PR

@@ -169,19 +184,24 @@ func restartInsightsOperator(t *testing.T) {
t.Log(errPod)
}

func checkPodsLogs(t *testing.T, kubeClient *kubernetes.Clientset, message string) {
func _checkPodsLogs(t *testing.T, kubeClient *kubernetes.Clientset, message string, newLogs bool) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how underscore function name is used in go, @martinkunc ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, go devs are a bit stricter about naming. Please use camelCase for private and PascalCase for public methods. Underscores should be only in file names, not in variables. See here: https://golang.org/doc/effective_go.html#mixed-caps

Copy link
Member Author

@psimovec psimovec Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this naming, it looks like most of the main_test.go functions should also be renamed - most functions are in camelCase but are used as public ones.
Edit: I will do this in follow-up PR with changing panic functions to t.Fatal

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. And regarding the public functions. Public in Go means that they are accessible from other packages, which is also making a contract with other packages that they will keep working, which is why we tend not to use them if we don't have to.
In case of tests, I think we are in same packge, so we don't need to expose functions for someone else, except for Test... functions, executed by test runner.

newPods, err := kubeClient.CoreV1().Pods("openshift-insights").List(metav1.ListOptions{})
if err != nil {
t.Fatal(err.Error())
}

tajm := metav1.NewTime(time.Now())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tajm - is it supposed to be time? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh :D the last variable that made my life happier, forgot to rename it. It was time, but it's already taken, unfortunately :D

@martinkunc
Copy link
Contributor

/ok-to-test

@openshift-ci-robot openshift-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 10, 2020
// get info about insights cluster operator
operator, err := configClient.ClusterOperators().Get("monitoring", metav1.GetOptions{})
if err != nil {
panic(err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think generally its better not to use panic in tests. would it be possible to add t *testing.T and use t.Fatal(err) instead ? The reason is to distinguish valid test "failure" from panic when something really unexepected happens.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be also done in follow-up cleanup PR, so I don't touch other test functions in this PR

@@ -236,6 +256,100 @@ func forceUpdateSecret(ns string, secretName string, secret *v1.Secret) error {
return nil
}

func checkPodsLogs(t *testing.T, kubeClient *kubernetes.Clientset, message string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go doesn't have optional parameters, but we are using Functional Options pattern, see here: https://stackoverflow.com/a/26326418
I guess it would be usefull in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice pattern! I cheated it a little bit with ...bool for one optional parameter only, I think it is way simpler than using options struct with one bool value, which gets set up by function. If you don't like this, I can change it to Functional Options pattern

// wait for fresh report
checkNewPodsLogs(t, kubeClient, "Writing \\d+ records to")
insightsPod := findPod(t, kubeClient, "openshift-insights", "insights-operator")
hasLatestArchiveLogs := "tar tf $(ls -dtr /var/lib/insights-operator/* | tail -1)|grep -c \"\\.log$\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, but complicated :) Good work.
I am thinking, what if we would try to be more specific with matching just for log our our degraded operator ? Also if you use golang backtick multiline string literal, you woudlnt need to escape, something like:

hasLatestArchiveLogs := `
tar tf $(ls -dtr /var/lib/insights-operator/* | tail -1)|grep -c "^config/pod/cluster-monitoring-operator/logs/.*\.log$"
`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to know about multi-line string, thanks!
And the specific matching makes more sense.

return nil
}

func PodsLogsExist(t *testing.T, kubeClient *kubernetes.Clientset) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was not exacly sure what this method does just by its name. Pod logs is quite specific, but its not related actually. What about calling it something like ArchiveContainsPodLogs, or anything else :)
Coudln't you also please take out code to degrate in separate function, maybe degradeMonitoring, also this would be both neede to be reverted in the end. For that I would call it alrady from TestPodLogsCollected.
You could use return defer function, something like:

func action() func() {
	fmt.Println("doing action")
	return func() {
		fmt.Println("reverting action")
	}
}

func main() {
	defer action()()

	fmt.Println("main")
}

@martinkunc
Copy link
Contributor

Ahh, also it looks e2e tests failed on the build machine. I am guessing that you have used a package, which is not yet in vendor folder in github. When you bring a new package, you need to commit the content of vendor, go.mod and go.sum as well. Please be Please run

make vendor

to keep it clean.

@psimovec psimovec changed the title Test pods logs collected - Automate BZ1838973 [WIP] Test pods logs collected - Automate BZ1838973 Jul 14, 2020
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 14, 2020
@psimovec psimovec changed the title [WIP] Test pods logs collected - Automate BZ1838973 Test pods logs collected - Automate BZ1838973 Jul 14, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 14, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 15, 2020
@martinkunc
Copy link
Contributor

/test e2e-aws

@martinkunc
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 15, 2020
added logs check

check only fresh logs

little refactor

the test now reverts its changes

new packages, content of vendor

force update when changing report time interval to speed it up

delete pod to update settings

optional parameter

deleted useless comment
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 15, 2020
@martinkunc
Copy link
Contributor

/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 15, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: martinkunc, psimovec

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 1c80079 into openshift:master Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants