Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unschedule logs_from_installation_system #16503

Merged
merged 1 commit into from Mar 2, 2023

Conversation

ge0r
Copy link
Member

@ge0r ge0r commented Mar 1, 2023

Due to poo#122608, test module logs_from_installation_system occasionaly fails. This is because the exit code of the first command we run in the serial console is sometimes not captured and the command times out.
To avoid this we unschedule logs_from_installation_system where this issue occurs.

@ge0r ge0r added the qe-yam label Mar 1, 2023
@ge0r ge0r added the WIP Work in progress label Mar 1, 2023
@jknphy
Copy link
Contributor

jknphy commented Mar 1, 2023

this yaml seems that is used in 7 test suites https://gitlab.suse.de/coolgw/wegao-test/-/blob/master/JobGroups/migration_regression.yaml and we don't want to avoid collecting logs in all of them.
btw there is one failure not in regression group mentioned in the progress ticket.

@ge0r
Copy link
Member Author

ge0r commented Mar 1, 2023

this yaml seems that is used in 7 test suites https://gitlab.suse.de/coolgw/wegao-test/-/blob/master/JobGroups/migration_regression.yaml and we don't want to avoid collecting logs in all of them. btw there is one failure not in regression group mentioned in the progress ticket.

I think it fails (potentially) on all of the 12-SPX flavor ones, right?
Also the failure not addressed here is because it is related to a MR, I added it to the original ticket now.

Due to poo#122608, test module logs_from_installation_system
occasionaly fails. This is because the exit code of the first
command we run in the serial console is sometimes not captured
and the command times out. To avoid this we unschedule
logs_from_installation_system where this issue occurs.
@jknphy
Copy link
Contributor

jknphy commented Mar 1, 2023

this yaml seems that is used in 7 test suites https://gitlab.suse.de/coolgw/wegao-test/-/blob/master/JobGroups/migration_regression.yaml and we don't want to avoid collecting logs in all of them. btw there is one failure not in regression group mentioned in the progress ticket.

I think it fails (potentially) on all of the 12-SPX flavor ones, right? Also the failure not addressed here is because it is related to a MR, I added it to the original ticket now.

The three jobs failing are in this flavors:

Migration-from-SLE12-SPx
Regression-on-Migration-from-SLE12-SPx
Migration-from-SLE12-SPx-Milestone

But I don't think it is a good idea to remove logs collection for test suites not failing, it is pretty handy when new bug appears and we need to file it quickly without trying to retrieve it manually. IMO acting here 'potentially' might not be beneficial. Even a new settings as last option could do...

Have you checked NOLOGS ? looks like it is what we need.

@ge0r
Copy link
Member Author

ge0r commented Mar 1, 2023

this yaml seems that is used in 7 test suites https://gitlab.suse.de/coolgw/wegao-test/-/blob/master/JobGroups/migration_regression.yaml and we don't want to avoid collecting logs in all of them. btw there is one failure not in regression group mentioned in the progress ticket.

I think it fails (potentially) on all of the 12-SPX flavor ones, right? Also the failure not addressed here is because it is related to a MR, I added it to the original ticket now.

The three jobs failing are in this flavors:

Migration-from-SLE12-SPx
Regression-on-Migration-from-SLE12-SPx
Migration-from-SLE12-SPx-Milestone

But I don't think it is a good idea to remove logs collection for test suites not failing, it is pretty handy when new bug appears and we need to file it quickly without trying to retrieve it manually. IMO acting here 'potentially' might not be beneficial. Even a new settings as last option could do...

Have you checked NOLOGS ? looks like it is what we need.

The jobs in Migration-from-SLE12-SPx and Migration-from-SLE12-SPx-Milestone
are not affected by this change. They are related to this MR
The jobs in Regression-on-Migration-from-SLE12-SPx are addressed with this PR.
The problem is not collecting logs or not, we have to avoid moving to serial console during that 12-SPx install, the first command we type has high probability of timing out because the return code will not get captured.
There is no way to know beforehand if a potentially unstable testsuite will fail or not, so in my opinion, the most safe approach is to unscheduled them from there.
I am not unscheduling them from any jobs that do not suffer from this issue.

@ge0r
Copy link
Member Author

ge0r commented Mar 1, 2023

It could be that I just do not understand your proposal. What do you suggest exactly?

@jknphy
Copy link
Contributor

jknphy commented Mar 2, 2023

Thanks for the explanation, I merged the the MR, it is not a REMOTE CONTROLLER scenario, but there is a comment so ok.
For regression I have the the doubt that in this file I can find things like this:

s390x:
    sle-15-SP5-Regression-on-Migration-from-SLE12-SPx-s390x:
    - offline_sles12sp4_ltss_media_sdk-asmm-contm-lgm-tcm-wsm-pcm_all_full:        
             ...
             YAML_SCHEDULE: 'schedule/migration/s390x_regression_test_offline.yaml'
    - offline_sles12sp4_ltss_pscc_sdk-asmm-contm-lgm-tcm-wsm-pcm_all_full:
             ... 
             YAML_SCHEDULE: 'schedule/migration/s390x_regression_test_offline.yaml'

So with this change we will unschedule logs from more than one test suite or I am missing something?

@jknphy
Copy link
Contributor

jknphy commented Mar 2, 2023

btw, now we are focusing on test modules (discarding entering on it thanks to your investigation) there was a simple way to do it for this one and the other 2 merged:
EXCLUDE_MODULES that setting should work in main.pm and with yaml schedule.

@ge0r
Copy link
Member Author

ge0r commented Mar 2, 2023

btw, now we are focusing on test modules (discarding entering on it thanks to your investigation) there was a simple way to do it for this one and the other 2 merged: EXCLUDE_MODULES that setting should work in main.pm and with yaml schedule.

True, I could have used EXCLUDE_MODULES for the MR, I did not think of it.

@ge0r
Copy link
Member Author

ge0r commented Mar 2, 2023

Thanks for the explanation, I merged the the MR, it is not a REMOTE CONTROLLER scenario, but there is a comment so ok. For regression I have the the doubt that in this file I can find things like this:

s390x:
    sle-15-SP5-Regression-on-Migration-from-SLE12-SPx-s390x:
    - offline_sles12sp4_ltss_media_sdk-asmm-contm-lgm-tcm-wsm-pcm_all_full:        
             ...
             YAML_SCHEDULE: 'schedule/migration/s390x_regression_test_offline.yaml'
    - offline_sles12sp4_ltss_pscc_sdk-asmm-contm-lgm-tcm-wsm-pcm_all_full:
             ... 
             YAML_SCHEDULE: 'schedule/migration/s390x_regression_test_offline.yaml'

So with this change we will unschedule logs from more than one test suite or I am missing something?

The point of this change is to prevent all s390x testsuites with flavor Regression-on-Migration-from-SLE12-SPx from running the logs_from_installation_system module since (from my understanding) they will fail. I do not target specific jobs in that matter. If you think an approach where I target individual testsuites is preferred then I can do this via an MR instead of what I am doing here.

@jknphy
Copy link
Contributor

jknphy commented Mar 2, 2023

Regression-on-Migration-from-SLE12-SPx

it is perfectly fine, I missed the info that was failing in more than one for that flavor. thx.

@jknphy jknphy merged commit 90d76e7 into os-autoinst:master Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
qe-yam WIP Work in progress
Projects
None yet
2 participants