Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable running install_ltp and ltp_run in the same job #8329

Merged
merged 2 commits into from
Oct 31, 2019

Conversation

mdoucha
Copy link
Contributor

@mdoucha mdoucha commented Aug 30, 2019

This is a convenience patch for LTP testcase debugging. When both INSTALL_LTP and LTP_COMMAND_FILE are specified in the same job, run the testcase immediately after install_ltp and VM reboot.

Currently, running an LTP testcase from fresh sources requires two separate jobs: One to install LTP from Git and publish the HDD image, second to actually run the testcase. This is perfectly fine for regression tests but it's too cumbersome for quick one-off debug runs on special hardware.

@pevik
Copy link
Contributor

pevik commented Aug 30, 2019

@mdoucha thanks for handling this. FYI, my unfinished PR #7812

@pevik
Copy link
Contributor

pevik commented Aug 30, 2019

I haven't test it my PR well, but expect chicken & egg problems as in main_ltp.pm is needed to have runtest files, which are only after installation. If it works (don't have this problem, it'd be better solution for IPMI). It'd be good to check if /opt/ltp exist and install LTP only when it does not.

@pevik
Copy link
Contributor

pevik commented Aug 30, 2019

@mdoucha mdoucha changed the title Enable running ltp_install and ltp_run in the same job [WIP] Enable running ltp_install and ltp_run in the same job Aug 30, 2019
@mdoucha
Copy link
Contributor Author

mdoucha commented Aug 30, 2019

I haven't test it my PR well, but expect chicken & egg problems as in main_ltp.pm is needed to have runtest files, which are only after installation. If it works (don't have this problem, it'd be better solution for IPMI). It'd be good to check if /opt/ltp exist and install LTP only when it does not.

Hmm, I didn't notice that loadtest_from_runtest_file depends on extra files uploaded by install_ltp (I did a few debug runs the hard way so the runtest file was already uploaded when I started writing this patch). I'll have to dig deeper and try to delay the runtest file loading until after install_ltp finishes (if possible).

FYI: https://progress.opensuse.org/issues/53948

Unfortunately, I don't have access to this issue (error 403).

@mdoucha mdoucha changed the title [WIP] Enable running ltp_install and ltp_run in the same job [WIP] Enable running install_ltp and ltp_run in the same job Aug 30, 2019
@mdoucha mdoucha changed the title [WIP] Enable running install_ltp and ltp_run in the same job Enable running install_ltp and ltp_run in the same job Sep 3, 2019
@mdoucha
Copy link
Contributor Author

mdoucha commented Sep 3, 2019

I've solved the chicken & egg problem by scheduling testcases at the end of install_ltp if both INSTALL_LTP and LTP_COMMAND_FILE are set. But this feature now depends on os-autoinst/openQA#2302

If you run install and tests in the same job on OpenQA without the above PR applied, the testcases will execute just fine but the results will not show up in the web UI.

@pevik
Copy link
Contributor

pevik commented Sep 4, 2019

@mdoucha nice :).
verification run: http://quasar.suse.cz/tests/3635

And yes, ajax or page redirection would be great when autotest::write_test_order(); returns something.
@Martchus is there any way to do it?

@pevik
Copy link
Contributor

pevik commented Sep 4, 2019

@mdoucha please keep [WIP] or something in the subject so nobody merges it before openQA functionality in os-autoinst/openQA#2302 is merged and deployed on both osd and o3.

Copy link
Contributor

@pevik pevik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elegant solution, it just need to wait for os-autoinst/openQA#2302 deployment.

@pevik
Copy link
Contributor

pevik commented Sep 4, 2019

@richiejp FYI

@mdoucha mdoucha changed the title Enable running install_ltp and ltp_run in the same job [WIP] Enable running install_ltp and ltp_run in the same job Sep 4, 2019
@mdoucha
Copy link
Contributor Author

mdoucha commented Sep 5, 2019

One more dependency: os-autoinst/os-autoinst#1208

autotest::loadtest() will now call write_test_order() for you.

@mdoucha
Copy link
Contributor Author

mdoucha commented Sep 26, 2019

There were some conflicting changes in install_ltp so I've rebased to latest master commit, fixed a bug where loadtest_from_runtest_file() was looking for freshly uploaded runtest files in the wrong directory and consolidated all fixes into a single commit.

@pevik
Copy link
Contributor

pevik commented Sep 27, 2019

OK, we're waiting for os-autoinst/openQA#2302 and os-autoinst/os-autoinst#1208 to be released and deployed :).

@mdoucha mdoucha changed the title [WIP] Enable running install_ltp and ltp_run in the same job Enable running install_ltp and ltp_run in the same job Oct 2, 2019
@mdoucha
Copy link
Contributor Author

mdoucha commented Oct 2, 2019

Dependencies have been deployed, this PR is ready to be merged.

@baierjan
Copy link
Member

baierjan commented Oct 3, 2019

Verification runs maybe?

@pevik
Copy link
Contributor

pevik commented Oct 3, 2019

I did one back when the PR was created, but agree a new one wouldn't harm.

@mdoucha
Copy link
Contributor Author

mdoucha commented Oct 3, 2019

Verification runs maybe?

Now that the dependencies have been deployed, I can finally create one: https://openqa.suse.de/tests/3435241 (install_ltp + ltp_math)

@baierjan
Copy link
Member

baierjan commented Oct 3, 2019

Verification runs maybe?

Now that the dependencies have been deployed, I can finally create one: https://openqa.suse.de/tests/3435241 (install_ltp + ltp_math)

I see only install_ltp, is that correct?

@mdoucha
Copy link
Contributor Author

mdoucha commented Oct 3, 2019

I see only install_ltp, is that correct?

No, that's not correct. It appears that os-autoinst wasn't deployed together with openQA...

@okurz
Copy link
Member

okurz commented Oct 3, 2019

that last statement is incorrect. As can be seen from https://openqa.suse.de/tests/3435241/file/autoinst-log.txt the version of os-autoinst running is "4.5.1569914332.5870aebd" corresponding to https://github.com/os-autoinst/os-autoinst/commits/5870aebd which includes os-autoinst/os-autoinst#1208 so the only logical explanation I have at this time is that the feature does not work as expected due to other reasons

@mdoucha
Copy link
Contributor Author

mdoucha commented Oct 4, 2019

that last statement is incorrect.

*Facepalm* You're right. Now I know where the problem is. I've created the verification run from an install job and didn't remove the PUBLISH_HDD_1/PUBLISH_PFLASH_VARS variables. These somehow block the usual result upload loop if the test is too short (which it is here) and OpenQA won't notice that test_order.json has changed.

This verification run will work: https://openqa.suse.de/tests/3439736

@mdoucha
Copy link
Contributor Author

mdoucha commented Oct 4, 2019

This verification run will work: https://openqa.suse.de/tests/3439736

...or at least it would have worked if I set the right boot image... One more try: https://openqa.suse.de/tests/3439742

@okurz
Copy link
Member

okurz commented Oct 5, 2019

I've created the verification run from an install job and didn't remove the PUBLISH_HDD_1/PUBLISH_PFLASH_VARS variables. These somehow block the usual result upload loop if the test is too short (which it is here) and OpenQA won't notice that test_order.json has changed.

This sounds like actually the known issue https://progress.opensuse.org/issues/39845 and this is being worked on.

@pevik
Copy link
Contributor

pevik commented Oct 30, 2019

@mdoucha @okurz @cfconrad @perlpunk Dependency os-autoinst/openQA#2327 (poo#39845) has been deployed. Although it does not display during running test (poo#58826), we might want to merge it before. Or do we want to wait till poo#39845 is solved?

Anyway, verification runs for some LTP jobs (cloned jobs with added INSTALL_LTP=repo):
http://quasar.suse.cz/tests/3770
http://quasar.suse.cz/tests/3769
http://quasar.suse.cz/tests/3771

@mdoucha
Copy link
Contributor Author

mdoucha commented Oct 30, 2019

@mdoucha @okurz @cfconrad @perlpunk Dependency os-autoinst/openQA#2327 (poo#39845) has been deployed. Although it does not display during running test (poo#58826), we might want to merge it before. Or do we want to wait till poo#39845 is solved?

The real problem is that the dependencies are not deployed. Period. I can see the expected log entries only in jobs executed on openqaworker2, openqaworker10, all ppc64le workers and all aarch64 workers.

I've started 2 more verification runs.
This job will finish with LTP tests missing from schedule: https://openqa.suse.de/tests/3541697
This job will have complete schedule because it's forced to run on openqaworker2: https://openqa.suse.de/tests/3541705

@pevik
Copy link
Contributor

pevik commented Oct 30, 2019

Good point, let's wait.

@mdoucha
Copy link
Contributor Author

mdoucha commented Oct 30, 2019

And the results are in - exactly as predicted in my previous comment. So my statement from almost a month ago is correct after all:

No, that's not correct. It appears that os-autoinst wasn't deployed together with openQA...

os-autoinst on openqaworker6 may be up to date but it's apparently talking to OpenQA API which is more than a month old. That's why the test schedule doesn't get updated - because the code that is supposed to reload test_order.json and write the new test schedule into database still hasn't been deployed there.

@okurz
Copy link
Member

okurz commented Oct 31, 2019

So what exactly are you waiting for? Or who should do what? As a side-note: you can check the version of os-autoinst in the log file of every job and the version of the openQA webui in the webui itself, including the changelog

@pevik
Copy link
Contributor

pevik commented Oct 31, 2019

@okurz IMHO os-autoinst being installed on workers (and maybe openQA as well).

@pevik
Copy link
Contributor

pevik commented Oct 31, 2019

As a side-note: you can check the version of os-autoinst in the log file of every job and the version of the openQA webui in the webui itself, including the changelog

I know that, but it'd be nice to have internal site showing os-autoinst, openQA* package versions of all workers. That'd help a lot to see the actual deployment.

@mdoucha
Copy link
Contributor Author

mdoucha commented Oct 31, 2019

So what exactly are you waiting for? Or who should do what? As a side-note: you can check the version of os-autoinst in the log file of every job and the version of the openQA webui in the webui itself, including the changelog

The OpenQA webUI version is irrelevant because the worker is not talking to it in the first place. The os-autoinst process on the worker is talking to OpenQA API on the worker.

@mdoucha
Copy link
Contributor Author

mdoucha commented Oct 31, 2019

So what exactly are you waiting for? Or who should do what?

To answer your question, you need to update the openQA-common package on the following workers:

  • openqaw1
  • openqaw2 (if you plan to bring openqaw1 & 2 back online sometime in the future)
  • openqaworker3
  • openqaworker5
  • openqaworker6
  • openqaworker7
  • openqaworker8
  • openqaworker9
  • openqaworker13
  • grenache-1

Then we can merge this.

@okurz
Copy link
Member

okurz commented Oct 31, 2019

@okurz IMHO os-autoinst being installed on workers (and maybe openQA as well).

@pevik yes, but this is done already. We have a recent version installed – unless you can show me a machine where we really missed this.

As a side-note: you can check the version of os-autoinst in the log file of every job and the version of the openQA webui in the webui itself, including the changelog

I know that, but it'd be nice to have internal site showing os-autoinst, openQA* package versions of all workers. That'd help a lot to see the actual deployment.

Well, on top of the log files that show the versions you can also check https://openqa.opensuse.org/admin/workers (or correspondingly for every other server) that shows the "os-autoinst API version" and "Websocket Api version" which should be the relevant numbers. It can happen that we miss to update these numbers in pull requests but everyone can propose this with simple code change. For o3 there is an automatic nightly deployment so you can assume that by default the current git master is also running on the o3 webUI and workers.

So what exactly are you waiting for? Or who should do what?

To answer your question, you need to update the openQA-common package on the following workers:

* openqaw1

[…]

Then we can merge this.

Let's discuss these OSD specifics internally.

@okurz okurz merged commit dbab0b1 into os-autoinst:master Oct 31, 2019
@okurz
Copy link
Member

okurz commented Oct 31, 2019

I have merged now after all services have been restarted to ensure the new code is used.

@mdoucha mdoucha deleted the ltp_quickrun branch October 31, 2019 15:26
@baierjan baierjan removed the notready label Nov 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants