Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run a basic test scenario on o3 as part of CI checks #18182

Merged
merged 1 commit into from
Jan 10, 2024

Conversation

Martchus
Copy link
Contributor

@Martchus Martchus commented Nov 22, 2023

@Martchus
Copy link
Contributor Author

I'm not sure whether we can see anything here just yet because this runs on:pull_request_target and not on:pull_request (as explained in the comment that is part of the diff itself). So I'm afraid we cannot really fully test whether this will work before merging. I'll try to schedule a scenario manually.

@Martchus
Copy link
Contributor Author

Martchus commented Nov 22, 2023

After a few adjustments I was able to conduct a successful test run using the senario definitions and commands in this PR: https://openqa.opensuse.org/tests/3748177

I basically invoked (which is what this GitHub action will do):

build=$(curl -s ${OPENQA_HOST:-https://openqa.opensuse.org}/group_overview/${OPENQA_GROUP_ID:-1}.json | jq -r '([ .build_results[] | select(.tag.description=="published") | select(.version=="Tumbleweed") | .build ] | sort | reverse)[]' | head -n1)
openqa-cli schedule \
  --monitor \
  --host "${OPENQA_HOST:-https://openqa.opensuse.org}/" \
  --param-file SCENARIO_DEFINITIONS_YAML=scenario-definitions.yaml \
  DISTRI=openQA VERSION=Tumbleweed FLAVOR=dev ARCH=x86_64 \
  HDD_1=opensuse-Tumbleweed-x86_64-$build-textmode@64bit.qcow2 \
  UEFI_PFLASH_VARS=opensuse-Tumbleweed-x86_64-$build-textmode@64bit-uefi-vars.qcow2 \
  BUILD="Martchus/os-autoinst-distri-opensuse.git#fcbbfae1d069ef031b59710dd4d1fef7ef98702e" _GROUP_ID="0" \
  CASEDIR="https://github.com/Martchus/os-autoinst-distri-opensuse.git#fcbbfae1d069ef031b59710dd4d1fef7ef98702e"

The test run used the test code from Git:

[2023-11-22T11:35:25.459032Z] [debug] [pid:26556] Fetching 'fcbbfae1d069ef031b59710dd4d1fef7ef98702e' from origin manually

Needles are still coming from the default location.

Copy link
Member

@foursixnine foursixnine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Martchus/@okurz security concerns aside, this is looking naisss (I wonder though, why the check is not showing up, I guess it was never ran??)

Partayyyyyy

.github/workflows/openqa.yml Outdated Show resolved Hide resolved
.github/workflows/openqa.yml Show resolved Hide resolved
scenario-definitions.yaml Show resolved Hide resolved
.github/workflows/openqa.yml Outdated Show resolved Hide resolved
scenario-definitions.yaml Show resolved Hide resolved
scenario-definitions.yaml Outdated Show resolved Hide resolved
.github/workflows/openqa.yml Outdated Show resolved Hide resolved
@Martchus
Copy link
Contributor Author

@foursixnine

I wonder though, why the check is not showing up, I guess it was never ran??

As explained

… this runs on:pull_request_target and not on:pull_request (as explained in the comment that is part of the diff itself). So I'm afraid we cannot really fully test whether this will work before merging.

@foursixnine
Copy link
Member

foursixnine commented Nov 24, 2023

@foursixnine

I wonder though, why the check is not showing up, I guess it was never ran??

As explained

… this runs on:pull_request_target and not on:pull_request (as explained in the comment that is part of the diff itself). So I'm afraid we cannot really fully test whether this will work before merging.

Lets use then a separate branch, the same way we have it now for Trufflehog, on a target branch that isn't master, or address the items in #18182 (comment) (which if I understood @okurz correctly, are still planned to be addressed)

https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/f99df70fc4702425fc55668a06d45bc639bf5056/.github/workflows/trufflehog.yml#L4C19-L4C19

otherwise, this would (at least at the moment) add extra time to pass all PR checks in its current state.

@okurz
Copy link
Member

okurz commented Nov 24, 2023

@foursixnine I would suggest to just accept the changes as is as we have the same running fine for multiple months already in os-autoinst-distri-openQA as well as os-autoinst-distri-example. I would only plan the other points after we collect some feedback from production and see which points users really care about

@foursixnine
Copy link
Member

foursixnine commented Nov 24, 2023

@foursixnine I would suggest to just accept the changes as is as we have the same running fine for multiple months already in os-autoinst-distri-openQA as well as os-autoinst-distri-example.

I would accept the changes if the points I have mentioned have been addressed

I would only plan the other points after we collect some feedback from production and see which points users really care about

If you decide to enable it on a separate target, I can ask directly to QE-Core to help provide feedback, in terms of accepting the changes, as they are now, I'd like to collect some approvals from PO's and also from @DimStar77 and @lkocman, as they will be the ones who will be impacted, even if the priority of these PRs are set to low.

But that doesn't stop you from planning the other points, as they directly correlate with my comment above

I understand you wanted to have a timeboxed activity, and there the path of least resistance is moving it to its own branch, but I'm not willing to allow to spam o3 multiple times a day, even if the test is short-lived, they will be a lot and we know how bad it could get, taking from the learning experience (extrapolate data here) of Maintenance Incidents being scheduled on OSD.

and the average of 14 minutes of wait here: https://github.com/os-autoinst/os-autoinst-distri-openQA/actions/workflows/openqa.yml

Will only grow, when you look - https://github.com/os-autoinst/os-autoinst-distri-opensuse/actions/workflows/ci.yml?query=is%3Acompleted + verification runs scheduled in different places

So my advice is already stated #18182 (comment)

But I won't block a merge if others (as mentioned above) say it will be fine for them.

Another idea is to use one or two worker instances to do all the work, following up on this other comment.

@okurz
Copy link
Member

okurz commented Nov 26, 2023

@foursixnine I would suggest to just accept the changes as is as we have the same running fine for multiple months already in os-autoinst-distri-openQA as well as os-autoinst-distri-example.

I would accept the changes if the points I have mentioned have been addressed

The points you mention also include a smart lookup of which test code changed and constructing a dynamic schedule. That IMHO is hard and would take considerable effort to the point of we would not follow up with that right now from the side of the tools team. I would like to bring in a minimal proof of concept that additionally has (limited) value and then encourage anyone also working actively with this repo to come up with improvement ideas.

I would only plan the other points after we collect some feedback from production and see which points users really care about

If you decide to enable it on a separate target, I can ask directly to QE-Core to help provide feedback, in terms of accepting the changes, as they are now, I'd like to collect some approvals from PO's and also from @DimStar77 and @lkocman, as they will be the ones who will be impacted, even if the priority of these PRs are set to low.

What do you mean with "separate target"?

I understand you wanted to have a timeboxed activity, and there the path of least resistance is moving it to its own branch, but I'm not willing to allow to spam o3 multiple times a day, even if the test is short-lived, they will be a lot and we know how bad it could get, taking from the learning experience (extrapolate data here) of Maintenance Incidents being scheduled on OSD.

My expectation is that there would be no significant load increase on o3 given that we have a much higher capacity nowadays.

But I won't block a merge if others (as mentioned above) say it will be fine for them.

Another idea is to use one or two worker instances to do all the work, following up on this other comment.

Instead we could just set a higher prio value

@Martchus
Copy link
Contributor Author

I'm not sure what the suggestion to do this according to https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/f99df70fc4702425fc55668a06d45bc639bf5056/.github/workflows/trufflehog.yml#L4C19-L4C19 means. I would leave that PR for now as-is because I'm not sure what to improve. We might also close the related ticket https://progress.opensuse.org/issues/150992 as resolved soon as it was just a spike solution ticket anyways.

@foursixnine
Copy link
Member

@foursixnine

I wonder though, why the check is not showing up, I guess it was never ran??

As explained

… this runs on:pull_request_target and not on:pull_request (as explained in the comment that is part of the diff itself). So I'm afraid we cannot really fully test whether this will work before merging.

Lets use then a separate branch, the same way we have it now for Trufflehog, on a target branch that isn't master, or address the items in #18182 (comment) (which if I understood @okurz correctly, are still planned to be addressed)

https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/f99df70fc4702425fc55668a06d45bc639bf5056/.github/workflows/trufflehog.yml#L4C19-L4C19

otherwise, this would (at least at the moment) add extra time to pass all PR checks in its current state.

@okurz this is the comment about the separate branch, the follow ups can be done there, before disrupting test developers cc @Martchus

@Martchus
Copy link
Contributor Author

What follow-ups are we planning to do?

@foursixnine
Copy link
Member

What follow-ups are we planning to do?

I would still prefer if the changes are applied in the same way as truffle-hog is, but I won't block a merge

@okurz
Copy link
Member

okurz commented Dec 15, 2023

@okurz this is the comment about the separate branch, the follow ups can be done there, before disrupting test developers cc @Martchus

I don't understand the idea behind a separate branch. The idea is to provide CI checks on pull requests so that actual openQA tests are conducted for every pull request same as we already have in https://github.com/os-autoinst/os-autoinst-distri-openQA/

@foursixnine
Copy link
Member

foursixnine commented Dec 15, 2023

@okurz this is the comment about the separate branch, the follow ups can be done there, before disrupting test developers cc @Martchus

I don't understand the idea behind a separate branch. The idea is to provide CI checks on pull requests so that actual openQA tests are conducted for every pull request same as we already have in https://github.com/os-autoinst/os-autoinst-distri-openQA/

I would still prefer if the changes are applied in the same way as truffle-hog is, but I won't block a merge

In fact, I would appreciate if the tools team could work on a branch that isn't master for this case, but as I mentioned yesterday to you, I won't block a merge of this anymore.

@foursixnine
Copy link
Member

@okurz this is the comment about the separate branch, the follow ups can be done there, before disrupting test developers cc @Martchus

I don't understand the idea behind a separate branch.

The Idea behind a separate branch, is so that reviewers aren't stuck waiting until a CI job that doesn't have anything to do with their code changes (unless by out of luck, happens to be what this PR's schedule is).

The idea is to provide CI checks on pull requests so that actual openQA tests are conducted for every pull request same as we already have in https://github.com/os-autoinst/os-autoinst-distri-openQA/

If you know it works there already, then take the steps to move the ball further. You already did the "Hello World", go ahead and implement the thing well enough from the beginning, instead of rewritting the hello world again, this isn't rocket science.

PS: Before you ask again: "What does this mean for this PR?", find your answer below:

@okurz this is the comment about the separate branch, the follow ups can be done there, before disrupting test developers cc @Martchus

I don't understand the idea behind a separate branch. The idea is to provide CI checks on pull requests so that actual openQA tests are conducted for every pull request same as we already have in https://github.com/os-autoinst/os-autoinst-distri-openQA/

I would still prefer if the changes are applied in the same way as truffle-hog is, but I won't block a merge

In fact, I would appreciate if the tools team could work on a branch that isn't master for this case, but as I mentioned yesterday to you, I won't block a merge of this anymore.

@kalikiana kalikiana dismissed foursixnine’s stale review January 8, 2024 11:34

Clarified in comments

@kalikiana kalikiana merged commit eadeb4d into os-autoinst:master Jan 10, 2024
8 checks passed
@Martchus Martchus deleted the ci branch January 10, 2024 12:07
@Martchus
Copy link
Contributor Author

Just for the record: It looks like this check works, e.g. "Run a basic openQA test / trigger_and_monitor_openqa (pull_request_target)" passes on #18422 and the logs look like it actually conducted the test run (https://github.com/os-autoinst/os-autoinst-distri-opensuse/actions/runs/7477357160/job/20349838534?pr=18422, https://openqa.opensuse.org/tests/3861047).

@foursixnine
Copy link
Member

Just for the record: It looks like this check works, e.g. "Run a basic openQA test / trigger_and_monitor_openqa (pull_request_target)" passes on #18422 and the logs look like it actually conducted the test run (https://github.com/os-autoinst/os-autoinst-distri-opensuse/actions/runs/7477357160/job/20349838534?pr=18422, https://openqa.opensuse.org/tests/3861047).

But you already knew that from the experience with the openQA in openQA repo, this pr only enables that for this one.

@okurz
Copy link
Member

okurz commented Jan 12, 2024

But you already knew that from the experience with the openQA in openQA repo, this pr only enables that for this one.

Correct. And now we would know if any other future PR break the code so much that systems don't even boot anymore. And additionally in every pull request reviewers can ask to extend those tests accordingly where fitting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants