Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test-configs.yaml: Enable additional preempt-rt tests #2397

Closed
wants to merge 4 commits into from

Conversation

igaw
Copy link
Contributor

@igaw igaw commented Feb 19, 2024

While cyclictest is the classic preempt-rt test, cyclicdeadline and the rtla tests are also interesting to run.

Since these are all smoke test there is no point in running them too long. Thus reduce the runtime per test to one minute. This should keep the total preempt-rt runtime roughly in the same time frame.

@jan-kiszka
Copy link

@patersonc @aliceinwire, can CIP help out on this?

@broonie
Copy link
Member

broonie commented Feb 19, 2024

I've created a PR to add @igaw to the list of people who's PRs get automatically included in staging tests: kernelci/kernelci-deploy#132

@igaw
Copy link
Contributor Author

igaw commented Feb 20, 2024

Figured out how to resolve all the dependencies to run make test locally.

Also, I think it would make sense to figure out if we could run a bunch of different workloads. hackbench is mostly stressing the scheduler. E.g. I am running stress-ng in my lab as workload. But first, I try to figure out the basic stuff how configure kernelci.

@igaw
Copy link
Contributor Author

igaw commented Feb 20, 2024

Also one thing I would like to look into, is an artifact storage service like

https://archive.validation.linaro.org/

so that the rt-tests data can be analyzed after the run in more detail.

@nuclearcat nuclearcat added the staging-skip Don't test automatically on staging.kernelci.org label Feb 20, 2024
@nuclearcat
Copy link
Member

Let me know please when PR is ready for testing on staging, as it will require some adjustments for staging patches.

@igaw igaw force-pushed the preempt-rt branch 2 times, most recently from fde5e66 to 1ecc7a2 Compare February 20, 2024 09:36
@igaw
Copy link
Contributor Author

igaw commented Feb 20, 2024

@nuclearcat I am ready. I fixed all reports from make test and it should at least pass this level of checks.

@igaw igaw marked this pull request as ready for review February 20, 2024 09:37
@nuclearcat
Copy link
Member

Thanks, i will do staging run in few hours, today (need to test few more pending PR as well).

@nuclearcat nuclearcat removed the staging-skip Don't test automatically on staging.kernelci.org label Feb 21, 2024
@nuclearcat
Copy link
Member

I am sorry for delay, last time i tried to run on staging, it seems something missing on staging specific to CIP instance. (ii tried just by adding one of trees in monitor).
@aliceinwire do you have experience on running CIP-specific tests?
I might try again this week.

@igaw
Copy link
Contributor Author

igaw commented Mar 13, 2024

FWIW, it's not just for cip test plan, this should be enabled for the rt-stable test plan too.

Do you have some logs, so I can look into?

@nuclearcat
Copy link
Member

One of errors i spotted:


Traceback (most recent call last):
  File "/usr/local/bin/kci_test", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kernelci/scripts/kci_test.py", line 409, in main
    status = opts.command(configs, opts)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kernelci/scripts/kci_test.py", line 346, in __call__
    job = runtime.generate(
          ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kernelci/legacy/lava/__init__.py", line 84, in generate
    template = jinja2_env.get_template(short_template_file)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/jinja2/environment.py", line 1010, in get_template
    return self._load_template(name, globals)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/jinja2/environment.py", line 969, in _load_template
    template = self.loader.load(self, name, self.make_globals(globals))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/jinja2/loaders.py", line 125, in load
    source, filename, uptodate = self.get_source(environment, name)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/jinja2/loaders.py", line 204, in get_source
    raise TemplateNotFound(template)
jinja2.exceptions.TemplateNotFound: preempt-rt-rtla-timerlat/generic-uboot-tftp-ramdisk-preempt-rt-rtla-timerlat-template.jinja2

@igaw
Copy link
Contributor Author

igaw commented Mar 13, 2024

Okay, I'll look into it.

@igaw
Copy link
Contributor Author

igaw commented Mar 20, 2024

I'm feeling a bit stupid. make check complains all over the place. I really thought I did test my submission. Well, I'm cleanup up the mess now and restructure some more while I am at it and make the preempt-rt tests a bit more flexible to configure. This will be useful in future.

Use a template for preempt-rt cyclictest. This is a preparation step to
enable more tests from the rt-test suite.

Signed-off-by: Daniel Wagner <wagi@monom.org>
@igaw
Copy link
Contributor Author

igaw commented Mar 20, 2024

Changes:

  • rebased
  • fixed all the warning reported by 'make check'
  • made it possible to provide different parameters for each rt-tests

Because some of the rt-tests are CPU/board depended the last point is kind of important. I haven't looked into the different board configurations available, just assumed that the current default work fine. That is every board tested has at least 2 CPUs.

@broonie
Copy link
Member

broonie commented Mar 21, 2024

We do currently schedule the preempt-rt tests on Beaglebone Black which is single processor, mainly just because there's lots of boards available in my lab so we can easily schedule tests there. I think it'd be fine to just not use that board so long as we've got something else providing the coverage, or to set up a nosmp variant which we select to run on single processor systems.

@igaw
Copy link
Contributor Author

igaw commented Mar 21, 2024

I think it makes sense to keep the Beaglebone Black in the loop. I am also using this board locally for testing, so good for reproducing results locally.

I haven't figured out yet how to per board type configurations are expressed. I stare a bit at the code and try to figure out what is there or to do it.

@igaw
Copy link
Contributor Author

igaw commented Mar 21, 2024

Alternatively, I could look into extending the test suite so it automatically selects all available CPUs. In the end we are talking about cyclictest, cyclicdeadline and signaltest. The rest of the test don't have any additional requirements. I think this makes more sense.

@broonie
Copy link
Member

broonie commented Mar 21, 2024

There's not really per board configurations at the minute as far as I'm aware, modulo the tests reading things from the running system and self configuring.

@broonie
Copy link
Member

broonie commented Mar 21, 2024

Support for automatic detection is probably going to be the easiest thing to get working here, yeah.

@igaw
Copy link
Contributor Author

igaw commented Mar 21, 2024

BTW, I haven't changed the defaults in this PR. cyclictest was and is still hard coded to use two threads.

Thus we don't have to wait for my PR for test-definition to land first. I'll update the kernelci configuration as soon as my test-definition is available in kernelci.

@nuclearcat
Copy link
Member

After trying to generate tests:

+ kci_test generate --install-path=/data/workspace/bot.staging.kernelci.org/test-runner/workspace/test-runner/artifacts --runtime-json=/data/workspace/bot.staging.kernelci.org/test-runner/workspace/test-runner/artifacts/lab-broonie.json --storage=http://storage.staging.kernelci.org/ --runtime-config=lab-broonie --user=kernel-ci --runtime-token=**** --output=/data/workspace/bot.staging.kernelci.org/test-runner/workspace/test-runner/jobs/lab-broonie --callback-id=kernel-ci-callback-staging --callback-url=https://api.staging.kernelci.org/
Traceback (most recent call last):
  File "/usr/local/bin/kci_test", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kernelci/scripts/kci_test.py", line 409, in main
    status = opts.command(configs, opts)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kernelci/scripts/kci_test.py", line 346, in __call__
    job = runtime.generate(
          ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kernelci/legacy/lava/__init__.py", line 85, in generate
    data = template.render(params)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/jinja2/environment.py", line 1301, in render
    self.environment.handle_exception()
  File "/usr/local/lib/python3.11/site-packages/jinja2/environment.py", line 936, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "/etc/kernelci/lava/preempt-rt/generic-uboot-tftp-ramdisk-preempt-rt-template.jinja2", line 1, in top-level template code
    {% extends 'boot/generic-uboot-tftp-ramdisk-boot-template.jinja2' %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/etc/kernelci/lava/boot/generic-uboot-tftp-ramdisk-boot-template.jinja2", line 1, in top-level template code
    {% extends 'base/kernel-ci-base-tftp-deploy.jinja2' %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/etc/kernelci/lava/base/kernel-ci-base-tftp-deploy.jinja2", line 1, in top-level template code
    {% extends 'base/kernel-ci-base.jinja2' %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/etc/kernelci/lava/base/kernel-ci-base.jinja2", line 79, in top-level template code
    {% block actions %}{% endblock %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/etc/kernelci/lava/preempt-rt/generic-uboot-tftp-ramdisk-preempt-rt-template.jinja2", line 5, in block 'actions'
    {% include 'preempt-rt/preempt-rt.jinja2' %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/jinja2/environment.py", line 936, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "/etc/kernelci/lava/preempt-rt/preempt-rt.jinja2", line 22, in template
    path: automated/linux/{{ tst_group | tst_cmd }}/{{ tst_cmd }}.yaml
^^^^^^^^^^^^^^^^^^^^^^^
jinja2.exceptions.TemplateAssertionError: No filter named 'tst_cmd'.

musamaanjum pushed a commit to musamaanjum/kernelci-core that referenced this pull request Jul 5, 2024
Add preempt-rt template for to be used by new KernelCI. Add
preempt-rt.jinja from Danel's github PR [1].

Allow to configure all parameters because these are board specific,
e.g. how many CPUs are available. Thus, it doesn't make sense to hard
code them. Furthermore, the various rt-tests (cyclictest, cyclicdeadline
and pmqtest etc) do have some common parameters but also unique ones.

[1] kernelci#2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
musamaanjum pushed a commit to musamaanjum/kernelci-pipeline that referenced this pull request Jul 5, 2024
Enable the first preempt-rt test, cyclictest in new KernelCI. Enable it
on all platforms.

Since these are all smoke test there is no point in running them too
long. Thus reduce the runtime per test to one minute. This should keep
the total preempt-rt runtime roughly in the same time frame.

The changes have been ported from Daniel's PR [1].

[1] kernelci/kernelci-core#2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
musamaanjum pushed a commit to musamaanjum/kernelci-pipeline that referenced this pull request Jul 5, 2024
Add jobs definition of all the rt-tests. Enable cyclicdeadline and rtla
tests to run on all targets.

The changes have been ported from Daniel's PR [1].

[1] kernelci/kernelci-core#2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
@nuclearcat
Copy link
Member

As referenced in PR we are creating -RT testing support in new system, called Maestro. Probably it is completed by now, feel free to open PR or issue in kernelci-pipeline repository if you want to send any updates.
Legacy system is not being updated, so i am closing this PR.

@nuclearcat nuclearcat closed this Jul 16, 2024
@igaw
Copy link
Contributor Author

igaw commented Jul 16, 2024

Is the new dashboard already online, to see if the RT builds/tests are running?

@igaw
Copy link
Contributor Author

igaw commented Jul 16, 2024

Nevermind, found the docs to it: https://docs.kernelci.org/maestro/api/early-access/

@nuclearcat
Copy link
Member

We have also internal tools to view results:
Staging: https://staging.kernelci.org:9000/viewer
Production: https://kernelci-api.westus3.cloudapp.azure.com/viewer
I will make sure also today to merge remaining -rt related patches, if that will be possible. I will reopen PR meanwhile to keep discussion.

@nuclearcat nuclearcat reopened this Jul 16, 2024
github-merge-queue bot pushed a commit to kernelci/kernelci-pipeline that referenced this pull request Jul 16, 2024
Enable the first preempt-rt test, cyclictest in new KernelCI. Enable it
on all platforms.

Since these are all smoke test there is no point in running them too
long. Thus reduce the runtime per test to one minute. This should keep
the total preempt-rt runtime roughly in the same time frame.

The changes have been ported from Daniel's PR [1].

[1] kernelci/kernelci-core#2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
github-merge-queue bot pushed a commit to kernelci/kernelci-pipeline that referenced this pull request Jul 16, 2024
Add jobs definition of all the rt-tests. Enable cyclicdeadline and rtla
tests to run on all targets.

The changes have been ported from Daniel's PR [1].

[1] kernelci/kernelci-core#2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
github-merge-queue bot pushed a commit that referenced this pull request Jul 16, 2024
Add preempt-rt template for to be used by new KernelCI. Add
preempt-rt.jinja from Danel's github PR [1].

Allow to configure all parameters because these are board specific,
e.g. how many CPUs are available. Thus, it doesn't make sense to hard
code them. Furthermore, the various rt-tests (cyclictest, cyclicdeadline
and pmqtest etc) do have some common parameters but also unique ones.

[1] #2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
@padovan
Copy link
Contributor

padovan commented Jul 16, 2024

@igaw you can also see the results on the WIP grafana dashboard from Helen Koike: https://grafana.kernelci.org/d/OKXc44EIz/wip3a-koike?orgId=1&var-origin=maestro&var-tree=stable-rt&var-branch=All&var-test_path_regex=kernelci_baseline&var-platform=%25&var-config=%25&var-datasource=default

nuclearcat added a commit to nuclearcat/kernelci-pipeline that referenced this pull request Jul 24, 2024
* src/scheduler: store error message when job fails with "submit_error"

It is helpful for debugging to catch error message when
scheduler fails to submit job to runtime.
Store the error message to `data.error_msg` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: Set minimum kernel version for DT kselftest to 6.7

The test was introduced upstream in version 6.7, so no point in trying
to run it on earlier versions.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* configs/: Update volteer device

Update volteer devices according lab availability

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary templates: detailed output for active/inactive regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new presets for active regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: update CHANGELOG

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* data: chmod -R 777 ./data/output to avoid permission error

Avoid errors like

PermissionError: [Errno 13] Permission denied: '/home/kernelci/data/output/stable-rc-boot.html'

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: move code to _get_logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: use ThreadPoolExecutor to fetch logs

Fetching logs is the bottleneck of the script. Fetch them in parallel
with ThreadPoolExecutor.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix result presets

stable-rc-build-failures and stable-rc-boot-failures weren't querying
specifically for test failures.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: rework regression detection

Take into account "active" and "inactive" regressions when creating them
and when processing new passed or failed nodes.

When a node passes, it checks if it "inactivates" an existing "active"
regression. When a node fails, it checks if it needs to create a new
regression or update an existing "active" one.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: link failed nodes to active regressions

When a failed node generates a regression, or when it's a re-run of a
run that generated a still active regression, link the node to the
regression id.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for date ranges for creation and update

New command line options to let the user specify date ranges for node
creation and last update: --created-from, --created-to,
--last-updated-from, --last-updated-to

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: support for date ranges for creation and last update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for extra query parameters in cmdline

New command line option: --query-params to specify a set of extra query
parameters to complete or override preset parameters.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: html markup in some preset titles

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: update and move to docs folder

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: move parameter loading and processing to 'setup'

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: refactor and split into two clases (single, run)

Split the ResultSummary class into a base class and two child classes:
ResultSummarySingle and ResultSummaryLoop (only a stub at this point).

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: WIP initial implementation of the "loop" command

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: huge refactoring

Implement "summary" (single-shot) and "monitor" (loop) modes based on
preset parameters instead of on the command-line main command.

Split the logic into multiple files, move all monitor-specific and
summary-specific code to independent files, common code in a separate
file.

Full of kludges, I don't like how this is looking so far, might consider
reimplementing it without any dependencies on pipeline code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix markup and indentation

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new generic templates for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: examples for "monitor" and "summary" modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: summary and monitor modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix generic regression report

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: summary: fix last_updated option handling

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: embed css stylesheet in html files

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] make regression active by default

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "result" field is ever made non-optional in the models we can
probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] set default empty node sequence

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "node_sequence" field is ever made non-optional in the models we
can probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: add cmdline option --output-dir

Introduce a new command-line option: --output-dir, and rename the old
--output to --output-file.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: command-line options change

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: jobs-chromeos: remove meaningless Tast tests

Several Tast tests can only fail in the context of KernelCI:
* `video.PlatformDecoding.v4l2_state*_vp9_0_svc` do not actually exist,
  causing the whole test job to fail
* `platform.DLCService*` and `platform.Memd` rely on features only
  present in the downstream Chrom{e,ium}OS kernel (see b/247467814 and
  b/244479619 for those having access to Google's issue tracker)
* `kernel.ConfigVerify.chromeos` relies on downstream-only config
  options such as `CONFIG_SECURITY_CHROMIUMOS` and other similar ones,
  and therefore can only fail when testing upstream kernels

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: scheduler-chromeos: don't execute non-working Tast tests

Currently, HEVC-related tests are known to either fail or be skipped as
ChromeOS doesn't yet handle hardware decoding of HEVC media. This is
expected to be fixed at some point though, so we're keeping the job
definitions and only remove the corresponding scheduler entries in order
to reinstate those jobs when relevant.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: exclude Tast tests known to always fail

Several decoder tests always fail on all platforms where they're
executed, adding only noise to otherwise useful test results. Disable
those for improving the quality of the results.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: chromeos: add special case for pre-6.7 qcom codec tests

On Qualcomm-based ChromeBooks (`trogdor` being the only model in
Collabora's lab), we noticed systematic failures of all
`vp9_*_frm_resize` and `vp9_*_sub8x8_sf` tests when using a kernel up to
6.6. With 6.7 and above, all of those tests (except one) now pass. It
therefore makes sense to exclude those on pre-6.7 kernels so we don't
report known failures and get rid of some noise.

This involves "duplicating" affected test jobs (although I did my best
to minimize that) and setting rules so only the working variant is
executed, based on the version of the kernel being tested.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* lava_callback: Compress the log files to save storage space

As storage space in cloud and egress have high costs,
better to compress potentially large files.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* tests: Add basic yaml validation

Add yaml load to figure out earlier issues with yaml

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in platforms anchors

The "stoneyridge" and "pineview" naming used in the Chromebook platform
anchors refers to ChromiumOS specific config fragments, but doesn't
necessarily match the actual platform of all the devices listed.
Use more generic names to distinguish amd and intel Chromebooks.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: rename test job anchors that use chromeos specific configs

Rename test job anchors that use chromeos specific kernel configurations
to include the 'chromeos' infix.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: add baseline tests

Enable the baseline tests on all the supported Chromebooks with their
default kernel configuration.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in job defs

The "stoneyridge" and "pineview" naming used in some Chromebook job
definitions refers to ChromiumOS specific config fragments, but
doesn't necessarily match the actual platforms targeted by the jobs.
Replace all occurrences with more generic intel/amd naming.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop chromeos infix from baseline jobs

Keeping different job names for tests targeting different kernel configs
might cause too much duplication. Drop the 'chromeos' infix from the job
name for the tests using the chromeos config fragment. Users will be
able to filter the results using the data.defconfig/data.config_full
fields anyway.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: post-process results for summary and monitor modes

Split the post-processing of nodes to a common function that can be used
for both summary and monitor modes. Currently, post-processing involves
only the collection of logs.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: update and fix presets and templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/result-summary-CHANGELOG: update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config/pipeline.yaml: enable 'BayLibre' lab

Add lab configuration for BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-baylibre` runtime

Add runtime argument `lab-baylibre` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86-baylibre` job

Add job configuration `baseline-x86-baylibre` for BayLibre.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-armel-baylibre` job

Add job configuration `baseline-armel-baylibre` for BayLibre.
Add scheduler entry and platform config as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline: enable `android` tree and build configs

Monitor linux `android` tree. Add build configs for `android-mainline`
branch.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add kbuild definitions for android-mainline

Add kbuild jobs to compile the kernel for android-mainline branch

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add entries to schedule to build android-mainline

Add entries to `scheduler:` section to run the builds for
android-mainline.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix node filter in monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* kernelci.toml: set `checkout` node timeout to `180 min`

Currently set `60 min` timeout is not enough as some
`kbuild` jobs and its sub-tests take around 2 hrs to
complete after getting submitted to runtime.

Here is an example from staging. See the information
for a `checkout` and its child nodes:

| id                       | name                | created                    | updated                    | timeout                    |
|--------------------------|---------------------|----------------------------|----------------------------|----------------------------|
| 661c9d59b60b785eb9fc42b0 | checkout            | 2024-04-15T03:22:01.317000 | 2024-04-15T03:51:03.870000 | 2024-04-15T04:22:01.284000 |
| 661c9d97b60b785eb9fc42b4 | kbuild-gcc-10-arm64 | 2024-04-15T03:23:03.399000 | 2024-04-15T03:50:15.031000 | 2024-04-15T09:23:03.399000 |
| 661ca3f7b60b785eb9fc4ead | baseline-arm64      | 2024-04-15T03:50:15.304000 | 2024-04-15T05:09:45.247000 | 2024-04-15T09:50:15.304000 |

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary: add email report capabilities for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: plain text single report templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: chromeos: add baseline-nfs tests

Enable the baseline-nfs tests on all the supported Chromebooks, with
both the default and the chromeos kernel configurations.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/timeout: set `checkout` result

For `TIMEOUT` mode, set `checkout` node result to `fail`
if its state is `running` as it means code checkout is still
going on and node timed-out. Set it to `pass` if its state
is any other than `running`.
Set `checkout` node result to `pass` if mode is `DONE` as
it means once `checkout` has been in `available` or `closing`
state and it could successfully complete source code checkout.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* regression_tracker: bugfix, failed test with no prior runs

Handle the case of a failed test run when it's the first occurence of
that test case. Consider it "not a regression" for now, since we're
defining a regression as a "breaking point" between a success and a
failure.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: platforms-chromeos: fix dalboz device type

Due due to a copy/paste mishap, the device type for
`asus-CM1400CXA-dalboz` had a trailing `_chromeos`, leading LAVA to fail
finding the correct device type, and no job from the new system running
on this platform.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromes: run Tast tests only on 5.4+

Current ChromeOS images have `ext4` filesystems using options not
present in 4.19. Therefore tests cannot run on kernels that old, and
this leads to false positives in corrupt device identification, so we
should only run those tests on 5.4 and later kernels.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromes: drop non-existent platform

`hp-x360-12b-ca0500na-n4000-octopus` isn't a device type available in
Collabora's LAVA lab, so let's drop its definition.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: exclude android tree from kbuild jobs

Only Android-specific kbuild jobs should run for this tree, let's not
overload our system with unneeded builds.

Take this opportunity to limit mediatek kbuilds to 6.1+ as that's the
earliest version that has upstream support for at least one of our
devices.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: a bug fix in `_submit_lapsed_nodes`

Fix a glitch in the code related to setting `checkout`
node result.

Fixes: 361fc0d ("src/timeout: set `checkout` result")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update early access FQDN

We are moving k8s from eastus to westus3 as it is cheaper

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/tarball: fix `_kdir` in `update_repo`

Fix the below error:
```
kernelci-pipeline-tarball |   File "/home/kernelci/./pipeline/tarball.py", line 79, in _update_repo
kernelci-pipeline-tarball |     kernelci.shell_cmd(f"rm -rf {self._kdir}")
kernelci-pipeline-tarball |                                  ^^^^^^^^^^
kernelci-pipeline-tarball | AttributeError: 'Tarball' object has no attribute '_kdir'
```

Fixes: 0a2fe9c ("src/patchset.py: Implement Patchset service)
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: fix method to get child nodes recursively

`TimeoutService._get_child_nodes_recursive` is used to get
pending child nodes recursively for closing and timed-out
nodes. It overwrites the result while being called recursively.
Fix the method to make it work properly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: rename "armel" arch to "arm"

`armel` has various meanings depending on the system: for ChromeOS, it
is ARMv7, while in Debian it's ARMv{5T,6}. Moreover, this project is
*Kernel*CI and the kernel uses `arm` for all 32-bits ARM devices. In
order to avoid confusion (including those wondering what the heck does
`armel` mean), let's rename `armel` to `arm`.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: use per-system arch property where relevant

With the new `*arch` fields present in the platform configurations, we
don't have to hardcode the architecture strings in some specific cases.
Let's adapt the config files so we use `{cros,deb,k}arch` wherever it
makes sense.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: set timed-out `checkout` result

Set timed-out `checkout` node result to `incomplete`
while in `running` state. As it denotes that the node
timed-out while checkout was still going on.
Also, set error related information i.e. `error_code`
and `error_msg`.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/tarball: update checkout node when update repo fails

Tarball updates source code repo and creates tarball.
If update repo operation fails even with second attempt,
it means it failed to checkout souce code.
Hence, update `checkout` node with state `done` state and
result `fail`. Also, set appropriate error information
to the `data` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: enable collabora-next tree and build config

Monitor the collabora-next tree. Add build config for the for-kernelci
branch.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: enable acpi kselftest on collabora-next tree

Run the ACPI kselftest on the for-kernelci branch of the collabora-next
tree.

See: https://lore.kernel.org/linux-kselftest/20240308144933.337107-1-laura.nao@collabora.com/T/#t

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: restore missing split_query_params function

Restore this function that was accidentally removed during the last
refactoring.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* lava_callback: Don't upload empty files to Azure

There is no use for lot of empty files on Azure,
that only complicate cleanup.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: unify preset and output names

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: update preset for aferraris

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for laura.nao

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fixes and new presets for nfraprado

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fix arch query parameters

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* k8s: Lot of deployment tested fixes

Fixes in yaml files for k8s production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result-summary presets: Fix build failure and regression monitors

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* result_summary: added debug traces to the monitor

Show detailed info of the node filterings in real time.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: fix corner case bug when no logs are found

Cover rare case where neither the node nor any of its parents up to the
checkout node have any log artifacts.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: refine stable-rc presets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: add regression info to test reports

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: escape log snippets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src: lava_callback: add device ID to node data

It can be useful to know the exact device on which a job ran, without
having to open the LAVA job page. This is done by querying the device ID
from the callback data and appending it to the node data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: upload raw callback data as well

Debugging callback issues is complex due to the raw data not being saved
after processing. This change ensures we save the callback data as a
JSON file in order to ease development.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* DONOTMERGE lava_callback: add debug statements

Why the heck doesn't this just work???

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary_templates: fix error 'node' is undefined

The object is named test and not node, so s/node/test

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/runtime/kunit: set architecture info

Set architecture field for `kunit` test
nodes.
If no `arch` argument is supplied, kunit takes
`um` (User Mode Linux) as architecture to run
tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: count running child jobs of build nodes

Add a method to count running jobs of `kbuild`
nodes i.e. jobs being submitted after successful
builds. Fox example `baseline` or `tast` jobs.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle closing `checkout` node differently

Usually, `checkout` should be transited to `done` state
when all its child nodes are completed.
In case of closing `checkout`, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are still
running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle holdoff reached `checkout` node differently

Usually, available `checkout` for which holdoff is
reached should be transited to `done` state only when
all its child nodes are completed.
In case of such `checkout` node, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are
still running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Revert "DONOTMERGE lava_callback: add debug statements"

This reverts commit 5ed8218d99840373bbba5830b1976813b52bf4b1.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* Create dependabot.yml

* result_summary_templates: make generic-test-failures generic to all
results

The generic-test-failures templates can be used to show general results
just replacing the name "failures" by "results". Makeing it easier to be
re-used by communities that want to have pre-sets to list all results of
the tests, so:

	s/generic-test-failures/generic-test-results

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result-summary.yaml: add preset to list android build tests

Since we now build android, add a preset to allow result-summary.yaml to
list all build results from Android tree.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* tarball: Implement checkout for specific commit

We often need not ToT, but specific commit, implement this.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* jobs-chromeos.yaml: Disable module compression for every kernel version

Commit d4bbe942098b ("kbuild: remove CONFIG_MODULE_COMPRESS"),
introduced in kernel v5.13, substituted CONFIG_MODULE_COMPRESS=n for
CONFIG_MODULE_COMPRESS_NONE=y as the way to disable module compression.
Since module compression causes "Invalid ELF header magic: != ELF"
errors during boot on the ChromeOS base config, add the missing config
to disable module compression on kernels > v5.13 as well.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* src: lava_callback: reduce callback data size

The callback data is quite large, especially as it includes the full log
which we already upload separately. By dropping it and compressing the
whole file with `gzip` we can avoid wasting too much storage space.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: don't leak secret token

The callback data contains the secret tokens value which shouldn't be
leaked. Ensure we drop it from the uploaded data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromeos: use new cros-flash image

This ensures we use the new version of the `install-modules` script.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: regression_tracker: add the "device" field to regression data

This can be helpful. We're not using it as a search param though, as we
don't want to narrow down the search that much, using the platform only
is better.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: result_summary_templates: report device used for job

This information is now available, and it can be useful to know the
affected device withouth having to look at the LAVA job details.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* kubernetes: Update deployment recipe

Update list of labs and add KCI_INSTANCE variable.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava-callback: Limit threads of lava-callback

Due inrush of lava callbacks and slow Azure Files
processing, we need to make sure we dont spawn too many
threads.
Also add hard limit of memory 1Gbyte

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: add presetes for fluster test

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Make template generic for all v4l2 tests
- Rebase on main

* result_summary presets: make the name of fluster test generic

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: enable first fluster test for mt8195-cherry-tomato-r2

Enable first fluster test, AV1-TEST-VECTORS for mt8195-cherry-tomato-r2.
Run the test on mainline and next until more trees are added.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Create generic v4l2-decoder-conformance-job and use anchers from it
- Update the rootfs address
- Move anchor to _anchor
- Update with nitpicks

* config: jobs-chromeos: Add kernelci tree for testing purpose

Remove this commit before merging.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Enable cpufreq kselftest

Enable cpufreq kselftest on all the trees and branches.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

* result_summary presets: fix preset for kselftest-dt failures monitor

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for kselftest-cpufreq

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: mt8195-cherry-tomato-r2: enable all fluster tests for all branches

Add all the trees and branches on which the tests would be ran. Enable
all the tests for tomato.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- The build config cannot be added yet. Just list the trees, it will only use
  the branches configured in build_configs:
  - mainline will use master
  - next will use master
  - collabora-chromeos-kernel will use for-kernelci
  - media will use master and fixes
- Remove kernelci tree as it was added just for testing purpose

* config: mt8183-kukui-jacuzzi-juniper-sku16: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

jacuzzi

* config: mt8186-corsola-steelix-sku131072: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: mt8192-asurada-spherion-r0: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Don't specify the platforms manually as they are already mentioned in
  test-job-arm64-mediatek

* config: sc7180-trogdor-kingoftown/lazor-limozeen: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Use test-job-arm64-qualcomm instead and carete separate jobs for
  qualcomm devices
- Don't specify platforms manually as they are already mentioned in
  test-job-arm64-qualcomm

* build(deps): bump uwsgi from 2.0.21 to 2.0.22 in /docker/lava-callback

Bumps [uwsgi](https://uwsgi-docs.readthedocs.io/en/latest/) from 2.0.21 to 2.0.22.

---
updated-dependencies:
- dependency-name: uwsgi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* pipeline.yaml: Add stable-rc build variants

Add more build variants for stable-rc tree to match legacy system.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary: add error classification

Classify errors according to patterns in the logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary presets: add collabora-chromeos-kernel and media trees for fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: Use media-stage instead of media-tree

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config/pipeline: enable android branches from legacy

Enable all android branches from the legacy system

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* trigger: Add exclude/include tree list for trigger

As we need to restrict list of running kernels on staging,
we need to add option allowing that.
Also it will be good to exclude staging kernels from production
kernel list.

So in case of staging we need to run kernels only from tree "kernelci"
and sometimes something else, for example "mediatek".
Option will look like:

--trees kernelci,mediatek
or
--trees kernelci

On production we need to exclude trees kernelci and buggytree:
--trees !kernelci,buggytree
or just kernelci:
--trees !kernelci

Purpose of this option is that our compiling capacity is limited,
and right now staging and production both compiling very large set
of kernels, we need to reduce this amount to drop costs.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: platforms-chromeos: use CrOS R124 files

ChromeBooks were upgraded with a new image based on ChromiumOS R124, so
we must use those files now.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: drop non-existent Tast tests

Those were removed between R120 and R124 and therefore cause test
failures with the new images.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary presets: fix acpi kselftest presets

We're interested in catching regressions and failures in the both the
kselftest-acpi test suites and its test cases. Match the nodes by group
in the presets accordingly.
Fix template used by the failure monitor preset.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src: update return values of `APIHelper.receive_event_node`

`APIHelper.receive_event_node` method is used to receive
node data from PubSub event. The method has been updated
to return `is_hierarchy` flag as well which represents
events related to node hierarchy.
Update pipeline services using the method accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: refine presets for v4l2-decoder-conformance

Modify the regression preset to monitor regressions on both the
v4l2-decoder-conformance test suites and its test cases, by matching the
nodes by group instead of by name.
Also, change the failure preset to monitor for all errors caused by
runtime errors.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: add summary presets for v4l2-decoder-conformance

Add summary presets to fetch regressions and failures on
v4l2-decoder-conformance tests. Two of the presets are the same used by
the monitor; add one additional preset to fetch all the failures on
both the test suites and their test cases.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* lava_callback.py: Remove error_code/error_msg on lava-callback

Sometimes due congestion node might be set to timeout, but
then result might arrive late and we need to use it properly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: fix dt kselftest presets

Fix the dt kselftest preset, just like was done for the acpi one, as the
current preset doesn't match the actual results we're interested in.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* doc/connecting-lab: refine documentation

Refine documentation for connecting LAVA labs
and submitting jobs to the lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback: Sometimes we get totally invalid log file uploaded

Most likely problems lays in threading of flask, and possibly
callbacks are getting mixed. This commit attempts to introduce
several countermeasures against that.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* doc: add `_index.md` page

Add index documentation page.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `pipeline-details` page

Move `pipeline-details` documentation from the API
repository to this repo to make it close to the source.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/connecting-lab: adjust `weight` property

Change `weight` property of existing doc page to
accommodate with transition of pipeline related docs
to pipeline repo.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `developer-documentation` page

Add developer manual documentation.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add lab config for Qualcomm

Add an entry to `runtimes` section for Qualcomm
lab configurations.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86` job for qualcomm

Add job configuration `baseline-x86-qualcomm` for
running baseline job in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add lab-qualcomm runtime

Add runtime argument `lab-qualcomm` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to Qualcomm LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-arm64` job for qualcomm

Add job configuration `baseline-arm64-qualcomm` for
running baseline job for `arm64` in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update RISC-V configs

1)rv32 defconfig doesn't exist, remove
2)nommu_k210_defconfig have modules disabled

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback.py: Sanitize lava log data

As we use this data in reports, lets remove all
non-printable characters as they confuse grafana, browsers and others.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/runtime/kunit.jinja2: fix result map

Fix result map for skipped tests. Initially, API
didn't have `skip` available node result in the schema.
That's why it was mapped to `None` result. But now API
has `skip` result to denote skipped tests.
Fix the result mapping accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: jobs-chromeos: Add lab-setup fragment

Add the lab-setup fragment to the chromebook builds, which contains the
architecture independent kernel configs needed to run tests on the
platform. Notably this disables IP autoconfig by the kernel.

The result of this change is that the 12 seconds boot delay and the
consequent deferred probe pending warnings will no longer happen on any
platform. Particularly on mt8186-corsola-steelix-sku131072 (due to a
different network adapter being used) on which it was still happening.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* lava_callback: bump up slightly threads number

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: enable watchdog reset test on Chromebooks

Add a basic test to verify watchdog reset functionality. Enable the
test on all ARM64 and AMD x86_64 Chromebooks. For Intel
Chromebooks, enable the test only on octopus, as ACPI PM Timer on the
other devices has been disabled in coreboot.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/send_kcidb: use schema version 4.3

Test status `MISS` was added to KCIDB in schema
v4.2 and supported by the latest version i.e. v4.3.
Hence, use the latest version for submission as
API may send a few tests with "MISS" status.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* send_kcidb: re-structure code for parsing checkout node

Move code for parsing checkout node to a separate
method.
Add `valid` field to parsed checkout node. It denotes
if source code was successfully checked out.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: print more information on invalid data

Print details for invalid revision data for the
sake of debugging.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: optimize `kcidb` import

Remove redundant `kcidb` import and adjust
kcidb Client call accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: remove keys with `None` values

KCIDB doesn't allow `None` as field value.
Remove all optional fields with `None` value
to make it valid data for submitting to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: add `kcidb_test_suite` property

Every KernelCI test will be mapped to a unified
test suite for KCIDB data submission.
Add `kcidb_test_suite` property to test job
definitions in YAML configuration files.
The added property will store the mapped
KCIDB test suite name.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: parse and submit node test and build data

Listen to all the node events with node state
`done` or `available` and submit the node to KCIDB.
Parse node received from the event and create KCIDB
schema compatible object based on type of the node
i.e. checkout, build or test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: set `log_excerpt` for builds and tests

Fetch logs from compressed log file(*.log.gz) URL
and send last 16*1024 characters for setting `log_excerpt`
field for build and test nodes as it is the max allowed
length of the KCIDB field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/jobs-chromes: add kcidb test suite property for watchdog test

Add KCIDB test suite mapping for `watchdog_reset` test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback.py: disable log removal from callback data

We need it for investigations if we have any critical data
loss during log sanitizing.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: add error info to build nodes

Add error metadata fields such as `error_code` and
`error_msg` to `misc` field for build nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: add watchdog-reset presets for mainline/next

Add monitor and summary presets to track the results from the watchdog
reset test on the mainline and next trees.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* pipeline.yaml: Fix fluster rootfs URL

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: get error metadata for failed/incomplete tests

Tweak condition to get error metadata for test nodes.
It should get error info for incomplete nodes as well
and not just failed nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: send tests only if KCIDB test mapping exists

All test suite definitions must have `kcidb_test_suite`
property i.e. KCIDB test suite mapping.
Only send tests for those the mapping is found.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* tests/validate_yaml: add validation for KCIDB mapping

To submit KernelCI generated data to KCIDB, it is required
to have a mapping for all the job definition with
`kcidb_test_suite` property.
Add validation to ensure all the jobs have a mapping
present to avoid missing data submission.
This check is to notify test authors trying to enable tests
in maestro to include the required property for the mapping
in their definition.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add qcs6490-rb3gen2 boot test

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* config: chromeos: Enable kselftest-dt on Qualcomm platforms

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* pipeline.yaml: Add one um build for android trees

As per request of Android team it will be good to check for breakages
UM builds as well.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: use `kind=job` for test suites

As part of re-structuring test hierarachy, `Job` model
has been introduced for test suite/job nodes.
It uses node kind `job`.
Update test configurations in `pipeline.yaml` and
`jobs-chromeos.yaml` to use `kind=job` to
generate job nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: provide `kind` value for child tests

In case of submitting test hierarchy, child nodes by default
inherit `kind` value from parent node.
As we are re-structuring test hierarchy, test suit/job nodes
will have `kind=job` where its child test nodes will have
`kind=test`. Provide `kind` field explicitly to test result
hierarchy to preserve different kind value than the parent
node.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: fix `NameError`

Fix the below error in `_submit` method:
```
Traceback (most recent call last):
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 287, in main
    job.submit(results)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 138, in submit
    self._submit(result)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 265, in _submit
    return node
NameError: name 'node' is not defined
```

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: evaluate job node result

Evaluate job node result from child node results if
`null` result is receive from test result parser.
For example nodes such as `fortify`:
https://staging.kernelci.org:9000/viewer?node_id=6670ab43d0b7694b399897c4

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix parsing of KUnit log file

Handle both compressed(gzip) and plain text log files
for getting log excerpt.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: HTTP exception handling for log excerpt

Add HTTP exception handling for getting
log excerpt data.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: platforms-chromeos: Add serial delay for some Mediatek platforms

Add test_character_delay to the Spherion, Tomato and Steelix platforms
to workaround the fact that they're sometimes unable to process serial
input fast enough, resulting in mangled commands and consequently flaky
test results, as described in
https://github.com/kernelci/kernelci-project/issues/366.

The right place to do this change would be in the device-type template
as described in LAVA's documentation [1]. This overriding in KernelCI is
meant only as a temporary workaround to verify whether this fixes the
issue. If it does, then we'll do it in LAVA upstream instead.

[1] https://docs.lavasoftware.org/lava/debugging.html#differences-in-input-speeds
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: chromeos: Enable error-logs kselftest for MediaTek Chromebooks

Run the error-logs kselftest on MediaTek Chromebooks. This test is
currently under review upstream [1] so, in the meantime, it has been
added to the collabora-next tree so it can prove its value by helping to
detect issues upstream.

[1] https://lore.kernel.org/all/20240423-dev-err-log-selftest-v1-0-690c1741d68b@collabora.com

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config/pipeline.yaml: enable CIP lab

Add configuration for LAVA CIP lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add baseline-x86 test for CIP

Add `baseline-x86-cip` test to be submitted to CIP
LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-cip` runtime

Add runtime argument `lab-cip` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to CIP LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: enable `job` node submission to KCIDB

Parse newly added job node and its child tests
for KCIDB submission.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: don't submit `setup` test suite nodes

`setup` test suite has been introduced to store test results
for environment setup checks before running actual test suite.
KCIDB doesn't require `setup` test suite result as long as
main test job result is submitted.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: add a check before sending data

Check if parsed data is available before
sending revision data to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix logs

Fix log statement about submitting node to KCIDB
as we are not sending all the nodes we receive
event for to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: handle skipped tests

Do not retrieve artifacts or metadata from parent
node for skipped tests as in pratice only kernel
revision, test runtime and platform will be
available for skipped tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary/utils: ignore failures on log retrieval

Make the script continue running if there was an error fetching a test
log.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/developer-documentation: add docs for enabling new tests

Add developer documentation for enabling new tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Fix links after docs page migration

Documentation has been migrated to the "docs.*" subdomain.

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* pipeline.yaml: Add kcidebug fragment

Add useful low-overhead debug option to kernel,
and test on most x86 boards we have available,
with minimal baseline tests.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* configs: update gcc-10 to gcc-12

As we upgrade compiler images, we need update gcc version

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: workaround: match node paths programatically

Don't use 'path' as an api search parameter. The use of lists as query
parameters (path is a list) is undefined. Instead, do the filtering in
code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: remove qemu jobs from lab-qualcomm

QEMU jobs use container pulled from hub.docker.com. After the lab move
pulling from this registry is no longer possible at Qualcomm. This patch
disables QEMU jobs from Qualcomm lab.

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* validate_yaml.py: Improve pipeline validation

Add validation that scheduler entries have matching job entry,
this is critical validation, and job entries have at least
one entry in the scheduler.
Fix one entry detected by this validation

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* pipeline.yaml: Add broonie(Mark Brown) trees to pipeline

It is time to enable even more trees.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add additional verification for duplicate keys

We might have redefined same keys in different yaml files,
this tool will ensure consistency of this entries.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Remove path separator

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Rename variable to schedules

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/kernelci.toml: update KCIDB origin name

As we agreed to refer new KernelCI API & Pipeline as
"maestro", use the new name while submitting data
to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: update KCI result mapping with KCIDB status

Update evaluation of KCIDB status from KCI result.

Create 2 categories for error codes:
1. When pre-check tests completed but actual test suite
coudln't run - this will have `MISS` status
2. When pre-check tests completed, actual test suite could
run but somehow couldn't complete - this will have `ERROR` status

Some LAVA error codes can occur at any point of execution
such as `Cancelled` and `Test`.
Listed such error codes to the most relevant category
based on analysis of available results.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: fix presets for v4l2-decoder-conformance

Following recent updates to data representation on KernelCI nodes,
the top-level nodes for tests now have their kind set to 'job' instead
of  'test'. Update the presets for v4l2-decoder-conformance tests
accordingly.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: fix output file name in kselftest-acpi preset

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: enable dmabuf-heaps, exec and iommu kselftest suites

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Add kcidb_test_suite

* config: result-summary: add generic rule to monitor failures and regression

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Add rt-stable builds

Copy rt-stable builds from legacy KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Major changes to move to new way of writing kbuild jobs

* config: pipeline: Add v6.6-rt branch for builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: result-summary: add rt-stable kbuilds presets

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Add 'nfs' suffix to KCIDB suite name for baseline-nfs

The baseline test is currently run with both ramdisk and nfs rootfs. To
distinguish baseline-nfs tests in KCIDB, add an 'nfs' suffix to the KCIDB
test suite name.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* aks: Add kubernetes kcidb deployment

We need file that will manage deployment of kcidb bridge
in kubernetes production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* kubernetes: Adjust trigger k8s options

Ignore kernelci tree on production, as it is special
"staging"-only tree, and read all /config directory, not just default
pipeline.yaml.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: bugfix: catch empty search condition

Fix _get_last_matching_node(), after the previous change there was an
unhandled scenario where nodes may be empty but the function wouldn't
return None immediately.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: pipeline: correct the kind of kselftest suites to job

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler-chromeos.yaml: Temporarily disable non-essential tast tests

As per discussion, we disable temporary tast tests which unlikely
will be reviewed.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* k8s/aks: Update deployment files

1)Update memory limit, as working with linux sources might require 3Gbyte of RAM.
2)Update config file path
3)Add callback environment variable
4)Update image reference to fresh one

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android builds with gcc-12 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable android builds with clang-17 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: remove build_variants from android build_configs

The build_variants is legacy way to specify the different variants. We
have moved to the newer way to specify the variants. Hence remove the
build_variants from android build_configs.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add android15-6.6-lts branch for build as well

The android15-6.6-lts has been included recently in legacy KernelCI:
https://github.com/kernelci/kernelci-core/pull/2597

Add the same in newer KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add blocklist for riscv older kernels for android builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: update KCIDB test suite mapping for baseline

Use `boot` as KCIDB test suite mapping for all
baseline tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* callback_url: Update config and README

As we are moving callback URL to environment variable,
updating config and README accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android baseline (boot) testing for arm and arm64 in only allmodconfig

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler.py: If event have jobfilter, inject it to the node data

When someone generate artificial event with jobfilter, this is
likely maintainer trying to repeat job. Treat this accordingly,
and inject job filter to job node, so we will run only tests
maintainer wants.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback: migrate to fastapi

It will be easier to maintain API and Pipeline, as
both will be powered by FastAPI framework.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: Update fluster rootfs URL

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: pipeline: fix defconfigs in fragments

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* kbuild.jinja2: support defconfig as list or str

As required in https://github.com/kernelci/kernelci-core/pull/2608
defconfig might be two types. Support it in jinja2 accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: piepline: add kbuilds of lee-mfd with default defconfigs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable baseline testing for mfd for one board of each arch

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: fix platform sections for Qualcomm and Android schedules

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* k8s: Update deployment to uvicorn, as we use fastapi now

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: Unblock android runs on lava-collabora

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: Enable preempt-rt cyclictest test

Enable the first preempt-rt test, cyclictest in new KernelCI. Enable it
on all platforms.

Since these are all smoke test there is no point in running them too
long. Thus reduce the runtime per test to one minute. This should keep
the total preempt-rt runtime roughly in the same time frame.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: add all the test jobs for all rt-test

Add jobs definition of all the rt-tests. Enable cyclicdeadline and rtla
tests to run on all targets.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add template and test properties for preempt_rt jobs

Add template, job add kcidb_test_suite properties for all preempt-rt jobs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: rename preempt-rt to rt-tests which is correct name of tests

The legacy was using preempt-rt name of tests. But the repository has
rt-tests name. We must use the same name to merge with execution results
coming from other CIs in KCIDB.

Suggested-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add the correct nfsroot for rt-tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Remove android's deprecated branches

It has been confirmed with Todd that we should remove the deprecated
branches. Hence remove those branches.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: run baseline on non-allmodconfig

The allmodconfig generates very large kernel image. It cannot be booted
on the arm64 and arm targets as tftp errors out that size is too large.
Reduce the kernel image size. Use the default defconfig. The same
defconfigs have been booting for other trees.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* doc: developer-documentation: Update documentation by adding more details

- Reorganize some things
- Specify how to write different variants by removing old syntax
- Give two separate templates for kbuild and test
- Try to put more details for new contributors

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Fix type
- Apply suggestions from code review

* doc/developer-documentation: fix a glitch in enabling new tree section

Fix a minor bug in YAML block formatting.

Fixes: f5f57de ("doc: developer-documentation: Update documentation by adding more details")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/developer-documentation: update a section title

Rename a section from "Enabling a new Kernel tree" to
"Enabling new KernelCI trees, builds, and tests" as it explains
enabling tests as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: use the new `tree:branch` format for rules

For cases where we want a single branch to be allowed for a given tree,
we can now use the `tree:branch` format in rules. Convert existing rules
accordingly.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: pipeline: fix improper use of "filters" attribute

The `filters` param was used in the legacy system but has been replaced
by `rules`, with a different syntax.

For Android RISC-V builds, this was used to deny job execution on
kernels < 4.19, so let's translate this condition with the rules format,
and do a similar change for the `rt-tests`-based jobs.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config/pipeline.yaml: Fix x86 typo in kcidebug job names

The kcidebug jobs that run on MediaTek and Qualcomm platforms should
have arm64 in the name rather than x86. Fix the typo.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: pipeline: remove params

The parameters are only needed when they are changed or appeneded.
Remvoe the parameters which aren't being modified.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* validate_yaml.py: Jobs are required to have template parameter

Add more validation to config files of mandatory parameters.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add more job validations

Add basic validation, each job must have kind parameter

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* workflows: Add label on CI check failures

Automatically add label so broken PR wont go to staging

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

---------

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Signed-off-by: Laura Nao <laura.nao@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>
Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-authored-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Co-authored-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Co-authored-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Co-authored-by: Helen Koike <helen.koike@collabora.com>
Co-authored-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Co-authored-by: Laura Nao <laura.nao@collabora.com>
Co-authored-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Co-authored-by: Shreeya Patel <shreeya.patel@collabora.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Milosz Wasilewski <milosz.wasilewski@foundries.io>
Co-authored-by: Paweł Wieczorek <pawiecz@collabora.com>
Co-authored-by: Milosz Wasil…
nuclearcat added a commit to nuclearcat/kernelci-pipeline that referenced this pull request Jul 24, 2024
* src/scheduler: store error message when job fails with "submit_error"

It is helpful for debugging to catch error message when
scheduler fails to submit job to runtime.
Store the error message to `data.error_msg` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: Set minimum kernel version for DT kselftest to 6.7

The test was introduced upstream in version 6.7, so no point in trying
to run it on earlier versions.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* configs/: Update volteer device

Update volteer devices according lab availability

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary templates: detailed output for active/inactive regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new presets for active regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: update CHANGELOG

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* data: chmod -R 777 ./data/output to avoid permission error

Avoid errors like

PermissionError: [Errno 13] Permission denied: '/home/kernelci/data/output/stable-rc-boot.html'

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: move code to _get_logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: use ThreadPoolExecutor to fetch logs

Fetching logs is the bottleneck of the script. Fetch them in parallel
with ThreadPoolExecutor.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix result presets

stable-rc-build-failures and stable-rc-boot-failures weren't querying
specifically for test failures.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: rework regression detection

Take into account "active" and "inactive" regressions when creating them
and when processing new passed or failed nodes.

When a node passes, it checks if it "inactivates" an existing "active"
regression. When a node fails, it checks if it needs to create a new
regression or update an existing "active" one.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: link failed nodes to active regressions

When a failed node generates a regression, or when it's a re-run of a
run that generated a still active regression, link the node to the
regression id.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for date ranges for creation and update

New command line options to let the user specify date ranges for node
creation and last update: --created-from, --created-to,
--last-updated-from, --last-updated-to

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: support for date ranges for creation and last update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for extra query parameters in cmdline

New command line option: --query-params to specify a set of extra query
parameters to complete or override preset parameters.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: html markup in some preset titles

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: update and move to docs folder

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: move parameter loading and processing to 'setup'

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: refactor and split into two clases (single, run)

Split the ResultSummary class into a base class and two child classes:
ResultSummarySingle and ResultSummaryLoop (only a stub at this point).

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: WIP initial implementation of the "loop" command

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: huge refactoring

Implement "summary" (single-shot) and "monitor" (loop) modes based on
preset parameters instead of on the command-line main command.

Split the logic into multiple files, move all monitor-specific and
summary-specific code to independent files, common code in a separate
file.

Full of kludges, I don't like how this is looking so far, might consider
reimplementing it without any dependencies on pipeline code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix markup and indentation

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new generic templates for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: examples for "monitor" and "summary" modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: summary and monitor modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix generic regression report

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: summary: fix last_updated option handling

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: embed css stylesheet in html files

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] make regression active by default

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "result" field is ever made non-optional in the models we can
probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] set default empty node sequence

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "node_sequence" field is ever made non-optional in the models we
can probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: add cmdline option --output-dir

Introduce a new command-line option: --output-dir, and rename the old
--output to --output-file.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: command-line options change

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: jobs-chromeos: remove meaningless Tast tests

Several Tast tests can only fail in the context of KernelCI:
* `video.PlatformDecoding.v4l2_state*_vp9_0_svc` do not actually exist,
  causing the whole test job to fail
* `platform.DLCService*` and `platform.Memd` rely on features only
  present in the downstream Chrom{e,ium}OS kernel (see b/247467814 and
  b/244479619 for those having access to Google's issue tracker)
* `kernel.ConfigVerify.chromeos` relies on downstream-only config
  options such as `CONFIG_SECURITY_CHROMIUMOS` and other similar ones,
  and therefore can only fail when testing upstream kernels

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: scheduler-chromeos: don't execute non-working Tast tests

Currently, HEVC-related tests are known to either fail or be skipped as
ChromeOS doesn't yet handle hardware decoding of HEVC media. This is
expected to be fixed at some point though, so we're keeping the job
definitions and only remove the corresponding scheduler entries in order
to reinstate those jobs when relevant.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: exclude Tast tests known to always fail

Several decoder tests always fail on all platforms where they're
executed, adding only noise to otherwise useful test results. Disable
those for improving the quality of the results.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: chromeos: add special case for pre-6.7 qcom codec tests

On Qualcomm-based ChromeBooks (`trogdor` being the only model in
Collabora's lab), we noticed systematic failures of all
`vp9_*_frm_resize` and `vp9_*_sub8x8_sf` tests when using a kernel up to
6.6. With 6.7 and above, all of those tests (except one) now pass. It
therefore makes sense to exclude those on pre-6.7 kernels so we don't
report known failures and get rid of some noise.

This involves "duplicating" affected test jobs (although I did my best
to minimize that) and setting rules so only the working variant is
executed, based on the version of the kernel being tested.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* lava_callback: Compress the log files to save storage space

As storage space in cloud and egress have high costs,
better to compress potentially large files.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* tests: Add basic yaml validation

Add yaml load to figure out earlier issues with yaml

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in platforms anchors

The "stoneyridge" and "pineview" naming used in the Chromebook platform
anchors refers to ChromiumOS specific config fragments, but doesn't
necessarily match the actual platform of all the devices listed.
Use more generic names to distinguish amd and intel Chromebooks.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: rename test job anchors that use chromeos specific configs

Rename test job anchors that use chromeos specific kernel configurations
to include the 'chromeos' infix.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: add baseline tests

Enable the baseline tests on all the supported Chromebooks with their
default kernel configuration.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in job defs

The "stoneyridge" and "pineview" naming used in some Chromebook job
definitions refers to ChromiumOS specific config fragments, but
doesn't necessarily match the actual platforms targeted by the jobs.
Replace all occurrences with more generic intel/amd naming.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop chromeos infix from baseline jobs

Keeping different job names for tests targeting different kernel configs
might cause too much duplication. Drop the 'chromeos' infix from the job
name for the tests using the chromeos config fragment. Users will be
able to filter the results using the data.defconfig/data.config_full
fields anyway.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: post-process results for summary and monitor modes

Split the post-processing of nodes to a common function that can be used
for both summary and monitor modes. Currently, post-processing involves
only the collection of logs.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: update and fix presets and templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/result-summary-CHANGELOG: update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config/pipeline.yaml: enable 'BayLibre' lab

Add lab configuration for BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-baylibre` runtime

Add runtime argument `lab-baylibre` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86-baylibre` job

Add job configuration `baseline-x86-baylibre` for BayLibre.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-armel-baylibre` job

Add job configuration `baseline-armel-baylibre` for BayLibre.
Add scheduler entry and platform config as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline: enable `android` tree and build configs

Monitor linux `android` tree. Add build configs for `android-mainline`
branch.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add kbuild definitions for android-mainline

Add kbuild jobs to compile the kernel for android-mainline branch

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add entries to schedule to build android-mainline

Add entries to `scheduler:` section to run the builds for
android-mainline.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix node filter in monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* kernelci.toml: set `checkout` node timeout to `180 min`

Currently set `60 min` timeout is not enough as some
`kbuild` jobs and its sub-tests take around 2 hrs to
complete after getting submitted to runtime.

Here is an example from staging. See the information
for a `checkout` and its child nodes:

| id                       | name                | created                    | updated                    | timeout                    |
|--------------------------|---------------------|----------------------------|----------------------------|----------------------------|
| 661c9d59b60b785eb9fc42b0 | checkout            | 2024-04-15T03:22:01.317000 | 2024-04-15T03:51:03.870000 | 2024-04-15T04:22:01.284000 |
| 661c9d97b60b785eb9fc42b4 | kbuild-gcc-10-arm64 | 2024-04-15T03:23:03.399000 | 2024-04-15T03:50:15.031000 | 2024-04-15T09:23:03.399000 |
| 661ca3f7b60b785eb9fc4ead | baseline-arm64      | 2024-04-15T03:50:15.304000 | 2024-04-15T05:09:45.247000 | 2024-04-15T09:50:15.304000 |

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary: add email report capabilities for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: plain text single report templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: chromeos: add baseline-nfs tests

Enable the baseline-nfs tests on all the supported Chromebooks, with
both the default and the chromeos kernel configurations.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/timeout: set `checkout` result

For `TIMEOUT` mode, set `checkout` node result to `fail`
if its state is `running` as it means code checkout is still
going on and node timed-out. Set it to `pass` if its state
is any other than `running`.
Set `checkout` node result to `pass` if mode is `DONE` as
it means once `checkout` has been in `available` or `closing`
state and it could successfully complete source code checkout.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* regression_tracker: bugfix, failed test with no prior runs

Handle the case of a failed test run when it's the first occurence of
that test case. Consider it "not a regression" for now, since we're
defining a regression as a "breaking point" between a success and a
failure.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: platforms-chromeos: fix dalboz device type

Due due to a copy/paste mishap, the device type for
`asus-CM1400CXA-dalboz` had a trailing `_chromeos`, leading LAVA to fail
finding the correct device type, and no job from the new system running
on this platform.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromes: run Tast tests only on 5.4+

Current ChromeOS images have `ext4` filesystems using options not
present in 4.19. Therefore tests cannot run on kernels that old, and
this leads to false positives in corrupt device identification, so we
should only run those tests on 5.4 and later kernels.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromes: drop non-existent platform

`hp-x360-12b-ca0500na-n4000-octopus` isn't a device type available in
Collabora's LAVA lab, so let's drop its definition.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: exclude android tree from kbuild jobs

Only Android-specific kbuild jobs should run for this tree, let's not
overload our system with unneeded builds.

Take this opportunity to limit mediatek kbuilds to 6.1+ as that's the
earliest version that has upstream support for at least one of our
devices.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: a bug fix in `_submit_lapsed_nodes`

Fix a glitch in the code related to setting `checkout`
node result.

Fixes: 361fc0d ("src/timeout: set `checkout` result")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update early access FQDN

We are moving k8s from eastus to westus3 as it is cheaper

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/tarball: fix `_kdir` in `update_repo`

Fix the below error:
```
kernelci-pipeline-tarball |   File "/home/kernelci/./pipeline/tarball.py", line 79, in _update_repo
kernelci-pipeline-tarball |     kernelci.shell_cmd(f"rm -rf {self._kdir}")
kernelci-pipeline-tarball |                                  ^^^^^^^^^^
kernelci-pipeline-tarball | AttributeError: 'Tarball' object has no attribute '_kdir'
```

Fixes: 0a2fe9c ("src/patchset.py: Implement Patchset service)
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: fix method to get child nodes recursively

`TimeoutService._get_child_nodes_recursive` is used to get
pending child nodes recursively for closing and timed-out
nodes. It overwrites the result while being called recursively.
Fix the method to make it work properly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: rename "armel" arch to "arm"

`armel` has various meanings depending on the system: for ChromeOS, it
is ARMv7, while in Debian it's ARMv{5T,6}. Moreover, this project is
*Kernel*CI and the kernel uses `arm` for all 32-bits ARM devices. In
order to avoid confusion (including those wondering what the heck does
`armel` mean), let's rename `armel` to `arm`.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: use per-system arch property where relevant

With the new `*arch` fields present in the platform configurations, we
don't have to hardcode the architecture strings in some specific cases.
Let's adapt the config files so we use `{cros,deb,k}arch` wherever it
makes sense.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: set timed-out `checkout` result

Set timed-out `checkout` node result to `incomplete`
while in `running` state. As it denotes that the node
timed-out while checkout was still going on.
Also, set error related information i.e. `error_code`
and `error_msg`.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/tarball: update checkout node when update repo fails

Tarball updates source code repo and creates tarball.
If update repo operation fails even with second attempt,
it means it failed to checkout souce code.
Hence, update `checkout` node with state `done` state and
result `fail`. Also, set appropriate error information
to the `data` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: enable collabora-next tree and build config

Monitor the collabora-next tree. Add build config for the for-kernelci
branch.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: enable acpi kselftest on collabora-next tree

Run the ACPI kselftest on the for-kernelci branch of the collabora-next
tree.

See: https://lore.kernel.org/linux-kselftest/20240308144933.337107-1-laura.nao@collabora.com/T/#t

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: restore missing split_query_params function

Restore this function that was accidentally removed during the last
refactoring.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* lava_callback: Don't upload empty files to Azure

There is no use for lot of empty files on Azure,
that only complicate cleanup.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: unify preset and output names

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: update preset for aferraris

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for laura.nao

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fixes and new presets for nfraprado

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fix arch query parameters

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* k8s: Lot of deployment tested fixes

Fixes in yaml files for k8s production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result-summary presets: Fix build failure and regression monitors

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* result_summary: added debug traces to the monitor

Show detailed info of the node filterings in real time.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: fix corner case bug when no logs are found

Cover rare case where neither the node nor any of its parents up to the
checkout node have any log artifacts.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: refine stable-rc presets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: add regression info to test reports

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: escape log snippets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src: lava_callback: add device ID to node data

It can be useful to know the exact device on which a job ran, without
having to open the LAVA job page. This is done by querying the device ID
from the callback data and appending it to the node data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: upload raw callback data as well

Debugging callback issues is complex due to the raw data not being saved
after processing. This change ensures we save the callback data as a
JSON file in order to ease development.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* DONOTMERGE lava_callback: add debug statements

Why the heck doesn't this just work???

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary_templates: fix error 'node' is undefined

The object is named test and not node, so s/node/test

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/runtime/kunit: set architecture info

Set architecture field for `kunit` test
nodes.
If no `arch` argument is supplied, kunit takes
`um` (User Mode Linux) as architecture to run
tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: count running child jobs of build nodes

Add a method to count running jobs of `kbuild`
nodes i.e. jobs being submitted after successful
builds. Fox example `baseline` or `tast` jobs.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle closing `checkout` node differently

Usually, `checkout` should be transited to `done` state
when all its child nodes are completed.
In case of closing `checkout`, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are still
running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle holdoff reached `checkout` node differently

Usually, available `checkout` for which holdoff is
reached should be transited to `done` state only when
all its child nodes are completed.
In case of such `checkout` node, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are
still running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Revert "DONOTMERGE lava_callback: add debug statements"

This reverts commit 5ed8218d99840373bbba5830b1976813b52bf4b1.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* Create dependabot.yml

* result_summary_templates: make generic-test-failures generic to all
results

The generic-test-failures templates can be used to show general results
just replacing the name "failures" by "results". Makeing it easier to be
re-used by communities that want to have pre-sets to list all results of
the tests, so:

	s/generic-test-failures/generic-test-results

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result-summary.yaml: add preset to list android build tests

Since we now build android, add a preset to allow result-summary.yaml to
list all build results from Android tree.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* tarball: Implement checkout for specific commit

We often need not ToT, but specific commit, implement this.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* jobs-chromeos.yaml: Disable module compression for every kernel version

Commit d4bbe942098b ("kbuild: remove CONFIG_MODULE_COMPRESS"),
introduced in kernel v5.13, substituted CONFIG_MODULE_COMPRESS=n for
CONFIG_MODULE_COMPRESS_NONE=y as the way to disable module compression.
Since module compression causes "Invalid ELF header magic: != ELF"
errors during boot on the ChromeOS base config, add the missing config
to disable module compression on kernels > v5.13 as well.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* src: lava_callback: reduce callback data size

The callback data is quite large, especially as it includes the full log
which we already upload separately. By dropping it and compressing the
whole file with `gzip` we can avoid wasting too much storage space.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: don't leak secret token

The callback data contains the secret tokens value which shouldn't be
leaked. Ensure we drop it from the uploaded data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromeos: use new cros-flash image

This ensures we use the new version of the `install-modules` script.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: regression_tracker: add the "device" field to regression data

This can be helpful. We're not using it as a search param though, as we
don't want to narrow down the search that much, using the platform only
is better.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: result_summary_templates: report device used for job

This information is now available, and it can be useful to know the
affected device withouth having to look at the LAVA job details.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* kubernetes: Update deployment recipe

Update list of labs and add KCI_INSTANCE variable.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava-callback: Limit threads of lava-callback

Due inrush of lava callbacks and slow Azure Files
processing, we need to make sure we dont spawn too many
threads.
Also add hard limit of memory 1Gbyte

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: add presetes for fluster test

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Make template generic for all v4l2 tests
- Rebase on main

* result_summary presets: make the name of fluster test generic

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: enable first fluster test for mt8195-cherry-tomato-r2

Enable first fluster test, AV1-TEST-VECTORS for mt8195-cherry-tomato-r2.
Run the test on mainline and next until more trees are added.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Create generic v4l2-decoder-conformance-job and use anchers from it
- Update the rootfs address
- Move anchor to _anchor
- Update with nitpicks

* config: jobs-chromeos: Add kernelci tree for testing purpose

Remove this commit before merging.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Enable cpufreq kselftest

Enable cpufreq kselftest on all the trees and branches.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

* result_summary presets: fix preset for kselftest-dt failures monitor

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for kselftest-cpufreq

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: mt8195-cherry-tomato-r2: enable all fluster tests for all branches

Add all the trees and branches on which the tests would be ran. Enable
all the tests for tomato.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- The build config cannot be added yet. Just list the trees, it will only use
  the branches configured in build_configs:
  - mainline will use master
  - next will use master
  - collabora-chromeos-kernel will use for-kernelci
  - media will use master and fixes
- Remove kernelci tree as it was added just for testing purpose

* config: mt8183-kukui-jacuzzi-juniper-sku16: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

jacuzzi

* config: mt8186-corsola-steelix-sku131072: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: mt8192-asurada-spherion-r0: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Don't specify the platforms manually as they are already mentioned in
  test-job-arm64-mediatek

* config: sc7180-trogdor-kingoftown/lazor-limozeen: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Use test-job-arm64-qualcomm instead and carete separate jobs for
  qualcomm devices
- Don't specify platforms manually as they are already mentioned in
  test-job-arm64-qualcomm

* build(deps): bump uwsgi from 2.0.21 to 2.0.22 in /docker/lava-callback

Bumps [uwsgi](https://uwsgi-docs.readthedocs.io/en/latest/) from 2.0.21 to 2.0.22.

---
updated-dependencies:
- dependency-name: uwsgi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* pipeline.yaml: Add stable-rc build variants

Add more build variants for stable-rc tree to match legacy system.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary: add error classification

Classify errors according to patterns in the logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary presets: add collabora-chromeos-kernel and media trees for fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: Use media-stage instead of media-tree

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config/pipeline: enable android branches from legacy

Enable all android branches from the legacy system

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* trigger: Add exclude/include tree list for trigger

As we need to restrict list of running kernels on staging,
we need to add option allowing that.
Also it will be good to exclude staging kernels from production
kernel list.

So in case of staging we need to run kernels only from tree "kernelci"
and sometimes something else, for example "mediatek".
Option will look like:

--trees kernelci,mediatek
or
--trees kernelci

On production we need to exclude trees kernelci and buggytree:
--trees !kernelci,buggytree
or just kernelci:
--trees !kernelci

Purpose of this option is that our compiling capacity is limited,
and right now staging and production both compiling very large set
of kernels, we need to reduce this amount to drop costs.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: platforms-chromeos: use CrOS R124 files

ChromeBooks were upgraded with a new image based on ChromiumOS R124, so
we must use those files now.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: drop non-existent Tast tests

Those were removed between R120 and R124 and therefore cause test
failures with the new images.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary presets: fix acpi kselftest presets

We're interested in catching regressions and failures in the both the
kselftest-acpi test suites and its test cases. Match the nodes by group
in the presets accordingly.
Fix template used by the failure monitor preset.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src: update return values of `APIHelper.receive_event_node`

`APIHelper.receive_event_node` method is used to receive
node data from PubSub event. The method has been updated
to return `is_hierarchy` flag as well which represents
events related to node hierarchy.
Update pipeline services using the method accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: refine presets for v4l2-decoder-conformance

Modify the regression preset to monitor regressions on both the
v4l2-decoder-conformance test suites and its test cases, by matching the
nodes by group instead of by name.
Also, change the failure preset to monitor for all errors caused by
runtime errors.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: add summary presets for v4l2-decoder-conformance

Add summary presets to fetch regressions and failures on
v4l2-decoder-conformance tests. Two of the presets are the same used by
the monitor; add one additional preset to fetch all the failures on
both the test suites and their test cases.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* lava_callback.py: Remove error_code/error_msg on lava-callback

Sometimes due congestion node might be set to timeout, but
then result might arrive late and we need to use it properly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: fix dt kselftest presets

Fix the dt kselftest preset, just like was done for the acpi one, as the
current preset doesn't match the actual results we're interested in.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* doc/connecting-lab: refine documentation

Refine documentation for connecting LAVA labs
and submitting jobs to the lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback: Sometimes we get totally invalid log file uploaded

Most likely problems lays in threading of flask, and possibly
callbacks are getting mixed. This commit attempts to introduce
several countermeasures against that.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* doc: add `_index.md` page

Add index documentation page.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `pipeline-details` page

Move `pipeline-details` documentation from the API
repository to this repo to make it close to the source.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/connecting-lab: adjust `weight` property

Change `weight` property of existing doc page to
accommodate with transition of pipeline related docs
to pipeline repo.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `developer-documentation` page

Add developer manual documentation.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add lab config for Qualcomm

Add an entry to `runtimes` section for Qualcomm
lab configurations.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86` job for qualcomm

Add job configuration `baseline-x86-qualcomm` for
running baseline job in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add lab-qualcomm runtime

Add runtime argument `lab-qualcomm` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to Qualcomm LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-arm64` job for qualcomm

Add job configuration `baseline-arm64-qualcomm` for
running baseline job for `arm64` in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update RISC-V configs

1)rv32 defconfig doesn't exist, remove
2)nommu_k210_defconfig have modules disabled

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback.py: Sanitize lava log data

As we use this data in reports, lets remove all
non-printable characters as they confuse grafana, browsers and others.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/runtime/kunit.jinja2: fix result map

Fix result map for skipped tests. Initially, API
didn't have `skip` available node result in the schema.
That's why it was mapped to `None` result. But now API
has `skip` result to denote skipped tests.
Fix the result mapping accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: jobs-chromeos: Add lab-setup fragment

Add the lab-setup fragment to the chromebook builds, which contains the
architecture independent kernel configs needed to run tests on the
platform. Notably this disables IP autoconfig by the kernel.

The result of this change is that the 12 seconds boot delay and the
consequent deferred probe pending warnings will no longer happen on any
platform. Particularly on mt8186-corsola-steelix-sku131072 (due to a
different network adapter being used) on which it was still happening.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* lava_callback: bump up slightly threads number

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: enable watchdog reset test on Chromebooks

Add a basic test to verify watchdog reset functionality. Enable the
test on all ARM64 and AMD x86_64 Chromebooks. For Intel
Chromebooks, enable the test only on octopus, as ACPI PM Timer on the
other devices has been disabled in coreboot.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/send_kcidb: use schema version 4.3

Test status `MISS` was added to KCIDB in schema
v4.2 and supported by the latest version i.e. v4.3.
Hence, use the latest version for submission as
API may send a few tests with "MISS" status.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* send_kcidb: re-structure code for parsing checkout node

Move code for parsing checkout node to a separate
method.
Add `valid` field to parsed checkout node. It denotes
if source code was successfully checked out.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: print more information on invalid data

Print details for invalid revision data for the
sake of debugging.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: optimize `kcidb` import

Remove redundant `kcidb` import and adjust
kcidb Client call accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: remove keys with `None` values

KCIDB doesn't allow `None` as field value.
Remove all optional fields with `None` value
to make it valid data for submitting to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: add `kcidb_test_suite` property

Every KernelCI test will be mapped to a unified
test suite for KCIDB data submission.
Add `kcidb_test_suite` property to test job
definitions in YAML configuration files.
The added property will store the mapped
KCIDB test suite name.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: parse and submit node test and build data

Listen to all the node events with node state
`done` or `available` and submit the node to KCIDB.
Parse node received from the event and create KCIDB
schema compatible object based on type of the node
i.e. checkout, build or test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: set `log_excerpt` for builds and tests

Fetch logs from compressed log file(*.log.gz) URL
and send last 16*1024 characters for setting `log_excerpt`
field for build and test nodes as it is the max allowed
length of the KCIDB field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/jobs-chromes: add kcidb test suite property for watchdog test

Add KCIDB test suite mapping for `watchdog_reset` test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback.py: disable log removal from callback data

We need it for investigations if we have any critical data
loss during log sanitizing.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: add error info to build nodes

Add error metadata fields such as `error_code` and
`error_msg` to `misc` field for build nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: add watchdog-reset presets for mainline/next

Add monitor and summary presets to track the results from the watchdog
reset test on the mainline and next trees.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* pipeline.yaml: Fix fluster rootfs URL

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: get error metadata for failed/incomplete tests

Tweak condition to get error metadata for test nodes.
It should get error info for incomplete nodes as well
and not just failed nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: send tests only if KCIDB test mapping exists

All test suite definitions must have `kcidb_test_suite`
property i.e. KCIDB test suite mapping.
Only send tests for those the mapping is found.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* tests/validate_yaml: add validation for KCIDB mapping

To submit KernelCI generated data to KCIDB, it is required
to have a mapping for all the job definition with
`kcidb_test_suite` property.
Add validation to ensure all the jobs have a mapping
present to avoid missing data submission.
This check is to notify test authors trying to enable tests
in maestro to include the required property for the mapping
in their definition.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add qcs6490-rb3gen2 boot test

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* config: chromeos: Enable kselftest-dt on Qualcomm platforms

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* pipeline.yaml: Add one um build for android trees

As per request of Android team it will be good to check for breakages
UM builds as well.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: use `kind=job` for test suites

As part of re-structuring test hierarachy, `Job` model
has been introduced for test suite/job nodes.
It uses node kind `job`.
Update test configurations in `pipeline.yaml` and
`jobs-chromeos.yaml` to use `kind=job` to
generate job nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: provide `kind` value for child tests

In case of submitting test hierarchy, child nodes by default
inherit `kind` value from parent node.
As we are re-structuring test hierarchy, test suit/job nodes
will have `kind=job` where its child test nodes will have
`kind=test`. Provide `kind` field explicitly to test result
hierarchy to preserve different kind value than the parent
node.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: fix `NameError`

Fix the below error in `_submit` method:
```
Traceback (most recent call last):
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 287, in main
    job.submit(results)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 138, in submit
    self._submit(result)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 265, in _submit
    return node
NameError: name 'node' is not defined
```

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: evaluate job node result

Evaluate job node result from child node results if
`null` result is receive from test result parser.
For example nodes such as `fortify`:
https://staging.kernelci.org:9000/viewer?node_id=6670ab43d0b7694b399897c4

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix parsing of KUnit log file

Handle both compressed(gzip) and plain text log files
for getting log excerpt.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: HTTP exception handling for log excerpt

Add HTTP exception handling for getting
log excerpt data.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: platforms-chromeos: Add serial delay for some Mediatek platforms

Add test_character_delay to the Spherion, Tomato and Steelix platforms
to workaround the fact that they're sometimes unable to process serial
input fast enough, resulting in mangled commands and consequently flaky
test results, as described in
https://github.com/kernelci/kernelci-project/issues/366.

The right place to do this change would be in the device-type template
as described in LAVA's documentation [1]. This overriding in KernelCI is
meant only as a temporary workaround to verify whether this fixes the
issue. If it does, then we'll do it in LAVA upstream instead.

[1] https://docs.lavasoftware.org/lava/debugging.html#differences-in-input-speeds
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: chromeos: Enable error-logs kselftest for MediaTek Chromebooks

Run the error-logs kselftest on MediaTek Chromebooks. This test is
currently under review upstream [1] so, in the meantime, it has been
added to the collabora-next tree so it can prove its value by helping to
detect issues upstream.

[1] https://lore.kernel.org/all/20240423-dev-err-log-selftest-v1-0-690c1741d68b@collabora.com

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config/pipeline.yaml: enable CIP lab

Add configuration for LAVA CIP lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add baseline-x86 test for CIP

Add `baseline-x86-cip` test to be submitted to CIP
LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-cip` runtime

Add runtime argument `lab-cip` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to CIP LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: enable `job` node submission to KCIDB

Parse newly added job node and its child tests
for KCIDB submission.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: don't submit `setup` test suite nodes

`setup` test suite has been introduced to store test results
for environment setup checks before running actual test suite.
KCIDB doesn't require `setup` test suite result as long as
main test job result is submitted.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: add a check before sending data

Check if parsed data is available before
sending revision data to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix logs

Fix log statement about submitting node to KCIDB
as we are not sending all the nodes we receive
event for to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: handle skipped tests

Do not retrieve artifacts or metadata from parent
node for skipped tests as in pratice only kernel
revision, test runtime and platform will be
available for skipped tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary/utils: ignore failures on log retrieval

Make the script continue running if there was an error fetching a test
log.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/developer-documentation: add docs for enabling new tests

Add developer documentation for enabling new tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Fix links after docs page migration

Documentation has been migrated to the "docs.*" subdomain.

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* pipeline.yaml: Add kcidebug fragment

Add useful low-overhead debug option to kernel,
and test on most x86 boards we have available,
with minimal baseline tests.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* configs: update gcc-10 to gcc-12

As we upgrade compiler images, we need update gcc version

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: workaround: match node paths programatically

Don't use 'path' as an api search parameter. The use of lists as query
parameters (path is a list) is undefined. Instead, do the filtering in
code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: remove qemu jobs from lab-qualcomm

QEMU jobs use container pulled from hub.docker.com. After the lab move
pulling from this registry is no longer possible at Qualcomm. This patch
disables QEMU jobs from Qualcomm lab.

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* validate_yaml.py: Improve pipeline validation

Add validation that scheduler entries have matching job entry,
this is critical validation, and job entries have at least
one entry in the scheduler.
Fix one entry detected by this validation

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* pipeline.yaml: Add broonie(Mark Brown) trees to pipeline

It is time to enable even more trees.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add additional verification for duplicate keys

We might have redefined same keys in different yaml files,
this tool will ensure consistency of this entries.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Remove path separator

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Rename variable to schedules

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/kernelci.toml: update KCIDB origin name

As we agreed to refer new KernelCI API & Pipeline as
"maestro", use the new name while submitting data
to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: update KCI result mapping with KCIDB status

Update evaluation of KCIDB status from KCI result.

Create 2 categories for error codes:
1. When pre-check tests completed but actual test suite
coudln't run - this will have `MISS` status
2. When pre-check tests completed, actual test suite could
run but somehow couldn't complete - this will have `ERROR` status

Some LAVA error codes can occur at any point of execution
such as `Cancelled` and `Test`.
Listed such error codes to the most relevant category
based on analysis of available results.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: fix presets for v4l2-decoder-conformance

Following recent updates to data representation on KernelCI nodes,
the top-level nodes for tests now have their kind set to 'job' instead
of  'test'. Update the presets for v4l2-decoder-conformance tests
accordingly.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: fix output file name in kselftest-acpi preset

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: enable dmabuf-heaps, exec and iommu kselftest suites

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Add kcidb_test_suite

* config: result-summary: add generic rule to monitor failures and regression

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Add rt-stable builds

Copy rt-stable builds from legacy KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Major changes to move to new way of writing kbuild jobs

* config: pipeline: Add v6.6-rt branch for builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: result-summary: add rt-stable kbuilds presets

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Add 'nfs' suffix to KCIDB suite name for baseline-nfs

The baseline test is currently run with both ramdisk and nfs rootfs. To
distinguish baseline-nfs tests in KCIDB, add an 'nfs' suffix to the KCIDB
test suite name.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* aks: Add kubernetes kcidb deployment

We need file that will manage deployment of kcidb bridge
in kubernetes production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* kubernetes: Adjust trigger k8s options

Ignore kernelci tree on production, as it is special
"staging"-only tree, and read all /config directory, not just default
pipeline.yaml.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: bugfix: catch empty search condition

Fix _get_last_matching_node(), after the previous change there was an
unhandled scenario where nodes may be empty but the function wouldn't
return None immediately.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: pipeline: correct the kind of kselftest suites to job

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler-chromeos.yaml: Temporarily disable non-essential tast tests

As per discussion, we disable temporary tast tests which unlikely
will be reviewed.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* k8s/aks: Update deployment files

1)Update memory limit, as working with linux sources might require 3Gbyte of RAM.
2)Update config file path
3)Add callback environment variable
4)Update image reference to fresh one

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android builds with gcc-12 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable android builds with clang-17 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: remove build_variants from android build_configs

The build_variants is legacy way to specify the different variants. We
have moved to the newer way to specify the variants. Hence remove the
build_variants from android build_configs.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add android15-6.6-lts branch for build as well

The android15-6.6-lts has been included recently in legacy KernelCI:
https://github.com/kernelci/kernelci-core/pull/2597

Add the same in newer KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add blocklist for riscv older kernels for android builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: update KCIDB test suite mapping for baseline

Use `boot` as KCIDB test suite mapping for all
baseline tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* callback_url: Update config and README

As we are moving callback URL to environment variable,
updating config and README accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android baseline (boot) testing for arm and arm64 in only allmodconfig

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler.py: If event have jobfilter, inject it to the node data

When someone generate artificial event with jobfilter, this is
likely maintainer trying to repeat job. Treat this accordingly,
and inject job filter to job node, so we will run only tests
maintainer wants.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback: migrate to fastapi

It will be easier to maintain API and Pipeline, as
both will be powered by FastAPI framework.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: Update fluster rootfs URL

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: pipeline: fix defconfigs in fragments

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* kbuild.jinja2: support defconfig as list or str

As required in https://github.com/kernelci/kernelci-core/pull/2608
defconfig might be two types. Support it in jinja2 accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: piepline: add kbuilds of lee-mfd with default defconfigs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable baseline testing for mfd for one board of each arch

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: fix platform sections for Qualcomm and Android schedules

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* k8s: Update deployment to uvicorn, as we use fastapi now

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: Unblock android runs on lava-collabora

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: Enable preempt-rt cyclictest test

Enable the first preempt-rt test, cyclictest in new KernelCI. Enable it
on all platforms.

Since these are all smoke test there is no point in running them too
long. Thus reduce the runtime per test to one minute. This should keep
the total preempt-rt runtime roughly in the same time frame.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: add all the test jobs for all rt-test

Add jobs definition of all the rt-tests. Enable cyclicdeadline and rtla
tests to run on all targets.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add template and test properties for preempt_rt jobs

Add template, job add kcidb_test_suite properties for all preempt-rt jobs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: rename preempt-rt to rt-tests which is correct name of tests

The legacy was using preempt-rt name of tests. But the repository has
rt-tests name. We must use the same name to merge with execution results
coming from other CIs in KCIDB.

Suggested-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add the correct nfsroot for rt-tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Remove android's deprecated branches

It has been confirmed with Todd that we should remove the deprecated
branches. Hence remove those branches.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: run baseline on non-allmodconfig

The allmodconfig generates very large kernel image. It cannot be booted
on the arm64 and arm targets as tftp errors out that size is too large.
Reduce the kernel image size. Use the default defconfig. The same
defconfigs have been booting for other trees.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* doc: developer-documentation: Update documentation by adding more details

- Reorganize some things
- Specify how to write different variants by removing old syntax
- Give two separate templates for kbuild and test
- Try to put more details for new contributors

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Fix type
- Apply suggestions from code review

* doc/developer-documentation: fix a glitch in enabling new tree section

Fix a minor bug in YAML block formatting.

Fixes: f5f57de ("doc: developer-documentation: Update documentation by adding more details")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/developer-documentation: update a section title

Rename a section from "Enabling a new Kernel tree" to
"Enabling new KernelCI trees, builds, and tests" as it explains
enabling tests as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: use the new `tree:branch` format for rules

For cases where we want a single branch to be allowed for a given tree,
we can now use the `tree:branch` format in rules. Convert existing rules
accordingly.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: pipeline: fix improper use of "filters" attribute

The `filters` param was used in the legacy system but has been replaced
by `rules`, with a different syntax.

For Android RISC-V builds, this was used to deny job execution on
kernels < 4.19, so let's translate this condition with the rules format,
and do a similar change for the `rt-tests`-based jobs.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config/pipeline.yaml: Fix x86 typo in kcidebug job names

The kcidebug jobs that run on MediaTek and Qualcomm platforms should
have arm64 in the name rather than x86. Fix the typo.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: pipeline: remove params

The parameters are only needed when they are changed or appeneded.
Remvoe the parameters which aren't being modified.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* validate_yaml.py: Jobs are required to have template parameter

Add more validation to config files of mandatory parameters.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add more job validations

Add basic validation, each job must have kind parameter

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* workflows: Add label on CI check failures

Automatically add label so broken PR wont go to staging

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

---------

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Signed-off-by: Laura Nao <laura.nao@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>
Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-authored-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Co-authored-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Co-authored-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Co-authored-by: Helen Koike <helen.koike@collabora.com>
Co-authored-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Co-authored-by: Laura Nao <laura.nao@collabora.com>
Co-authored-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Co-authored-by: Shreeya Patel <shreeya.patel@collabora.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Milosz Wasilewski <milosz.wasilewski@foundries.io>
Co-authored-by: Paweł Wieczorek <pawiecz@collabora.com>
Co-authored-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Co-authored-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
nuclearcat added a commit to nuclearcat/kernelci-pipeline that referenced this pull request Jul 24, 2024
* src/scheduler: store error message when job fails with "submit_error"

It is helpful for debugging to catch error message when
scheduler fails to submit job to runtime.
Store the error message to `data.error_msg` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: Set minimum kernel version for DT kselftest to 6.7

The test was introduced upstream in version 6.7, so no point in trying
to run it on earlier versions.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* configs/: Update volteer device

Update volteer devices according lab availability

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary templates: detailed output for active/inactive regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new presets for active regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: update CHANGELOG

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* data: chmod -R 777 ./data/output to avoid permission error

Avoid errors like

PermissionError: [Errno 13] Permission denied: '/home/kernelci/data/output/stable-rc-boot.html'

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: move code to _get_logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: use ThreadPoolExecutor to fetch logs

Fetching logs is the bottleneck of the script. Fetch them in parallel
with ThreadPoolExecutor.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix result presets

stable-rc-build-failures and stable-rc-boot-failures weren't querying
specifically for test failures.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: rework regression detection

Take into account "active" and "inactive" regressions when creating them
and when processing new passed or failed nodes.

When a node passes, it checks if it "inactivates" an existing "active"
regression. When a node fails, it checks if it needs to create a new
regression or update an existing "active" one.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: link failed nodes to active regressions

When a failed node generates a regression, or when it's a re-run of a
run that generated a still active regression, link the node to the
regression id.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for date ranges for creation and update

New command line options to let the user specify date ranges for node
creation and last update: --created-from, --created-to,
--last-updated-from, --last-updated-to

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: support for date ranges for creation and last update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for extra query parameters in cmdline

New command line option: --query-params to specify a set of extra query
parameters to complete or override preset parameters.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: html markup in some preset titles

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: update and move to docs folder

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: move parameter loading and processing to 'setup'

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: refactor and split into two clases (single, run)

Split the ResultSummary class into a base class and two child classes:
ResultSummarySingle and ResultSummaryLoop (only a stub at this point).

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: WIP initial implementation of the "loop" command

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: huge refactoring

Implement "summary" (single-shot) and "monitor" (loop) modes based on
preset parameters instead of on the command-line main command.

Split the logic into multiple files, move all monitor-specific and
summary-specific code to independent files, common code in a separate
file.

Full of kludges, I don't like how this is looking so far, might consider
reimplementing it without any dependencies on pipeline code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix markup and indentation

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new generic templates for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: examples for "monitor" and "summary" modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: summary and monitor modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix generic regression report

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: summary: fix last_updated option handling

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: embed css stylesheet in html files

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] make regression active by default

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "result" field is ever made non-optional in the models we can
probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] set default empty node sequence

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "node_sequence" field is ever made non-optional in the models we
can probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: add cmdline option --output-dir

Introduce a new command-line option: --output-dir, and rename the old
--output to --output-file.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: command-line options change

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: jobs-chromeos: remove meaningless Tast tests

Several Tast tests can only fail in the context of KernelCI:
* `video.PlatformDecoding.v4l2_state*_vp9_0_svc` do not actually exist,
  causing the whole test job to fail
* `platform.DLCService*` and `platform.Memd` rely on features only
  present in the downstream Chrom{e,ium}OS kernel (see b/247467814 and
  b/244479619 for those having access to Google's issue tracker)
* `kernel.ConfigVerify.chromeos` relies on downstream-only config
  options such as `CONFIG_SECURITY_CHROMIUMOS` and other similar ones,
  and therefore can only fail when testing upstream kernels

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: scheduler-chromeos: don't execute non-working Tast tests

Currently, HEVC-related tests are known to either fail or be skipped as
ChromeOS doesn't yet handle hardware decoding of HEVC media. This is
expected to be fixed at some point though, so we're keeping the job
definitions and only remove the corresponding scheduler entries in order
to reinstate those jobs when relevant.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: exclude Tast tests known to always fail

Several decoder tests always fail on all platforms where they're
executed, adding only noise to otherwise useful test results. Disable
those for improving the quality of the results.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: chromeos: add special case for pre-6.7 qcom codec tests

On Qualcomm-based ChromeBooks (`trogdor` being the only model in
Collabora's lab), we noticed systematic failures of all
`vp9_*_frm_resize` and `vp9_*_sub8x8_sf` tests when using a kernel up to
6.6. With 6.7 and above, all of those tests (except one) now pass. It
therefore makes sense to exclude those on pre-6.7 kernels so we don't
report known failures and get rid of some noise.

This involves "duplicating" affected test jobs (although I did my best
to minimize that) and setting rules so only the working variant is
executed, based on the version of the kernel being tested.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* lava_callback: Compress the log files to save storage space

As storage space in cloud and egress have high costs,
better to compress potentially large files.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* tests: Add basic yaml validation

Add yaml load to figure out earlier issues with yaml

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in platforms anchors

The "stoneyridge" and "pineview" naming used in the Chromebook platform
anchors refers to ChromiumOS specific config fragments, but doesn't
necessarily match the actual platform of all the devices listed.
Use more generic names to distinguish amd and intel Chromebooks.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: rename test job anchors that use chromeos specific configs

Rename test job anchors that use chromeos specific kernel configurations
to include the 'chromeos' infix.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: add baseline tests

Enable the baseline tests on all the supported Chromebooks with their
default kernel configuration.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in job defs

The "stoneyridge" and "pineview" naming used in some Chromebook job
definitions refers to ChromiumOS specific config fragments, but
doesn't necessarily match the actual platforms targeted by the jobs.
Replace all occurrences with more generic intel/amd naming.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop chromeos infix from baseline jobs

Keeping different job names for tests targeting different kernel configs
might cause too much duplication. Drop the 'chromeos' infix from the job
name for the tests using the chromeos config fragment. Users will be
able to filter the results using the data.defconfig/data.config_full
fields anyway.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: post-process results for summary and monitor modes

Split the post-processing of nodes to a common function that can be used
for both summary and monitor modes. Currently, post-processing involves
only the collection of logs.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: update and fix presets and templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/result-summary-CHANGELOG: update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config/pipeline.yaml: enable 'BayLibre' lab

Add lab configuration for BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-baylibre` runtime

Add runtime argument `lab-baylibre` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86-baylibre` job

Add job configuration `baseline-x86-baylibre` for BayLibre.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-armel-baylibre` job

Add job configuration `baseline-armel-baylibre` for BayLibre.
Add scheduler entry and platform config as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline: enable `android` tree and build configs

Monitor linux `android` tree. Add build configs for `android-mainline`
branch.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add kbuild definitions for android-mainline

Add kbuild jobs to compile the kernel for android-mainline branch

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add entries to schedule to build android-mainline

Add entries to `scheduler:` section to run the builds for
android-mainline.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix node filter in monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* kernelci.toml: set `checkout` node timeout to `180 min`

Currently set `60 min` timeout is not enough as some
`kbuild` jobs and its sub-tests take around 2 hrs to
complete after getting submitted to runtime.

Here is an example from staging. See the information
for a `checkout` and its child nodes:

| id                       | name                | created                    | updated                    | timeout                    |
|--------------------------|---------------------|----------------------------|----------------------------|----------------------------|
| 661c9d59b60b785eb9fc42b0 | checkout            | 2024-04-15T03:22:01.317000 | 2024-04-15T03:51:03.870000 | 2024-04-15T04:22:01.284000 |
| 661c9d97b60b785eb9fc42b4 | kbuild-gcc-10-arm64 | 2024-04-15T03:23:03.399000 | 2024-04-15T03:50:15.031000 | 2024-04-15T09:23:03.399000 |
| 661ca3f7b60b785eb9fc4ead | baseline-arm64      | 2024-04-15T03:50:15.304000 | 2024-04-15T05:09:45.247000 | 2024-04-15T09:50:15.304000 |

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary: add email report capabilities for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: plain text single report templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: chromeos: add baseline-nfs tests

Enable the baseline-nfs tests on all the supported Chromebooks, with
both the default and the chromeos kernel configurations.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/timeout: set `checkout` result

For `TIMEOUT` mode, set `checkout` node result to `fail`
if its state is `running` as it means code checkout is still
going on and node timed-out. Set it to `pass` if its state
is any other than `running`.
Set `checkout` node result to `pass` if mode is `DONE` as
it means once `checkout` has been in `available` or `closing`
state and it could successfully complete source code checkout.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* regression_tracker: bugfix, failed test with no prior runs

Handle the case of a failed test run when it's the first occurence of
that test case. Consider it "not a regression" for now, since we're
defining a regression as a "breaking point" between a success and a
failure.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: platforms-chromeos: fix dalboz device type

Due due to a copy/paste mishap, the device type for
`asus-CM1400CXA-dalboz` had a trailing `_chromeos`, leading LAVA to fail
finding the correct device type, and no job from the new system running
on this platform.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromes: run Tast tests only on 5.4+

Current ChromeOS images have `ext4` filesystems using options not
present in 4.19. Therefore tests cannot run on kernels that old, and
this leads to false positives in corrupt device identification, so we
should only run those tests on 5.4 and later kernels.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromes: drop non-existent platform

`hp-x360-12b-ca0500na-n4000-octopus` isn't a device type available in
Collabora's LAVA lab, so let's drop its definition.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: exclude android tree from kbuild jobs

Only Android-specific kbuild jobs should run for this tree, let's not
overload our system with unneeded builds.

Take this opportunity to limit mediatek kbuilds to 6.1+ as that's the
earliest version that has upstream support for at least one of our
devices.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: a bug fix in `_submit_lapsed_nodes`

Fix a glitch in the code related to setting `checkout`
node result.

Fixes: 361fc0d ("src/timeout: set `checkout` result")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update early access FQDN

We are moving k8s from eastus to westus3 as it is cheaper

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/tarball: fix `_kdir` in `update_repo`

Fix the below error:
```
kernelci-pipeline-tarball |   File "/home/kernelci/./pipeline/tarball.py", line 79, in _update_repo
kernelci-pipeline-tarball |     kernelci.shell_cmd(f"rm -rf {self._kdir}")
kernelci-pipeline-tarball |                                  ^^^^^^^^^^
kernelci-pipeline-tarball | AttributeError: 'Tarball' object has no attribute '_kdir'
```

Fixes: 0a2fe9c ("src/patchset.py: Implement Patchset service)
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: fix method to get child nodes recursively

`TimeoutService._get_child_nodes_recursive` is used to get
pending child nodes recursively for closing and timed-out
nodes. It overwrites the result while being called recursively.
Fix the method to make it work properly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: rename "armel" arch to "arm"

`armel` has various meanings depending on the system: for ChromeOS, it
is ARMv7, while in Debian it's ARMv{5T,6}. Moreover, this project is
*Kernel*CI and the kernel uses `arm` for all 32-bits ARM devices. In
order to avoid confusion (including those wondering what the heck does
`armel` mean), let's rename `armel` to `arm`.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: use per-system arch property where relevant

With the new `*arch` fields present in the platform configurations, we
don't have to hardcode the architecture strings in some specific cases.
Let's adapt the config files so we use `{cros,deb,k}arch` wherever it
makes sense.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: set timed-out `checkout` result

Set timed-out `checkout` node result to `incomplete`
while in `running` state. As it denotes that the node
timed-out while checkout was still going on.
Also, set error related information i.e. `error_code`
and `error_msg`.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/tarball: update checkout node when update repo fails

Tarball updates source code repo and creates tarball.
If update repo operation fails even with second attempt,
it means it failed to checkout souce code.
Hence, update `checkout` node with state `done` state and
result `fail`. Also, set appropriate error information
to the `data` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: enable collabora-next tree and build config

Monitor the collabora-next tree. Add build config for the for-kernelci
branch.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: enable acpi kselftest on collabora-next tree

Run the ACPI kselftest on the for-kernelci branch of the collabora-next
tree.

See: https://lore.kernel.org/linux-kselftest/20240308144933.337107-1-laura.nao@collabora.com/T/#t

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: restore missing split_query_params function

Restore this function that was accidentally removed during the last
refactoring.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* lava_callback: Don't upload empty files to Azure

There is no use for lot of empty files on Azure,
that only complicate cleanup.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: unify preset and output names

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: update preset for aferraris

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for laura.nao

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fixes and new presets for nfraprado

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fix arch query parameters

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* k8s: Lot of deployment tested fixes

Fixes in yaml files for k8s production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result-summary presets: Fix build failure and regression monitors

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* result_summary: added debug traces to the monitor

Show detailed info of the node filterings in real time.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: fix corner case bug when no logs are found

Cover rare case where neither the node nor any of its parents up to the
checkout node have any log artifacts.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: refine stable-rc presets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: add regression info to test reports

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: escape log snippets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src: lava_callback: add device ID to node data

It can be useful to know the exact device on which a job ran, without
having to open the LAVA job page. This is done by querying the device ID
from the callback data and appending it to the node data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: upload raw callback data as well

Debugging callback issues is complex due to the raw data not being saved
after processing. This change ensures we save the callback data as a
JSON file in order to ease development.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* DONOTMERGE lava_callback: add debug statements

Why the heck doesn't this just work???

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary_templates: fix error 'node' is undefined

The object is named test and not node, so s/node/test

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/runtime/kunit: set architecture info

Set architecture field for `kunit` test
nodes.
If no `arch` argument is supplied, kunit takes
`um` (User Mode Linux) as architecture to run
tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: count running child jobs of build nodes

Add a method to count running jobs of `kbuild`
nodes i.e. jobs being submitted after successful
builds. Fox example `baseline` or `tast` jobs.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle closing `checkout` node differently

Usually, `checkout` should be transited to `done` state
when all its child nodes are completed.
In case of closing `checkout`, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are still
running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle holdoff reached `checkout` node differently

Usually, available `checkout` for which holdoff is
reached should be transited to `done` state only when
all its child nodes are completed.
In case of such `checkout` node, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are
still running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Revert "DONOTMERGE lava_callback: add debug statements"

This reverts commit 5ed8218d99840373bbba5830b1976813b52bf4b1.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* Create dependabot.yml

* result_summary_templates: make generic-test-failures generic to all
results

The generic-test-failures templates can be used to show general results
just replacing the name "failures" by "results". Makeing it easier to be
re-used by communities that want to have pre-sets to list all results of
the tests, so:

	s/generic-test-failures/generic-test-results

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result-summary.yaml: add preset to list android build tests

Since we now build android, add a preset to allow result-summary.yaml to
list all build results from Android tree.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* tarball: Implement checkout for specific commit

We often need not ToT, but specific commit, implement this.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* jobs-chromeos.yaml: Disable module compression for every kernel version

Commit d4bbe942098b ("kbuild: remove CONFIG_MODULE_COMPRESS"),
introduced in kernel v5.13, substituted CONFIG_MODULE_COMPRESS=n for
CONFIG_MODULE_COMPRESS_NONE=y as the way to disable module compression.
Since module compression causes "Invalid ELF header magic: != ELF"
errors during boot on the ChromeOS base config, add the missing config
to disable module compression on kernels > v5.13 as well.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* src: lava_callback: reduce callback data size

The callback data is quite large, especially as it includes the full log
which we already upload separately. By dropping it and compressing the
whole file with `gzip` we can avoid wasting too much storage space.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: don't leak secret token

The callback data contains the secret tokens value which shouldn't be
leaked. Ensure we drop it from the uploaded data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromeos: use new cros-flash image

This ensures we use the new version of the `install-modules` script.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: regression_tracker: add the "device" field to regression data

This can be helpful. We're not using it as a search param though, as we
don't want to narrow down the search that much, using the platform only
is better.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: result_summary_templates: report device used for job

This information is now available, and it can be useful to know the
affected device withouth having to look at the LAVA job details.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* kubernetes: Update deployment recipe

Update list of labs and add KCI_INSTANCE variable.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava-callback: Limit threads of lava-callback

Due inrush of lava callbacks and slow Azure Files
processing, we need to make sure we dont spawn too many
threads.
Also add hard limit of memory 1Gbyte

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: add presetes for fluster test

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Make template generic for all v4l2 tests
- Rebase on main

* result_summary presets: make the name of fluster test generic

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: enable first fluster test for mt8195-cherry-tomato-r2

Enable first fluster test, AV1-TEST-VECTORS for mt8195-cherry-tomato-r2.
Run the test on mainline and next until more trees are added.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Create generic v4l2-decoder-conformance-job and use anchers from it
- Update the rootfs address
- Move anchor to _anchor
- Update with nitpicks

* config: jobs-chromeos: Add kernelci tree for testing purpose

Remove this commit before merging.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Enable cpufreq kselftest

Enable cpufreq kselftest on all the trees and branches.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

* result_summary presets: fix preset for kselftest-dt failures monitor

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for kselftest-cpufreq

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: mt8195-cherry-tomato-r2: enable all fluster tests for all branches

Add all the trees and branches on which the tests would be ran. Enable
all the tests for tomato.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- The build config cannot be added yet. Just list the trees, it will only use
  the branches configured in build_configs:
  - mainline will use master
  - next will use master
  - collabora-chromeos-kernel will use for-kernelci
  - media will use master and fixes
- Remove kernelci tree as it was added just for testing purpose

* config: mt8183-kukui-jacuzzi-juniper-sku16: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

jacuzzi

* config: mt8186-corsola-steelix-sku131072: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: mt8192-asurada-spherion-r0: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Don't specify the platforms manually as they are already mentioned in
  test-job-arm64-mediatek

* config: sc7180-trogdor-kingoftown/lazor-limozeen: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Use test-job-arm64-qualcomm instead and carete separate jobs for
  qualcomm devices
- Don't specify platforms manually as they are already mentioned in
  test-job-arm64-qualcomm

* build(deps): bump uwsgi from 2.0.21 to 2.0.22 in /docker/lava-callback

Bumps [uwsgi](https://uwsgi-docs.readthedocs.io/en/latest/) from 2.0.21 to 2.0.22.

---
updated-dependencies:
- dependency-name: uwsgi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* pipeline.yaml: Add stable-rc build variants

Add more build variants for stable-rc tree to match legacy system.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary: add error classification

Classify errors according to patterns in the logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary presets: add collabora-chromeos-kernel and media trees for fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: Use media-stage instead of media-tree

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config/pipeline: enable android branches from legacy

Enable all android branches from the legacy system

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* trigger: Add exclude/include tree list for trigger

As we need to restrict list of running kernels on staging,
we need to add option allowing that.
Also it will be good to exclude staging kernels from production
kernel list.

So in case of staging we need to run kernels only from tree "kernelci"
and sometimes something else, for example "mediatek".
Option will look like:

--trees kernelci,mediatek
or
--trees kernelci

On production we need to exclude trees kernelci and buggytree:
--trees !kernelci,buggytree
or just kernelci:
--trees !kernelci

Purpose of this option is that our compiling capacity is limited,
and right now staging and production both compiling very large set
of kernels, we need to reduce this amount to drop costs.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: platforms-chromeos: use CrOS R124 files

ChromeBooks were upgraded with a new image based on ChromiumOS R124, so
we must use those files now.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: drop non-existent Tast tests

Those were removed between R120 and R124 and therefore cause test
failures with the new images.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary presets: fix acpi kselftest presets

We're interested in catching regressions and failures in the both the
kselftest-acpi test suites and its test cases. Match the nodes by group
in the presets accordingly.
Fix template used by the failure monitor preset.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src: update return values of `APIHelper.receive_event_node`

`APIHelper.receive_event_node` method is used to receive
node data from PubSub event. The method has been updated
to return `is_hierarchy` flag as well which represents
events related to node hierarchy.
Update pipeline services using the method accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: refine presets for v4l2-decoder-conformance

Modify the regression preset to monitor regressions on both the
v4l2-decoder-conformance test suites and its test cases, by matching the
nodes by group instead of by name.
Also, change the failure preset to monitor for all errors caused by
runtime errors.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: add summary presets for v4l2-decoder-conformance

Add summary presets to fetch regressions and failures on
v4l2-decoder-conformance tests. Two of the presets are the same used by
the monitor; add one additional preset to fetch all the failures on
both the test suites and their test cases.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* lava_callback.py: Remove error_code/error_msg on lava-callback

Sometimes due congestion node might be set to timeout, but
then result might arrive late and we need to use it properly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: fix dt kselftest presets

Fix the dt kselftest preset, just like was done for the acpi one, as the
current preset doesn't match the actual results we're interested in.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* doc/connecting-lab: refine documentation

Refine documentation for connecting LAVA labs
and submitting jobs to the lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback: Sometimes we get totally invalid log file uploaded

Most likely problems lays in threading of flask, and possibly
callbacks are getting mixed. This commit attempts to introduce
several countermeasures against that.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* doc: add `_index.md` page

Add index documentation page.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `pipeline-details` page

Move `pipeline-details` documentation from the API
repository to this repo to make it close to the source.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/connecting-lab: adjust `weight` property

Change `weight` property of existing doc page to
accommodate with transition of pipeline related docs
to pipeline repo.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `developer-documentation` page

Add developer manual documentation.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add lab config for Qualcomm

Add an entry to `runtimes` section for Qualcomm
lab configurations.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86` job for qualcomm

Add job configuration `baseline-x86-qualcomm` for
running baseline job in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add lab-qualcomm runtime

Add runtime argument `lab-qualcomm` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to Qualcomm LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-arm64` job for qualcomm

Add job configuration `baseline-arm64-qualcomm` for
running baseline job for `arm64` in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update RISC-V configs

1)rv32 defconfig doesn't exist, remove
2)nommu_k210_defconfig have modules disabled

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback.py: Sanitize lava log data

As we use this data in reports, lets remove all
non-printable characters as they confuse grafana, browsers and others.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/runtime/kunit.jinja2: fix result map

Fix result map for skipped tests. Initially, API
didn't have `skip` available node result in the schema.
That's why it was mapped to `None` result. But now API
has `skip` result to denote skipped tests.
Fix the result mapping accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: jobs-chromeos: Add lab-setup fragment

Add the lab-setup fragment to the chromebook builds, which contains the
architecture independent kernel configs needed to run tests on the
platform. Notably this disables IP autoconfig by the kernel.

The result of this change is that the 12 seconds boot delay and the
consequent deferred probe pending warnings will no longer happen on any
platform. Particularly on mt8186-corsola-steelix-sku131072 (due to a
different network adapter being used) on which it was still happening.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* lava_callback: bump up slightly threads number

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: enable watchdog reset test on Chromebooks

Add a basic test to verify watchdog reset functionality. Enable the
test on all ARM64 and AMD x86_64 Chromebooks. For Intel
Chromebooks, enable the test only on octopus, as ACPI PM Timer on the
other devices has been disabled in coreboot.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/send_kcidb: use schema version 4.3

Test status `MISS` was added to KCIDB in schema
v4.2 and supported by the latest version i.e. v4.3.
Hence, use the latest version for submission as
API may send a few tests with "MISS" status.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* send_kcidb: re-structure code for parsing checkout node

Move code for parsing checkout node to a separate
method.
Add `valid` field to parsed checkout node. It denotes
if source code was successfully checked out.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: print more information on invalid data

Print details for invalid revision data for the
sake of debugging.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: optimize `kcidb` import

Remove redundant `kcidb` import and adjust
kcidb Client call accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: remove keys with `None` values

KCIDB doesn't allow `None` as field value.
Remove all optional fields with `None` value
to make it valid data for submitting to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: add `kcidb_test_suite` property

Every KernelCI test will be mapped to a unified
test suite for KCIDB data submission.
Add `kcidb_test_suite` property to test job
definitions in YAML configuration files.
The added property will store the mapped
KCIDB test suite name.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: parse and submit node test and build data

Listen to all the node events with node state
`done` or `available` and submit the node to KCIDB.
Parse node received from the event and create KCIDB
schema compatible object based on type of the node
i.e. checkout, build or test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: set `log_excerpt` for builds and tests

Fetch logs from compressed log file(*.log.gz) URL
and send last 16*1024 characters for setting `log_excerpt`
field for build and test nodes as it is the max allowed
length of the KCIDB field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/jobs-chromes: add kcidb test suite property for watchdog test

Add KCIDB test suite mapping for `watchdog_reset` test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback.py: disable log removal from callback data

We need it for investigations if we have any critical data
loss during log sanitizing.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: add error info to build nodes

Add error metadata fields such as `error_code` and
`error_msg` to `misc` field for build nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: add watchdog-reset presets for mainline/next

Add monitor and summary presets to track the results from the watchdog
reset test on the mainline and next trees.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* pipeline.yaml: Fix fluster rootfs URL

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: get error metadata for failed/incomplete tests

Tweak condition to get error metadata for test nodes.
It should get error info for incomplete nodes as well
and not just failed nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: send tests only if KCIDB test mapping exists

All test suite definitions must have `kcidb_test_suite`
property i.e. KCIDB test suite mapping.
Only send tests for those the mapping is found.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* tests/validate_yaml: add validation for KCIDB mapping

To submit KernelCI generated data to KCIDB, it is required
to have a mapping for all the job definition with
`kcidb_test_suite` property.
Add validation to ensure all the jobs have a mapping
present to avoid missing data submission.
This check is to notify test authors trying to enable tests
in maestro to include the required property for the mapping
in their definition.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add qcs6490-rb3gen2 boot test

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* config: chromeos: Enable kselftest-dt on Qualcomm platforms

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* pipeline.yaml: Add one um build for android trees

As per request of Android team it will be good to check for breakages
UM builds as well.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: use `kind=job` for test suites

As part of re-structuring test hierarachy, `Job` model
has been introduced for test suite/job nodes.
It uses node kind `job`.
Update test configurations in `pipeline.yaml` and
`jobs-chromeos.yaml` to use `kind=job` to
generate job nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: provide `kind` value for child tests

In case of submitting test hierarchy, child nodes by default
inherit `kind` value from parent node.
As we are re-structuring test hierarchy, test suit/job nodes
will have `kind=job` where its child test nodes will have
`kind=test`. Provide `kind` field explicitly to test result
hierarchy to preserve different kind value than the parent
node.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: fix `NameError`

Fix the below error in `_submit` method:
```
Traceback (most recent call last):
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 287, in main
    job.submit(results)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 138, in submit
    self._submit(result)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 265, in _submit
    return node
NameError: name 'node' is not defined
```

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: evaluate job node result

Evaluate job node result from child node results if
`null` result is receive from test result parser.
For example nodes such as `fortify`:
https://staging.kernelci.org:9000/viewer?node_id=6670ab43d0b7694b399897c4

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix parsing of KUnit log file

Handle both compressed(gzip) and plain text log files
for getting log excerpt.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: HTTP exception handling for log excerpt

Add HTTP exception handling for getting
log excerpt data.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: platforms-chromeos: Add serial delay for some Mediatek platforms

Add test_character_delay to the Spherion, Tomato and Steelix platforms
to workaround the fact that they're sometimes unable to process serial
input fast enough, resulting in mangled commands and consequently flaky
test results, as described in
https://github.com/kernelci/kernelci-project/issues/366.

The right place to do this change would be in the device-type template
as described in LAVA's documentation [1]. This overriding in KernelCI is
meant only as a temporary workaround to verify whether this fixes the
issue. If it does, then we'll do it in LAVA upstream instead.

[1] https://docs.lavasoftware.org/lava/debugging.html#differences-in-input-speeds
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: chromeos: Enable error-logs kselftest for MediaTek Chromebooks

Run the error-logs kselftest on MediaTek Chromebooks. This test is
currently under review upstream [1] so, in the meantime, it has been
added to the collabora-next tree so it can prove its value by helping to
detect issues upstream.

[1] https://lore.kernel.org/all/20240423-dev-err-log-selftest-v1-0-690c1741d68b@collabora.com

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config/pipeline.yaml: enable CIP lab

Add configuration for LAVA CIP lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add baseline-x86 test for CIP

Add `baseline-x86-cip` test to be submitted to CIP
LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-cip` runtime

Add runtime argument `lab-cip` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to CIP LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: enable `job` node submission to KCIDB

Parse newly added job node and its child tests
for KCIDB submission.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: don't submit `setup` test suite nodes

`setup` test suite has been introduced to store test results
for environment setup checks before running actual test suite.
KCIDB doesn't require `setup` test suite result as long as
main test job result is submitted.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: add a check before sending data

Check if parsed data is available before
sending revision data to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix logs

Fix log statement about submitting node to KCIDB
as we are not sending all the nodes we receive
event for to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: handle skipped tests

Do not retrieve artifacts or metadata from parent
node for skipped tests as in pratice only kernel
revision, test runtime and platform will be
available for skipped tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary/utils: ignore failures on log retrieval

Make the script continue running if there was an error fetching a test
log.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/developer-documentation: add docs for enabling new tests

Add developer documentation for enabling new tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Fix links after docs page migration

Documentation has been migrated to the "docs.*" subdomain.

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* pipeline.yaml: Add kcidebug fragment

Add useful low-overhead debug option to kernel,
and test on most x86 boards we have available,
with minimal baseline tests.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* configs: update gcc-10 to gcc-12

As we upgrade compiler images, we need update gcc version

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: workaround: match node paths programatically

Don't use 'path' as an api search parameter. The use of lists as query
parameters (path is a list) is undefined. Instead, do the filtering in
code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: remove qemu jobs from lab-qualcomm

QEMU jobs use container pulled from hub.docker.com. After the lab move
pulling from this registry is no longer possible at Qualcomm. This patch
disables QEMU jobs from Qualcomm lab.

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* validate_yaml.py: Improve pipeline validation

Add validation that scheduler entries have matching job entry,
this is critical validation, and job entries have at least
one entry in the scheduler.
Fix one entry detected by this validation

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* pipeline.yaml: Add broonie(Mark Brown) trees to pipeline

It is time to enable even more trees.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add additional verification for duplicate keys

We might have redefined same keys in different yaml files,
this tool will ensure consistency of this entries.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Remove path separator

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Rename variable to schedules

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/kernelci.toml: update KCIDB origin name

As we agreed to refer new KernelCI API & Pipeline as
"maestro", use the new name while submitting data
to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: update KCI result mapping with KCIDB status

Update evaluation of KCIDB status from KCI result.

Create 2 categories for error codes:
1. When pre-check tests completed but actual test suite
coudln't run - this will have `MISS` status
2. When pre-check tests completed, actual test suite could
run but somehow couldn't complete - this will have `ERROR` status

Some LAVA error codes can occur at any point of execution
such as `Cancelled` and `Test`.
Listed such error codes to the most relevant category
based on analysis of available results.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: fix presets for v4l2-decoder-conformance

Following recent updates to data representation on KernelCI nodes,
the top-level nodes for tests now have their kind set to 'job' instead
of  'test'. Update the presets for v4l2-decoder-conformance tests
accordingly.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: fix output file name in kselftest-acpi preset

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: enable dmabuf-heaps, exec and iommu kselftest suites

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Add kcidb_test_suite

* config: result-summary: add generic rule to monitor failures and regression

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Add rt-stable builds

Copy rt-stable builds from legacy KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Major changes to move to new way of writing kbuild jobs

* config: pipeline: Add v6.6-rt branch for builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: result-summary: add rt-stable kbuilds presets

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Add 'nfs' suffix to KCIDB suite name for baseline-nfs

The baseline test is currently run with both ramdisk and nfs rootfs. To
distinguish baseline-nfs tests in KCIDB, add an 'nfs' suffix to the KCIDB
test suite name.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* aks: Add kubernetes kcidb deployment

We need file that will manage deployment of kcidb bridge
in kubernetes production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* kubernetes: Adjust trigger k8s options

Ignore kernelci tree on production, as it is special
"staging"-only tree, and read all /config directory, not just default
pipeline.yaml.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: bugfix: catch empty search condition

Fix _get_last_matching_node(), after the previous change there was an
unhandled scenario where nodes may be empty but the function wouldn't
return None immediately.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: pipeline: correct the kind of kselftest suites to job

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler-chromeos.yaml: Temporarily disable non-essential tast tests

As per discussion, we disable temporary tast tests which unlikely
will be reviewed.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* k8s/aks: Update deployment files

1)Update memory limit, as working with linux sources might require 3Gbyte of RAM.
2)Update config file path
3)Add callback environment variable
4)Update image reference to fresh one

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android builds with gcc-12 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable android builds with clang-17 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: remove build_variants from android build_configs

The build_variants is legacy way to specify the different variants. We
have moved to the newer way to specify the variants. Hence remove the
build_variants from android build_configs.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add android15-6.6-lts branch for build as well

The android15-6.6-lts has been included recently in legacy KernelCI:
https://github.com/kernelci/kernelci-core/pull/2597

Add the same in newer KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add blocklist for riscv older kernels for android builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: update KCIDB test suite mapping for baseline

Use `boot` as KCIDB test suite mapping for all
baseline tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* callback_url: Update config and README

As we are moving callback URL to environment variable,
updating config and README accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android baseline (boot) testing for arm and arm64 in only allmodconfig

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler.py: If event have jobfilter, inject it to the node data

When someone generate artificial event with jobfilter, this is
likely maintainer trying to repeat job. Treat this accordingly,
and inject job filter to job node, so we will run only tests
maintainer wants.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback: migrate to fastapi

It will be easier to maintain API and Pipeline, as
both will be powered by FastAPI framework.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: Update fluster rootfs URL

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: pipeline: fix defconfigs in fragments

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* kbuild.jinja2: support defconfig as list or str

As required in https://github.com/kernelci/kernelci-core/pull/2608
defconfig might be two types. Support it in jinja2 accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: piepline: add kbuilds of lee-mfd with default defconfigs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable baseline testing for mfd for one board of each arch

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: fix platform sections for Qualcomm and Android schedules

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* k8s: Update deployment to uvicorn, as we use fastapi now

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: Unblock android runs on lava-collabora

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: Enable preempt-rt cyclictest test

Enable the first preempt-rt test, cyclictest in new KernelCI. Enable it
on all platforms.

Since these are all smoke test there is no point in running them too
long. Thus reduce the runtime per test to one minute. This should keep
the total preempt-rt runtime roughly in the same time frame.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: add all the test jobs for all rt-test

Add jobs definition of all the rt-tests. Enable cyclicdeadline and rtla
tests to run on all targets.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add template and test properties for preempt_rt jobs

Add template, job add kcidb_test_suite properties for all preempt-rt jobs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: rename preempt-rt to rt-tests which is correct name of tests

The legacy was using preempt-rt name of tests. But the repository has
rt-tests name. We must use the same name to merge with execution results
coming from other CIs in KCIDB.

Suggested-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add the correct nfsroot for rt-tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Remove android's deprecated branches

It has been confirmed with Todd that we should remove the deprecated
branches. Hence remove those branches.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: run baseline on non-allmodconfig

The allmodconfig generates very large kernel image. It cannot be booted
on the arm64 and arm targets as tftp errors out that size is too large.
Reduce the kernel image size. Use the default defconfig. The same
defconfigs have been booting for other trees.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* doc: developer-documentation: Update documentation by adding more details

- Reorganize some things
- Specify how to write different variants by removing old syntax
- Give two separate templates for kbuild and test
- Try to put more details for new contributors

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Fix type
- Apply suggestions from code review

* doc/developer-documentation: fix a glitch in enabling new tree section

Fix a minor bug in YAML block formatting.

Fixes: f5f57de ("doc: developer-documentation: Update documentation by adding more details")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/developer-documentation: update a section title

Rename a section from "Enabling a new Kernel tree" to
"Enabling new KernelCI trees, builds, and tests" as it explains
enabling tests as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: use the new `tree:branch` format for rules

For cases where we want a single branch to be allowed for a given tree,
we can now use the `tree:branch` format in rules. Convert existing rules
accordingly.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: pipeline: fix improper use of "filters" attribute

The `filters` param was used in the legacy system but has been replaced
by `rules`, with a different syntax.

For Android RISC-V builds, this was used to deny job execution on
kernels < 4.19, so let's translate this condition with the rules format,
and do a similar change for the `rt-tests`-based jobs.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config/pipeline.yaml: Fix x86 typo in kcidebug job names

The kcidebug jobs that run on MediaTek and Qualcomm platforms should
have arm64 in the name rather than x86. Fix the typo.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: pipeline: remove params

The parameters are only needed when they are changed or appeneded.
Remvoe the parameters which aren't being modified.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* validate_yaml.py: Jobs are required to have template parameter

Add more validation to config files of mandatory parameters.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add more job validations

Add basic validation, each job must have kind parameter

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* workflows: Add label on CI check failures

Automatically add label so broken PR wont go to staging

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

---------

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Signed-off-by: Laura Nao <laura.nao@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>
Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-authored-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Co-authored-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Co-authored-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Co-authored-by: Helen Koike <helen.koike@collabora.com>
Co-authored-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Co-authored-by: Laura Nao <laura.nao@collabora.com>
Co-authored-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Co-authored-by: Shreeya Patel <shreeya.patel@collabora.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Milosz Wasilewski <milosz.wasilewski@foundries.io>
Co-authored-by: Paweł Wieczorek <pawiecz@collabora.com>
Co-authored-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Co-authored-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
nuclearcat added a commit to nuclearcat/kernelci-pipeline that referenced this pull request Jul 24, 2024
* src/scheduler: store error message when job fails with "submit_error"

It is helpful for debugging to catch error message when
scheduler fails to submit job to runtime.
Store the error message to `data.error_msg` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: Set minimum kernel version for DT kselftest to 6.7

The test was introduced upstream in version 6.7, so no point in trying
to run it on earlier versions.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* configs/: Update volteer device

Update volteer devices according lab availability

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary templates: detailed output for active/inactive regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new presets for active regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: update CHANGELOG

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* data: chmod -R 777 ./data/output to avoid permission error

Avoid errors like

PermissionError: [Errno 13] Permission denied: '/home/kernelci/data/output/stable-rc-boot.html'

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: move code to _get_logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: use ThreadPoolExecutor to fetch logs

Fetching logs is the bottleneck of the script. Fetch them in parallel
with ThreadPoolExecutor.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix result presets

stable-rc-build-failures and stable-rc-boot-failures weren't querying
specifically for test failures.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: rework regression detection

Take into account "active" and "inactive" regressions when creating them
and when processing new passed or failed nodes.

When a node passes, it checks if it "inactivates" an existing "active"
regression. When a node fails, it checks if it needs to create a new
regression or update an existing "active" one.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: link failed nodes to active regressions

When a failed node generates a regression, or when it's a re-run of a
run that generated a still active regression, link the node to the
regression id.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for date ranges for creation and update

New command line options to let the user specify date ranges for node
creation and last update: --created-from, --created-to,
--last-updated-from, --last-updated-to

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: support for date ranges for creation and last update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for extra query parameters in cmdline

New command line option: --query-params to specify a set of extra query
parameters to complete or override preset parameters.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: html markup in some preset titles

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: update and move to docs folder

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: move parameter loading and processing to 'setup'

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: refactor and split into two clases (single, run)

Split the ResultSummary class into a base class and two child classes:
ResultSummarySingle and ResultSummaryLoop (only a stub at this point).

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: WIP initial implementation of the "loop" command

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: huge refactoring

Implement "summary" (single-shot) and "monitor" (loop) modes based on
preset parameters instead of on the command-line main command.

Split the logic into multiple files, move all monitor-specific and
summary-specific code to independent files, common code in a separate
file.

Full of kludges, I don't like how this is looking so far, might consider
reimplementing it without any dependencies on pipeline code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix markup and indentation

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new generic templates for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: examples for "monitor" and "summary" modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: summary and monitor modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix generic regression report

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: summary: fix last_updated option handling

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: embed css stylesheet in html files

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] make regression active by default

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "result" field is ever made non-optional in the models we can
probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] set default empty node sequence

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "node_sequence" field is ever made non-optional in the models we
can probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: add cmdline option --output-dir

Introduce a new command-line option: --output-dir, and rename the old
--output to --output-file.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: command-line options change

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: jobs-chromeos: remove meaningless Tast tests

Several Tast tests can only fail in the context of KernelCI:
* `video.PlatformDecoding.v4l2_state*_vp9_0_svc` do not actually exist,
  causing the whole test job to fail
* `platform.DLCService*` and `platform.Memd` rely on features only
  present in the downstream Chrom{e,ium}OS kernel (see b/247467814 and
  b/244479619 for those having access to Google's issue tracker)
* `kernel.ConfigVerify.chromeos` relies on downstream-only config
  options such as `CONFIG_SECURITY_CHROMIUMOS` and other similar ones,
  and therefore can only fail when testing upstream kernels

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: scheduler-chromeos: don't execute non-working Tast tests

Currently, HEVC-related tests are known to either fail or be skipped as
ChromeOS doesn't yet handle hardware decoding of HEVC media. This is
expected to be fixed at some point though, so we're keeping the job
definitions and only remove the corresponding scheduler entries in order
to reinstate those jobs when relevant.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: exclude Tast tests known to always fail

Several decoder tests always fail on all platforms where they're
executed, adding only noise to otherwise useful test results. Disable
those for improving the quality of the results.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: chromeos: add special case for pre-6.7 qcom codec tests

On Qualcomm-based ChromeBooks (`trogdor` being the only model in
Collabora's lab), we noticed systematic failures of all
`vp9_*_frm_resize` and `vp9_*_sub8x8_sf` tests when using a kernel up to
6.6. With 6.7 and above, all of those tests (except one) now pass. It
therefore makes sense to exclude those on pre-6.7 kernels so we don't
report known failures and get rid of some noise.

This involves "duplicating" affected test jobs (although I did my best
to minimize that) and setting rules so only the working variant is
executed, based on the version of the kernel being tested.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* lava_callback: Compress the log files to save storage space

As storage space in cloud and egress have high costs,
better to compress potentially large files.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* tests: Add basic yaml validation

Add yaml load to figure out earlier issues with yaml

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in platforms anchors

The "stoneyridge" and "pineview" naming used in the Chromebook platform
anchors refers to ChromiumOS specific config fragments, but doesn't
necessarily match the actual platform of all the devices listed.
Use more generic names to distinguish amd and intel Chromebooks.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: rename test job anchors that use chromeos specific configs

Rename test job anchors that use chromeos specific kernel configurations
to include the 'chromeos' infix.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: add baseline tests

Enable the baseline tests on all the supported Chromebooks with their
default kernel configuration.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in job defs

The "stoneyridge" and "pineview" naming used in some Chromebook job
definitions refers to ChromiumOS specific config fragments, but
doesn't necessarily match the actual platforms targeted by the jobs.
Replace all occurrences with more generic intel/amd naming.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop chromeos infix from baseline jobs

Keeping different job names for tests targeting different kernel configs
might cause too much duplication. Drop the 'chromeos' infix from the job
name for the tests using the chromeos config fragment. Users will be
able to filter the results using the data.defconfig/data.config_full
fields anyway.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: post-process results for summary and monitor modes

Split the post-processing of nodes to a common function that can be used
for both summary and monitor modes. Currently, post-processing involves
only the collection of logs.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: update and fix presets and templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/result-summary-CHANGELOG: update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config/pipeline.yaml: enable 'BayLibre' lab

Add lab configuration for BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-baylibre` runtime

Add runtime argument `lab-baylibre` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86-baylibre` job

Add job configuration `baseline-x86-baylibre` for BayLibre.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-armel-baylibre` job

Add job configuration `baseline-armel-baylibre` for BayLibre.
Add scheduler entry and platform config as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline: enable `android` tree and build configs

Monitor linux `android` tree. Add build configs for `android-mainline`
branch.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add kbuild definitions for android-mainline

Add kbuild jobs to compile the kernel for android-mainline branch

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add entries to schedule to build android-mainline

Add entries to `scheduler:` section to run the builds for
android-mainline.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix node filter in monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* kernelci.toml: set `checkout` node timeout to `180 min`

Currently set `60 min` timeout is not enough as some
`kbuild` jobs and its sub-tests take around 2 hrs to
complete after getting submitted to runtime.

Here is an example from staging. See the information
for a `checkout` and its child nodes:

| id                       | name                | created                    | updated                    | timeout                    |
|--------------------------|---------------------|----------------------------|----------------------------|----------------------------|
| 661c9d59b60b785eb9fc42b0 | checkout            | 2024-04-15T03:22:01.317000 | 2024-04-15T03:51:03.870000 | 2024-04-15T04:22:01.284000 |
| 661c9d97b60b785eb9fc42b4 | kbuild-gcc-10-arm64 | 2024-04-15T03:23:03.399000 | 2024-04-15T03:50:15.031000 | 2024-04-15T09:23:03.399000 |
| 661ca3f7b60b785eb9fc4ead | baseline-arm64      | 2024-04-15T03:50:15.304000 | 2024-04-15T05:09:45.247000 | 2024-04-15T09:50:15.304000 |

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary: add email report capabilities for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: plain text single report templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: chromeos: add baseline-nfs tests

Enable the baseline-nfs tests on all the supported Chromebooks, with
both the default and the chromeos kernel configurations.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/timeout: set `checkout` result

For `TIMEOUT` mode, set `checkout` node result to `fail`
if its state is `running` as it means code checkout is still
going on and node timed-out. Set it to `pass` if its state
is any other than `running`.
Set `checkout` node result to `pass` if mode is `DONE` as
it means once `checkout` has been in `available` or `closing`
state and it could successfully complete source code checkout.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* regression_tracker: bugfix, failed test with no prior runs

Handle the case of a failed test run when it's the first occurence of
that test case. Consider it "not a regression" for now, since we're
defining a regression as a "breaking point" between a success and a
failure.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: platforms-chromeos: fix dalboz device type

Due due to a copy/paste mishap, the device type for
`asus-CM1400CXA-dalboz` had a trailing `_chromeos`, leading LAVA to fail
finding the correct device type, and no job from the new system running
on this platform.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromes: run Tast tests only on 5.4+

Current ChromeOS images have `ext4` filesystems using options not
present in 4.19. Therefore tests cannot run on kernels that old, and
this leads to false positives in corrupt device identification, so we
should only run those tests on 5.4 and later kernels.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromes: drop non-existent platform

`hp-x360-12b-ca0500na-n4000-octopus` isn't a device type available in
Collabora's LAVA lab, so let's drop its definition.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: exclude android tree from kbuild jobs

Only Android-specific kbuild jobs should run for this tree, let's not
overload our system with unneeded builds.

Take this opportunity to limit mediatek kbuilds to 6.1+ as that's the
earliest version that has upstream support for at least one of our
devices.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: a bug fix in `_submit_lapsed_nodes`

Fix a glitch in the code related to setting `checkout`
node result.

Fixes: 361fc0d ("src/timeout: set `checkout` result")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update early access FQDN

We are moving k8s from eastus to westus3 as it is cheaper

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/tarball: fix `_kdir` in `update_repo`

Fix the below error:
```
kernelci-pipeline-tarball |   File "/home/kernelci/./pipeline/tarball.py", line 79, in _update_repo
kernelci-pipeline-tarball |     kernelci.shell_cmd(f"rm -rf {self._kdir}")
kernelci-pipeline-tarball |                                  ^^^^^^^^^^
kernelci-pipeline-tarball | AttributeError: 'Tarball' object has no attribute '_kdir'
```

Fixes: 0a2fe9c ("src/patchset.py: Implement Patchset service)
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: fix method to get child nodes recursively

`TimeoutService._get_child_nodes_recursive` is used to get
pending child nodes recursively for closing and timed-out
nodes. It overwrites the result while being called recursively.
Fix the method to make it work properly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: rename "armel" arch to "arm"

`armel` has various meanings depending on the system: for ChromeOS, it
is ARMv7, while in Debian it's ARMv{5T,6}. Moreover, this project is
*Kernel*CI and the kernel uses `arm` for all 32-bits ARM devices. In
order to avoid confusion (including those wondering what the heck does
`armel` mean), let's rename `armel` to `arm`.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: use per-system arch property where relevant

With the new `*arch` fields present in the platform configurations, we
don't have to hardcode the architecture strings in some specific cases.
Let's adapt the config files so we use `{cros,deb,k}arch` wherever it
makes sense.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: set timed-out `checkout` result

Set timed-out `checkout` node result to `incomplete`
while in `running` state. As it denotes that the node
timed-out while checkout was still going on.
Also, set error related information i.e. `error_code`
and `error_msg`.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/tarball: update checkout node when update repo fails

Tarball updates source code repo and creates tarball.
If update repo operation fails even with second attempt,
it means it failed to checkout souce code.
Hence, update `checkout` node with state `done` state and
result `fail`. Also, set appropriate error information
to the `data` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: enable collabora-next tree and build config

Monitor the collabora-next tree. Add build config for the for-kernelci
branch.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: enable acpi kselftest on collabora-next tree

Run the ACPI kselftest on the for-kernelci branch of the collabora-next
tree.

See: https://lore.kernel.org/linux-kselftest/20240308144933.337107-1-laura.nao@collabora.com/T/#t

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: restore missing split_query_params function

Restore this function that was accidentally removed during the last
refactoring.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* lava_callback: Don't upload empty files to Azure

There is no use for lot of empty files on Azure,
that only complicate cleanup.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: unify preset and output names

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: update preset for aferraris

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for laura.nao

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fixes and new presets for nfraprado

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fix arch query parameters

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* k8s: Lot of deployment tested fixes

Fixes in yaml files for k8s production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result-summary presets: Fix build failure and regression monitors

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* result_summary: added debug traces to the monitor

Show detailed info of the node filterings in real time.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: fix corner case bug when no logs are found

Cover rare case where neither the node nor any of its parents up to the
checkout node have any log artifacts.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: refine stable-rc presets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: add regression info to test reports

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: escape log snippets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src: lava_callback: add device ID to node data

It can be useful to know the exact device on which a job ran, without
having to open the LAVA job page. This is done by querying the device ID
from the callback data and appending it to the node data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: upload raw callback data as well

Debugging callback issues is complex due to the raw data not being saved
after processing. This change ensures we save the callback data as a
JSON file in order to ease development.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* DONOTMERGE lava_callback: add debug statements

Why the heck doesn't this just work???

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary_templates: fix error 'node' is undefined

The object is named test and not node, so s/node/test

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/runtime/kunit: set architecture info

Set architecture field for `kunit` test
nodes.
If no `arch` argument is supplied, kunit takes
`um` (User Mode Linux) as architecture to run
tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: count running child jobs of build nodes

Add a method to count running jobs of `kbuild`
nodes i.e. jobs being submitted after successful
builds. Fox example `baseline` or `tast` jobs.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle closing `checkout` node differently

Usually, `checkout` should be transited to `done` state
when all its child nodes are completed.
In case of closing `checkout`, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are still
running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle holdoff reached `checkout` node differently

Usually, available `checkout` for which holdoff is
reached should be transited to `done` state only when
all its child nodes are completed.
In case of such `checkout` node, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are
still running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Revert "DONOTMERGE lava_callback: add debug statements"

This reverts commit 5ed8218d99840373bbba5830b1976813b52bf4b1.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* Create dependabot.yml

* result_summary_templates: make generic-test-failures generic to all
results

The generic-test-failures templates can be used to show general results
just replacing the name "failures" by "results". Makeing it easier to be
re-used by communities that want to have pre-sets to list all results of
the tests, so:

	s/generic-test-failures/generic-test-results

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result-summary.yaml: add preset to list android build tests

Since we now build android, add a preset to allow result-summary.yaml to
list all build results from Android tree.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* tarball: Implement checkout for specific commit

We often need not ToT, but specific commit, implement this.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* jobs-chromeos.yaml: Disable module compression for every kernel version

Commit d4bbe942098b ("kbuild: remove CONFIG_MODULE_COMPRESS"),
introduced in kernel v5.13, substituted CONFIG_MODULE_COMPRESS=n for
CONFIG_MODULE_COMPRESS_NONE=y as the way to disable module compression.
Since module compression causes "Invalid ELF header magic: != ELF"
errors during boot on the ChromeOS base config, add the missing config
to disable module compression on kernels > v5.13 as well.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* src: lava_callback: reduce callback data size

The callback data is quite large, especially as it includes the full log
which we already upload separately. By dropping it and compressing the
whole file with `gzip` we can avoid wasting too much storage space.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: don't leak secret token

The callback data contains the secret tokens value which shouldn't be
leaked. Ensure we drop it from the uploaded data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromeos: use new cros-flash image

This ensures we use the new version of the `install-modules` script.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: regression_tracker: add the "device" field to regression data

This can be helpful. We're not using it as a search param though, as we
don't want to narrow down the search that much, using the platform only
is better.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: result_summary_templates: report device used for job

This information is now available, and it can be useful to know the
affected device withouth having to look at the LAVA job details.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* kubernetes: Update deployment recipe

Update list of labs and add KCI_INSTANCE variable.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava-callback: Limit threads of lava-callback

Due inrush of lava callbacks and slow Azure Files
processing, we need to make sure we dont spawn too many
threads.
Also add hard limit of memory 1Gbyte

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: add presetes for fluster test

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Make template generic for all v4l2 tests
- Rebase on main

* result_summary presets: make the name of fluster test generic

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: enable first fluster test for mt8195-cherry-tomato-r2

Enable first fluster test, AV1-TEST-VECTORS for mt8195-cherry-tomato-r2.
Run the test on mainline and next until more trees are added.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Create generic v4l2-decoder-conformance-job and use anchers from it
- Update the rootfs address
- Move anchor to _anchor
- Update with nitpicks

* config: jobs-chromeos: Add kernelci tree for testing purpose

Remove this commit before merging.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Enable cpufreq kselftest

Enable cpufreq kselftest on all the trees and branches.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

* result_summary presets: fix preset for kselftest-dt failures monitor

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for kselftest-cpufreq

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: mt8195-cherry-tomato-r2: enable all fluster tests for all branches

Add all the trees and branches on which the tests would be ran. Enable
all the tests for tomato.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- The build config cannot be added yet. Just list the trees, it will only use
  the branches configured in build_configs:
  - mainline will use master
  - next will use master
  - collabora-chromeos-kernel will use for-kernelci
  - media will use master and fixes
- Remove kernelci tree as it was added just for testing purpose

* config: mt8183-kukui-jacuzzi-juniper-sku16: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

jacuzzi

* config: mt8186-corsola-steelix-sku131072: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: mt8192-asurada-spherion-r0: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Don't specify the platforms manually as they are already mentioned in
  test-job-arm64-mediatek

* config: sc7180-trogdor-kingoftown/lazor-limozeen: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Use test-job-arm64-qualcomm instead and carete separate jobs for
  qualcomm devices
- Don't specify platforms manually as they are already mentioned in
  test-job-arm64-qualcomm

* build(deps): bump uwsgi from 2.0.21 to 2.0.22 in /docker/lava-callback

Bumps [uwsgi](https://uwsgi-docs.readthedocs.io/en/latest/) from 2.0.21 to 2.0.22.

---
updated-dependencies:
- dependency-name: uwsgi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* pipeline.yaml: Add stable-rc build variants

Add more build variants for stable-rc tree to match legacy system.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary: add error classification

Classify errors according to patterns in the logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary presets: add collabora-chromeos-kernel and media trees for fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: Use media-stage instead of media-tree

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config/pipeline: enable android branches from legacy

Enable all android branches from the legacy system

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* trigger: Add exclude/include tree list for trigger

As we need to restrict list of running kernels on staging,
we need to add option allowing that.
Also it will be good to exclude staging kernels from production
kernel list.

So in case of staging we need to run kernels only from tree "kernelci"
and sometimes something else, for example "mediatek".
Option will look like:

--trees kernelci,mediatek
or
--trees kernelci

On production we need to exclude trees kernelci and buggytree:
--trees !kernelci,buggytree
or just kernelci:
--trees !kernelci

Purpose of this option is that our compiling capacity is limited,
and right now staging and production both compiling very large set
of kernels, we need to reduce this amount to drop costs.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: platforms-chromeos: use CrOS R124 files

ChromeBooks were upgraded with a new image based on ChromiumOS R124, so
we must use those files now.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: drop non-existent Tast tests

Those were removed between R120 and R124 and therefore cause test
failures with the new images.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary presets: fix acpi kselftest presets

We're interested in catching regressions and failures in the both the
kselftest-acpi test suites and its test cases. Match the nodes by group
in the presets accordingly.
Fix template used by the failure monitor preset.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src: update return values of `APIHelper.receive_event_node`

`APIHelper.receive_event_node` method is used to receive
node data from PubSub event. The method has been updated
to return `is_hierarchy` flag as well which represents
events related to node hierarchy.
Update pipeline services using the method accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: refine presets for v4l2-decoder-conformance

Modify the regression preset to monitor regressions on both the
v4l2-decoder-conformance test suites and its test cases, by matching the
nodes by group instead of by name.
Also, change the failure preset to monitor for all errors caused by
runtime errors.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: add summary presets for v4l2-decoder-conformance

Add summary presets to fetch regressions and failures on
v4l2-decoder-conformance tests. Two of the presets are the same used by
the monitor; add one additional preset to fetch all the failures on
both the test suites and their test cases.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* lava_callback.py: Remove error_code/error_msg on lava-callback

Sometimes due congestion node might be set to timeout, but
then result might arrive late and we need to use it properly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: fix dt kselftest presets

Fix the dt kselftest preset, just like was done for the acpi one, as the
current preset doesn't match the actual results we're interested in.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* doc/connecting-lab: refine documentation

Refine documentation for connecting LAVA labs
and submitting jobs to the lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback: Sometimes we get totally invalid log file uploaded

Most likely problems lays in threading of flask, and possibly
callbacks are getting mixed. This commit attempts to introduce
several countermeasures against that.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* doc: add `_index.md` page

Add index documentation page.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `pipeline-details` page

Move `pipeline-details` documentation from the API
repository to this repo to make it close to the source.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/connecting-lab: adjust `weight` property

Change `weight` property of existing doc page to
accommodate with transition of pipeline related docs
to pipeline repo.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `developer-documentation` page

Add developer manual documentation.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add lab config for Qualcomm

Add an entry to `runtimes` section for Qualcomm
lab configurations.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86` job for qualcomm

Add job configuration `baseline-x86-qualcomm` for
running baseline job in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add lab-qualcomm runtime

Add runtime argument `lab-qualcomm` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to Qualcomm LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-arm64` job for qualcomm

Add job configuration `baseline-arm64-qualcomm` for
running baseline job for `arm64` in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update RISC-V configs

1)rv32 defconfig doesn't exist, remove
2)nommu_k210_defconfig have modules disabled

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback.py: Sanitize lava log data

As we use this data in reports, lets remove all
non-printable characters as they confuse grafana, browsers and others.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/runtime/kunit.jinja2: fix result map

Fix result map for skipped tests. Initially, API
didn't have `skip` available node result in the schema.
That's why it was mapped to `None` result. But now API
has `skip` result to denote skipped tests.
Fix the result mapping accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: jobs-chromeos: Add lab-setup fragment

Add the lab-setup fragment to the chromebook builds, which contains the
architecture independent kernel configs needed to run tests on the
platform. Notably this disables IP autoconfig by the kernel.

The result of this change is that the 12 seconds boot delay and the
consequent deferred probe pending warnings will no longer happen on any
platform. Particularly on mt8186-corsola-steelix-sku131072 (due to a
different network adapter being used) on which it was still happening.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* lava_callback: bump up slightly threads number

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: enable watchdog reset test on Chromebooks

Add a basic test to verify watchdog reset functionality. Enable the
test on all ARM64 and AMD x86_64 Chromebooks. For Intel
Chromebooks, enable the test only on octopus, as ACPI PM Timer on the
other devices has been disabled in coreboot.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/send_kcidb: use schema version 4.3

Test status `MISS` was added to KCIDB in schema
v4.2 and supported by the latest version i.e. v4.3.
Hence, use the latest version for submission as
API may send a few tests with "MISS" status.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* send_kcidb: re-structure code for parsing checkout node

Move code for parsing checkout node to a separate
method.
Add `valid` field to parsed checkout node. It denotes
if source code was successfully checked out.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: print more information on invalid data

Print details for invalid revision data for the
sake of debugging.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: optimize `kcidb` import

Remove redundant `kcidb` import and adjust
kcidb Client call accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: remove keys with `None` values

KCIDB doesn't allow `None` as field value.
Remove all optional fields with `None` value
to make it valid data for submitting to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: add `kcidb_test_suite` property

Every KernelCI test will be mapped to a unified
test suite for KCIDB data submission.
Add `kcidb_test_suite` property to test job
definitions in YAML configuration files.
The added property will store the mapped
KCIDB test suite name.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: parse and submit node test and build data

Listen to all the node events with node state
`done` or `available` and submit the node to KCIDB.
Parse node received from the event and create KCIDB
schema compatible object based on type of the node
i.e. checkout, build or test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: set `log_excerpt` for builds and tests

Fetch logs from compressed log file(*.log.gz) URL
and send last 16*1024 characters for setting `log_excerpt`
field for build and test nodes as it is the max allowed
length of the KCIDB field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/jobs-chromes: add kcidb test suite property for watchdog test

Add KCIDB test suite mapping for `watchdog_reset` test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback.py: disable log removal from callback data

We need it for investigations if we have any critical data
loss during log sanitizing.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: add error info to build nodes

Add error metadata fields such as `error_code` and
`error_msg` to `misc` field for build nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: add watchdog-reset presets for mainline/next

Add monitor and summary presets to track the results from the watchdog
reset test on the mainline and next trees.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* pipeline.yaml: Fix fluster rootfs URL

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: get error metadata for failed/incomplete tests

Tweak condition to get error metadata for test nodes.
It should get error info for incomplete nodes as well
and not just failed nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: send tests only if KCIDB test mapping exists

All test suite definitions must have `kcidb_test_suite`
property i.e. KCIDB test suite mapping.
Only send tests for those the mapping is found.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* tests/validate_yaml: add validation for KCIDB mapping

To submit KernelCI generated data to KCIDB, it is required
to have a mapping for all the job definition with
`kcidb_test_suite` property.
Add validation to ensure all the jobs have a mapping
present to avoid missing data submission.
This check is to notify test authors trying to enable tests
in maestro to include the required property for the mapping
in their definition.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add qcs6490-rb3gen2 boot test

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* config: chromeos: Enable kselftest-dt on Qualcomm platforms

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* pipeline.yaml: Add one um build for android trees

As per request of Android team it will be good to check for breakages
UM builds as well.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: use `kind=job` for test suites

As part of re-structuring test hierarachy, `Job` model
has been introduced for test suite/job nodes.
It uses node kind `job`.
Update test configurations in `pipeline.yaml` and
`jobs-chromeos.yaml` to use `kind=job` to
generate job nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: provide `kind` value for child tests

In case of submitting test hierarchy, child nodes by default
inherit `kind` value from parent node.
As we are re-structuring test hierarchy, test suit/job nodes
will have `kind=job` where its child test nodes will have
`kind=test`. Provide `kind` field explicitly to test result
hierarchy to preserve different kind value than the parent
node.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: fix `NameError`

Fix the below error in `_submit` method:
```
Traceback (most recent call last):
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 287, in main
    job.submit(results)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 138, in submit
    self._submit(result)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 265, in _submit
    return node
NameError: name 'node' is not defined
```

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: evaluate job node result

Evaluate job node result from child node results if
`null` result is receive from test result parser.
For example nodes such as `fortify`:
https://staging.kernelci.org:9000/viewer?node_id=6670ab43d0b7694b399897c4

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix parsing of KUnit log file

Handle both compressed(gzip) and plain text log files
for getting log excerpt.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: HTTP exception handling for log excerpt

Add HTTP exception handling for getting
log excerpt data.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: platforms-chromeos: Add serial delay for some Mediatek platforms

Add test_character_delay to the Spherion, Tomato and Steelix platforms
to workaround the fact that they're sometimes unable to process serial
input fast enough, resulting in mangled commands and consequently flaky
test results, as described in
https://github.com/kernelci/kernelci-project/issues/366.

The right place to do this change would be in the device-type template
as described in LAVA's documentation [1]. This overriding in KernelCI is
meant only as a temporary workaround to verify whether this fixes the
issue. If it does, then we'll do it in LAVA upstream instead.

[1] https://docs.lavasoftware.org/lava/debugging.html#differences-in-input-speeds
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: chromeos: Enable error-logs kselftest for MediaTek Chromebooks

Run the error-logs kselftest on MediaTek Chromebooks. This test is
currently under review upstream [1] so, in the meantime, it has been
added to the collabora-next tree so it can prove its value by helping to
detect issues upstream.

[1] https://lore.kernel.org/all/20240423-dev-err-log-selftest-v1-0-690c1741d68b@collabora.com

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config/pipeline.yaml: enable CIP lab

Add configuration for LAVA CIP lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add baseline-x86 test for CIP

Add `baseline-x86-cip` test to be submitted to CIP
LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-cip` runtime

Add runtime argument `lab-cip` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to CIP LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: enable `job` node submission to KCIDB

Parse newly added job node and its child tests
for KCIDB submission.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: don't submit `setup` test suite nodes

`setup` test suite has been introduced to store test results
for environment setup checks before running actual test suite.
KCIDB doesn't require `setup` test suite result as long as
main test job result is submitted.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: add a check before sending data

Check if parsed data is available before
sending revision data to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix logs

Fix log statement about submitting node to KCIDB
as we are not sending all the nodes we receive
event for to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: handle skipped tests

Do not retrieve artifacts or metadata from parent
node for skipped tests as in pratice only kernel
revision, test runtime and platform will be
available for skipped tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary/utils: ignore failures on log retrieval

Make the script continue running if there was an error fetching a test
log.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/developer-documentation: add docs for enabling new tests

Add developer documentation for enabling new tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Fix links after docs page migration

Documentation has been migrated to the "docs.*" subdomain.

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* pipeline.yaml: Add kcidebug fragment

Add useful low-overhead debug option to kernel,
and test on most x86 boards we have available,
with minimal baseline tests.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* configs: update gcc-10 to gcc-12

As we upgrade compiler images, we need update gcc version

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: workaround: match node paths programatically

Don't use 'path' as an api search parameter. The use of lists as query
parameters (path is a list) is undefined. Instead, do the filtering in
code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: remove qemu jobs from lab-qualcomm

QEMU jobs use container pulled from hub.docker.com. After the lab move
pulling from this registry is no longer possible at Qualcomm. This patch
disables QEMU jobs from Qualcomm lab.

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* validate_yaml.py: Improve pipeline validation

Add validation that scheduler entries have matching job entry,
this is critical validation, and job entries have at least
one entry in the scheduler.
Fix one entry detected by this validation

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* pipeline.yaml: Add broonie(Mark Brown) trees to pipeline

It is time to enable even more trees.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add additional verification for duplicate keys

We might have redefined same keys in different yaml files,
this tool will ensure consistency of this entries.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Remove path separator

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Rename variable to schedules

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/kernelci.toml: update KCIDB origin name

As we agreed to refer new KernelCI API & Pipeline as
"maestro", use the new name while submitting data
to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: update KCI result mapping with KCIDB status

Update evaluation of KCIDB status from KCI result.

Create 2 categories for error codes:
1. When pre-check tests completed but actual test suite
coudln't run - this will have `MISS` status
2. When pre-check tests completed, actual test suite could
run but somehow couldn't complete - this will have `ERROR` status

Some LAVA error codes can occur at any point of execution
such as `Cancelled` and `Test`.
Listed such error codes to the most relevant category
based on analysis of available results.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: fix presets for v4l2-decoder-conformance

Following recent updates to data representation on KernelCI nodes,
the top-level nodes for tests now have their kind set to 'job' instead
of  'test'. Update the presets for v4l2-decoder-conformance tests
accordingly.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: fix output file name in kselftest-acpi preset

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: enable dmabuf-heaps, exec and iommu kselftest suites

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Add kcidb_test_suite

* config: result-summary: add generic rule to monitor failures and regression

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Add rt-stable builds

Copy rt-stable builds from legacy KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Major changes to move to new way of writing kbuild jobs

* config: pipeline: Add v6.6-rt branch for builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: result-summary: add rt-stable kbuilds presets

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Add 'nfs' suffix to KCIDB suite name for baseline-nfs

The baseline test is currently run with both ramdisk and nfs rootfs. To
distinguish baseline-nfs tests in KCIDB, add an 'nfs' suffix to the KCIDB
test suite name.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* aks: Add kubernetes kcidb deployment

We need file that will manage deployment of kcidb bridge
in kubernetes production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* kubernetes: Adjust trigger k8s options

Ignore kernelci tree on production, as it is special
"staging"-only tree, and read all /config directory, not just default
pipeline.yaml.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: bugfix: catch empty search condition

Fix _get_last_matching_node(), after the previous change there was an
unhandled scenario where nodes may be empty but the function wouldn't
return None immediately.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: pipeline: correct the kind of kselftest suites to job

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler-chromeos.yaml: Temporarily disable non-essential tast tests

As per discussion, we disable temporary tast tests which unlikely
will be reviewed.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* k8s/aks: Update deployment files

1)Update memory limit, as working with linux sources might require 3Gbyte of RAM.
2)Update config file path
3)Add callback environment variable
4)Update image reference to fresh one

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android builds with gcc-12 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable android builds with clang-17 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: remove build_variants from android build_configs

The build_variants is legacy way to specify the different variants. We
have moved to the newer way to specify the variants. Hence remove the
build_variants from android build_configs.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add android15-6.6-lts branch for build as well

The android15-6.6-lts has been included recently in legacy KernelCI:
https://github.com/kernelci/kernelci-core/pull/2597

Add the same in newer KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add blocklist for riscv older kernels for android builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: update KCIDB test suite mapping for baseline

Use `boot` as KCIDB test suite mapping for all
baseline tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* callback_url: Update config and README

As we are moving callback URL to environment variable,
updating config and README accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android baseline (boot) testing for arm and arm64 in only allmodconfig

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler.py: If event have jobfilter, inject it to the node data

When someone generate artificial event with jobfilter, this is
likely maintainer trying to repeat job. Treat this accordingly,
and inject job filter to job node, so we will run only tests
maintainer wants.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback: migrate to fastapi

It will be easier to maintain API and Pipeline, as
both will be powered by FastAPI framework.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: Update fluster rootfs URL

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: pipeline: fix defconfigs in fragments

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* kbuild.jinja2: support defconfig as list or str

As required in https://github.com/kernelci/kernelci-core/pull/2608
defconfig might be two types. Support it in jinja2 accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: piepline: add kbuilds of lee-mfd with default defconfigs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable baseline testing for mfd for one board of each arch

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: fix platform sections for Qualcomm and Android schedules

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* k8s: Update deployment to uvicorn, as we use fastapi now

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: Unblock android runs on lava-collabora

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: Enable preempt-rt cyclictest test

Enable the first preempt-rt test, cyclictest in new KernelCI. Enable it
on all platforms.

Since these are all smoke test there is no point in running them too
long. Thus reduce the runtime per test to one minute. This should keep
the total preempt-rt runtime roughly in the same time frame.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: add all the test jobs for all rt-test

Add jobs definition of all the rt-tests. Enable cyclicdeadline and rtla
tests to run on all targets.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add template and test properties for preempt_rt jobs

Add template, job add kcidb_test_suite properties for all preempt-rt jobs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: rename preempt-rt to rt-tests which is correct name of tests

The legacy was using preempt-rt name of tests. But the repository has
rt-tests name. We must use the same name to merge with execution results
coming from other CIs in KCIDB.

Suggested-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add the correct nfsroot for rt-tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Remove android's deprecated branches

It has been confirmed with Todd that we should remove the deprecated
branches. Hence remove those branches.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: run baseline on non-allmodconfig

The allmodconfig generates very large kernel image. It cannot be booted
on the arm64 and arm targets as tftp errors out that size is too large.
Reduce the kernel image size. Use the default defconfig. The same
defconfigs have been booting for other trees.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* doc: developer-documentation: Update documentation by adding more details

- Reorganize some things
- Specify how to write different variants by removing old syntax
- Give two separate templates for kbuild and test
- Try to put more details for new contributors

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Fix type
- Apply suggestions from code review

* doc/developer-documentation: fix a glitch in enabling new tree section

Fix a minor bug in YAML block formatting.

Fixes: f5f57de ("doc: developer-documentation: Update documentation by adding more details")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/developer-documentation: update a section title

Rename a section from "Enabling a new Kernel tree" to
"Enabling new KernelCI trees, builds, and tests" as it explains
enabling tests as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: use the new `tree:branch` format for rules

For cases where we want a single branch to be allowed for a given tree,
we can now use the `tree:branch` format in rules. Convert existing rules
accordingly.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: pipeline: fix improper use of "filters" attribute

The `filters` param was used in the legacy system but has been replaced
by `rules`, with a different syntax.

For Android RISC-V builds, this was used to deny job execution on
kernels < 4.19, so let's translate this condition with the rules format,
and do a similar change for the `rt-tests`-based jobs.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config/pipeline.yaml: Fix x86 typo in kcidebug job names

The kcidebug jobs that run on MediaTek and Qualcomm platforms should
have arm64 in the name rather than x86. Fix the typo.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: pipeline: remove params

The parameters are only needed when they are changed or appeneded.
Remvoe the parameters which aren't being modified.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* validate_yaml.py: Jobs are required to have template parameter

Add more validation to config files of mandatory parameters.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add more job validations

Add basic validation, each job must have kind parameter

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* workflows: Add label on CI check failures

Automatically add label so broken PR wont go to staging

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

---------

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Signed-off-by: Laura Nao <laura.nao@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>
Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-authored-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Co-authored-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Co-authored-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Co-authored-by: Helen Koike <helen.koike@collabora.com>
Co-authored-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Co-authored-by: Laura Nao <laura.nao@collabora.com>
Co-authored-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Co-authored-by: Shreeya Patel <shreeya.patel@collabora.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Milosz Wasilewski <milosz.wasilewski@foundries.io>
Co-authored-by: Paweł Wieczorek <pawiecz@collabora.com>
Co-authored-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Co-authored-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
nuclearcat added a commit to nuclearcat/kernelci-pipeline that referenced this pull request Jul 24, 2024
* src/scheduler: store error message when job fails with "submit_error"

It is helpful for debugging to catch error message when
scheduler fails to submit job to runtime.
Store the error message to `data.error_msg` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: Set minimum kernel version for DT kselftest to 6.7

The test was introduced upstream in version 6.7, so no point in trying
to run it on earlier versions.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* configs/: Update volteer device

Update volteer devices according lab availability

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary templates: detailed output for active/inactive regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new presets for active regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: update CHANGELOG

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* data: chmod -R 777 ./data/output to avoid permission error

Avoid errors like

PermissionError: [Errno 13] Permission denied: '/home/kernelci/data/output/stable-rc-boot.html'

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: move code to _get_logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: use ThreadPoolExecutor to fetch logs

Fetching logs is the bottleneck of the script. Fetch them in parallel
with ThreadPoolExecutor.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix result presets

stable-rc-build-failures and stable-rc-boot-failures weren't querying
specifically for test failures.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: rework regression detection

Take into account "active" and "inactive" regressions when creating them
and when processing new passed or failed nodes.

When a node passes, it checks if it "inactivates" an existing "active"
regression. When a node fails, it checks if it needs to create a new
regression or update an existing "active" one.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: link failed nodes to active regressions

When a failed node generates a regression, or when it's a re-run of a
run that generated a still active regression, link the node to the
regression id.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for date ranges for creation and update

New command line options to let the user specify date ranges for node
creation and last update: --created-from, --created-to,
--last-updated-from, --last-updated-to

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: support for date ranges for creation and last update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for extra query parameters in cmdline

New command line option: --query-params to specify a set of extra query
parameters to complete or override preset parameters.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: html markup in some preset titles

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: update and move to docs folder

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: move parameter loading and processing to 'setup'

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: refactor and split into two clases (single, run)

Split the ResultSummary class into a base class and two child classes:
ResultSummarySingle and ResultSummaryLoop (only a stub at this point).

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: WIP initial implementation of the "loop" command

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: huge refactoring

Implement "summary" (single-shot) and "monitor" (loop) modes based on
preset parameters instead of on the command-line main command.

Split the logic into multiple files, move all monitor-specific and
summary-specific code to independent files, common code in a separate
file.

Full of kludges, I don't like how this is looking so far, might consider
reimplementing it without any dependencies on pipeline code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix markup and indentation

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new generic templates for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: examples for "monitor" and "summary" modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: summary and monitor modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix generic regression report

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: summary: fix last_updated option handling

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: embed css stylesheet in html files

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] make regression active by default

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "result" field is ever made non-optional in the models we can
probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] set default empty node sequence

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "node_sequence" field is ever made non-optional in the models we
can probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: add cmdline option --output-dir

Introduce a new command-line option: --output-dir, and rename the old
--output to --output-file.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: command-line options change

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: jobs-chromeos: remove meaningless Tast tests

Several Tast tests can only fail in the context of KernelCI:
* `video.PlatformDecoding.v4l2_state*_vp9_0_svc` do not actually exist,
  causing the whole test job to fail
* `platform.DLCService*` and `platform.Memd` rely on features only
  present in the downstream Chrom{e,ium}OS kernel (see b/247467814 and
  b/244479619 for those having access to Google's issue tracker)
* `kernel.ConfigVerify.chromeos` relies on downstream-only config
  options such as `CONFIG_SECURITY_CHROMIUMOS` and other similar ones,
  and therefore can only fail when testing upstream kernels

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: scheduler-chromeos: don't execute non-working Tast tests

Currently, HEVC-related tests are known to either fail or be skipped as
ChromeOS doesn't yet handle hardware decoding of HEVC media. This is
expected to be fixed at some point though, so we're keeping the job
definitions and only remove the corresponding scheduler entries in order
to reinstate those jobs when relevant.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: exclude Tast tests known to always fail

Several decoder tests always fail on all platforms where they're
executed, adding only noise to otherwise useful test results. Disable
those for improving the quality of the results.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: chromeos: add special case for pre-6.7 qcom codec tests

On Qualcomm-based ChromeBooks (`trogdor` being the only model in
Collabora's lab), we noticed systematic failures of all
`vp9_*_frm_resize` and `vp9_*_sub8x8_sf` tests when using a kernel up to
6.6. With 6.7 and above, all of those tests (except one) now pass. It
therefore makes sense to exclude those on pre-6.7 kernels so we don't
report known failures and get rid of some noise.

This involves "duplicating" affected test jobs (although I did my best
to minimize that) and setting rules so only the working variant is
executed, based on the version of the kernel being tested.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* lava_callback: Compress the log files to save storage space

As storage space in cloud and egress have high costs,
better to compress potentially large files.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* tests: Add basic yaml validation

Add yaml load to figure out earlier issues with yaml

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in platforms anchors

The "stoneyridge" and "pineview" naming used in the Chromebook platform
anchors refers to ChromiumOS specific config fragments, but doesn't
necessarily match the actual platform of all the devices listed.
Use more generic names to distinguish amd and intel Chromebooks.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: rename test job anchors that use chromeos specific configs

Rename test job anchors that use chromeos specific kernel configurations
to include the 'chromeos' infix.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: add baseline tests

Enable the baseline tests on all the supported Chromebooks with their
default kernel configuration.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in job defs

The "stoneyridge" and "pineview" naming used in some Chromebook job
definitions refers to ChromiumOS specific config fragments, but
doesn't necessarily match the actual platforms targeted by the jobs.
Replace all occurrences with more generic intel/amd naming.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop chromeos infix from baseline jobs

Keeping different job names for tests targeting different kernel configs
might cause too much duplication. Drop the 'chromeos' infix from the job
name for the tests using the chromeos config fragment. Users will be
able to filter the results using the data.defconfig/data.config_full
fields anyway.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: post-process results for summary and monitor modes

Split the post-processing of nodes to a common function that can be used
for both summary and monitor modes. Currently, post-processing involves
only the collection of logs.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: update and fix presets and templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/result-summary-CHANGELOG: update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config/pipeline.yaml: enable 'BayLibre' lab

Add lab configuration for BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-baylibre` runtime

Add runtime argument `lab-baylibre` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86-baylibre` job

Add job configuration `baseline-x86-baylibre` for BayLibre.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-armel-baylibre` job

Add job configuration `baseline-armel-baylibre` for BayLibre.
Add scheduler entry and platform config as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline: enable `android` tree and build configs

Monitor linux `android` tree. Add build configs for `android-mainline`
branch.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add kbuild definitions for android-mainline

Add kbuild jobs to compile the kernel for android-mainline branch

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add entries to schedule to build android-mainline

Add entries to `scheduler:` section to run the builds for
android-mainline.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix node filter in monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* kernelci.toml: set `checkout` node timeout to `180 min`

Currently set `60 min` timeout is not enough as some
`kbuild` jobs and its sub-tests take around 2 hrs to
complete after getting submitted to runtime.

Here is an example from staging. See the information
for a `checkout` and its child nodes:

| id                       | name                | created                    | updated                    | timeout                    |
|--------------------------|---------------------|----------------------------|----------------------------|----------------------------|
| 661c9d59b60b785eb9fc42b0 | checkout            | 2024-04-15T03:22:01.317000 | 2024-04-15T03:51:03.870000 | 2024-04-15T04:22:01.284000 |
| 661c9d97b60b785eb9fc42b4 | kbuild-gcc-10-arm64 | 2024-04-15T03:23:03.399000 | 2024-04-15T03:50:15.031000 | 2024-04-15T09:23:03.399000 |
| 661ca3f7b60b785eb9fc4ead | baseline-arm64      | 2024-04-15T03:50:15.304000 | 2024-04-15T05:09:45.247000 | 2024-04-15T09:50:15.304000 |

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary: add email report capabilities for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: plain text single report templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: chromeos: add baseline-nfs tests

Enable the baseline-nfs tests on all the supported Chromebooks, with
both the default and the chromeos kernel configurations.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/timeout: set `checkout` result

For `TIMEOUT` mode, set `checkout` node result to `fail`
if its state is `running` as it means code checkout is still
going on and node timed-out. Set it to `pass` if its state
is any other than `running`.
Set `checkout` node result to `pass` if mode is `DONE` as
it means once `checkout` has been in `available` or `closing`
state and it could successfully complete source code checkout.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* regression_tracker: bugfix, failed test with no prior runs

Handle the case of a failed test run when it's the first occurence of
that test case. Consider it "not a regression" for now, since we're
defining a regression as a "breaking point" between a success and a
failure.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: platforms-chromeos: fix dalboz device type

Due due to a copy/paste mishap, the device type for
`asus-CM1400CXA-dalboz` had a trailing `_chromeos`, leading LAVA to fail
finding the correct device type, and no job from the new system running
on this platform.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromes: run Tast tests only on 5.4+

Current ChromeOS images have `ext4` filesystems using options not
present in 4.19. Therefore tests cannot run on kernels that old, and
this leads to false positives in corrupt device identification, so we
should only run those tests on 5.4 and later kernels.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromes: drop non-existent platform

`hp-x360-12b-ca0500na-n4000-octopus` isn't a device type available in
Collabora's LAVA lab, so let's drop its definition.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: exclude android tree from kbuild jobs

Only Android-specific kbuild jobs should run for this tree, let's not
overload our system with unneeded builds.

Take this opportunity to limit mediatek kbuilds to 6.1+ as that's the
earliest version that has upstream support for at least one of our
devices.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: a bug fix in `_submit_lapsed_nodes`

Fix a glitch in the code related to setting `checkout`
node result.

Fixes: 361fc0d ("src/timeout: set `checkout` result")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update early access FQDN

We are moving k8s from eastus to westus3 as it is cheaper

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/tarball: fix `_kdir` in `update_repo`

Fix the below error:
```
kernelci-pipeline-tarball |   File "/home/kernelci/./pipeline/tarball.py", line 79, in _update_repo
kernelci-pipeline-tarball |     kernelci.shell_cmd(f"rm -rf {self._kdir}")
kernelci-pipeline-tarball |                                  ^^^^^^^^^^
kernelci-pipeline-tarball | AttributeError: 'Tarball' object has no attribute '_kdir'
```

Fixes: 0a2fe9c ("src/patchset.py: Implement Patchset service)
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: fix method to get child nodes recursively

`TimeoutService._get_child_nodes_recursive` is used to get
pending child nodes recursively for closing and timed-out
nodes. It overwrites the result while being called recursively.
Fix the method to make it work properly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: rename "armel" arch to "arm"

`armel` has various meanings depending on the system: for ChromeOS, it
is ARMv7, while in Debian it's ARMv{5T,6}. Moreover, this project is
*Kernel*CI and the kernel uses `arm` for all 32-bits ARM devices. In
order to avoid confusion (including those wondering what the heck does
`armel` mean), let's rename `armel` to `arm`.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: use per-system arch property where relevant

With the new `*arch` fields present in the platform configurations, we
don't have to hardcode the architecture strings in some specific cases.
Let's adapt the config files so we use `{cros,deb,k}arch` wherever it
makes sense.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: set timed-out `checkout` result

Set timed-out `checkout` node result to `incomplete`
while in `running` state. As it denotes that the node
timed-out while checkout was still going on.
Also, set error related information i.e. `error_code`
and `error_msg`.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/tarball: update checkout node when update repo fails

Tarball updates source code repo and creates tarball.
If update repo operation fails even with second attempt,
it means it failed to checkout souce code.
Hence, update `checkout` node with state `done` state and
result `fail`. Also, set appropriate error information
to the `data` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: enable collabora-next tree and build config

Monitor the collabora-next tree. Add build config for the for-kernelci
branch.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: enable acpi kselftest on collabora-next tree

Run the ACPI kselftest on the for-kernelci branch of the collabora-next
tree.

See: https://lore.kernel.org/linux-kselftest/20240308144933.337107-1-laura.nao@collabora.com/T/#t

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: restore missing split_query_params function

Restore this function that was accidentally removed during the last
refactoring.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* lava_callback: Don't upload empty files to Azure

There is no use for lot of empty files on Azure,
that only complicate cleanup.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: unify preset and output names

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: update preset for aferraris

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for laura.nao

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fixes and new presets for nfraprado

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fix arch query parameters

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* k8s: Lot of deployment tested fixes

Fixes in yaml files for k8s production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result-summary presets: Fix build failure and regression monitors

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* result_summary: added debug traces to the monitor

Show detailed info of the node filterings in real time.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: fix corner case bug when no logs are found

Cover rare case where neither the node nor any of its parents up to the
checkout node have any log artifacts.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: refine stable-rc presets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: add regression info to test reports

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: escape log snippets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src: lava_callback: add device ID to node data

It can be useful to know the exact device on which a job ran, without
having to open the LAVA job page. This is done by querying the device ID
from the callback data and appending it to the node data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: upload raw callback data as well

Debugging callback issues is complex due to the raw data not being saved
after processing. This change ensures we save the callback data as a
JSON file in order to ease development.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* DONOTMERGE lava_callback: add debug statements

Why the heck doesn't this just work???

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary_templates: fix error 'node' is undefined

The object is named test and not node, so s/node/test

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/runtime/kunit: set architecture info

Set architecture field for `kunit` test
nodes.
If no `arch` argument is supplied, kunit takes
`um` (User Mode Linux) as architecture to run
tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: count running child jobs of build nodes

Add a method to count running jobs of `kbuild`
nodes i.e. jobs being submitted after successful
builds. Fox example `baseline` or `tast` jobs.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle closing `checkout` node differently

Usually, `checkout` should be transited to `done` state
when all its child nodes are completed.
In case of closing `checkout`, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are still
running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle holdoff reached `checkout` node differently

Usually, available `checkout` for which holdoff is
reached should be transited to `done` state only when
all its child nodes are completed.
In case of such `checkout` node, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are
still running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Revert "DONOTMERGE lava_callback: add debug statements"

This reverts commit 5ed8218d99840373bbba5830b1976813b52bf4b1.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* Create dependabot.yml

* result_summary_templates: make generic-test-failures generic to all
results

The generic-test-failures templates can be used to show general results
just replacing the name "failures" by "results". Makeing it easier to be
re-used by communities that want to have pre-sets to list all results of
the tests, so:

	s/generic-test-failures/generic-test-results

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result-summary.yaml: add preset to list android build tests

Since we now build android, add a preset to allow result-summary.yaml to
list all build results from Android tree.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* tarball: Implement checkout for specific commit

We often need not ToT, but specific commit, implement this.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* jobs-chromeos.yaml: Disable module compression for every kernel version

Commit d4bbe942098b ("kbuild: remove CONFIG_MODULE_COMPRESS"),
introduced in kernel v5.13, substituted CONFIG_MODULE_COMPRESS=n for
CONFIG_MODULE_COMPRESS_NONE=y as the way to disable module compression.
Since module compression causes "Invalid ELF header magic: != ELF"
errors during boot on the ChromeOS base config, add the missing config
to disable module compression on kernels > v5.13 as well.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* src: lava_callback: reduce callback data size

The callback data is quite large, especially as it includes the full log
which we already upload separately. By dropping it and compressing the
whole file with `gzip` we can avoid wasting too much storage space.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: don't leak secret token

The callback data contains the secret tokens value which shouldn't be
leaked. Ensure we drop it from the uploaded data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromeos: use new cros-flash image

This ensures we use the new version of the `install-modules` script.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: regression_tracker: add the "device" field to regression data

This can be helpful. We're not using it as a search param though, as we
don't want to narrow down the search that much, using the platform only
is better.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: result_summary_templates: report device used for job

This information is now available, and it can be useful to know the
affected device withouth having to look at the LAVA job details.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* kubernetes: Update deployment recipe

Update list of labs and add KCI_INSTANCE variable.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava-callback: Limit threads of lava-callback

Due inrush of lava callbacks and slow Azure Files
processing, we need to make sure we dont spawn too many
threads.
Also add hard limit of memory 1Gbyte

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: add presetes for fluster test

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Make template generic for all v4l2 tests
- Rebase on main

* result_summary presets: make the name of fluster test generic

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: enable first fluster test for mt8195-cherry-tomato-r2

Enable first fluster test, AV1-TEST-VECTORS for mt8195-cherry-tomato-r2.
Run the test on mainline and next until more trees are added.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Create generic v4l2-decoder-conformance-job and use anchers from it
- Update the rootfs address
- Move anchor to _anchor
- Update with nitpicks

* config: jobs-chromeos: Add kernelci tree for testing purpose

Remove this commit before merging.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Enable cpufreq kselftest

Enable cpufreq kselftest on all the trees and branches.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

* result_summary presets: fix preset for kselftest-dt failures monitor

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for kselftest-cpufreq

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: mt8195-cherry-tomato-r2: enable all fluster tests for all branches

Add all the trees and branches on which the tests would be ran. Enable
all the tests for tomato.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- The build config cannot be added yet. Just list the trees, it will only use
  the branches configured in build_configs:
  - mainline will use master
  - next will use master
  - collabora-chromeos-kernel will use for-kernelci
  - media will use master and fixes
- Remove kernelci tree as it was added just for testing purpose

* config: mt8183-kukui-jacuzzi-juniper-sku16: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

jacuzzi

* config: mt8186-corsola-steelix-sku131072: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: mt8192-asurada-spherion-r0: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Don't specify the platforms manually as they are already mentioned in
  test-job-arm64-mediatek

* config: sc7180-trogdor-kingoftown/lazor-limozeen: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Use test-job-arm64-qualcomm instead and carete separate jobs for
  qualcomm devices
- Don't specify platforms manually as they are already mentioned in
  test-job-arm64-qualcomm

* build(deps): bump uwsgi from 2.0.21 to 2.0.22 in /docker/lava-callback

Bumps [uwsgi](https://uwsgi-docs.readthedocs.io/en/latest/) from 2.0.21 to 2.0.22.

---
updated-dependencies:
- dependency-name: uwsgi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* pipeline.yaml: Add stable-rc build variants

Add more build variants for stable-rc tree to match legacy system.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary: add error classification

Classify errors according to patterns in the logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary presets: add collabora-chromeos-kernel and media trees for fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: Use media-stage instead of media-tree

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config/pipeline: enable android branches from legacy

Enable all android branches from the legacy system

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* trigger: Add exclude/include tree list for trigger

As we need to restrict list of running kernels on staging,
we need to add option allowing that.
Also it will be good to exclude staging kernels from production
kernel list.

So in case of staging we need to run kernels only from tree "kernelci"
and sometimes something else, for example "mediatek".
Option will look like:

--trees kernelci,mediatek
or
--trees kernelci

On production we need to exclude trees kernelci and buggytree:
--trees !kernelci,buggytree
or just kernelci:
--trees !kernelci

Purpose of this option is that our compiling capacity is limited,
and right now staging and production both compiling very large set
of kernels, we need to reduce this amount to drop costs.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: platforms-chromeos: use CrOS R124 files

ChromeBooks were upgraded with a new image based on ChromiumOS R124, so
we must use those files now.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: drop non-existent Tast tests

Those were removed between R120 and R124 and therefore cause test
failures with the new images.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary presets: fix acpi kselftest presets

We're interested in catching regressions and failures in the both the
kselftest-acpi test suites and its test cases. Match the nodes by group
in the presets accordingly.
Fix template used by the failure monitor preset.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src: update return values of `APIHelper.receive_event_node`

`APIHelper.receive_event_node` method is used to receive
node data from PubSub event. The method has been updated
to return `is_hierarchy` flag as well which represents
events related to node hierarchy.
Update pipeline services using the method accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: refine presets for v4l2-decoder-conformance

Modify the regression preset to monitor regressions on both the
v4l2-decoder-conformance test suites and its test cases, by matching the
nodes by group instead of by name.
Also, change the failure preset to monitor for all errors caused by
runtime errors.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: add summary presets for v4l2-decoder-conformance

Add summary presets to fetch regressions and failures on
v4l2-decoder-conformance tests. Two of the presets are the same used by
the monitor; add one additional preset to fetch all the failures on
both the test suites and their test cases.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* lava_callback.py: Remove error_code/error_msg on lava-callback

Sometimes due congestion node might be set to timeout, but
then result might arrive late and we need to use it properly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: fix dt kselftest presets

Fix the dt kselftest preset, just like was done for the acpi one, as the
current preset doesn't match the actual results we're interested in.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* doc/connecting-lab: refine documentation

Refine documentation for connecting LAVA labs
and submitting jobs to the lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback: Sometimes we get totally invalid log file uploaded

Most likely problems lays in threading of flask, and possibly
callbacks are getting mixed. This commit attempts to introduce
several countermeasures against that.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* doc: add `_index.md` page

Add index documentation page.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `pipeline-details` page

Move `pipeline-details` documentation from the API
repository to this repo to make it close to the source.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/connecting-lab: adjust `weight` property

Change `weight` property of existing doc page to
accommodate with transition of pipeline related docs
to pipeline repo.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `developer-documentation` page

Add developer manual documentation.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add lab config for Qualcomm

Add an entry to `runtimes` section for Qualcomm
lab configurations.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86` job for qualcomm

Add job configuration `baseline-x86-qualcomm` for
running baseline job in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add lab-qualcomm runtime

Add runtime argument `lab-qualcomm` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to Qualcomm LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-arm64` job for qualcomm

Add job configuration `baseline-arm64-qualcomm` for
running baseline job for `arm64` in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update RISC-V configs

1)rv32 defconfig doesn't exist, remove
2)nommu_k210_defconfig have modules disabled

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback.py: Sanitize lava log data

As we use this data in reports, lets remove all
non-printable characters as they confuse grafana, browsers and others.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/runtime/kunit.jinja2: fix result map

Fix result map for skipped tests. Initially, API
didn't have `skip` available node result in the schema.
That's why it was mapped to `None` result. But now API
has `skip` result to denote skipped tests.
Fix the result mapping accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: jobs-chromeos: Add lab-setup fragment

Add the lab-setup fragment to the chromebook builds, which contains the
architecture independent kernel configs needed to run tests on the
platform. Notably this disables IP autoconfig by the kernel.

The result of this change is that the 12 seconds boot delay and the
consequent deferred probe pending warnings will no longer happen on any
platform. Particularly on mt8186-corsola-steelix-sku131072 (due to a
different network adapter being used) on which it was still happening.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* lava_callback: bump up slightly threads number

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: enable watchdog reset test on Chromebooks

Add a basic test to verify watchdog reset functionality. Enable the
test on all ARM64 and AMD x86_64 Chromebooks. For Intel
Chromebooks, enable the test only on octopus, as ACPI PM Timer on the
other devices has been disabled in coreboot.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/send_kcidb: use schema version 4.3

Test status `MISS` was added to KCIDB in schema
v4.2 and supported by the latest version i.e. v4.3.
Hence, use the latest version for submission as
API may send a few tests with "MISS" status.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* send_kcidb: re-structure code for parsing checkout node

Move code for parsing checkout node to a separate
method.
Add `valid` field to parsed checkout node. It denotes
if source code was successfully checked out.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: print more information on invalid data

Print details for invalid revision data for the
sake of debugging.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: optimize `kcidb` import

Remove redundant `kcidb` import and adjust
kcidb Client call accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: remove keys with `None` values

KCIDB doesn't allow `None` as field value.
Remove all optional fields with `None` value
to make it valid data for submitting to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: add `kcidb_test_suite` property

Every KernelCI test will be mapped to a unified
test suite for KCIDB data submission.
Add `kcidb_test_suite` property to test job
definitions in YAML configuration files.
The added property will store the mapped
KCIDB test suite name.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: parse and submit node test and build data

Listen to all the node events with node state
`done` or `available` and submit the node to KCIDB.
Parse node received from the event and create KCIDB
schema compatible object based on type of the node
i.e. checkout, build or test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: set `log_excerpt` for builds and tests

Fetch logs from compressed log file(*.log.gz) URL
and send last 16*1024 characters for setting `log_excerpt`
field for build and test nodes as it is the max allowed
length of the KCIDB field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/jobs-chromes: add kcidb test suite property for watchdog test

Add KCIDB test suite mapping for `watchdog_reset` test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback.py: disable log removal from callback data

We need it for investigations if we have any critical data
loss during log sanitizing.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: add error info to build nodes

Add error metadata fields such as `error_code` and
`error_msg` to `misc` field for build nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: add watchdog-reset presets for mainline/next

Add monitor and summary presets to track the results from the watchdog
reset test on the mainline and next trees.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* pipeline.yaml: Fix fluster rootfs URL

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: get error metadata for failed/incomplete tests

Tweak condition to get error metadata for test nodes.
It should get error info for incomplete nodes as well
and not just failed nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: send tests only if KCIDB test mapping exists

All test suite definitions must have `kcidb_test_suite`
property i.e. KCIDB test suite mapping.
Only send tests for those the mapping is found.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* tests/validate_yaml: add validation for KCIDB mapping

To submit KernelCI generated data to KCIDB, it is required
to have a mapping for all the job definition with
`kcidb_test_suite` property.
Add validation to ensure all the jobs have a mapping
present to avoid missing data submission.
This check is to notify test authors trying to enable tests
in maestro to include the required property for the mapping
in their definition.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add qcs6490-rb3gen2 boot test

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* config: chromeos: Enable kselftest-dt on Qualcomm platforms

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* pipeline.yaml: Add one um build for android trees

As per request of Android team it will be good to check for breakages
UM builds as well.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: use `kind=job` for test suites

As part of re-structuring test hierarachy, `Job` model
has been introduced for test suite/job nodes.
It uses node kind `job`.
Update test configurations in `pipeline.yaml` and
`jobs-chromeos.yaml` to use `kind=job` to
generate job nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: provide `kind` value for child tests

In case of submitting test hierarchy, child nodes by default
inherit `kind` value from parent node.
As we are re-structuring test hierarchy, test suit/job nodes
will have `kind=job` where its child test nodes will have
`kind=test`. Provide `kind` field explicitly to test result
hierarchy to preserve different kind value than the parent
node.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: fix `NameError`

Fix the below error in `_submit` method:
```
Traceback (most recent call last):
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 287, in main
    job.submit(results)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 138, in submit
    self._submit(result)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 265, in _submit
    return node
NameError: name 'node' is not defined
```

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: evaluate job node result

Evaluate job node result from child node results if
`null` result is receive from test result parser.
For example nodes such as `fortify`:
https://staging.kernelci.org:9000/viewer?node_id=6670ab43d0b7694b399897c4

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix parsing of KUnit log file

Handle both compressed(gzip) and plain text log files
for getting log excerpt.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: HTTP exception handling for log excerpt

Add HTTP exception handling for getting
log excerpt data.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: platforms-chromeos: Add serial delay for some Mediatek platforms

Add test_character_delay to the Spherion, Tomato and Steelix platforms
to workaround the fact that they're sometimes unable to process serial
input fast enough, resulting in mangled commands and consequently flaky
test results, as described in
https://github.com/kernelci/kernelci-project/issues/366.

The right place to do this change would be in the device-type template
as described in LAVA's documentation [1]. This overriding in KernelCI is
meant only as a temporary workaround to verify whether this fixes the
issue. If it does, then we'll do it in LAVA upstream instead.

[1] https://docs.lavasoftware.org/lava/debugging.html#differences-in-input-speeds
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: chromeos: Enable error-logs kselftest for MediaTek Chromebooks

Run the error-logs kselftest on MediaTek Chromebooks. This test is
currently under review upstream [1] so, in the meantime, it has been
added to the collabora-next tree so it can prove its value by helping to
detect issues upstream.

[1] https://lore.kernel.org/all/20240423-dev-err-log-selftest-v1-0-690c1741d68b@collabora.com

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config/pipeline.yaml: enable CIP lab

Add configuration for LAVA CIP lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add baseline-x86 test for CIP

Add `baseline-x86-cip` test to be submitted to CIP
LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-cip` runtime

Add runtime argument `lab-cip` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to CIP LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: enable `job` node submission to KCIDB

Parse newly added job node and its child tests
for KCIDB submission.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: don't submit `setup` test suite nodes

`setup` test suite has been introduced to store test results
for environment setup checks before running actual test suite.
KCIDB doesn't require `setup` test suite result as long as
main test job result is submitted.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: add a check before sending data

Check if parsed data is available before
sending revision data to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix logs

Fix log statement about submitting node to KCIDB
as we are not sending all the nodes we receive
event for to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: handle skipped tests

Do not retrieve artifacts or metadata from parent
node for skipped tests as in pratice only kernel
revision, test runtime and platform will be
available for skipped tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary/utils: ignore failures on log retrieval

Make the script continue running if there was an error fetching a test
log.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/developer-documentation: add docs for enabling new tests

Add developer documentation for enabling new tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Fix links after docs page migration

Documentation has been migrated to the "docs.*" subdomain.

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* pipeline.yaml: Add kcidebug fragment

Add useful low-overhead debug option to kernel,
and test on most x86 boards we have available,
with minimal baseline tests.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* configs: update gcc-10 to gcc-12

As we upgrade compiler images, we need update gcc version

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: workaround: match node paths programatically

Don't use 'path' as an api search parameter. The use of lists as query
parameters (path is a list) is undefined. Instead, do the filtering in
code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: remove qemu jobs from lab-qualcomm

QEMU jobs use container pulled from hub.docker.com. After the lab move
pulling from this registry is no longer possible at Qualcomm. This patch
disables QEMU jobs from Qualcomm lab.

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* validate_yaml.py: Improve pipeline validation

Add validation that scheduler entries have matching job entry,
this is critical validation, and job entries have at least
one entry in the scheduler.
Fix one entry detected by this validation

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* pipeline.yaml: Add broonie(Mark Brown) trees to pipeline

It is time to enable even more trees.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add additional verification for duplicate keys

We might have redefined same keys in different yaml files,
this tool will ensure consistency of this entries.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Remove path separator

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Rename variable to schedules

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/kernelci.toml: update KCIDB origin name

As we agreed to refer new KernelCI API & Pipeline as
"maestro", use the new name while submitting data
to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: update KCI result mapping with KCIDB status

Update evaluation of KCIDB status from KCI result.

Create 2 categories for error codes:
1. When pre-check tests completed but actual test suite
coudln't run - this will have `MISS` status
2. When pre-check tests completed, actual test suite could
run but somehow couldn't complete - this will have `ERROR` status

Some LAVA error codes can occur at any point of execution
such as `Cancelled` and `Test`.
Listed such error codes to the most relevant category
based on analysis of available results.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: fix presets for v4l2-decoder-conformance

Following recent updates to data representation on KernelCI nodes,
the top-level nodes for tests now have their kind set to 'job' instead
of  'test'. Update the presets for v4l2-decoder-conformance tests
accordingly.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: fix output file name in kselftest-acpi preset

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: enable dmabuf-heaps, exec and iommu kselftest suites

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Add kcidb_test_suite

* config: result-summary: add generic rule to monitor failures and regression

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Add rt-stable builds

Copy rt-stable builds from legacy KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Major changes to move to new way of writing kbuild jobs

* config: pipeline: Add v6.6-rt branch for builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: result-summary: add rt-stable kbuilds presets

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Add 'nfs' suffix to KCIDB suite name for baseline-nfs

The baseline test is currently run with both ramdisk and nfs rootfs. To
distinguish baseline-nfs tests in KCIDB, add an 'nfs' suffix to the KCIDB
test suite name.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* aks: Add kubernetes kcidb deployment

We need file that will manage deployment of kcidb bridge
in kubernetes production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* kubernetes: Adjust trigger k8s options

Ignore kernelci tree on production, as it is special
"staging"-only tree, and read all /config directory, not just default
pipeline.yaml.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: bugfix: catch empty search condition

Fix _get_last_matching_node(), after the previous change there was an
unhandled scenario where nodes may be empty but the function wouldn't
return None immediately.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: pipeline: correct the kind of kselftest suites to job

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler-chromeos.yaml: Temporarily disable non-essential tast tests

As per discussion, we disable temporary tast tests which unlikely
will be reviewed.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* k8s/aks: Update deployment files

1)Update memory limit, as working with linux sources might require 3Gbyte of RAM.
2)Update config file path
3)Add callback environment variable
4)Update image reference to fresh one

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android builds with gcc-12 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable android builds with clang-17 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: remove build_variants from android build_configs

The build_variants is legacy way to specify the different variants. We
have moved to the newer way to specify the variants. Hence remove the
build_variants from android build_configs.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add android15-6.6-lts branch for build as well

The android15-6.6-lts has been included recently in legacy KernelCI:
https://github.com/kernelci/kernelci-core/pull/2597

Add the same in newer KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add blocklist for riscv older kernels for android builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: update KCIDB test suite mapping for baseline

Use `boot` as KCIDB test suite mapping for all
baseline tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* callback_url: Update config and README

As we are moving callback URL to environment variable,
updating config and README accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android baseline (boot) testing for arm and arm64 in only allmodconfig

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler.py: If event have jobfilter, inject it to the node data

When someone generate artificial event with jobfilter, this is
likely maintainer trying to repeat job. Treat this accordingly,
and inject job filter to job node, so we will run only tests
maintainer wants.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback: migrate to fastapi

It will be easier to maintain API and Pipeline, as
both will be powered by FastAPI framework.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: Update fluster rootfs URL

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: pipeline: fix defconfigs in fragments

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* kbuild.jinja2: support defconfig as list or str

As required in https://github.com/kernelci/kernelci-core/pull/2608
defconfig might be two types. Support it in jinja2 accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: piepline: add kbuilds of lee-mfd with default defconfigs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable baseline testing for mfd for one board of each arch

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: fix platform sections for Qualcomm and Android schedules

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* k8s: Update deployment to uvicorn, as we use fastapi now

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: Unblock android runs on lava-collabora

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: Enable preempt-rt cyclictest test

Enable the first preempt-rt test, cyclictest in new KernelCI. Enable it
on all platforms.

Since these are all smoke test there is no point in running them too
long. Thus reduce the runtime per test to one minute. This should keep
the total preempt-rt runtime roughly in the same time frame.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: add all the test jobs for all rt-test

Add jobs definition of all the rt-tests. Enable cyclicdeadline and rtla
tests to run on all targets.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add template and test properties for preempt_rt jobs

Add template, job add kcidb_test_suite properties for all preempt-rt jobs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: rename preempt-rt to rt-tests which is correct name of tests

The legacy was using preempt-rt name of tests. But the repository has
rt-tests name. We must use the same name to merge with execution results
coming from other CIs in KCIDB.

Suggested-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add the correct nfsroot for rt-tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Remove android's deprecated branches

It has been confirmed with Todd that we should remove the deprecated
branches. Hence remove those branches.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: run baseline on non-allmodconfig

The allmodconfig generates very large kernel image. It cannot be booted
on the arm64 and arm targets as tftp errors out that size is too large.
Reduce the kernel image size. Use the default defconfig. The same
defconfigs have been booting for other trees.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* doc: developer-documentation: Update documentation by adding more details

- Reorganize some things
- Specify how to write different variants by removing old syntax
- Give two separate templates for kbuild and test
- Try to put more details for new contributors

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Fix type
- Apply suggestions from code review

* doc/developer-documentation: fix a glitch in enabling new tree section

Fix a minor bug in YAML block formatting.

Fixes: f5f57de ("doc: developer-documentation: Update documentation by adding more details")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/developer-documentation: update a section title

Rename a section from "Enabling a new Kernel tree" to
"Enabling new KernelCI trees, builds, and tests" as it explains
enabling tests as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: use the new `tree:branch` format for rules

For cases where we want a single branch to be allowed for a given tree,
we can now use the `tree:branch` format in rules. Convert existing rules
accordingly.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: pipeline: fix improper use of "filters" attribute

The `filters` param was used in the legacy system but has been replaced
by `rules`, with a different syntax.

For Android RISC-V builds, this was used to deny job execution on
kernels < 4.19, so let's translate this condition with the rules format,
and do a similar change for the `rt-tests`-based jobs.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config/pipeline.yaml: Fix x86 typo in kcidebug job names

The kcidebug jobs that run on MediaTek and Qualcomm platforms should
have arm64 in the name rather than x86. Fix the typo.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: pipeline: remove params

The parameters are only needed when they are changed or appeneded.
Remvoe the parameters which aren't being modified.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* validate_yaml.py: Jobs are required to have template parameter

Add more validation to config files of mandatory parameters.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add more job validations

Add basic validation, each job must have kind parameter

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* workflows: Add label on CI check failures

Automatically add label so broken PR wont go to staging

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

---------

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Signed-off-by: Laura Nao <laura.nao@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>
Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-authored-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Co-authored-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Co-authored-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Co-authored-by: Helen Koike <helen.koike@collabora.com>
Co-authored-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Co-authored-by: Laura Nao <laura.nao@collabora.com>
Co-authored-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Co-authored-by: Shreeya Patel <shreeya.patel@collabora.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Milosz Wasilewski <milosz.wasilewski@foundries.io>
Co-authored-by: Paweł Wieczorek <pawiecz@collabora.com>
Co-authored-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Co-authored-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
nuclearcat added a commit to nuclearcat/kernelci-pipeline that referenced this pull request Jul 24, 2024
* src/scheduler: store error message when job fails with "submit_error"

It is helpful for debugging to catch error message when
scheduler fails to submit job to runtime.
Store the error message to `data.error_msg` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: Set minimum kernel version for DT kselftest to 6.7

The test was introduced upstream in version 6.7, so no point in trying
to run it on earlier versions.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* configs/: Update volteer device

Update volteer devices according lab availability

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary templates: detailed output for active/inactive regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new presets for active regressions

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: update CHANGELOG

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* data: chmod -R 777 ./data/output to avoid permission error

Avoid errors like

PermissionError: [Errno 13] Permission denied: '/home/kernelci/data/output/stable-rc-boot.html'

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: move code to _get_logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: use ThreadPoolExecutor to fetch logs

Fetching logs is the bottleneck of the script. Fetch them in parallel
with ThreadPoolExecutor.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix result presets

stable-rc-build-failures and stable-rc-boot-failures weren't querying
specifically for test failures.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: rework regression detection

Take into account "active" and "inactive" regressions when creating them
and when processing new passed or failed nodes.

When a node passes, it checks if it "inactivates" an existing "active"
regression. When a node fails, it checks if it needs to create a new
regression or update an existing "active" one.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src/regression_tracker: link failed nodes to active regressions

When a failed node generates a regression, or when it's a re-run of a
run that generated a still active regression, link the node to the
regression id.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for date ranges for creation and update

New command line options to let the user specify date ranges for node
creation and last update: --created-from, --created-to,
--last-updated-from, --last-updated-to

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: support for date ranges for creation and last update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: support for extra query parameters in cmdline

New command line option: --query-params to specify a set of extra query
parameters to complete or override preset parameters.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: html markup in some preset titles

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: update and move to docs folder

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: move parameter loading and processing to 'setup'

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: refactor and split into two clases (single, run)

Split the ResultSummary class into a base class and two child classes:
ResultSummarySingle and ResultSummaryLoop (only a stub at this point).

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: WIP initial implementation of the "loop" command

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: huge refactoring

Implement "summary" (single-shot) and "monitor" (loop) modes based on
preset parameters instead of on the command-line main command.

Split the logic into multiple files, move all monitor-specific and
summary-specific code to independent files, common code in a separate
file.

Full of kludges, I don't like how this is looking so far, might consider
reimplementing it without any dependencies on pipeline code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix markup and indentation

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: new generic templates for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: examples for "monitor" and "summary" modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: summary and monitor modes

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: fix generic regression report

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: summary: fix last_updated option handling

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: embed css stylesheet in html files

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] make regression active by default

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "result" field is ever made non-optional in the models we can
probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* regression_tracker: [trivial] set default empty node sequence

Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4
If the "node_sequence" field is ever made non-optional in the models we
can probably remove this.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: add cmdline option --output-dir

Introduce a new command-line option: --output-dir, and rename the old
--output to --output-file.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary changelog: command-line options change

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: jobs-chromeos: remove meaningless Tast tests

Several Tast tests can only fail in the context of KernelCI:
* `video.PlatformDecoding.v4l2_state*_vp9_0_svc` do not actually exist,
  causing the whole test job to fail
* `platform.DLCService*` and `platform.Memd` rely on features only
  present in the downstream Chrom{e,ium}OS kernel (see b/247467814 and
  b/244479619 for those having access to Google's issue tracker)
* `kernel.ConfigVerify.chromeos` relies on downstream-only config
  options such as `CONFIG_SECURITY_CHROMIUMOS` and other similar ones,
  and therefore can only fail when testing upstream kernels

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: scheduler-chromeos: don't execute non-working Tast tests

Currently, HEVC-related tests are known to either fail or be skipped as
ChromeOS doesn't yet handle hardware decoding of HEVC media. This is
expected to be fixed at some point though, so we're keeping the job
definitions and only remove the corresponding scheduler entries in order
to reinstate those jobs when relevant.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: exclude Tast tests known to always fail

Several decoder tests always fail on all platforms where they're
executed, adding only noise to otherwise useful test results. Disable
those for improving the quality of the results.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: chromeos: add special case for pre-6.7 qcom codec tests

On Qualcomm-based ChromeBooks (`trogdor` being the only model in
Collabora's lab), we noticed systematic failures of all
`vp9_*_frm_resize` and `vp9_*_sub8x8_sf` tests when using a kernel up to
6.6. With 6.7 and above, all of those tests (except one) now pass. It
therefore makes sense to exclude those on pre-6.7 kernels so we don't
report known failures and get rid of some noise.

This involves "duplicating" affected test jobs (although I did my best
to minimize that) and setting rules so only the working variant is
executed, based on the version of the kernel being tested.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* lava_callback: Compress the log files to save storage space

As storage space in cloud and egress have high costs,
better to compress potentially large files.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* tests: Add basic yaml validation

Add yaml load to figure out earlier issues with yaml

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in platforms anchors

The "stoneyridge" and "pineview" naming used in the Chromebook platform
anchors refers to ChromiumOS specific config fragments, but doesn't
necessarily match the actual platform of all the devices listed.
Use more generic names to distinguish amd and intel Chromebooks.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: rename test job anchors that use chromeos specific configs

Rename test job anchors that use chromeos specific kernel configurations
to include the 'chromeos' infix.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: add baseline tests

Enable the baseline tests on all the supported Chromebooks with their
default kernel configuration.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop stoneyridge/pineview naming in job defs

The "stoneyridge" and "pineview" naming used in some Chromebook job
definitions refers to ChromiumOS specific config fragments, but
doesn't necessarily match the actual platforms targeted by the jobs.
Replace all occurrences with more generic intel/amd naming.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: drop chromeos infix from baseline jobs

Keeping different job names for tests targeting different kernel configs
might cause too much duplication. Drop the 'chromeos' infix from the job
name for the tests using the chromeos config fragment. Users will be
able to filter the results using the data.defconfig/data.config_full
fields anyway.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: post-process results for summary and monitor modes

Split the post-processing of nodes to a common function that can be used
for both summary and monitor modes. Currently, post-processing involves
only the collection of logs.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: update and fix presets and templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/result-summary-CHANGELOG: update

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config/pipeline.yaml: enable 'BayLibre' lab

Add lab configuration for BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-baylibre` runtime

Add runtime argument `lab-baylibre` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to BayLibre.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86-baylibre` job

Add job configuration `baseline-x86-baylibre` for BayLibre.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-armel-baylibre` job

Add job configuration `baseline-armel-baylibre` for BayLibre.
Add scheduler entry and platform config as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline: enable `android` tree and build configs

Monitor linux `android` tree. Add build configs for `android-mainline`
branch.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add kbuild definitions for android-mainline

Add kbuild jobs to compile the kernel for android-mainline branch

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/pipeline.yaml: add entries to schedule to build android-mainline

Add entries to `scheduler:` section to run the builds for
android-mainline.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary: fix node filter in monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* kernelci.toml: set `checkout` node timeout to `180 min`

Currently set `60 min` timeout is not enough as some
`kbuild` jobs and its sub-tests take around 2 hrs to
complete after getting submitted to runtime.

Here is an example from staging. See the information
for a `checkout` and its child nodes:

| id                       | name                | created                    | updated                    | timeout                    |
|--------------------------|---------------------|----------------------------|----------------------------|----------------------------|
| 661c9d59b60b785eb9fc42b0 | checkout            | 2024-04-15T03:22:01.317000 | 2024-04-15T03:51:03.870000 | 2024-04-15T04:22:01.284000 |
| 661c9d97b60b785eb9fc42b4 | kbuild-gcc-10-arm64 | 2024-04-15T03:23:03.399000 | 2024-04-15T03:50:15.031000 | 2024-04-15T09:23:03.399000 |
| 661ca3f7b60b785eb9fc4ead | baseline-arm64      | 2024-04-15T03:50:15.304000 | 2024-04-15T05:09:45.247000 | 2024-04-15T09:50:15.304000 |

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary: add email report capabilities for monitor mode

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: plain text single report templates

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: chromeos: add baseline-nfs tests

Enable the baseline-nfs tests on all the supported Chromebooks, with
both the default and the chromeos kernel configurations.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/timeout: set `checkout` result

For `TIMEOUT` mode, set `checkout` node result to `fail`
if its state is `running` as it means code checkout is still
going on and node timed-out. Set it to `pass` if its state
is any other than `running`.
Set `checkout` node result to `pass` if mode is `DONE` as
it means once `checkout` has been in `available` or `closing`
state and it could successfully complete source code checkout.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* regression_tracker: bugfix, failed test with no prior runs

Handle the case of a failed test run when it's the first occurence of
that test case. Consider it "not a regression" for now, since we're
defining a regression as a "breaking point" between a success and a
failure.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: platforms-chromeos: fix dalboz device type

Due due to a copy/paste mishap, the device type for
`asus-CM1400CXA-dalboz` had a trailing `_chromeos`, leading LAVA to fail
finding the correct device type, and no job from the new system running
on this platform.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromes: run Tast tests only on 5.4+

Current ChromeOS images have `ext4` filesystems using options not
present in 4.19. Therefore tests cannot run on kernels that old, and
this leads to false positives in corrupt device identification, so we
should only run those tests on 5.4 and later kernels.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromes: drop non-existent platform

`hp-x360-12b-ca0500na-n4000-octopus` isn't a device type available in
Collabora's LAVA lab, so let's drop its definition.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: exclude android tree from kbuild jobs

Only Android-specific kbuild jobs should run for this tree, let's not
overload our system with unneeded builds.

Take this opportunity to limit mediatek kbuilds to 6.1+ as that's the
earliest version that has upstream support for at least one of our
devices.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: a bug fix in `_submit_lapsed_nodes`

Fix a glitch in the code related to setting `checkout`
node result.

Fixes: 361fc0d ("src/timeout: set `checkout` result")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update early access FQDN

We are moving k8s from eastus to westus3 as it is cheaper

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/tarball: fix `_kdir` in `update_repo`

Fix the below error:
```
kernelci-pipeline-tarball |   File "/home/kernelci/./pipeline/tarball.py", line 79, in _update_repo
kernelci-pipeline-tarball |     kernelci.shell_cmd(f"rm -rf {self._kdir}")
kernelci-pipeline-tarball |                                  ^^^^^^^^^^
kernelci-pipeline-tarball | AttributeError: 'Tarball' object has no attribute '_kdir'
```

Fixes: 0a2fe9c ("src/patchset.py: Implement Patchset service)
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: fix method to get child nodes recursively

`TimeoutService._get_child_nodes_recursive` is used to get
pending child nodes recursively for closing and timed-out
nodes. It overwrites the result while being called recursively.
Fix the method to make it work properly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: rename "armel" arch to "arm"

`armel` has various meanings depending on the system: for ChromeOS, it
is ARMv7, while in Debian it's ARMv{5T,6}. Moreover, this project is
*Kernel*CI and the kernel uses `arm` for all 32-bits ARM devices. In
order to avoid confusion (including those wondering what the heck does
`armel` mean), let's rename `armel` to `arm`.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: use per-system arch property where relevant

With the new `*arch` fields present in the platform configurations, we
don't have to hardcode the architecture strings in some specific cases.
Let's adapt the config files so we use `{cros,deb,k}arch` wherever it
makes sense.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src/timeout: set timed-out `checkout` result

Set timed-out `checkout` node result to `incomplete`
while in `running` state. As it denotes that the node
timed-out while checkout was still going on.
Also, set error related information i.e. `error_code`
and `error_msg`.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/tarball: update checkout node when update repo fails

Tarball updates source code repo and creates tarball.
If update repo operation fails even with second attempt,
it means it failed to checkout souce code.
Hence, update `checkout` node with state `done` state and
result `fail`. Also, set appropriate error information
to the `data` field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: pipeline: enable collabora-next tree and build config

Monitor the collabora-next tree. Add build config for the for-kernelci
branch.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: chromeos: enable acpi kselftest on collabora-next tree

Run the ACPI kselftest on the for-kernelci branch of the collabora-next
tree.

See: https://lore.kernel.org/linux-kselftest/20240308144933.337107-1-laura.nao@collabora.com/T/#t

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary: restore missing split_query_params function

Restore this function that was accidentally removed during the last
refactoring.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* lava_callback: Don't upload empty files to Azure

There is no use for lot of empty files on Azure,
that only complicate cleanup.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: unify preset and output names

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: update preset for aferraris

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for laura.nao

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fixes and new presets for nfraprado

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: fix arch query parameters

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* k8s: Lot of deployment tested fixes

Fixes in yaml files for k8s production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result-summary presets: Fix build failure and regression monitors

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* result_summary: added debug traces to the monitor

Show detailed info of the node filterings in real time.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary: fix corner case bug when no logs are found

Cover rare case where neither the node nor any of its parents up to the
checkout node have any log artifacts.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: refine stable-rc presets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: add regression info to test reports

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary templates: escape log snippets

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* src: lava_callback: add device ID to node data

It can be useful to know the exact device on which a job ran, without
having to open the LAVA job page. This is done by querying the device ID
from the callback data and appending it to the node data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: upload raw callback data as well

Debugging callback issues is complex due to the raw data not being saved
after processing. This change ensures we save the callback data as a
JSON file in order to ease development.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* DONOTMERGE lava_callback: add debug statements

Why the heck doesn't this just work???

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary_templates: fix error 'node' is undefined

The object is named test and not node, so s/node/test

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* config/runtime/kunit: set architecture info

Set architecture field for `kunit` test
nodes.
If no `arch` argument is supplied, kunit takes
`um` (User Mode Linux) as architecture to run
tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: count running child jobs of build nodes

Add a method to count running jobs of `kbuild`
nodes i.e. jobs being submitted after successful
builds. Fox example `baseline` or `tast` jobs.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle closing `checkout` node differently

Usually, `checkout` should be transited to `done` state
when all its child nodes are completed.
In case of closing `checkout`, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are still
running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/timeout: handle holdoff reached `checkout` node differently

Usually, available `checkout` for which holdoff is
reached should be transited to `done` state only when
all its child nodes are completed.
In case of such `checkout` node, take into account
running child jobs of build nodes before transiting
its state to `done`. Otherwise, `checkout` will be
assigned to `done` state even if some child jobs are
still running.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Revert "DONOTMERGE lava_callback: add debug statements"

This reverts commit 5ed8218d99840373bbba5830b1976813b52bf4b1.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* Create dependabot.yml

* result_summary_templates: make generic-test-failures generic to all
results

The generic-test-failures templates can be used to show general results
just replacing the name "failures" by "results". Makeing it easier to be
re-used by communities that want to have pre-sets to list all results of
the tests, so:

	s/generic-test-failures/generic-test-results

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result-summary.yaml: add preset to list android build tests

Since we now build android, add a preset to allow result-summary.yaml to
list all build results from Android tree.

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* tarball: Implement checkout for specific commit

We often need not ToT, but specific commit, implement this.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* jobs-chromeos.yaml: Disable module compression for every kernel version

Commit d4bbe942098b ("kbuild: remove CONFIG_MODULE_COMPRESS"),
introduced in kernel v5.13, substituted CONFIG_MODULE_COMPRESS=n for
CONFIG_MODULE_COMPRESS_NONE=y as the way to disable module compression.
Since module compression causes "Invalid ELF header magic: != ELF"
errors during boot on the ChromeOS base config, add the missing config
to disable module compression on kernels > v5.13 as well.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* src: lava_callback: reduce callback data size

The callback data is quite large, especially as it includes the full log
which we already upload separately. By dropping it and compressing the
whole file with `gzip` we can avoid wasting too much storage space.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: lava_callback: don't leak secret token

The callback data contains the secret tokens value which shouldn't be
leaked. Ensure we drop it from the uploaded data.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: platforms-chromeos: use new cros-flash image

This ensures we use the new version of the `install-modules` script.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* src: regression_tracker: add the "device" field to regression data

This can be helpful. We're not using it as a search param though, as we
don't want to narrow down the search that much, using the platform only
is better.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: result_summary_templates: report device used for job

This information is now available, and it can be useful to know the
affected device withouth having to look at the LAVA job details.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* kubernetes: Update deployment recipe

Update list of labs and add KCI_INSTANCE variable.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava-callback: Limit threads of lava-callback

Due inrush of lava callbacks and slow Azure Files
processing, we need to make sure we dont spawn too many
threads.
Also add hard limit of memory 1Gbyte

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: add presetes for fluster test

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Make template generic for all v4l2 tests
- Rebase on main

* result_summary presets: make the name of fluster test generic

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: enable first fluster test for mt8195-cherry-tomato-r2

Enable first fluster test, AV1-TEST-VECTORS for mt8195-cherry-tomato-r2.
Run the test on mainline and next until more trees are added.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Create generic v4l2-decoder-conformance-job and use anchers from it
- Update the rootfs address
- Move anchor to _anchor
- Update with nitpicks

* config: jobs-chromeos: Add kernelci tree for testing purpose

Remove this commit before merging.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Enable cpufreq kselftest

Enable cpufreq kselftest on all the trees and branches.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

* result_summary presets: fix preset for kselftest-dt failures monitor

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* result_summary presets: new presets for kselftest-cpufreq

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: mt8195-cherry-tomato-r2: enable all fluster tests for all branches

Add all the trees and branches on which the tests would be ran. Enable
all the tests for tomato.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- The build config cannot be added yet. Just list the trees, it will only use
  the branches configured in build_configs:
  - mainline will use master
  - next will use master
  - collabora-chromeos-kernel will use for-kernelci
  - media will use master and fixes
- Remove kernelci tree as it was added just for testing purpose

* config: mt8183-kukui-jacuzzi-juniper-sku16: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

jacuzzi

* config: mt8186-corsola-steelix-sku131072: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: mt8192-asurada-spherion-r0: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Don't specify the platforms manually as they are already mentioned in
  test-job-arm64-mediatek

* config: sc7180-trogdor-kingoftown/lazor-limozeen: enable add all supported fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Use test-job-arm64-qualcomm instead and carete separate jobs for
  qualcomm devices
- Don't specify platforms manually as they are already mentioned in
  test-job-arm64-qualcomm

* build(deps): bump uwsgi from 2.0.21 to 2.0.22 in /docker/lava-callback

Bumps [uwsgi](https://uwsgi-docs.readthedocs.io/en/latest/) from 2.0.21 to 2.0.22.

---
updated-dependencies:
- dependency-name: uwsgi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* pipeline.yaml: Add stable-rc build variants

Add more build variants for stable-rc tree to match legacy system.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary: add error classification

Classify errors according to patterns in the logs

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* result_summary presets: add collabora-chromeos-kernel and media trees for fluster tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: Use media-stage instead of media-tree

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config/pipeline: enable android branches from legacy

Enable all android branches from the legacy system

Signed-off-by: Helen Koike <helen.koike@collabora.com>

* trigger: Add exclude/include tree list for trigger

As we need to restrict list of running kernels on staging,
we need to add option allowing that.
Also it will be good to exclude staging kernels from production
kernel list.

So in case of staging we need to run kernels only from tree "kernelci"
and sometimes something else, for example "mediatek".
Option will look like:

--trees kernelci,mediatek
or
--trees kernelci

On production we need to exclude trees kernelci and buggytree:
--trees !kernelci,buggytree
or just kernelci:
--trees !kernelci

Purpose of this option is that our compiling capacity is limited,
and right now staging and production both compiling very large set
of kernels, we need to reduce this amount to drop costs.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: platforms-chromeos: use CrOS R124 files

ChromeBooks were upgraded with a new image based on ChromiumOS R124, so
we must use those files now.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: jobs-chromeos: drop non-existent Tast tests

Those were removed between R120 and R124 and therefore cause test
failures with the new images.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* result_summary presets: fix acpi kselftest presets

We're interested in catching regressions and failures in the both the
kselftest-acpi test suites and its test cases. Match the nodes by group
in the presets accordingly.
Fix template used by the failure monitor preset.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src: update return values of `APIHelper.receive_event_node`

`APIHelper.receive_event_node` method is used to receive
node data from PubSub event. The method has been updated
to return `is_hierarchy` flag as well which represents
events related to node hierarchy.
Update pipeline services using the method accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: refine presets for v4l2-decoder-conformance

Modify the regression preset to monitor regressions on both the
v4l2-decoder-conformance test suites and its test cases, by matching the
nodes by group instead of by name.
Also, change the failure preset to monitor for all errors caused by
runtime errors.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: add summary presets for v4l2-decoder-conformance

Add summary presets to fetch regressions and failures on
v4l2-decoder-conformance tests. Two of the presets are the same used by
the monitor; add one additional preset to fetch all the failures on
both the test suites and their test cases.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* lava_callback.py: Remove error_code/error_msg on lava-callback

Sometimes due congestion node might be set to timeout, but
then result might arrive late and we need to use it properly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* result_summary presets: fix dt kselftest presets

Fix the dt kselftest preset, just like was done for the acpi one, as the
current preset doesn't match the actual results we're interested in.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* doc/connecting-lab: refine documentation

Refine documentation for connecting LAVA labs
and submitting jobs to the lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback: Sometimes we get totally invalid log file uploaded

Most likely problems lays in threading of flask, and possibly
callbacks are getting mixed. This commit attempts to introduce
several countermeasures against that.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* doc: add `_index.md` page

Add index documentation page.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `pipeline-details` page

Move `pipeline-details` documentation from the API
repository to this repo to make it close to the source.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/connecting-lab: adjust `weight` property

Change `weight` property of existing doc page to
accommodate with transition of pipeline related docs
to pipeline repo.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc: add `developer-documentation` page

Add developer manual documentation.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add lab config for Qualcomm

Add an entry to `runtimes` section for Qualcomm
lab configurations.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-x86` job for qualcomm

Add job configuration `baseline-x86-qualcomm` for
running baseline job in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add lab-qualcomm runtime

Add runtime argument `lab-qualcomm` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to Qualcomm LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add `baseline-arm64` job for qualcomm

Add job configuration `baseline-arm64-qualcomm` for
running baseline job for `arm64` in Qualcomm LAVA lab.
Add scheduler entry as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* pipeline.yaml: Update RISC-V configs

1)rv32 defconfig doesn't exist, remove
2)nommu_k210_defconfig have modules disabled

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback.py: Sanitize lava log data

As we use this data in reports, lets remove all
non-printable characters as they confuse grafana, browsers and others.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/runtime/kunit.jinja2: fix result map

Fix result map for skipped tests. Initially, API
didn't have `skip` available node result in the schema.
That's why it was mapped to `None` result. But now API
has `skip` result to denote skipped tests.
Fix the result mapping accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: jobs-chromeos: Add lab-setup fragment

Add the lab-setup fragment to the chromebook builds, which contains the
architecture independent kernel configs needed to run tests on the
platform. Notably this disables IP autoconfig by the kernel.

The result of this change is that the 12 seconds boot delay and the
consequent deferred probe pending warnings will no longer happen on any
platform. Particularly on mt8186-corsola-steelix-sku131072 (due to a
different network adapter being used) on which it was still happening.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* lava_callback: bump up slightly threads number

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: enable watchdog reset test on Chromebooks

Add a basic test to verify watchdog reset functionality. Enable the
test on all ARM64 and AMD x86_64 Chromebooks. For Intel
Chromebooks, enable the test only on octopus, as ACPI PM Timer on the
other devices has been disabled in coreboot.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* src/send_kcidb: use schema version 4.3

Test status `MISS` was added to KCIDB in schema
v4.2 and supported by the latest version i.e. v4.3.
Hence, use the latest version for submission as
API may send a few tests with "MISS" status.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* send_kcidb: re-structure code for parsing checkout node

Move code for parsing checkout node to a separate
method.
Add `valid` field to parsed checkout node. It denotes
if source code was successfully checked out.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: print more information on invalid data

Print details for invalid revision data for the
sake of debugging.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: optimize `kcidb` import

Remove redundant `kcidb` import and adjust
kcidb Client call accordingly.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: remove keys with `None` values

KCIDB doesn't allow `None` as field value.
Remove all optional fields with `None` value
to make it valid data for submitting to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: add `kcidb_test_suite` property

Every KernelCI test will be mapped to a unified
test suite for KCIDB data submission.
Add `kcidb_test_suite` property to test job
definitions in YAML configuration files.
The added property will store the mapped
KCIDB test suite name.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: parse and submit node test and build data

Listen to all the node events with node state
`done` or `available` and submit the node to KCIDB.
Parse node received from the event and create KCIDB
schema compatible object based on type of the node
i.e. checkout, build or test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: set `log_excerpt` for builds and tests

Fetch logs from compressed log file(*.log.gz) URL
and send last 16*1024 characters for setting `log_excerpt`
field for build and test nodes as it is the max allowed
length of the KCIDB field.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/jobs-chromes: add kcidb test suite property for watchdog test

Add KCIDB test suite mapping for `watchdog_reset` test.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* lava_callback.py: disable log removal from callback data

We need it for investigations if we have any critical data
loss during log sanitizing.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: add error info to build nodes

Add error metadata fields such as `error_code` and
`error_msg` to `misc` field for build nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: add watchdog-reset presets for mainline/next

Add monitor and summary presets to track the results from the watchdog
reset test on the mainline and next trees.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* pipeline.yaml: Fix fluster rootfs URL

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* src/send_kcidb: get error metadata for failed/incomplete tests

Tweak condition to get error metadata for test nodes.
It should get error info for incomplete nodes as well
and not just failed nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: send tests only if KCIDB test mapping exists

All test suite definitions must have `kcidb_test_suite`
property i.e. KCIDB test suite mapping.
Only send tests for those the mapping is found.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* tests/validate_yaml: add validation for KCIDB mapping

To submit KernelCI generated data to KCIDB, it is required
to have a mapping for all the job definition with
`kcidb_test_suite` property.
Add validation to ensure all the jobs have a mapping
present to avoid missing data submission.
This check is to notify test authors trying to enable tests
in maestro to include the required property for the mapping
in their definition.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add qcs6490-rb3gen2 boot test

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* config: chromeos: Enable kselftest-dt on Qualcomm platforms

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* pipeline.yaml: Add one um build for android trees

As per request of Android team it will be good to check for breakages
UM builds as well.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: use `kind=job` for test suites

As part of re-structuring test hierarachy, `Job` model
has been introduced for test suite/job nodes.
It uses node kind `job`.
Update test configurations in `pipeline.yaml` and
`jobs-chromeos.yaml` to use `kind=job` to
generate job nodes.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: provide `kind` value for child tests

In case of submitting test hierarchy, child nodes by default
inherit `kind` value from parent node.
As we are re-structuring test hierarchy, test suit/job nodes
will have `kind=job` where its child test nodes will have
`kind=test`. Provide `kind` field explicitly to test result
hierarchy to preserve different kind value than the parent
node.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: fix `NameError`

Fix the below error in `_submit` method:
```
Traceback (most recent call last):
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 287, in main
    job.submit(results)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 138, in submit
    self._submit(result)
  File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 265, in _submit
    return node
NameError: name 'node' is not defined
```

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/runtime/kunit.jinja2: evaluate job node result

Evaluate job node result from child node results if
`null` result is receive from test result parser.
For example nodes such as `fortify`:
https://staging.kernelci.org:9000/viewer?node_id=6670ab43d0b7694b399897c4

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix parsing of KUnit log file

Handle both compressed(gzip) and plain text log files
for getting log excerpt.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: HTTP exception handling for log excerpt

Add HTTP exception handling for getting
log excerpt data.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: platforms-chromeos: Add serial delay for some Mediatek platforms

Add test_character_delay to the Spherion, Tomato and Steelix platforms
to workaround the fact that they're sometimes unable to process serial
input fast enough, resulting in mangled commands and consequently flaky
test results, as described in
https://github.com/kernelci/kernelci-project/issues/366.

The right place to do this change would be in the device-type template
as described in LAVA's documentation [1]. This overriding in KernelCI is
meant only as a temporary workaround to verify whether this fixes the
issue. If it does, then we'll do it in LAVA upstream instead.

[1] https://docs.lavasoftware.org/lava/debugging.html#differences-in-input-speeds
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: chromeos: Enable error-logs kselftest for MediaTek Chromebooks

Run the error-logs kselftest on MediaTek Chromebooks. This test is
currently under review upstream [1] so, in the meantime, it has been
added to the collabora-next tree so it can prove its value by helping to
detect issues upstream.

[1] https://lore.kernel.org/all/20240423-dev-err-log-selftest-v1-0-690c1741d68b@collabora.com

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config/pipeline.yaml: enable CIP lab

Add configuration for LAVA CIP lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config/pipeline.yaml: add baseline-x86 test for CIP

Add `baseline-x86-cip` test to be submitted to CIP
LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* docker-compose.yaml: add `lab-cip` runtime

Add runtime argument `lab-cip` to `scheduler-lava`
container. This will enable the pipeline to run and
submit jobs to CIP LAVA lab.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: enable `job` node submission to KCIDB

Parse newly added job node and its child tests
for KCIDB submission.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: don't submit `setup` test suite nodes

`setup` test suite has been introduced to store test results
for environment setup checks before running actual test suite.
KCIDB doesn't require `setup` test suite result as long as
main test job result is submitted.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: add a check before sending data

Check if parsed data is available before
sending revision data to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: fix logs

Fix log statement about submitting node to KCIDB
as we are not sending all the nodes we receive
event for to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: handle skipped tests

Do not retrieve artifacts or metadata from parent
node for skipped tests as in pratice only kernel
revision, test runtime and platform will be
available for skipped tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary/utils: ignore failures on log retrieval

Make the script continue running if there was an error fetching a test
log.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* doc/developer-documentation: add docs for enabling new tests

Add developer documentation for enabling new tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* Fix links after docs page migration

Documentation has been migrated to the "docs.*" subdomain.

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* pipeline.yaml: Add kcidebug fragment

Add useful low-overhead debug option to kernel,
and test on most x86 boards we have available,
with minimal baseline tests.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* configs: update gcc-10 to gcc-12

As we upgrade compiler images, we need update gcc version

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: workaround: match node paths programatically

Don't use 'path' as an api search parameter. The use of lists as query
parameters (path is a list) is undefined. Instead, do the filtering in
code.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: remove qemu jobs from lab-qualcomm

QEMU jobs use container pulled from hub.docker.com. After the lab move
pulling from this registry is no longer possible at Qualcomm. This patch
disables QEMU jobs from Qualcomm lab.

Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>

* validate_yaml.py: Improve pipeline validation

Add validation that scheduler entries have matching job entry,
this is critical validation, and job entries have at least
one entry in the scheduler.
Fix one entry detected by this validation

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* pipeline.yaml: Add broonie(Mark Brown) trees to pipeline

It is time to enable even more trees.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add additional verification for duplicate keys

We might have redefined same keys in different yaml files,
this tool will ensure consistency of this entries.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Remove path separator

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Rename variable to schedules

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config/kernelci.toml: update KCIDB origin name

As we agreed to refer new KernelCI API & Pipeline as
"maestro", use the new name while submitting data
to KCIDB.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* src/send_kcidb: update KCI result mapping with KCIDB status

Update evaluation of KCIDB status from KCI result.

Create 2 categories for error codes:
1. When pre-check tests completed but actual test suite
coudln't run - this will have `MISS` status
2. When pre-check tests completed, actual test suite could
run but somehow couldn't complete - this will have `ERROR` status

Some LAVA error codes can occur at any point of execution
such as `Cancelled` and `Test`.
Listed such error codes to the most relevant category
based on analysis of available results.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* result_summary presets: fix presets for v4l2-decoder-conformance

Following recent updates to data representation on KernelCI nodes,
the top-level nodes for tests now have their kind set to 'job' instead
of  'test'. Update the presets for v4l2-decoder-conformance tests
accordingly.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* result_summary presets: fix output file name in kselftest-acpi preset

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: enable dmabuf-heaps, exec and iommu kselftest suites

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Add kcidb_test_suite

* config: result-summary: add generic rule to monitor failures and regression

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Add rt-stable builds

Copy rt-stable builds from legacy KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes:
- Major changes to move to new way of writing kbuild jobs

* config: pipeline: Add v6.6-rt branch for builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: result-summary: add rt-stable kbuilds presets

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: chromeos: Add 'nfs' suffix to KCIDB suite name for baseline-nfs

The baseline test is currently run with both ramdisk and nfs rootfs. To
distinguish baseline-nfs tests in KCIDB, add an 'nfs' suffix to the KCIDB
test suite name.

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* aks: Add kubernetes kcidb deployment

We need file that will manage deployment of kcidb bridge
in kubernetes production deployment.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* kubernetes: Adjust trigger k8s options

Ignore kernelci tree on production, as it is special
"staging"-only tree, and read all /config directory, not just default
pipeline.yaml.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* regression_tracker: bugfix: catch empty search condition

Fix _get_last_matching_node(), after the previous change there was an
unhandled scenario where nodes may be empty but the function wouldn't
return None immediately.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>

* config: pipeline: correct the kind of kselftest suites to job

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler-chromeos.yaml: Temporarily disable non-essential tast tests

As per discussion, we disable temporary tast tests which unlikely
will be reviewed.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* k8s/aks: Update deployment files

1)Update memory limit, as working with linux sources might require 3Gbyte of RAM.
2)Update config file path
3)Add callback environment variable
4)Update image reference to fresh one

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android builds with gcc-12 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable android builds with clang-17 for all architectures

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: remove build_variants from android build_configs

The build_variants is legacy way to specify the different variants. We
have moved to the newer way to specify the variants. Hence remove the
build_variants from android build_configs.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add android15-6.6-lts branch for build as well

The android15-6.6-lts has been included recently in legacy KernelCI:
https://github.com/kernelci/kernelci-core/pull/2597

Add the same in newer KernelCI.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add blocklist for riscv older kernels for android builds

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: update KCIDB test suite mapping for baseline

Use `boot` as KCIDB test suite mapping for all
baseline tests.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* callback_url: Update config and README

As we are moving callback URL to environment variable,
updating config and README accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: enable android baseline (boot) testing for arm and arm64 in only allmodconfig

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* scheduler.py: If event have jobfilter, inject it to the node data

When someone generate artificial event with jobfilter, this is
likely maintainer trying to repeat job. Treat this accordingly,
and inject job filter to job node, so we will run only tests
maintainer wants.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* lava_callback: migrate to fastapi

It will be easier to maintain API and Pipeline, as
both will be powered by FastAPI framework.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: chromeos: Update fluster rootfs URL

Signed-off-by: Laura Nao <laura.nao@collabora.com>

* config: pipeline: fix defconfigs in fragments

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* kbuild.jinja2: support defconfig as list or str

As required in https://github.com/kernelci/kernelci-core/pull/2608
defconfig might be two types. Support it in jinja2 accordingly.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: piepline: add kbuilds of lee-mfd with default defconfigs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: enable baseline testing for mfd for one board of each arch

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: fix platform sections for Qualcomm and Android schedules

Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>

* k8s: Update deployment to uvicorn, as we use fastapi now

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* config: pipeline: Unblock android runs on lava-collabora

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: Enable preempt-rt cyclictest test

Enable the first preempt-rt test, cyclictest in new KernelCI. Enable it
on all platforms.

Since these are all smoke test there is no point in running them too
long. Thus reduce the runtime per test to one minute. This should keep
the total preempt-rt runtime roughly in the same time frame.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* pipeline: add all the test jobs for all rt-test

Add jobs definition of all the rt-tests. Enable cyclicdeadline and rtla
tests to run on all targets.

The changes have been ported from Daniel's PR [1].

[1] https://github.com/kernelci/kernelci-core/pull/2397

Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add template and test properties for preempt_rt jobs

Add template, job add kcidb_test_suite properties for all preempt-rt jobs

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: rename preempt-rt to rt-tests which is correct name of tests

The legacy was using preempt-rt name of tests. But the repository has
rt-tests name. We must use the same name to merge with execution results
coming from other CIs in KCIDB.

Suggested-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: add the correct nfsroot for rt-tests

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: Remove android's deprecated branches

It has been confirmed with Todd that we should remove the deprecated
branches. Hence remove those branches.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* config: pipeline: run baseline on non-allmodconfig

The allmodconfig generates very large kernel image. It cannot be booted
on the arm64 and arm targets as tftp errors out that size is too large.
Reduce the kernel image size. Use the default defconfig. The same
defconfigs have been booting for other trees.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* doc: developer-documentation: Update documentation by adding more details

- Reorganize some things
- Specify how to write different variants by removing old syntax
- Give two separate templates for kbuild and test
- Try to put more details for new contributors

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Fix type
- Apply suggestions from code review

* doc/developer-documentation: fix a glitch in enabling new tree section

Fix a minor bug in YAML block formatting.

Fixes: f5f57de ("doc: developer-documentation: Update documentation by adding more details")
Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* doc/developer-documentation: update a section title

Rename a section from "Enabling a new Kernel tree" to
"Enabling new KernelCI trees, builds, and tests" as it explains
enabling tests as well.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

* config: use the new `tree:branch` format for rules

For cases where we want a single branch to be allowed for a given tree,
we can now use the `tree:branch` format in rules. Convert existing rules
accordingly.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config: pipeline: fix improper use of "filters" attribute

The `filters` param was used in the legacy system but has been replaced
by `rules`, with a different syntax.

For Android RISC-V builds, this was used to deny job execution on
kernels < 4.19, so let's translate this condition with the rules format,
and do a similar change for the `rt-tests`-based jobs.

Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>

* config/pipeline.yaml: Fix x86 typo in kcidebug job names

The kcidebug jobs that run on MediaTek and Qualcomm platforms should
have arm64 in the name rather than x86. Fix the typo.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

* config: pipeline: remove params

The parameters are only needed when they are changed or appeneded.
Remvoe the parameters which aren't being modified.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>

* validate_yaml.py: Jobs are required to have template parameter

Add more validation to config files of mandatory parameters.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* validate_yaml.py: Add more job validations

Add basic validation, each job must have kind parameter

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

* workflows: Add label on CI check failures

Automatically add label so broken PR wont go to staging

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>

---------

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Signed-off-by: Laura Nao <laura.nao@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com>
Signed-off-by: Daniel Wagner <wagi@monom.org>
Co-authored-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Co-authored-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Co-authored-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Co-authored-by: Helen Koike <helen.koike@collabora.com>
Co-authored-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Co-authored-by: Laura Nao <laura.nao@collabora.com>
Co-authored-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Co-authored-by: Shreeya Patel <shreeya.patel@collabora.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Milosz Wasilewski <milosz.wasilewski@foundries.io>
Co-authored-by: Paweł Wieczorek <pawiecz@collabora.com>
Co-authored-by: Milosz Wasilewski <quic_mwasilew@quicinc.com>
Co-authored-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
@igaw
Copy link
Contributor Author

igaw commented Aug 7, 2024

I've shared the dashboard with my fellow stable-rt maintainers and we played a bit around. It seems everyone like the direction this project takes.

Anyway, one remark was that there was no rt config used for the builds, just the default ones. Which is good as it already found a bug in one of our rt patch series. So the question is if it would be possible to enable the rt configs back as well? Thanks!

@musamaanjum
Copy link
Contributor

Thank you so much for reporting.

The preempt_rt config builds are already enabled. There is some problem on the dashboard side and those results are missing. I'll investigate.

@padovan
Copy link
Contributor

padovan commented Aug 7, 2024

@igaw thanks for the valuable feedback! Let's also setup email notifications for your trees and tests (cc @JenySadadia)

@igaw
Copy link
Contributor Author

igaw commented Aug 7, 2024

For the email notification, could these emails send to stable-rt@vger.kernel.org too? We use this mailing list for stable work, so it would be the right place to post any build results etc. Thanks!

@JenySadadia
Copy link
Collaborator

Hello @igaw
For starters, I'll start working on reporting build failures and rt-tests failures in the email notification.
Is there any other test failure would you like to see in the report? Such as boot failures?

@igaw
Copy link
Contributor Author

igaw commented Aug 19, 2024

Hi @JenySadadia, cool! Yes, boot failures would also be good to get.

@JenySadadia
Copy link
Collaborator

Hi @JenySadadia, cool! Yes, boot failures would also be good to get.

Okay, thanks.

@nuclearcat
Copy link
Member

Opened issue #2643
As it's better to continue discussion there and close PR that is no more relevant. Thank you!

@nuclearcat nuclearcat closed this Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants