Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add build config and test-plan for PREEMPT_RT #475

Merged
merged 3 commits into from Sep 23, 2020

Conversation

khilman
Copy link
Contributor

@khilman khilman commented Aug 25, 2020

Add minimal set of build configurations, including PREEMPT_RT=y config for the rt-stable branches.
Also add a test-plan based on LAVA test-definition from Daniel Wagner's fork of LInaro test-definition repo.

NOTE: PR currently based on the kselftest PR #445 due to conflicts in adding test-plan to test-configs and lab-configs.

Things to fix

  • move parameters to device-specific part of test-configs (e.g. num threads will be SoC specific)

@khilman khilman linked an issue Aug 25, 2020 that may be closed by this pull request
@broonie
Copy link
Member

broonie commented Aug 25, 2020

Makes sense to me.

@khilman khilman changed the title build-configs: update RT preempt configurations and build config and test-plan for PREEMPT_RT Aug 25, 2020
@khilman khilman mentioned this pull request Aug 25, 2020
@khilman khilman force-pushed the dev/preempt-rt branch 2 times, most recently from 6e05adc to 00489c0 Compare Aug 25, 2020
@khilman khilman changed the title and build config and test-plan for PREEMPT_RT add build config and test-plan for PREEMPT_RT Aug 25, 2020
@khilman
Copy link
Contributor Author

khilman commented Aug 25, 2020

Manually triggered LAVA job: https://lava.baylibre.com/scheduler/job/31249
and results on staging: https://staging.kernelci.org/test/plan/id/5f4581c91902afaf5f544a7b/

@gctucker something kind of curious. The frontend shows some of the results as unknown, but the LAVA_SIGNAL sent says fail. Here's the relevant part of the LAVA job log:

- {"dt": "2020-08-25T22:56:06.243696", "lvl": "target", "msg": "t0-min-latency pass 21 us"}
- {"dt": "2020-08-25T22:56:06.243948", "lvl": "target", "msg": "t0-avg-latency pass 42 us"}
- {"dt": "2020-08-25T22:56:06.248144", "lvl": "target", "msg": "t0-max-latency fail 4344 us"}
- {"dt": "2020-08-25T22:56:06.248434", "lvl": "target", "msg": "t1-min-latency pass 24 us"}
- {"dt": "2020-08-25T22:56:06.248614", "lvl": "target", "msg": "t1-avg-latency pass 39 us"}
- {"dt": "2020-08-25T22:56:06.248743", "lvl": "target", "msg": "t1-max-latency fail 4782 us"}
- {"dt": "2020-08-25T22:56:06.254064", "lvl": "target", "msg": "+ ../../utils/send-to-lava.sh ./output/result.txt"}
- {"dt": "2020-08-25T22:56:06.340646", "lvl": "target", "msg": "<LAVA_SIGNAL_TESTCASE TEST_CASE_ID=t0-min-latency RESULT=pass UNITS=us MEASUREMENT=21>"}
- {"dt": "2020-08-25T22:56:06.350176", "lvl": "debug", "msg": "Received signal: <TESTCASE> TEST_CASE_ID=t0-min-latency RESULT=pass UNITS=us MEASUREMENT=21"}
- {"dt": "2020-08-25T22:56:06.350818", "lvl": "results", "msg": {"case": "t0-min-latency", "definition": "1_cyclictest", "measurement": !!python/object/apply:decimal.Decimal ["21"], "result": "pass", "units": "us"}}
- {"dt": "2020-08-25T22:56:06.420675", "lvl": "target", "msg": "<LAVA_SIGNAL_TESTCASE TEST_CASE_ID=t0-avg-latency RESULT=pass UNITS=us MEASUREMENT=42>"}
- {"dt": "2020-08-25T22:56:06.423155", "lvl": "debug", "msg": "Received signal: <TESTCASE> TEST_CASE_ID=t0-avg-latency RESULT=pass UNITS=us MEASUREMENT=42"}
- {"dt": "2020-08-25T22:56:06.423739", "lvl": "results", "msg": {"case": "t0-avg-latency", "definition": "1_cyclictest", "measurement": !!python/object/apply:decimal.Decimal ["42"], "result": "pass", "units": "us"}}
- {"dt": "2020-08-25T22:56:06.498716", "lvl": "target", "msg": "<LAVA_SIGNAL_TESTCASE TEST_CASE_ID=t0-max-latency RESULT=fail UNITS=us MEASUREMENT=4344>"}
- {"dt": "2020-08-25T22:56:06.499538", "lvl": "debug", "msg": "Received signal: <TESTCASE> TEST_CASE_ID=t0-max-latency RESULT=fail UNITS=us MEASUREMENT=4344"}
- {"dt": "2020-08-25T22:56:06.500106", "lvl": "results", "msg": {"case": "t0-max-latency", "definition": "1_cyclictest", "measurement": !!python/object/apply:decimal.Decimal ["4344"], "result": "fail", "units": "us"}}
- {"dt": "2020-08-25T22:56:06.593601", "lvl": "target", "msg": "<LAVA_SIGNAL_TESTCASE TEST_CASE_ID=t1-min-latency RESULT=pass UNITS=us MEASUREMENT=24>"}
- {"dt": "2020-08-25T22:56:06.594449", "lvl": "debug", "msg": "Received signal: <TESTCASE> TEST_CASE_ID=t1-min-latency RESULT=pass UNITS=us MEASUREMENT=24"}
- {"dt": "2020-08-25T22:56:06.595037", "lvl": "results", "msg": {"case": "t1-min-latency", "definition": "1_cyclictest", "measurement": !!python/object/apply:decimal.Decimal ["24"], "result": "pass", "units": "us"}}
- {"dt": "2020-08-25T22:56:06.680753", "lvl": "target", "msg": "<LAVA_SIGNAL_TESTCASE TEST_CASE_ID=t1-avg-latency RESULT=pass UNITS=us MEASUREMENT=39>"}
- {"dt": "2020-08-25T22:56:06.681478", "lvl": "debug", "msg": "Received signal: <TESTCASE> TEST_CASE_ID=t1-avg-latency RESULT=pass UNITS=us MEASUREMENT=39"}
- {"dt": "2020-08-25T22:56:06.682296", "lvl": "results", "msg": {"case": "t1-avg-latency", "definition": "1_cyclictest", "measurement": !!python/object/apply:decimal.Decimal ["39"], "result": "pass", "units": "us"}}
- {"dt": "2020-08-25T22:56:06.936498", "lvl": "target", "msg": "<LAVA_SIGNAL_TESTCASE TEST_CASE_ID=t1-max-latency RESULT=fail UNITS=us MEASUREMENT=4782>"}
- {"dt": "2020-08-25T22:56:06.936910", "lvl": "target", "msg": "+ set +x"}
- {"dt": "2020-08-25T22:56:06.937279", "lvl": "target", "msg": "<LAVA_SIGNAL_ENDRUN 1_cyclictest 31249_1.6.2.4.5>"}
- {"dt": "2020-08-25T22:56:06.937505", "lvl": "target", "msg": "<LAVA_TEST_RUNNER EXIT>"}

@gctucker
Copy link
Contributor

gctucker commented Aug 26, 2020

@khilman Yes that's because tests that have always been reporting a "FAIL" are not considered as regressions. It's an artifact of the UI in the frontend, we should show the status differently to reflect failures that aren't regressions. Something like this maybe:

  • Green check: pass
  • Orange cross: always failed
  • Red cross: regression
  • Grey question mark: unknown

That deserves an issue in kernelci-frontend.

@gctucker gctucker requested a review from a team Sep 1, 2020
@gctucker gctucker added this to In Progress in KernelCI project board via automation Sep 1, 2020
Copy link
Contributor

@gctucker gctucker left a comment

A few minor things - and yes that will need to be rebased once #445 has been merged. Or, the dependency could be inverted if we wanted to merge this first and then #445 with kselftests.

build-configs.yaml Outdated Show resolved Hide resolved
configs:
- 'CONFIG_EXPERT=y'
- 'CONFIG_PREEMPT_RT=y'
- 'CONFIG_PREEMPT_RT_FULL=y' # <= v4.19
Copy link
Contributor

@gctucker gctucker Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it won't hurt to have this config in the fragment with newer kernel, as it should just result in a warning from merge_config.sh. If however we start having strict checks and all the config options must appear in the resulting .config then we would need to create a separate fragment, e.g. preempt_rt_4.19 with that RT_FULL config option and the default one without it.

@khilman
Copy link
Contributor Author

khilman commented Sep 16, 2020

Fixed alphabetical order issues mentioned above and rebased onto master now that kselftest is merged.
Should be ready for broader testing in staging now.

@gctucker
Copy link
Contributor

gctucker commented Sep 16, 2020

Great, the PR had been conflicting for a while so let's hope this gets tested in the next staging job.

name: preempt-rt-prereq
path: inline/preempt-rt-prereq.yaml

- repository: https://github.com/igaw/test-definitions.git
Copy link
Contributor

@gctucker gctucker Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be getting those test definitions from kernelci/test-definitions now, with the kernelci.org branch. Let's get a couple of staging jobs to run with this repo first but then we would need to get it changed before deploying in production.

Copy link
Contributor Author

@khilman khilman Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'll get any runs on staging because there are no rt-stable trees built in staging. Maybe you add rt-stable v5.4 at least temporarily?

Copy link
Contributor Author

@khilman khilman Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I need to switch to the kernelci repo right now. Daniel submitted his stuff upstream and removed his preempt-rt branch.

Copy link
Contributor

@gctucker gctucker Sep 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting tree monitor jobs for some stable-rt build configs manually is fine for now, we shouldn't be building that all the time as it's a significant load. If we do want to keep testing real-time on staging, we should make a special build config i.e. kernelci_staging-rt or pull from stable-rt into the kernelci staging kernel branch.

test-configs.yaml Outdated Show resolved Hide resolved
@khilman
Copy link
Contributor Author

khilman commented Sep 17, 2020

Updated to use test-definitions repo/branch from kernelci, and dropped rootfs pattern based on comments.

Results from manually submitted jobs: https://staging.kernelci.org/test/job/rt-stable_v5.4-rt/branch/HEAD/kernel/v5.4.61-rt37/plan/preempt-rt/

LAVA jobs:

@gctucker
Copy link
Contributor

gctucker commented Sep 17, 2020

https://staging.kernelci.org/test/case/id/5f63999274b6645009581060/
9223372036854776000us lacenty? That's 292271 years :p

I don't know what went wrong, looks like an unsigned integer overflow. What is in the original LAVA results? That would help narrow down the problem, between LAVA, kernelci-backend or kernelci-frontend.

@khilman
Copy link
Contributor Author

khilman commented Sep 17, 2020

The results sent to the backend match the results from LAVA: https://lava.baylibre.com/scheduler/job/117611#L1293

So this looks like a cyclictest issue, not something related to the test-definition or this PR. IMO, we can merge this and then get rt-tests folks like Daniel Wagner to have a look.

@gctucker
Copy link
Contributor

gctucker commented Sep 17, 2020

Great, thanks for confirming.

How about the item in the description to add parameters such as number of threads?

@khilman
Copy link
Contributor Author

khilman commented Sep 17, 2020

How about the item in the description to add parameters such as number of threads?

I meant for that to be a future work item as more platforms are added, not a prereq for this PR.

@khilman
Copy link
Contributor Author

khilman commented Sep 17, 2020

Here's some results from the khadas-vim3l (arm64: 2x A53): https://lava.baylibre.com/scheduler/job/118418#L1361

Those look more normal than the odroid-n2 (arm64: big.LITTLE 2x A53, 4x A72)

@gctucker
Copy link
Contributor

gctucker commented Sep 21, 2020

templates/preempt-rt/preempt-rt.jinja2 Outdated Show resolved Hide resolved
description: Pre-requisites for PREEMPT_RT
run:
steps:
- apt-get update && apt-get install -y procps rt-tests
Copy link
Contributor

@gctucker gctucker Sep 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like something to add to the rootfs image as a follow-up - either buster-rt if we want to keep adding test suites for real-time, if not this could be added to the basic buster one to avoid rootfs proliferation.

Copy link
Contributor Author

@khilman khilman Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this can be done if/when creating a test-plan specific rootfs.

khilman added 3 commits Sep 22, 2020
1) add config fragments for building RT-enabled kernels for the
rt-stable tree and branches.

2) Add preempt_rt_variant for minimal set of builds.  Based on
minimal_variant, and adds preempt_rt fragment.

3) Add v4.19-rt and v5.4-rt branches

Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Add preempt-rt test-plan based on the Linaro test-definitions repo.

Uses vanilla KernelCI debian buster NFS root filesystem, and installs
rt-tests (and dependencies) via apt-get before running the test plan.

Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
@gctucker gctucker merged commit 5fa4394 into kernelci:master Sep 23, 2020
KernelCI project board automation moved this from In Progress to Done Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Real-Time test coverage
3 participants