WIP: send data to experimental ES by chaws · Pull Request #68 · kernelci/kernelci-core-staging

chaws · 2018-12-14T14:19:58Z

On last Monday call (10/12/18) Milosz suggested a patch in kernelci so that job results from lava would also be sent to our experimental ElasticSearch instance.

This patch does two things:

Sends a copy of build.json to our server
Changed lava test job definition to add an extra callback when callback_type != 'custom'
2.1. Apparently only LAVA V2 labs after 2018.5 support to multiple callbacks and from https://github.com/kernelci/kernelci-core-staging/blob/master/labs.ini, I came up with this list:

lab-baylibre: 2018.11-1~bpo9+1
lab-baylibre-dev: 2018.11-1~bpo9+1
lab-baylibre-seattle: offline
lab-baylibre-seattle-dev: offline
lab-broonie: offline
lab-broonie-dev: offline
lab-collabora: 2018.11-1~bpo9+1.1debian9.1
lab-collabora-dev: 2018.11-1~bpo9+1.1debian9.1
lab-embeddedbits: offline
lab-embeddedbits-dev: offline
lab-pengutronix: offline
lab-pengutronix-dev: offline
lab-free-electrons: offline
lab-free-electrons-dev: offline
lab-jsmoeller: offline
lab-jsmoeller-dev: offline
lab-mhart: 2018.5.post1-1+stretch
lab-mhart-dev: 2018.5.post1-1+stretch
lab-linaro-lkft: 2018.11+stretch
lab-linaro-lkft-dev: 2018.11+stretch
lab-theobroma-systems: 2018.2-1+stretch
lab-theobroma-systems-dev: 2018.2-1+stretch

Since kernelci-core-staging uses matt's, collabora's and baylibre's lab, we're safe adding the multiple callbacks block in the test job definition

khilman · 2018-12-17T19:52:56Z

I'm OK with this approach to send live data to additional backends for experimentation.
Could you make the URLs used a bit more configurable? either via env variables or args?

chaws · 2019-01-15T18:43:21Z

@khilman sorry for delay, I've parameterized it now, but for one experimental_url only, if needed I can add multiple urls (maybe comma separated).

khilman

I don't understand why build.py is involved at all. For current callback URLs, it's not involved at all. This is not a kernel build time decision, but one that should be made when creating the LAVA jobs (lava-v2-jobs-from-api.py)

Rather, I think that the --calback feature of lava-v2-jobs-from-api.py should be extended to support more than one callback URLs. Also, why limit ourselves to 2?

@gctucker what's your thought on the right way to support multiple callbacks in LAVA jobs in a flexible way

chaws · 2019-01-17T14:18:37Z

@khilman the intention of having experimental url in build.py was to send ES a fresh version of the build.json file. Then associate builds to jobs (boot/test) by using git-commit as link. Do you have other ideas of doing this?

Agreed on --callback, might've been naive on my end. Will update it later today, thanks for input!

khilman · 2019-01-17T16:54:57Z

@chaws ok, right. I forgot that build.py is already publishing the build.json so yeah, it will need to be modified too. Again, should probably be done in a way that supports more than one additional URL.

gctucker · 2019-01-24T20:54:47Z

@gctucker what's your thought on the right way to support multiple callbacks in LAVA jobs in a flexible way

@khilman I think this should be part of a general architecture to make the system more modular and enable alternatives (i.e. builds not in Jenkins, non-LAVA labs, alternative backends and frontends). It's probably worth a design document in Google docs as something to aim for, I can dump some ideas there to get this started.

gctucker · 2019-01-29T17:35:34Z

@khilman While I understand that sending the builds to both the current KernelCI backend and Elastic Search helps us with this experiment, is there a good reason why we would keep both in a production environment?

It would seem fine to make it possible to configure a KernelCI instance to use either the current backend, or ES, or something else. But having more than one used by a single production instance, is there any use-case for this?

khilman · 2019-01-30T09:43:34Z

@gctucker yes. It's not just for experiments like ES. The other obvious example is to to have production and development backends operating on the same "real" data. To get around this in the past, I've just copied the db from a production backend to a development backend, but that's just a workaround for not being able to send the same data to multiple backends.

That's why I also suggested we should support even more than 2. As the project grows, I think there will always be needs for production, development and experimental backends.

gctucker · 2019-01-30T10:46:14Z

OK. I guess it would also help on staging to have multiple backend instances to test separate things being developed in parallel, running with different branches. It would still not strictly be part of the production path though.

One thing that might cause some issues is the storage server. If we need to send the kernel source tarballs, the kernel build artifacts, the rootfs etc... to several places then it could start slowing things down as the number of them increases.

So I think we may need to keep them separate, essentially having one or several "storage" backends independently from one or several "results" backends. It's not a trivial thing to do though, especially as the current backend is entangled with the storage part, so probably something to keep in mind for later if we're having issues with scaling and network bandwidth.

This option will allow multiple callbacks to be defined in a yaml file, later to be used when generating test job definitions.

`build.py` was a bit limited due to `getopt`, it now works with `argparse`, that allows more flexible command line set up. Also, now the script allows notifications to be sent to multiple backends when a new build is complete.

mwasilew · 2019-02-04T13:01:21Z

lava-v2-jobs-from-api.py

+    broken_callbacks = []
+
+    for c in callbacks:
+        if not c.get('url'):


I would check whether all mandatory fields ('expected_fields') are present. There is no point checking any further if fields is missing.

@mwasilew , only url field would be mandatory. The rest of them can be optional. If an expected field is not present, it will be set to LAVA's default.

mwasilew · 2019-02-04T13:02:15Z

lava-v2-jobs-from-api.py

+
+    for c in callbacks:
+        if not c.get('url'):
+            print('Warning: one of the callbacks has no "url" set, removing it from the list')


logging sounds a better fit, but this is a case for another patch.

Agreed, I could propose some refactoring later on

mwasilew · 2019-02-04T13:02:33Z

lava-v2-jobs-from-api.py

+            [c.pop(k) for k in unknown_fields]
+
+    for broken in broken_callbacks:
+        callbacks.remove(broken)


log which callbacks were removed?

chaws · 2019-02-05T14:04:14Z

@khilman, @gctucker I rebased this PR and updated build.py to use argparse instead of getopt and now it accepts multiple urls (url[:token]) through command line, these are used to send build.json file whenever there's a new build in place. Also lava-v2-from-api.py was updated to accept a file name where to look for callbacks.

The callbacks is a yaml file that contains pretty much what LAVA accepts in their callbacks block. The only mandatory field is url.

gctucker · 2019-02-05T15:01:57Z

@khilman, @gctucker I rebased this PR and updated build.py to use argparse instead of getopt and now it accepts multiple urls (url[:token]) through command line, these are used to send build.json file whenever there's a new build in place. Also lava-v2-from-api.py was updated to accept a file name where to look for callbacks.

The callbacks is a yaml file that contains pretty much what LAVA accepts in their callbacks block. The only mandatory field is url.

OK, I guess the file with the ES callback can be on a branch to test on staging as long as it doesn't contain any secrets?

khilman · 2019-02-05T20:18:53Z

@chaws I'd really like to see the build.py argparse changes and new features separated into two separate patches.

chaws · 2019-02-06T12:31:59Z

@khilman yeah, it makes sense. Please check #90. I'll rebase this PR after that one gets merged.

@gctucker I guess it can. I was just unsure how to pass the filename to lava-v2...py. Should it be something like ${WORKSPACE}/callbacks.yaml?

gctucker · 2019-03-04T18:05:48Z

So I think the steps to get this merged are:

test the changes in build.py: changed getopt to argparse #90 on staging Jenkins
get build.py: changed getopt to argparse #90 merged
rebase this PR WIP: send data to experimental ES #68
see how this fits in the modular design proposal and adjust if required
https://docs.google.com/document/d/15F42HdHTO6NbSL53_iLl77lfe1XQKdWaHAf7XCNkKD8/edit?usp=sharing
In particular, I think the callbacks config is essentially backend components configuration. I find it more appropriate to put them in a plain configuration format like the existing labs.ini rather than YAML. We could have backends.ini or something like that (the details aren't defined in the design document linked above).

gctucker · 2019-03-19T09:14:14Z

@chaws Could you please rebase and submit a new PR in https://github.com/kernelci/kernelci-core as we're shutting down this kernelci-core-staging repo this week?

gctucker · 2019-03-22T07:52:49Z

@chaws Please resubmit this PR in https://github.com/kernelci/kernelci-core as we're shutting down kernelci-core-staging today.

cc @khilman

khilman suggested changes Jan 16, 2019

View reviewed changes

chaws added 2 commits January 30, 2019 14:02

lava-v2-jobs-from-api.py: add multiple callback support

827ad5c

This option will allow multiple callbacks to be defined in a yaml file, later to be used when generating test job definitions.

mwasilew reviewed Feb 4, 2019

View reviewed changes

khilman closed this Feb 5, 2019

khilman reopened this Feb 5, 2019

gctucker closed this Mar 22, 2019

Conversation

chaws commented Dec 14, 2018

Uh oh!

khilman commented Dec 17, 2018

Uh oh!

chaws commented Jan 15, 2019

Uh oh!

khilman left a comment

Choose a reason for hiding this comment

Uh oh!

chaws commented Jan 17, 2019

Uh oh!

khilman commented Jan 17, 2019

Uh oh!

gctucker commented Jan 24, 2019

Uh oh!

gctucker commented Jan 29, 2019

Uh oh!

khilman commented Jan 30, 2019

Uh oh!

gctucker commented Jan 30, 2019

Uh oh!

mwasilew Feb 4, 2019

Choose a reason for hiding this comment

Uh oh!

chaws Feb 4, 2019

Choose a reason for hiding this comment

Uh oh!

mwasilew Feb 4, 2019

Choose a reason for hiding this comment

Uh oh!

chaws Feb 4, 2019

Choose a reason for hiding this comment

Uh oh!

mwasilew Feb 4, 2019

Choose a reason for hiding this comment

Uh oh!

chaws commented Feb 5, 2019

Uh oh!

gctucker commented Feb 5, 2019

Uh oh!

khilman commented Feb 5, 2019

Uh oh!

chaws commented Feb 6, 2019

Uh oh!

gctucker commented Mar 4, 2019

Uh oh!

gctucker commented Mar 19, 2019

Uh oh!

gctucker commented Mar 22, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants