Convert task to Standard Interface style task #44

kparal · 2018-02-01T14:25:26Z

This converts the task to the Standard Interface [1] [2]. It's a slight variation of SI, because SI is only concerned with package-specific tasks, not generic tasks (running on all packages). The small differences are described in libtaskotron's readme file (in develop branch, currently). We call such tasks "Taskotron generic tasks". [3]

In a nutshell, runtask.yml is removed and tests.yml (an ansible playbook) is used instead. The task needs to run as root when running through runtask, because that's what SI demands. Of course you can adjust the playbook however you see fit and change how it's run locally (ansible-playbook tests.yml), so it might be possible to have a local way of executing the same task without root.

I had to disable integration testing in the test suite for this very reason (unless you want to run it as root locally, hum). I'm not sure how fast (or if) we'll have a solution for this in taskotron runner itself.

One of the few upsides to SI is that you can now run the task pretty independently on any test system. I.e. you don't need to run it through runtask or any other runner, you run it directly through ansible (on a remote machine as root, of course), and you see exactly what's happening. The only thing required are a couple of ansible variables set, in this case just taskotron_item (and perhaps taskotron_supported_arches).

I had to add download_rpms.py as a workaround way to be able to download RPMs. Originally I had written this using just koji download-build command (directly called from the playbook), but with the recent feature of also downloading build.log files, I had to revert back to using libtaskotron's koji directive (the koji command can't download build.log files when called from command line).

Currently all tasks running on http://taskotron-dev.fedoraproject.org/ need to be based on SI, and are pulled from develop branch, so I had to disable python-versions in there until this change is accepted. So from my point of view, since SI is the inevitable future, I'd be glad to have this code merged to develop so that we can continue running it and polishing the rough edges. If you'd prefer having it in a differently named branch, our tooling can't really do that right now, but of course everything can be patched with enough effort. (For us, it would be much easier to have this in develop).

https://taskotron.fedoraproject.org/ continues to run the old-style tasks (with runtask.yml) pulled from master branch for the moment. I don't have ETA for using SI tasks on production server.
(The stg server might be going away soon so I'm not mentioning it).

[1] https://fedoraproject.org/wiki/CI/Tests
[2] https://fedoraproject.org/wiki/Changes/InvokingTests
[3] https://pagure.io/taskotron/libtaskotron/blob/develop/f/README.rst

hroncok · 2018-02-01T14:31:40Z

tests.yml

+    taskotron_generic_task: true
+    # below are fallback vars for local testing
+    artifacts: ./artifacts
+    taskotron_item: python-gear-0.11.0-1.fc27  # you should really override at least this :)


What does this do? What is python-gear and why does it say we need to change it?

That's a default variable value (a random recent koji package with python code in it), so that you can run ansible-playbook tests.yml without studying and adding any variables on the command line, and still see it doing something.

hroncok · 2018-02-01T14:33:51Z

download_rpms.py

+
+    koji = koji_directive.KojiDirective()
+
+    print 'Downloading rpms for %s into %s' % (koji_build, rpmsdir)


It's 2018 and you've just wrote a Python 2 only script. Every time you do that, some little part in me dies. We are not going to maintain a Legacy Python only script in our repo. It's already tedious to keep in mind that our code here needs to be Python 2 compatible, this is just making it worse. I suppose this code should be part of taskotron, not our repo, but I'll get to this in a different section.

That's because I'm not familiar with Python 3, even in 2018 :-/ This code is a semi-temporary file that I've already copied to several tasks and I'm definitely not happy about it. It's there because:
a) invoking koji download-build command directly is not sufficient for this task (you need build.logs)
b) you can't interact with python directly in ansible, so you need a wrapper that accepts cmdline arguments and converts them into python calls which use our koji_directive helper library

In the future, we'd like to have a native ansible module that exposes all important input arguments, so that you can call it directly and don't need to have a wrapper.

Of course, if you decide to convert this to python23 in the meantime (or even a better idea), I have no issue with it (quite the opposite).

hroncok · 2018-02-01T14:40:40Z

README.rst

+on a production machine!)::

-.. code:: console
+  $ ansible-playbook tests.yml -e taskotron_item=<nevr>


Is this how you run the checks in production now instead of the runtask command? Will this command give the yaml output that we can parse as we did in our integration tests? We want to make sure the tests work in Taskotron not, that some ansible playbook works.

Basically, yes. We set up more variables. Here's an example:
http://taskotron-dev.fedoraproject.org/artifacts/all/6ea80e0c-0756-11e8-920b-525400ee7c53/task_output/tests.yml/taskotron/ansible_vars.json.gz

The ResultYAML file will be stored in {{artifacts}}/taskotron/results.yaml, which for local execution defaults to ./artifacts/taskotron/results.yaml. That's exactly the file we will then send to ResultsDB.

This is as close as possible to actual taskotron execution, so verifying this is useful, I believe. Of course it's not the same as running through the whole taskotron stack.

Our integration tests now read the results.yaml file.

hroncok · 2018-02-01T14:41:26Z

tests.yml

+    artifacts: ./artifacts
+    taskotron_item: python-gear-0.11.0-1.fc27  # you should really override at least this :)
+    taskotron_supported_binary_arches:
+      - x86_64


Did we just get rid of all the other architectures?

Again, that's a default variable value, for local execution when developing/experimenting. It is expected that you override this (and taskotron stack will do that for you, with the list of arches we want to execute on).

hroncok · 2018-02-01T14:42:40Z

tests.yml

+
+        - name: Run task
+          shell: >
+            ./python_versions_check.py {{ taskotron_item }} {{ workdir.path }}


Can we invoke python2 here directly instead of adding a shebang and executable flag? Would that break something?

Shouldn't break anything, I believe. I'm simply used to the current approach, that's why I wrote it that way.

hroncok · 2018-02-01T14:45:05Z

tests.yml

+    - block:
+        - name: Download RPMs from Koji
+          shell: >
+            ./download_rpms.py {{ taskotron_item }} {{ workdir.path }}


Instead of our home brewed script, maybe a takotorn_download_rpms --with_logs command should exist and be maintained upstream and used here. How are the other checks doing this?

It could, yes. As replied above, a more preferable approach is a native Ansible module. But right now, this is what I have ready.

hroncok · 2018-02-01T14:47:14Z

tox.ini

-basepython = python3
-commands = python -m pytest -v {posargs} test/integration
-sitepackages = False
+;; disabled during transition to ansiblized Taskotron (Standard Interface support)


What is the actual reason to disable the tests? If I change the integration tests to run the ansible playbook in mock, would that work?

OK, there is a way, I have a very rough code ready, testing it.

I have no idea :-) The reason I disabled it is because I don't have time ATM to implement running the test suite in docker/mock/something else, and I hoped you'll either don't care that much (since with ansible playbook, taskotron runner is now a very thin wrapper around all of this), or implement it yourself. Or I might get to it, eventually. Just not immediately.

pushed, let's see what Travis says

hroncok · 2018-02-01T14:48:09Z

tests.yml

+      dnf:
+        name: "{{ item }}"
+        state: latest
+      with_items:


What Fedora version is this running on in production?

The approach hasn't changed, we pick the image to run on based on the item tested. So in this case we parse NVR and use disttag to decide which image to spin up.

hroncok · 2018-02-01T14:51:56Z

README.rst

-Once everything is installed you can run the task on a Koji build
-using the
-``name-(epoch:)version-release`` (``nevr``) identifier.
+To run this task locally, execute the following command as root (don't do this


This is bad design, it worked before and now it requires this extra step (getting a system where this runs). Once I setup a mock based workaround for this, I'll provide a way to run it not only in the integration tests, but also manually.

Agreed. I proposed some ideas how to mitigate this at least for local execution in the PR description.

Added mock section bellow.

hroncok · 2018-02-01T14:53:28Z

README.rst

+For example::

-.. code:: console
+  $ ansible-playbook tests.yml -e taskotron_item=python-gear-0.11.0-1.fc27


There was a --debug switch to runtask. How do I turn on debug logging now?

Currently, when you use runtask --debug, we do two things:

Switch taskotron runner to debug mode

Append -vv to ansible-playbook invocation.

Since the task no longer runs on the same machine and in the same process as the runner, our setting no longer applies to libtaskotron libraries being used inside the task. However, you can set up your own ansible variable and if it is present, you can switch your task to debug output (for libtaskotron libraries, simple use logging.getLogger('libtaskotron').setLevel(logging.DEBUG)). Even better, we can expose an ansible variable taskotron_debug: bool and you can react on that. Which means the task will show debug output even when run through runtask and there will be a standard way to toggle this for all tasks (that support that).

That being said, all of this might be of limited use and I'm actually not sure it's needed. Since the task is now run through ansible, you now longer see its standard output (at least not in realtime). Instead, you're required to create {{artifacts}}/test.log file containing log output of your task. That can be always put to debug level, if you want. Or you can create test.log with standard level output, and test.log.debug with debug level output, every time. It's your preference. We will always archive whole {{artifacts}} dir. Still, if you prefer a boolean toggle, I can expose it easily.

keep it how it is, thanks for explanation

kparal · 2018-02-01T17:17:21Z

tests.yml

    - name: Compute architectures to download
      set_fact:
-        download_arches: "{{ taskotron_supported_arches | join(',') }}"
+        download_arches: "{{ taskotron_supported_binary_arches | join(',') }}"


Thanks. Actually, I messed up this a little when converting from koji download-build to download_rpms.py, so I'll fix it in a following commit.

integration tests are disabled for the moment

The koji command can't do it itself, use the koji directive from libtaskotron.

Otherwise mock is not possible, see https://lists.fedoraproject.org/archives/list/buildsys@lists.fedoraproject.org/message/UMYBANKEJRCP52I5O7LAWX35VUYQFVH3/ This is probably not safe, but we are on Travis. I have no idea what I'm doing.

hroncok · 2018-02-01T19:43:50Z

Rebased. Integration tests now use mock. The Tests section of the README needs some adjusting, but I don't have enough energy any more.

Thoughts, @irushchyshyn?

kparal · 2018-02-02T07:44:20Z

Awesome work, Miro, thanks. You deserve a 🍪 .

gdestuynder/simple_bugzilla#1

irushchyshyn · 2018-02-02T10:24:39Z

Thoughts, @irushchyshyn?

I just took a quick look over the code and the general idea and it looks good. Great work @kparal and @hroncok !

I would like to play with it a little more today and get a better idea of what is going on. I will let you know, but feel free to merge if this is pressing.

hroncok · 2018-02-02T10:38:41Z

test/integration/test_integration.py

+    files = ['taskotron_python_versions'] + glob.glob('*.py') + ['tests.yml']
+    mockenv.copy_in(files)
+    yield mockenv
+    mockenv.orphanskill()


I'm not sure if this is the right way of doing this, but I found not way to "kill all processes from mock".

Note that sometimes I had problem when I tried to Ctrl+C the tests, that it only killed the mock process and the entire pytest thing detached from my stdin, so it was running in background and I could not Ctrl+C it any more. This is very inconvenient, so I recommend running he integration tests with --maxfail=1 when running interactively.

hroncok · 2018-02-02T10:41:25Z

test/integration/test_integration.py

-        print(err, file=sys.stderr)  # always print stderr in this case
-        raise RuntimeError('runtask exited with {}'.format(proc.returncode))
-    results = parse_results(err)
+    mock.shell('ansible-playbook tests.yml -e taskotron_item={}'.format(nevr))


This ends up with positive exit code when the check FAIL. So I didn't check the result. However there is failure (of the check) and failure (of the entire circus) and we should check the exit status and try to guess (at least if it's negative, it was killed and we should raise KeybordInterupt or something like that).

Added, it's either 0 or 2.

To clarify: the SI spec says the playbook exit code reflects the result of the check - 0 means PASS, non-zero means FAIL. It doesn't know the concept of generic tests. To stay close to the SI spec, we opted (at least for now) to exiting with 0 if all tests pass and non-zero if at least one check fails in our generic tasks. Unfortunately that prevents us from distinguishing a failed test from a crashed test. So, as long there is {{artifacts}}/taskotron/results.yml present, we consider the test to have finished fine. This is something we should discuss with SI spec authors in the future.

hroncok · 2018-02-02T11:15:58Z

test/integration/test_integration.py

+    mock.shell('ansible-playbook tests.yml -e taskotron_item={}'.format(nevr))
+    mock.copy_out('artifacts', clean_target=True)
+
+    with open('artifacts/test.log') as f:


The log contains all the previous runs, we need to purge it before each test.

testing this

I might be missing something, but test.log should be overwritten in the Download RPMs from Koji ansible step. A possible approach is also to use a different artifacts directory for each integration test suite invocation.

It contained other stuff. Now I need to preserve each artifacts dir for each nevr, but that's fine. it will also make it easier when debugging test suite failures.

The testsuite actually only worked because it got preserved. So this was clearly a bug in the design of the tests (previously, each runtask had a different artifact path, now they shared it).

Pushed. Also, this may allow us to add a --fake switch that will not run the checks but use the stuff from the artifacts directories (e.g. if you are only changing the tests but not the implementation).

Added that as well.

Previously this only worked by accident Bonus: We can now inspect the contents of the folder, if there is a failure

irushchyshyn · 2018-02-02T14:21:49Z

.gitignore

 *.pyo
 *.pyc
 __pycache__
-/artifacts


Not sure why this gets removed.
An own directory gets created for each nevr only during integration tests so far. When running ansible-playbook ... command the results get saved to artifacts.

Are you overriding artifacts variable on the command line?
ansible-playbook ... -e artifacts=path

Nevermind, maybe I misunderstood the question. Either way, I also believe it's better to keep this in .gitignore. When somebody runs the playbook with default vars, that's where the artifacts get created.

I am not overriding artifacts, I just took README and tried running as it says :)
I will add it back.

Besides, while test.log gets overridden with each subsequent run, output.log keeps previous results (also with default vars).

ok, get it back, my bad, I haven't considered direct runs.

@irushchyshyn When running it locally, it's your duty to clear artifacts dir (or not). The task can't know your intended use case. Of course, you can e.g. generate random artifacts dir paths inside the playbook (the same way it generates workdirs), in case the variable is not set externally. It's your task, you can do whatever you prefer for local runs :) Local runs are mostly for development.

@kparal thanks for explanation.

irushchyshyn

I have added some changes to README tests section, otherwise it looks good to me 👍

irushchyshyn · 2018-02-02T18:21:54Z

@kparal thank you for this change!
Merged to develop, let me know if there are any issues enabling it for http://taskotron-dev.fedoraproject.org/

kparal · 2018-02-03T10:39:38Z

I'll enable it on Monday and let you know if there are any issues. Thanks for a fast response here in this PR.

kparal · 2018-02-05T09:54:24Z

It's running on taskotron-dev now once again:
http://taskotron-dev.fedoraproject.org/resultsdb/results?&testcases=dist.python-versions

hroncok · 2018-02-21T13:03:43Z

@kparal Can we merge this to master?

kparal · 2018-02-21T13:38:30Z

Unfortunately, not yet. I'm still working on getting the libtaskotron code merged to master.

hroncok · 2018-02-21T14:09:34Z

Do you have any (even raw) estimate? I just need to know whether we need to backport new checks that we want to get running before the bodhi activation point of F28.

kparal · 2018-02-22T11:04:19Z

I'll discuss this with @tflink and get back to you tomorrow.

kparal · 2018-02-22T16:22:26Z

The current plan is that we could do the switch in 1-2 weeks, hopefully. Does that work for you?

hroncok · 2018-02-22T17:27:50Z

Doing this before 2018-03-06 (Bodhi activation point) would be perfect, but 2 weeks is fine. Thanks.

hroncok · 2018-03-06T10:28:37Z

For whoever is reading this. Today is the Bodhi activation point. Yet this was already deployed via #55

hroncok self-requested a review February 1, 2018 14:28

hroncok reviewed Feb 1, 2018

View reviewed changes

kparal commented Feb 1, 2018

View reviewed changes

hroncok force-pushed the feature/ansiblize branch 2 times, most recently from 37f5cc1 to b1833b5 Compare February 1, 2018 18:59

kparal and others added 7 commits February 1, 2018 20:02

convert to ansiblized Taskotron (generic) task

b405c1f

integration tests are disabled for the moment

create artifacts/taskotron in playbook

640cf7e

download only supported architectures

d762ac8

allow to download build.log

8df2176

The koji command can't do it itself, use the koji directive from libtaskotron.

Reenable the integration tests, use mock

70978fb

Add --cap-add=SYS_ADMIN to docker on Tarvis

d5f0afe

Otherwise mock is not possible, see https://lists.fedoraproject.org/archives/list/buildsys@lists.fedoraproject.org/message/UMYBANKEJRCP52I5O7LAWX35VUYQFVH3/ This is probably not safe, but we are on Travis. I have no idea what I'm doing.

Don't use nspawn

3e4d8e5

hroncok force-pushed the feature/ansiblize branch from b1833b5 to 0b376bb Compare February 1, 2018 19:09

hroncok added 4 commits February 1, 2018 20:37

Document running in mock

b2f95c8

Don't use shebangs

8ef4147

Fix style issues in download_rpms.py

50a0916

Use a dedicated mock root

4b2cd96

hroncok force-pushed the feature/ansiblize branch from 0720c20 to 4b2cd96 Compare February 1, 2018 19:39

hroncok requested a review from irushchyshyn February 1, 2018 19:43

hroncok added the WIP label Feb 1, 2018

hroncok added 2 commits February 2, 2018 09:42

Remove unneeded packages from Docker

6e46a8b

bugzilla needs requests but does not specify it

57253da

gdestuynder/simple_bugzilla#1

hroncok reviewed Feb 2, 2018

View reviewed changes

Error if mock shell ends with unexpected exitcode

d20272e

hroncok mentioned this pull request Feb 2, 2018

The integration tests take ages (not that much anymore, but still) #45

Closed

hroncok added 2 commits February 2, 2018 13:05

For each nevr, use it's own artifacts directory

118f8c6

Previously this only worked by accident Bonus: We can now inspect the contents of the folder, if there is a failure

Add a --fake switch for integration tests

8601f68

irushchyshyn reviewed Feb 2, 2018

View reviewed changes

irushchyshyn added 3 commits February 2, 2018 15:58

Add artifacts directory to gitignore for direct runs

71cce32

Update tests section in README

5b945f1

Add generated tests.retry file to gitignore

651975d

irushchyshyn approved these changes Feb 2, 2018

View reviewed changes

Document how to use --fake option in integration tests

dbc9ac1

irushchyshyn merged commit 569ed88 into develop Feb 2, 2018

irushchyshyn deleted the feature/ansiblize branch February 2, 2018 18:22

irushchyshyn added enhancement tests and removed WIP labels Feb 2, 2018


		koji = koji_directive.KojiDirective()

		print 'Downloading rpms for %s into %s' % (koji_build, rpmsdir)

Convert task to Standard Interface style task #44

Convert task to Standard Interface style task #44

Uh oh!

Conversation

kparal commented Feb 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kparal Feb 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hroncok commented Feb 1, 2018

Uh oh!

kparal commented Feb 2, 2018

Uh oh!

irushchyshyn commented Feb 2, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

kparal commented Feb 1, 2018 •

edited

Loading

kparal Feb 1, 2018 •

edited

Loading

hroncok Feb 2, 2018 •

edited

Loading