Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixtures: Add argument use_subprocess to run_cli_command #5846

Merged

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Dec 19, 2022

Fixes #5843

Release v2.2.0 was broken in a bad way because of a bug that was introduced in the PsqlDosBackend that was not caught by any of the tests. The problem is that changes made by verdi commands were no longer automatically committed and persisted to the database.

This went unnoticed by the tests because in the unit tests the commands are run through the click.testing.CliRunner utility. This suffers from a number of problems. We already knew that it only executed the (sub)command that was passed to the runner, and any code that was part of the parents would not be executed. The run_cli_command already had to deal with and, for example, manually added the verbosity option to each command, which normally is done automatically by the verdi top-level command when invoking through the command line. The v2.2.0 release highlighted another problem, namely that the assertions of the tests do so against the state of the interpreter of the test, which in this case would contain the correct changes after the command was executed, but this did of course not require the changes to be persisted and so didn't notice the problem.

To avoid these problems of the tests not executing the exact pathway as a real invocation over the command line, the run_cli_command fixture was updated with the keyword use_subprocess which when set to True would invoke the command as an actual subprocess using the subprocess module. This would force the command to be tested as in a real-life situation.

Running the tests with this mode caused a number of tests to fail. A number of these were due to unrelated bugs that were hidden through the use of the CliRunner, but 32 tests failed due to the bug in the PsqlDosBackend. Had these tests been run with use_subprocess=True before the release, we would have caught the bug before shipping it with v2.2.0.

Now ideally then we would always run all tests with use_subprocess=True but this is quite a bit slower and would add another 30 minutes roughly to the unit test suite, which is unaccepatble. Therefore, I have gone through a procedure to determine exactly those tests that fail because of the session bug and run only those with use_subprocess=True. This should give us a lot more confidence that we will catch similar bugs in future PRs.

The PR consists of the following commits (see commit messages for more details):

  1. Refactor the run_cli_command fixture to add the use_subprocess and set True as the default
  2. Switch certain tests to use_subprocess=False that use monkeypatching or other strategies to change things for the test that are only visible in the current interpreter. They would fail when using the subprocess since that uses a differnt interpreter where the monkeypatch is not applied. Similarly, certain tests are run with suppress_warnings=True since their assertions strictly check output and the printing of deprecation warnings would cause them to fail. This wasn't a problem with the test runner since in that case the warning would already have been printed on test startup and wouldn't be printed again.
  3. Fix a number of minor bugs in the tests that were not related to the PsqlDosBackend bug but were related to how defaults are processed by the CliRunner and when running as a subprocess.
  4. The use_subprocess=True pathway tries to determine the full command line arguments from the click.Command which failed for the verdi data comands as their the actual name is not taken from the command definition but the entry point with which they are registered.
  5. Change the default for use_subprocess to False but set use_subprocess=True for all 32 tests that failed because of the PsqlDosBackend session bug

Note that when these 5 commits are run on top of v2.2.0 there are 32 tests that fail because of the session bug (and 2 that hang for the same reason). This branch, however, puts these commits on top of main where the session bug has already been reverted and so should pass. The branch of the broken tests can be found here: https://github.com/sphuber/aiida-core/tree/feature/5843/run-cli-command-subprocess-broken

@ltalirz
Copy link
Member

ltalirz commented Dec 20, 2022

If you use this in the node deletion test you mentioned in #5842, does that test now fail?

@sphuber
Copy link
Contributor Author

sphuber commented Dec 20, 2022

If you use this in the node deletion test you mentioned in #5842, does that test now fail?

It does indeed. The fix I added in #5848 also makes them pass again. Just need to figure out why the fix makes other tests fail.

@sphuber sphuber force-pushed the feature/5843/run-cli-command-subprocess branch 17 times, most recently from cccbeec to 429fae3 Compare December 24, 2022 08:42
@ltalirz
Copy link
Member

ltalirz commented Dec 26, 2022

Thanks a lot @sphuber for the detailed investigation and the fix!

One question before I go through this:

Now ideally then we would always run all tests with use_subprocess=True but this is quite a bit slower and would add another 30 minutes roughly to the unit test suite

I understand that running the cli tests in a subprocess is the safest way to catch any "oddities" that may be overlooked due to running through the cli runner. I think it was the right strategy to proceed here, and I understand it did catch some unrelated bugs as well.

Now that we know how costly this operation is, I wonder: for the bugs related to changes not being persisted in the database, would there also be a "cheaper" way to detect them?
E.g. could the cli_runner try a rollback before returning?

@sphuber
Copy link
Contributor Author

sphuber commented Dec 27, 2022

Now that we know how costly this operation is, I wonder: for the bugs related to changes not being persisted in the database, would there also be a "cheaper" way to detect them?
E.g. could the cli_runner try a rollback before returning?

I might not understand properly what you are trying to suggest here, but I don't think this would work. When you rollback the session you are undoing all the changes. How would that help detect the problem?

Just to make sure the problem is clear, if we take the three A's of testing (arrange, act, assert), when we "act" with the runner, the assert is done on using the same session, and so it will work, regardless of whether the changes have been persisted (flushed and committed) to the database or not. Now you could force the test runner to flush and commit the session after it "acts" and so before the "assert" runs, but this would be hiding the very problem we faced just now where the code does not persist itself properly.

@sphuber sphuber force-pushed the feature/5843/run-cli-command-subprocess branch from 429fae3 to 9ffaf41 Compare March 8, 2023 09:19
@sphuber sphuber force-pushed the feature/5843/run-cli-command-subprocess branch 5 times, most recently from ee5d4af to baeeea9 Compare March 8, 2023 14:36
@sphuber sphuber force-pushed the feature/5843/run-cli-command-subprocess branch 2 times, most recently from f403864 to c6179f5 Compare March 8, 2023 18:10
@sphuber
Copy link
Contributor Author

sphuber commented Mar 8, 2023

@ltalirz This is now ready for review. I just have to take a final look if some of the commands that use use_subprocess=True can be switched to runnin with False. There are a few commands that take more than a minute now, since they have multiple invocations of the CLI. But the main logic can already be reviewed.

ltalirz
ltalirz previously approved these changes Mar 12, 2023
Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sphuber , this looks good to me.

nice to see the (test) bugs that were caught by this investigation!

just minor comments, no further review required

return self.stderr_bytes.decode('utf-8', 'replace').replace('\r\n', '\n')

@property
def output_lines(self) -> list[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_lines only returns stdout, while output returns stdout+stderr, which is likely to confuse users of the class.

I'd suggest reusing .output in .output_lines

:param use_subprocess: Boolean, if ``True``, runs the command in a subprocess, otherwise it is run in the same
interpreter using :class:`click.testing.CliRunner`. The advantage of running in a subprocess is that it
simulates exactly what a user would invoke through the CLI. The test runner provided by ``click`` invokes
commands in a way that is not always a 100% analogous to an actual CLI call and so tests may not cover the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be worth giving the actual example here that triggered this need, i.e. that an ORM object may be in the session but not committed to the database, causing existence checks to be ineffective

tested, then a subprocess cannot be used, since the monkeypatch only applies to the current interpreter. In
these cases it is necessary to set ``use_subprocesses = False``.
:param suppress_warnings: Boolean, if ``True``, the ``PYTHONWARNINGS`` environment variable is set to
``ignore::Warning``. This is important when testing a command using ``use_subprocess = True`` and the check
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this, it seems to me that, when running via a subprocess, we may want to suppress warnings by default (e.g. it could be that some random new deprecation warning starts popping up that breaks tests where people forgot to set suppress_warnings=True).

On the other hand, you probably don't want to change the default behavior for the regular tests.

Fine to keep; I don't have a better solution

Comment on lines 521 to 522
command_path = cli_command_map[command]
args = command_path[:1] + ['-p', aiida_profile.name] + command_path[1:] + parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might make sense to move this into the run_cli_command_subprocess function (and add the aiida profile name as a function argument), so it becomes more parallel to run_cli_command_runner

The `verdi data` subcommands, which are pluginnable through the
`aiida.cmdline.data` entry point group, get their actual name from the
corresponding entry point name, not the name that is declared in the
code. However, the `cli_command_map` fixture gets the name from the code
and not the entry point name. A mismatch in these names will cause the
`run_cli_command` fixture to fail in the case of `use_subprocess=True`,
so therefore we correct the command names to match the entry point.
By default the `run_cli_command` fixture will run the provided command
through the `CliRunner` utility provided by `click`. However, this
suffers from a problem as the code pathway followed through the runner
is not identical as when the command is actually called through the CLI.

The reason is that in `click` commands do not have a reference to their
parents, and so when a subcommand is invoked through the runner, only
the code of the subcommand is executed. Any code that is part of its
parents, such as parameter validation and callbacks are skipped. In
addition, the runner is executed in the same interpreter as where the
test suite is being run, whereas a command invoked through the actual
CLI runs in a standalone interpreter. This may hide subtle bugs.

Therefore, the `use_subprocess` argument is added, which is `False` by
default, which when set to `True`, the command will not be called through
the runner, but as an actual subprocess using `subprocess.run`. This
guarantees that the tested pathway is identical to what an actual
invocation through the command line would be. It is not enabled by
default since running through a subprocess is slower and not always
necessary.
Certain command line tests cannot be run with `use_subprocess=True`.
Examples are tests that use the `monkeypatch` fixture to patch certain
resources. If the command were to be executed in a subprocess, the patch
would no longer apply since it is only active in the current interpreter.
For those tests, the `use_subprocess` keyword of `run_cli_command` needs
to be set to `False`.

Also test that use the `empty_config` or `config_with_profile_factory`
fixtures need to run with `use_subprocess=False` since these fixtures
generate a temporary config that would not be used by the subprocess
since that uses the config at the `AIIDA_PATH` environment variable.

Then there are tests whose assertions very strictly check the output. If
the command is executed through a subprocess, any deprecation warnings
that are hit will be printed and the test might fail. This is not a
problem when running through the test runner because then the
deprecation warning will already have been emitted when the test suite
loaded and won't be fired another time when the test code is executed.

For this purpose the `suppress_warnings` argument is added to the
`run_cli_command` fixture. False by default, but when set to `True` the
`PYTHONWARNINGS` environment variable is set to `ignore::Warning` which
should cause all warnings to be suppressed.
The addition of `use_subprocess=True` to the `run_cli_command` fixture
to actually run the command as a subprocess instead of through the test
runner in the same interpreter, revealed a number of small bugs in the
tests themselves that would cause them to fail. They are fixed such that
the tests pass regardless of the value of `use_subprocess`.
The `run_cli_command` fixture was updated a few commits ago to allow
running the command as a subprocess instead of through the test runner.
The reason is that the test runner does not execute the exact same
pathway as when run as a subprocess and this was hiding subtle bugs.

Specifically, a bug in the `PsqlDosBackend` where changes made by a CLI
command were not automatically persisted to the database, was not caught
by the tests since when running in the test runner the assertions do not
explicitly check that changes are persisted, just that they are present,
even if just in memory.

Ideally, all tests are run with `use_subprocess=True`, which was set as
the default originally, but unfortunately this is quite a deal slower
than running with the test runner and would make the test suite almost
30 minutes slower. Instead, the tests that failed because of the storage
backend bug are now explicitly marked as using `use_subprocess=True` and
the default is changed. This now guarantees that at least these tests
that are known to depend on the subprocess pathway are properly run in
that way. This should provide greater coverage to prevent these types of
bugs in the future.

An exception is made for the tests of `verdi process list` and a number
of subcommands of `verdi data` (mostly the list command but also some of
the import and export commands. The reason is that all of the commands
failed for the same reason: the group that was created in the setup of
the test, was not persisted and so not found when the command was run
in a subprocess. Since these tests are numerous and take multiple
minutes to run, and the same behavior is already explicitly tested in
the `verdi group` tests, they are run with `use_subprocess=False` to
prevent the test suite from taking too much time.
@sphuber sphuber force-pushed the feature/5843/run-cli-command-subprocess branch from c6179f5 to 2f92964 Compare March 12, 2023 14:10
@sphuber sphuber merged commit 49bfa0b into aiidateam:main Mar 12, 2023
@sphuber sphuber deleted the feature/5843/run-cli-command-subprocess branch March 12, 2023 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add option to run_cli_command fixture to run command through subprocess instead of click.TestRunner
2 participants