New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avocado logging cleanup #5645
Avocado logging cleanup #5645
Conversation
|
Great job @richtja, will do so tomorrow and let you know! |
|
Hi @richtja I am still getting logging messages at the stdout, together with the test statuses like: but this is because I have done the logging adjustment as explained in #5587 (comment). Without this logging adjustment I don't see the messages and now it seems at least they appear in the debug log (did not appear before this PR) but are marked as [stderr] instead of [stdlog] like Perhaps something else should be adjusted too so that wanted (for additional test debugging) DEBUG messages like these don't end up marked as errors? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm but it seems the job.log file's content is now completely missing and replaced by mere test statuses. Is this intentional? If so has it been discussed with the community? The test statuses I see in the job.log right now are also seen in the stdout but at least in the stdout there are also clear error messages after each status and the same statuses can also be found in multiple result formats like tap, json, xml, etc. So this replacement of useful avocado plugin messages used for debugging various plugins with yet another source of status messages rather seems unintentional. Is this the case?
All the useful messages from our own scheduler are gone and at present I cannot debug it with this branch so I will have to roll back some of my use of it right now and request changes here.
Hi @pevogam thank you for testing this. I was able to reproduce it, I am looking into that. |
Before this PR, the job.log file has job related logs avocado internal logs and all test logs. The occurrence of job internal logs and test related logs in this file was unintentional and therefore has been removed in this PR. We discussed the separation of test logs and job logs in here. I am not sure which logs you are missing because, this PR only removing avocado internal logs and duplicates with test-result logs.
You should be able to see the logs with |
I am not sure if you could provide example but to be precise - I understand that any job-related plugins (including test loaders, schedulers, command line parsing) would emit to the job log and I think it was meaningful as these log messages won't make any sense in any test specific log file.
Hmmm, I participated in this discussion, I wasn't left with any impression that job related logging will be replaced with yet another stdout and results duplicate.
Our Avocado I2N plugin has its own scheduler and test loading process and all these messages (info and even debug level) that were related to the job are completely missing now - not even present in the newly found log = logging.getLogger('avocado.test.' + __name__)
Do you think that the |
IMO, those are internal logs and might be distracting during debugging tests, but you can still store them with
I am not sure what do you mean here, Because job related logging wasn't removed. Removed were
No, maybe this the main misunderstanding here. This PR is splitting the avocado test and job logs. It creates the |
I see, this explains the change then. Indeed the change to a separate
Indeed the test runner focus is mostly on debugging tests but imagine a case where the same user is interested in information on "how the job ran". For instance, they might be interested in test loading information ("why wasn't test A included in the current test set"), scheduling information ("why did test A run before test B"), reusable dependency information ("how did we prepare the preconditions for test A"), job cleanup and sync information ("was the dependency synced across the cluster of hosts and containers"), and so on. Perhaps avocado's state machine is not very informative in that regard and hence the aforementioned decision but plugins like ours strive to provide these details and consider the job log the most natural place for it as it is comparable in style and purpose to the separate test debug logs. So additional job information would not be distracting in such cases and actually be to the point of adding information of value to the test runner (user) as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @richtja ,
Thanks for your work here. This is definitely a little can of worms, and one must be very brave to open it.
I've found a few issues and have questions about the goals and working state of the commits, etc. Please let me know if I my comments make sense.
Other than that, while hoping to make sense of the current logging code, and trying to simplify a bit more, I came up with #5652 . It's not intended to conflict with this work, it is just one extra simplification.
I understand that those logs are useful in some use-cases, but IMO this shouldn't be the default behavior. And if you want to have those logs, you should use From what you are saying, maybe the |
Not just in some use cases but in all use cases involving running a job with tests for our plugin - their order, dependencies, scheduling logic, and all else matter there and vary from job to job. But as I mentioned Avocado likely doesn't include much information of this kind to begin with and as you suggested above moving these messages to the job stream is a good enough solution to have them in the job log.
I assume currently by default |
b6df754
to
aa2ae11
Compare
For now, we were using `avocado.test` logging stream for job logs and for test logs. This has been creating confusions since we have been splitting the logs into separate files. This change introduce `avocado.job` logging stream for job specific logs and `avocado.test` will be used only on runners for test specific logs. Signed-off-by: Jan Richter <jarichte@redhat.com>
We have predefined streams which can be used by `--show` option to show user the predefined outputs. Let's add a dictionary which will translate these options to real avocado logging streams. This will make much easier manipulation with those streams and options. Signed-off-by: Jan Richter <jarichte@redhat.com>
For some time we have to support avocado legacy logging together with nrunner logging. It made avocado logging configuration too complicated. Now, when the legacy code has been removed, we can focus only on nrunner logging. In nrunner logging, all loggers have to be in the 'avocado' namespace and all logs in 'avocado' namespace which are not test specific will be stored into job.log file. All test specific logs have to be under 'avocado.test' namespace and they will be handled by nrunner messages mechanism. Because of these rules, the avocado.core.output package has to take care only about early logging and console output which simplifies things and we can clean avocado.core.output package from legacy code. After this change, the job.log file will have only logs under `avocado.job` namespace and test-result/*/debug.log file will have only logs related to one specified test under `avocado.test` namespace. Reference: avocado-framework#4121 Signed-off-by: Jan Richter <jarichte@redhat.com>
This adds possibility to use any avocado stream in `–store-logging-stream` option. Until now, we only supported separation of streams by tests, which means that the stored streams were stored in `test-result` directory for each test where they were used. Now, the `–store-logging-stream` option adds one more file to job_log directory for each stream which combination of streams from each test. This will be useful when you don't want to switch between files and you want to have everything in one place during debugging. Since job.log file doesn't contain `avocado.core` stream anymore, the `–store-logging-stream` option is the only possibility of how to log avocado internal logs. This also adds new default `avocado` value to the `–store-logging-stream` option which will create `full.log` file in job_log directory with all logs created during the job run. Signed-off-by: Jan Richter <jarichte@redhat.com>
This commit adds support for logs which are not under `avocado.*` namespace. This will be useful when tests use external libraries with its own logging namespace. Since avocado doesn't manipulate with root logger and stores logs only from `avocado.*` namespace, the user has to specify the external namespaces which he wants to store as well. For this, he have to use `--store-logging-stream` option. Reference: Signed-off-by: Jan Richter <jarichte@redhat.com>
aa2ae11
to
a628696
Compare
|
Hi @pevogam and @clebergnu for your reviews. I fixed the issues which you found and updated the documentation to better describe the logging changes. I also added support for external logging. I am moving this from the draft state so when you will have time please look at this. Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely good improvements @richtja, I left some more comments from testing this out and looking through the most recent changes.
| logging.getLogger("avocado").level = logging.DEBUG | ||
|
|
||
|
|
||
| CONFIG = [] | ||
| def split_loggers_and_levels(loggers): | ||
| """Separates logger names and legger leves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger levels
| "app": "avocado.app", | ||
| "test": "avocado.test", | ||
| "job": "avocado.job", | ||
| "debug": "avocado.app.debug", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a separate namespace for the app debug level instead of using the app namespace with debug level of verbosity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting this, this has been removed in #5652 which was introduced in parallel with this PR and I have wrongly resolved the collisions. I will fix it.
|
|
||
| .. note:: You have to specify separated logging streams. You can't use the | ||
| built-in streams in this function. | ||
| built-in streams in this function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for the trailing empty space? Perhaps this should have been caught by the linters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. Yes, we can think about adding linter check for this.
| :full.log: All generated logs in one job run. It contains all logs from | ||
| avocado internal to logs from all tests. If you run your tests | ||
| in parallel the logs might be overlap since the logs are not | ||
| sorted and they are stored at the time when they have been created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this could be disabled by default? If not then I assume this can still be disabled for those of us with large logs that don't want duplication of the same data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed this with @clebergnu and we decided to enable this by default. Because for users might forget to enable it and then lose some important logs and reproducing the failure and gathering logs might be difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I can add and option for disabling this. @clebergnu what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if we have set the relevant stream storing switches won't full.log represent the same as separate log files, just put in one? In other words, no logging will be lost if users forget to enable it, it will just be separate logs.
@clebergnu Could you motivate this decision? IMO the full.log is next to useless when it comes to parallel runs where everything is jumbled together and I thought the future direction for avocado is towards parallelism and away from serial runs. Isn't full.log just a remnant of serial runs then? Why is it provided by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My hunch is that users would complain louder about the lack of an all-in-one log file, than they would about the extra storage spent. But, another characteristic of what I valued about a "full log file", would be to have logs from non-avocado sources. See here what I mean by it.
One real-world use case is the QEMU tests, that happen to use loggers named after their own project, that is, qemu.*. Those logs are critical to understanding the outcome of tests.
We can require users to manually register all loggers they need, or have a default that will persist them by default in a full.log kind of file. The same logic applies to per-test logging, so my point was not just about a job-wide full log file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I completely agree and we have also lost critical messages in order to understand our own tests (in experimental branches) which is why I kept being an advocate of finding a nice and standard way to handle such libraries, I am not really sure such a full log file will be tractable with ten or more parallel workers. How do you intend on separating the external library qemu logs from one test from another?
The same logic applies to per-test logging, so my point was not just about a job-wide full log file.
Alright but the job-wide full log file is still going to be almost unreadable in terms of order of parallelly logged events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can require users to manually register all loggers they need, or have a default that will persist them by default in a full.log kind of file.
I also don't understand the need for users to manually register all loggers since IMO the default test behavior should always be verbose and easy to debug instead of requiring manual switches to do so. Especially if the larger verbosity could easily be grepped or filter out and doesn't mean much performance penalty. It would make more sense to me if users manually disable some stream from some namespace for being too verbose or even then just tweak its logging level instead of having to re-enable all the streams that are harder to re-generate than to filter out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can require users to manually register all loggers they need, or have a default that will persist them by default in a full.log kind of file.
I also don't understand the need for users to manually register all loggers since IMO the default test behavior should always be verbose and easy to debug instead of requiring manual switches to do so. Especially if the larger verbosity could easily be grepped or filter out and doesn't mean much performance penalty. It would make more sense to me if users manually disable some stream from some namespace for being too verbose or even then just tweak its logging level instead of having to re-enable all the streams that are harder to re-generate than to filter out.
"the default test behavior should always be verbose and easy to debug instead of requiring manual switches to do so" is actually the point I tried to make here. Again using the QEMU tests, it would not be convenient for people running the tests, to have to create a command line or configuration file that would enable the logging of qemu.*, which is critical to understanding the test outcome, but it's not generated directly by the tests.
And your next point, "It would make more sense to me if users manually disable some stream", is also what I've tried to communicate. That's why I'm advocating for the root logger to have a handler that will persist everything by default in a full.log kind of file.
I guess the discussion is now mainly on whether a job level full.log file makes sense (assuming the test level does make sense). I'll concede gladly to a compromise of having the test-level by default, and not having a job-level by default, because my concern is really with users being able to understand what really happened to a test, and not the order of events between (possible parallel) tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't understand the need for users to manually register all loggers since IMO the default test behavior should always be verbose and easy to debug instead of requiring manual switches to do so. Especially if the larger verbosity could easily be grepped or filter out and doesn't mean much performance penalty. It would make more sense to me if users manually disable some stream from some namespace for being too verbose or even then just tweak its logging level instead of having to re-enable all the streams that are harder to re-generate than to filter out.
"the default test behavior should always be verbose and easy to debug instead of requiring manual switches to do so" is actually the point I tried to make here. Again using the QEMU tests, it would not be convenient for people running the tests, to have to create a command line or configuration file that would enable the logging of
qemu.*, which is critical to understanding the test outcome, but it's not generated directly by the tests.
Exactly, but my point is somewhat bolder here: these store-logging-stream switches are not needed on the first place. Right now we take as default not to store these streams unless explicitly mentioned then try and cancel out our original default decision by introducing a full log file which might not have been needed on the first place. So I am on your opinion about what needs to be done but believe it could be done better. Because the same logging is just as critical for me, right now I am left with two choices: either try and shuffle a big and possibly scrambled full log file or manually restore each external library I need in order to understand the tests.
I guess the discussion is now mainly on whether a job level full.log file makes sense (assuming the test level does make sense). I'll concede gladly to a compromise of having the test-level by default, and not having a job-level by default, because my concern is really with users being able to understand what really happened to a test, and not the order of events between (possible parallel) tests.
Indeed, this is perhaps a better partial solution of the above even though the nomenclature is not really ideal: debug.log sounds to me like it should contain everything or at least the names themselves don't clearly differentiate between debug.log and full.log. Perhaps avocado.log and full.log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even better: test.log and job.log for the not-everything and full.log for everything.
| class MatplotlibTest(Test): | ||
| def test(self): | ||
|
|
||
| logging.getLogger("matplotlib").level = logging.DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm but something like this doesn't work on my side and I need this to make it work:
logging.getLogger("matplotlib").parent = logging.getLogger("avocado.test")In this way the matplotlib logger is told to delegate its messages to the test namespace and the debug.log. Is this correct or am I missing something? On my side if I don't use the above line the output will appear in the [stderr] stream even with the most up-to-date changes here so far. Same as before:
[stderr] DEBUG:guibot.finder:Next best match is not acceptable
Could it be related to Avocado VT doing some additional logging manipulation?
The messages from your test also show in the job log, not sure if this is intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line 11 of this example:
logging.getLogger("matplotlib").level = logging.DEBUG
only enables logging in matplotlib and it is not necessary for avocado, if your library already generating logs you don't need to do any changes. But what you have to do is to tell avocado to store those logs by --store-logging-stream=matplotlib option
Hmmm but something like this doesn't work on my side and I need this to make it work:
logging.getLogger("matplotlib").parent = logging.getLogger("avocado.test")In this way the
matplotliblogger is told to delegate its messages to the test namespace and the debug.log. Is this correct or am I missing something? On my side if I don't use the above line the output will appear in the [stderr] stream even with the most up-to-date changes here so far. Same as before:[stderr] DEBUG:guibot.finder:Next best match is not acceptableCould it be related to Avocado VT doing some additional logging manipulation?
If some messages are in the [stderr] then the test itself send the logs to the stderr, maybe avocado vt or the guibot is doing that. Can you please provide some easy reproducible with guibot for test? Thank you.
The messages from your test also show in the job log, not sure if this is intended.
When I run avocado run --store-logging-stream=matplotlib examples/tests/external_logging_stream.py I can't see any logs in job log can you provide some reproducible as well. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I need to have this logging streams permanently and thus in a config file, does something like this make sense:
[root@c4 ~]# cat /etc/avocado/avocado.conf
[job.run]
store_logging_stream = avocado:DEBUG, aexpect:DEBUG, guibot:DEBUG
I cannot find any example in the documentation, example, or blueprints using a config file and the above produces an error that might be caused by me in some way:
File "/mnt/local/intra2net-qa/tp_i2n/tools/cli/run_api/suite.py", line 98, in run_suite
from avocado.core.app import AvocadoApp
File "/usr/lib/python3.10/site-packages/avocado/__init__.py", line 36, in <module>
settings.merge_with_configs()
File "/usr/lib/python3.10/site-packages/avocado/core/settings.py", line 576, in merge_with_configs
self.update_option(namespace, value, convert=True)
File "/usr/lib/python3.10/site-packages/avocado/core/settings.py", line 776, in update_option
self._namespaces[namespace].set_value(value, convert)
File "/usr/lib/python3.10/site-packages/avocado/core/settings.py", line 279, in set_value
self._value = self._as_list(value)
File "/usr/lib/python3.10/site-packages/avocado/core/settings.py", line 226, in _as_list
return ast.literal_eval(value)
File "/usr/lib64/python3.10/ast.py", line 62, in literal_eval
node_or_string = parse(node_or_string.lstrip(" \t"), mode='eval')
File "/usr/lib64/python3.10/ast.py", line 50, in parse
return compile(source, filename, mode, flags,
File "<unknown>", line 1
avocado:DEBUG, aexpect:DEBUG, guibot:DEBUG, vncdotool:DEBUG, pyi2ncommon:DEBUG
^
SyntaxError: invalid syntax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If some messages are in the [stderr] then the test itself send the logs to the stderr, maybe avocado vt or the guibot is doing that. Can you please provide some easy reproducible with guibot for test? Thank you.
Since this requires additional pip-installations on your side let's first see if you have any problems with Avocado VT so that we can exclude the elephant in the room.
|
|
||
| ax.set_ylabel("fruit supply") | ||
| ax.set_title("Fruit supply by kind and color") | ||
| ax.legend(title="Fruit color") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a place to automatically CI test out examples? Like verify the output here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an alternative minimal example using aexpect:
import logging
from avocado import Test
from aexpect import remote
host="localhost"
username="root"
password="test1234"
client="ssh"
port=22
prompt="^\\[.*\\][\\#\\$]\\s*$"
linesep="\n"
status_test_command="echo $?"
class MatplotlibTest(Test):
def test(self):
logging.getLogger("aexpect").level = logging.DEBUG
logging.getLogger("aexpect").parent = logging.getLogger("avocado.test")
session = remote.remote_login(client,
host, port, username, password,
prompt, verbose=True)
session.close()where attaching the logger makes the library output available in the debug log in the correct format but seems to produce some spurious outputs also on stdout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, we don't CI test for our examples, but this example is tested by test_store_logging_stream_external
|
There seems to be also a new duplicate log file |
Yes, you are right, I didn't do tests on avocado-vt. I will look at that. Thanks |
So it seems that we have converged into the understanding that:
If there's no disagreement about those points, then I believe the proposal in this PR is suitable. It could, of course, take some adjustments in 3rd party plugins, etc.
I see the default as:
@richtja is the list above correct? |
|
@richtja functionally-wise, there's one use case that I'm missing. For context, in the past, we added a handler to the root logger ( Now, the idea of the The arguments in favor of never touching the root logger have a lot of strength if Avocado introduces change that impacts the producers of those logs. For instance, in your The question that I can think of: what is the probability that matplotlib would check the logging configuration, and act differently if they find that a handler that is not known to them? To me, this seems highly improbable. What I am arguing in favor here is that, provided there's a tiny chance that we change the behavior of external code, logging everything into Now, there are also a couple of things that I found out with the current implementation: I) The It's something that we must decide how to act. I've thought of a few possible solutions, including: a) giving II) I've also found some duplication. For a test such as: import logging
from avocado import Test
class T(Test):
def test(self):
logging.getLogger("foo").info("bar")Running it results in: III) The Using the test above, the following should work: Or: The format itself is also a point to be discussed. |
|
@clebergnu I just noticed your new points about the A small side note: Has anyone checked for any performance effects from streaming so many logging messages to so many files? In particular, a full log containing all log messages some of which are duplicated from other files might have some increased IO load, does anyone of you think this could be significant?
The |
That's a good point. Using experience alone, and no benchmarking so far, I believe this can hurt performance if the actual writing is synchronous. It's by far (many orders of magnitude) more important than the actual size of the data. If tests are only transmitting once, and the logging handlers are writing to a couple (say test-specific and job-wide files), I'd be very surprised if this adds a significant load.
Yeah, as you could tell, I dislike my own proposal, but I also dislike the UI messages in the logs. But I can live with the UI in the log messages, and maybe in the future, the UI will be reimplemented and not use (abuse?) the logging system. So take this more like a rant, and sorry for the noise! :) |
I try to avoid manipulations with root logger, that is why you have to specify the steams which you want to store by
I understand your concern, but IMO we can keep the
Good catch, I will look into this.
If we decided to use root logger. Then this shouldn't be a problem, because there won't be need for |
|
Hi @richtja something else I spotted to, some lines in the job.log file contain a test name prefix and a logging message that belongs to a given test: What I would expect is that test logging messages remain within the test debug log files and there are no messages of the form Is this intentional? Has anyone else seen it? Could it be related to using avocado utilities within tests? |
IMO, the reason here is that the avocado_i2n creates logs into the The |
Indeed, I did exactly as you recommended in order to have the log messages from Avocado I2N in the job log.
The problem here is that this comes from utilities used by the tests themselves and I never intended for any test messages to end up in the log. In other words, there are tests that use avocado, virttest, or avocado_i2n utilities and I would prefer to have the messages of those tests remaining all within the debug log. In the case of Avocado VT for instance, there are plenty of avocado utilties they use, all of which will end up spamming the job log right now with unwanted noise. Can we circumvent this somehow? |
IMO, this should be a responsibility of those utilities. The |
I see but how about utilties like |
I didn't consider this option, and it is really hard nut to crack. IMO, we have two options here:
|
As I could think of no use case where we would really want for tests to emit messages to the job log (but I might be wrong) I don't think there is any need to allow tests to add any logs into
This is what we have more or less had in the past and I think this is the most compatible and clean way forward. Any other opinions are welcome, I can't think of many arguments for not separating the test and job log completely so I am curious to see if there are some. |
|
Hi @pevogam and @clebergnu. Thank you for your reviews. I prepared v2 #5665 based on your comments. When you will have a time, please have a look. |
This PR simplifies the avocado logging mechanism and removes logging related to legacy code.
It separates avocado test logging and avocado job logging. It creates a new stream for job logs called
avocado.joband removes all unnecessary logs from job.log file.Furthermore, it adds new options to –store-logging-stream and adds a new log file called 'full.log' for collecting all avocado logs into one file.
Reference: #4121 #4856 #5587
Signed-off-by: Jan Richter jarichte@redhat.com