Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save actor tracebacks in leapp.db #832

Merged
merged 3 commits into from Jun 26, 2023
Merged

Conversation

fernflower
Copy link
Member

Now when an exception is thrown during actor
execution it won't be shown only on stdout, but
will appear in leapp.db and leapp report as high
risk report message.

OAMG-4186

Now when an exception is thrown during actor
execution it won't be shown only on stdout, but
will appear in leapp.db and leapp report as high
risk report message.

OAMG-4186
@github-actions
Copy link

github-actions bot commented Jun 7, 2023

Thank you for contributing to the Leapp project!

Please note that every PR needs to comply with the Leapp Guidelines and must pass all tests in order to be mergeable.
If you want to request a review or rebuild a package in copr, you can use following commands as a comment:

  • review please @oamg/developers to notify leapp developers of the review request
  • /packit copr-build to submit a public copr build using packit

To launch regression testing public members of oamg organization can leave the following comment:

  • /rerun to schedule basic regression tests using this pr build and leapp-repository*master* as artifacts
  • /rerun 42 to schedule basic regression tests using this pr build and leapp-repository*PR42* as artifacts
  • /rerun-sst to schedule sst tests using this pr build and leapp-repository*master* as artifacts
  • /rerun-sst 42 to schedule sst tests using this pr build and leapp-repository*PR42* as artifacts

Please open ticket in case you experience technical problem with the CI. (RH internal only)

Note: In case there are problems with tests not being triggered automatically on new PR/commit or pending for a long time, please consider rerunning the CI by commenting leapp-ci build (might require several comments). If the problem persists, contact leapp-infra.

leapp/workflows/__init__.py Outdated Show resolved Hide resolved
@fernflower
Copy link
Member Author

/rerun

@github-actions
Copy link

github-actions bot commented Jun 8, 2023

Copr build succeeded: https://copr.fedorainfracloud.org/coprs/build/6020934

@github-actions
Copy link

github-actions bot commented Jun 8, 2023

Testing Farm request for RHEL-8.6-rhui/5990856;6020934 regression testing has been created.
Once finished, results should be available here.
Full pipeline log.

@github-actions
Copy link

github-actions bot commented Jun 8, 2023

Testing Farm request for RHEL-7.9-rhui/5990856;6020934 regression testing has been created.
Once finished, results should be available here.
Full pipeline log.

@fernflower
Copy link
Member Author

/rerun

@github-actions
Copy link

Copr build succeeded: https://copr.fedorainfracloud.org/coprs/build/6091405

@github-actions
Copy link

Testing Farm request for RHEL-8.6-rhui/6080984;6091405 regression testing has been created.
Once finished, results should be available here.
Full pipeline log.

@github-actions
Copy link

Testing Farm request for RHEL-7.9-rhui/6080984;6091405 regression testing has been created.
Once finished, results should be available here.
Full pipeline log.

@PeterMocary
Copy link
Member

LGTM, although I agree with the suggestion from @pirat89 in previous comment - changing the immediate CLI output to something similar to the StopActorExecutionError would be nice.

Tried on rhel7 vm by raising an exception in scancpu actor.

Leapp report:
/var/log/leapp/leapp-report.txt contains expected output:

Risk Factor: high
Title: Actor scancpu unexpectedly terminated with exit code: 1
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/leapp/repository/actor_definition.py", line 74, in _do_run
    actor_instance.run(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/leapp/actors/__init__.py", line 289, in run
    self.process(*args)
  File "/usr/share/leapp-repository/repositories/system_upgrade/common/actors/scancpu/actor.py", line 16, in process
    scancpu.process()
  File "/usr/share/leapp-repository/repositories/system_upgrade/common/actors/scancpu/libraries/scancpu.py", line 142, in process
    raise Exception("Testing exception")
Exception: Testing exception

Summary:
Key: d6c2efd3b09890ac78b14f0fb428e5236fe1c36c
----------------------------------------

Database:
leapp-inspector messages --actor=scancpu:

######################################################################
                          PRODUCED MESSAGES
######################################################################
Stamp: 2023-06-19T11:37:56.658310Z
Actor: scancpu
Phase: FactsCollection
Type: ErrorModel
Message_data:
{
    "actor": "scancpu",
    "details": null,
    "message": "Actor scancpu unexpectedly terminated with exit code: 1\nTraceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/leapp/repository/actor_definition.py\", line 74, in _do_run\n    actor_instance.run(*args, **kwargs)\n  File \"/usr/lib/python2.7/site-packages/leapp/actors/__init__.py\", line 289, in run\n    self.process(*args)\n  File \"/usr/share/leapp-repository/repositories/system_upgrade/common/actors/scancpu/actor.py\", line 16, in process\n    scancpu.process()\n  File \"/usr/share/leapp-repository/repositories/system_upgrade/common/actors/scancpu/libraries/scancpu.py\", line 142, in process\n    raise Exception(\"Testing exception\")\nException: Testing exception\n",
    "severity": "fatal",
    "time": "2023-06-19T11:37:56.657966Z"
}
######################################################################

leapp-inspector actors --actor=scancpu

Actor: scancpu
Executed: True
Phase: FactsCollection
Started: 2023-06-19T11:37:56.593592Z
Produced messages:
    - ErrorModel
Executed shell commands:
    - ['lscpu']
Logs:
--- {"message": "2023-06-19 11:37:56.627 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: External command has started: ['lscpu']", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.633 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Architecture:          x86_64", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.634 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: CPU op-mode(s):        32-bit, 64-bit", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.635 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Byte Order:            Little Endian", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.636 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: CPU(s):                2", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.637 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: On-line CPU(s) list:   0,1", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.638 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Thread(s) per core:    1", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.639 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Core(s) per socket:    1", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.639 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Socket(s):             2", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.640 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: NUMA node(s):          1", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.641 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Vendor ID:             GenuineIntel", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.642 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: CPU family:            6", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.642 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Model:                 142", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.643 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Model name:            Intel(R) Core(TM) i7-10610U CPU @ 1.80GHz", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.644 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Stepping:              12", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.645 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: CPU MHz:               2303.996", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.645 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: BogoMIPS:              4607.99", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.646 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Virtualization:        VT-x", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.647 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Hypervisor vendor:     KVM", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.648 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Virtualization type:   full", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.648 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: L1d cache:             32K", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.649 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: L1i cache:             32K", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.650 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: L2 cache:              4096K", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.650 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: L3 cache:              16384K", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.651 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: NUMA node0 CPU(s):     0,1", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.652 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 arat umip md_clear spec_ctrl intel_stibp arch_capabilities", "level": "DEBUG"}
--- {"message": "2023-06-19 11:37:56.653 DEBUG    PID: 8240 leapp.workflow.FactsCollection.scancpu: External command has finished: ['lscpu']", "level": "DEBUG"}

I'd suggest to add an ERROR log as well, if that makes sense and would be doable.

@fernflower
Copy link
Member Author

@PeterMocary @pirat89 Thanks for the review!
If I understand you guys correctly, you would like to wrap the hard crash into a leapp exception and put the traceback in the details, right?
(I honestly prefer the hard crash way, as any actor crashing is a severe thing that has to be as obvious as possible, but I will implement it the way you suggest and we'll see :)

@pirat89
Copy link
Member

pirat89 commented Jun 20, 2023

@fernflower In case of the report I am more wondering WDYT about putting it into the report like this:

Risk Factor: high
Title: Actor scancpu unexpectedly terminated with exit code: 1
Summary:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/leapp/repository/actor_definition.py", line 74, in _do_run
    actor_instance.run(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/leapp/actors/__init__.py", line 289, in run
    self.process(*args)
  File "/usr/share/leapp-repository/repositories/system_upgrade/common/actors/scancpu/actor.py", line 16, in process
    scancpu.process()
  File "/usr/share/leapp-repository/repositories/system_upgrade/common/actors/scancpu/libraries/scancpu.py", line 142, in process
    raise Exception("Testing exception")
Exception: Testing exception

Key: d6c2efd3b09890ac78b14f0fb428e5236fe1c36c
----------------------------------------
######################################################################
                          PRODUCED MESSAGES
######################################################################
Stamp: 2023-06-19T11:37:56.658310Z
Actor: scancpu
Phase: FactsCollection
Type: ErrorModel
Message_data:
{
    "actor": "scancpu",
    "details": "{'traceback': 'Traceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/leapp/repository/actor_definition.py\", line 74, in _do_run\n    actor_instance.run(*args, **kwargs)\n  File \"/usr/lib/python2.7/site-packages/leapp/actors/__init__.py\", line 289, in run\n    self.process(*args)\n  File \"/usr/share/leapp-repository/repositories/system_upgrade/common/actors/scancpu/actor.py\", line 16, in process\n    scancpu.process()\n  File \"/usr/share/leapp-repository/repositories/system_upgrade/common/actors/scancpu/libraries/scancpu.py\", line 142, in process\n    raise Exception(\"Testing exception\")\nException: Testing exception\n'}"
    "message": "Actor scancpu unexpectedly terminated with exit code: 1",
    "severity": "fatal",
    "time": "2023-06-19T11:37:56.657966Z"
}
######################################################################

I mean to keep the title as shorter msg and use details/summary/... for the rest. But as I told I am ok with this solution also. Btw, for the console output it's same as before:

Failed to get unit file state for ip6tables.service: No such file or directory
2023-06-20 13:16:00.228 DEBUG    PID: 3771 leapp.workflow.FactsCollection.system_facts: Command ['systemctl', 'is-enabled', 'ip6tables'] failed with exit code 1.
2023-06-20 13:16:00.230 DEBUG    PID: 3771 leapp.workflow.FactsCollection.system_facts: External command has finished: ['systemctl', 'is-enabled', 'ip6tables']
2023-06-20 13:16:00.231 DEBUG    PID: 3771 get_firewalld_status: The ip6tables service is likely not enabled nor running
Process Process-220:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/site-packages/leapp/repository/actor_definition.py", line 74, in _do_run
    actor_instance.run(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/leapp/actors/__init__.py", line 289, in run
    self.process(*args)
  File "/usr/share/leapp-repository/repositories/system_upgrade/common/actors/systemfacts/actor.py", line 62, in process
    if not architecture.matches_architecture(architecture.ARCH_S390X).mooo():
AttributeError: 'bool' object has no attribute 'mooo'


=============================================================================================
Actor system_facts unexpectedly terminated with exit code: 1 - Please check the above details
=============================================================================================

I actually realized the "banner" statement just now :-D It's cool.

@pirat89
Copy link
Member

pirat89 commented Jun 20, 2023

@fernflower @PeterMocary btw, I've realized that for error models in the report.txt file, we do not print that it's error, inhibitor, ... it looks just a high severity msg. I am thinking we should improve it in future somehow to make it clear that it's error, but I do not want to change the structure/schema because of that.

@pirat89
Copy link
Member

pirat89 commented Jun 20, 2023

@PeterMocary @pirat89 Thanks for the review! If I understand you guys correctly, you would like to wrap the hard crash into a leapp exception and put the traceback in the details, right? (I honestly prefer the hard crash way, as any actor crashing is a severe thing that has to be as obvious as possible, but I will implement it the way you suggest and we'll see :)

Just realized you summarized that right! Sorry, my brain is slow today :/

@fernflower
Copy link
Member Author

/rerun

@github-actions
Copy link

Copr build succeeded: https://copr.fedorainfracloud.org/coprs/build/6107459

@github-actions
Copy link

Testing Farm request for RHEL-8.6-rhui/6096363;6107459 regression testing has been created.
Once finished, results should be available here.
Full pipeline log.

@github-actions
Copy link

Testing Farm request for RHEL-7.9-rhui/6096363;6107459 regression testing has been created.
Once finished, results should be available here.
Full pipeline log.

@fernflower
Copy link
Member Author

@PeterMocary @pirat89 All your comments should be resolved - moved trace to the message details (so that a nice Summary with shorter message name will be shown in the report), error is logged at the time of the crash as well. Could you please re-review?
(leapp-report.txt)

----------------------------------------
Risk Factor: high
Title: Actor bacula_check unexpectedly terminated with exit code: 1
Summary: Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/leapp/repository/actor_definition.py", line 74, in _do_run
    actor_instance.run(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/leapp/actors/__init__.py", line 289, in run
    self.process(*args)
  File "/etc/leapp/repos.d/system_upgrade/common/actors/baculacheck/actor.py", line 20, in process
    42 / 0
ZeroDivisionError: division by zero

Key: 4117a75983f26b1c6987c1bcc84ac31f9a2d8480

(except from leapp-upgrade.log)

2023-06-23 09:09:23.81  INFO     PID: 235468 leapp.workflow.Checks: Executing actor bacula_check 
2023-06-23 09:09:23.131 ERROR    PID: 235468 leapp.workflow.Checks: Actor bacula_check has crashed: Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/leapp/repository/actor_definition.py", line 74, in _do_run
    actor_instance.run(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/leapp/actors/__init__.py", line 289, in run
    self.process(*args)
  File "/etc/leapp/repos.d/system_upgrade/common/actors/baculacheck/actor.py", line 20, in process
    42 / 0
ZeroDivisionError: division by zero

@fernflower fernflower requested a review from pirat89 June 23, 2023 13:12
@pirat89
Copy link
Member

pirat89 commented Jun 26, 2023

@fernflower thanks! lgtm and tests passed. Merging.

@pirat89 pirat89 merged commit c4c1800 into oamg:master Jun 26, 2023
12 checks passed
@pirat89 pirat89 added the changelog-checked The merger/reviewer checked the changelog draft document and updated it when relevant label Aug 23, 2023
pirat89 added a commit to pirat89/leapp that referenced this pull request Aug 23, 2023
## Packaging
-  Provide leapp-framework 5.0 (oamg#818, oamg#840)

## Framework
### Fixes
- Improve processing of reports with UTF-8 characters (oamg#821, oamg#828)
- Fix info reporting with only one path to log (oamg#834)

### Enhancements
- Include tracebacks of actors into the leapp.db (oamg#832)

## Leapp (tool)
### Fixes
- Dialog creation fails not more for a component without choices (oamg#826)
- Empty report is generated correctly (oamg#828)
- Fix processing data in remediation instructions with non-ascii characters ()

### Enhancements
- Improve report summary output to make it more visible (oamg#818, oamg#840)

## stdlib
### Fixes
- Fixed the call when the execute cannot be performed (oamg#836)
### Enhancements
- changes related just to stdlib - under leapp/libraries/stdlib
@pirat89 pirat89 mentioned this pull request Aug 23, 2023
pirat89 added a commit to pirat89/leapp that referenced this pull request Aug 23, 2023
## Packaging
-  Provide leapp-framework 5.0 (oamg#818, oamg#840)

## Framework
### Fixes
- Improve processing of reports with UTF-8 characters (oamg#821, oamg#828)
- Fix info reporting with only one path to log (oamg#834)

### Enhancements
- Include tracebacks of actors into the leapp.db (oamg#832)

## Leapp (tool)
### Fixes
- Dialog creation fails not more for a component without choices (oamg#826)
- Empty report is generated correctly (oamg#828)
- Fix processing data in remediation instructions with non-ascii characters ()

### Enhancements
- Improve report summary output to make it more visible (oamg#818, oamg#840)

## stdlib
### Fixes
- Fixed the call when the execute cannot be performed (oamg#836)
### Enhancements
- changes related just to stdlib - under leapp/libraries/stdlib
Rezney pushed a commit that referenced this pull request Aug 23, 2023
## Packaging
-  Provide leapp-framework 5.0 (#818, #840)

## Framework
### Fixes
- Improve processing of reports with UTF-8 characters (#821, #828)
- Fix info reporting with only one path to log (#834)

### Enhancements
- Include tracebacks of actors into the leapp.db (#832)

## Leapp (tool)
### Fixes
- Dialog creation fails not more for a component without choices (#826)
- Empty report is generated correctly (#828)
- Fix processing data in remediation instructions with non-ascii characters ()

### Enhancements
- Improve report summary output to make it more visible (#818, #840)

## stdlib
### Fixes
- Fixed the call when the execute cannot be performed (#836)
### Enhancements
- changes related just to stdlib - under leapp/libraries/stdlib
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-checked The merger/reviewer checked the changelog draft document and updated it when relevant
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants