Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[openshift_ovn] Collect additional ovnkube node logs #3336

Merged
merged 2 commits into from Aug 22, 2023

Conversation

pperiyasamy
Copy link
Contributor

With Interconnect support in latest OVN-Kubernetes, ovnkube-nodes logs grew large. This commit adds the ability to collect those additional logs.


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname email@example.com?
  • Are any related Issues or existing PRs properly referenced via a Closes (Issue) or Resolved (PR) line?

@pperiyasamy
Copy link
Contributor Author

@packit-as-a-service
Copy link

Congratulations! One of the builds has completed. 🍾

You can install the built RPMs by following these steps:

  • sudo yum install -y dnf-plugins-core on RHEL 8
  • sudo dnf install -y dnf-plugins-core on Fedora
  • dnf copr enable packit/sosreport-sos-3336
  • And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

Comment on lines 36 to 35
self.add_copy_spec([
"/var/lib/ovn-ic/etc/*log*"
])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a one-line addition to the add_copy_spec() call above, since we're working in the same directory. Also, a_c_s accepts single strings as well as lists if this were to be kept separate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK @TurboTurtle , moved this log path into existing list.

@@ -32,6 +32,10 @@ def setup(self):
"/var/lib/ovn-ic/etc/ovnnb_db.db",
"/var/lib/ovn-ic/etc/ovnsb_db.db"
])
# Collect additional ovnkube logs for interconnect setup.
self.add_copy_spec([
"/var/lib/ovn-ic/etc/*log*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to know what kind of logs are here, capturing every log file by default may not be ideal if there are other components logging here or if there is significant log rotation happening.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @TurboTurtle reasonable questions. With the ovn-ic feature, we opted for having the libovsdb logs separate from the regular ovnkube controller. General details on this feature are here:
https://issues.redhat.com/browse/OCPSTRAT-379
More info on the logs are here: https://issues.redhat.com/browse/SDN-3507
CNO PR: openshift/cluster-network-operator#1938
The logs are rotated via lumberjack and have the following values: 100Mb (uncompressed) and 5 rotated and compressed. With that, the worst case scenario will mean that the the SOS will have less than 500Mb.
How about we restrict the log files to use "libovsdblog" wildcard?
Here is an example output of what the files look like:

❯ POD=$(oc get pod -n openshift-ovn-kubernetes -l app=ovnkube-node -o jsonpath='{.items[0].metadata.name}')
ocn exec -it $POD -c ovnkube-controller -- bash -c "ls -lah /var/log/ovnkube/libovsdb*log*"
-rw-r-----. 1 root root  41K Aug 14 19:45 /var/log/ovnkube/libovsdb-2023-08-14T19-45-34.678.log.gz
-rw-r-----. 1 root root 660K Aug 14 23:38 /var/log/ovnkube/libovsdb.log

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about collecting /var/log/ovnkube/*log* when --all-logs is specified, and /var/log/ovnkube/*.log otherwise. This follows the same kind of scenario for many of the plugins. This ensures that the size of sosreport is not as big if --all-logs is not specified

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say we want to have these files collected by default (since communicating the right flags that should be used is not that easy and may result in losing precious data), and we know this plugin is only active for openshift nodes, so this change shouldn't affect the majority of setups, wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prevailing approach in sos plugins is to capture *.log normally, and get *log* when either --all-logs or some plugin-specific option is used.

The sizelimit parameter here applies per item (in a list) handed to add_copy_spec, not the entire call or the individual matches for a regex (meaning a sizelimit setting of 600 is a bit heavy handed). So, another option would be having one entry for *.log and another for *.log.gz and setting sizelimit to 100 meaning by default we'd capture the active log and up to 100 MB of rotated logs. If something truly needs all the rotated logs, then that's where --all-logs would come into play.

Would that be a reasonable middle ground?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@npinaeva @TurboTurtle @arif-ali , Thanks! Based on your concerns and suggestions, updated the log collection with commit b098433, Is that ok now ?

"/var/log/openvswitch/ovs-monitor-ipsec.log"
])
else:
self.add_copy_spec("/var/log/openvswitch")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This folder is also collected in the openvswitch plugin, so you could have a conflict of collections. It may be better to target the logname and asterisk. or better yet, if the openvswitch plugin does get activated on an openshift node, that these items could be added to that plugin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is mainly to collect ipsec related logs on ovn/openvswitch setup.
now collecting only ipsec logs from ovn plugin when --all-logs option is not set.

sos/report/plugins/openshift_ovn.py Outdated Show resolved Hide resolved
@pperiyasamy pperiyasamy force-pushed the ovnkube-collect-extra-logs branch 2 times, most recently from cc70b68 to b098433 Compare August 15, 2023 15:48
Comment on lines 44 to 46
self.add_copy_spec(["/var/lib/ovn-ic/etc/libovsdb.log",
"/var/lib/ovn-ic/etc/libovsdb*log.gz"],
sizelimit=100)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick on style guidelines, this should be

self.add_copy_spec([
    "/var/lib/ovn-ic/etc/libovsdb.log",
    "/var/lib/ovn-ic/etc/libovsdb*log.gz"
], sizelimit=100)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Comment on lines 40 to 42
self.add_copy_spec([
"/var/log/openvswitch/libreswan.log",
"/var/log/openvswitch/ovs-monitor-ipsec.log"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was it moved here? I would say we need these logs even when all_logs is set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, these file patterns dont seem to be collected in all_logs (not sure if intentionally or by mistake).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/var/log/openvswitch is already being captured by the openvswitch plugin. Would that not be activated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen several such feedbacks that a PR wants to collect a copypath in another plugin; maybe does it make sense to have a CI test for it? Though it can be tricky to have it working properly (e.g. due to multiline a_s_c, asterisks in paths, paths with variables etc)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be collected for openvswitch plugin, but I don't think that happens for some reason, may be useful to check/debug that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok @npinaeva , I will check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@npinaeva This works as expected as per openvswitch plugin logic here on an ovn cluster, in fact we can remove collecting ipsec logs here as openvswitch plugin already taking care of collecting *.log when --all-logs not specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, thanks for checking! then we can just remove these files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed collecting ipsec logs from /var/log/openvswitch directory.

With Interconnect support in latest OVN-Kubernetes, ovnkube-nodes
logs grew large. This commit adds the ability to collect those
additional logs.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
The sosreport limits to collect logs at maximum of 25 MB in a given
collection passed into add_copy_spec method. so this may lead into
logs wouldn't have fully collected when user collected sos report
without --all-logs option.
Hence this commit ensures logs and dbs collected as much as possible
when --all-logs option is not specified.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
@npinaeva
Copy link
Contributor

/lgtm
thanks all for the nice discussion!

@TurboTurtle TurboTurtle merged commit e11a594 into sosreport:main Aug 22, 2023
34 checks passed
@pperiyasamy pperiyasamy deleted the ovnkube-collect-extra-logs branch August 22, 2023 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants