You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I experience massive impact on my 3 node cluster running 5 SAP SID's. If all exporters are active before the SAP systems are running the sapstartsrv will go in defunct and this blocks all other start procedures of the cluster.
I was able to to reproduce that with stop sap_host_exporter. That looks for me that we have an issue with the sapstartsrv in case to many request are happen.
First of all I'll get in touch with SAP to verify the situation. Second we may should thing if we reduce the data which we are collecting or using a different method. Maybe the socket direct instead of a http request.
The text was updated successfully, but these errors were encountered:
Just to keep track of the developments about this:
This is 99.9% an upstream issue: some SAPControl methods will cause the start service to crash, especially if they return a 500 response.
The problem is very difficult to reproduce, it most probably involves some internal race condition in the start service, and it replicates in many different environments, with or without suse-cluster-connector involved. There is nothing inherently wrong with the exporter, other than the assumption that sending HTTP requests wouldn't cause an entire SAP cluster to go haywire.
#36 partly works around this issue, and implementing #22 should completely avoid it.
Hi,
I experience massive impact on my 3 node cluster running 5 SAP SID's. If all exporters are active before the SAP systems are running the sapstartsrv will go in defunct and this blocks all other start procedures of the cluster.
I was able to to reproduce that with stop sap_host_exporter. That looks for me that we have an issue with the sapstartsrv in case to many request are happen.
First of all I'll get in touch with SAP to verify the situation. Second we may should thing if we reduce the data which we are collecting or using a different method. Maybe the socket direct instead of a http request.
The text was updated successfully, but these errors were encountered: