Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Log count and intrusion detection #16

Closed
NosIreland opened this issue Jul 9, 2020 · 5 comments
Closed

Feature request: Log count and intrusion detection #16

NosIreland opened this issue Jul 9, 2020 · 5 comments

Comments

@NosIreland
Copy link

NosIreland commented Jul 9, 2020

Would it be possible to add metric for log entries and intrusion detection. Both of these would change system/chassis health to warning or critical. But at the moment if there is no way in seeing what is causing warning/critical state of system when there is intrusion detection or entries in system logs.
log entries: https://hostname/redfish/v1/Systems/1/LogServices/Log1/Entries

{
    "@odata.context": "/redfish/v1/$metadata#LogEntryCollection.LogEntryCollection",
    "@odata.type": "#LogEntryCollection.LogEntryCollection",
    "@odata.id": "/redfish/v1/Systems/1/LogServices/Log1/Entries",
    "Name": "Health Event Log Service Collection",
    "Description": "Collection of Health Event Logs",
    "Members@odata.count": 2,
    "Members": [
        {
            "@odata.id": "/redfish/v1/Systems/1/LogServices/Log1/Entries/1",
            "@odata.type": "#LogEntry.v1_3_0.LogEntry",
            "Id": "1",
            "Name": "Health Event Log Entry 1",
            "EntryType": "Event",
            "Severity": "Warning",
            "Created": "2020-07-07T10:21:02+00:00",
            "EntryCode": "Deassert",
            "SensorType": "Battery",
            "SensorNumber": 93,
            "Message": "BBU presence (StorageController0)",
            "MessageArgs": [
                "ArrayOfMessageArgs"
            ],
            "Links": {
                "Oem": {}
            },
            "Oem": {
                "Supermicro": {
                    "MarkAsAcknowledged": false,
                    "@odata.type": "#SmcLogEntryExtensions.v1_0_0.LogEntry",
                    "RawEventData": {
                        "EventDirAndType": "0xF0",
                        "SensorType": "0x29",
                        "EventData1": "0x02",
                        "EventData2": "0x00",
                        "EventData3": "0x00"
                    }
                }
            }
        },
        {
            "@odata.id": "/redfish/v1/Systems/1/LogServices/Log1/Entries/2",
            "@odata.type": "#LogEntry.v1_3_0.LogEntry",
            "Id": "2",
            "Name": "Health Event Log Entry 2",
            "EntryType": "Event",
            "Severity": "OK",
            "Created": "2020-07-07T10:21:29+00:00",
            "EntryCode": "Assert",
            "SensorType": "Battery",
            "SensorNumber": 93,
            "Message": "BBU presence (StorageController0)",
            "MessageArgs": [
                "ArrayOfMessageArgs"
            ],
            "Links": {
                "Oem": {}
            },
            "Oem": {
                "Supermicro": {
                    "@odata.type": "#SmcLogEntryExtensions.v1_0_0.LogEntry",
                    "RawEventData": {
                        "EventDirAndType": "0x70",
                        "SensorType": "0x29",
                        "EventData1": "0x02",
                        "EventData2": "0x00",
                        "EventData3": "0x00"
                    }
                }
            }
        }
    ]
}

Intrusion: https://hostname/redfish/v1/Chassis/1

{
    "@odata.context": "/redfish/v1/$metadata#Chassis.Chassis",
    "@odata.type": "#Chassis.v1_4_0.Chassis",
    "@odata.id": "/redfish/v1/Chassis/1",
    "Id": "1",
    "Name": "Computer System Chassis",
    "ChassisType": "RackMount",
    "Manufacturer": "Supermicro",
    "Model": "X11SPW-TF",
    "SKU": "",
    "SerialNumber": "XXXXXXXX",
    "PartNumber": "CSE-116TS-R504WBP",
    "AssetTag": "",
    "IndicatorLED": "Off",
    "Status": {
        "State": "Enabled",
        "Health": "Critical",
        "HealthRollup": "Critical"
    },
    "PhysicalSecurity": {
        "IntrusionSensorNumber": 170,
        "IntrusionSensor": "HardwareIntrusion",
        "IntrusionSensorReArm": "Manual"
    },
    "Power": {
        "@odata.id": "/redfish/v1/Chassis/1/Power"
    },
    "Thermal": {
        "@odata.id": "/redfish/v1/Chassis/1/Thermal"
    },
    "Links": {
        "ComputerSystems": [
            {
                "@odata.id": "/redfish/v1/Systems/1"
            }
        ],
        "PCIeDevices": [
            {
                "@odata.id": "/redfish/v1/Systems/1/PCIeDevices/NIC1"
            }
        ],
        "ManagedBy": [
            {
                "@odata.id": "/redfish/v1/Managers/1"
            }
        ]
    },
    "Oem": {
        "Supermicro": {
            "@odata.type": "#SmcChassisExtensions.v1_0_0.Chassis",
            "BoardSerialNumber": "XXXXXX",
            "GUID": "34313031-4D53-3CEC-EF06-B1D500000000",
            "BoardID": "0x953"
        }
    }
}
@jenningsloy318
Copy link
Owner

At the very beginning, I also come across same confusion regarding if it is required to implement this, but finally decided not, two reasons here:

  1. This plug /unplug action triggered this, but actually it is an event not metric
    2.log contains too many arbbitory attributes,it is not easy to filter them into a common pattern which is essential for a monitoring metric set

@NosIreland
Copy link
Author

Thanks for info, here is my take:

  1. Intrusion alert is a sensor and it goes red if triggered the same way as dead dimm, fan or psu. So I assume there would be 2 states.
  2. for log, it would be enough just to have a total count, no need to filter:
    "Members@odata.count": 2

@jenningsloy318
Copy link
Owner

I checked the gofish code, and indeed this is a struct that hold the PhysicalSecurity data, at this point, I can add this metric.
and meanwihle, only one metric is possible, check whether if the IntrusionSensorReArm is Manual or Automatic, and treat IntrusionSensor and IntrusionSensorNumber as the labels.

for log metrics, I need more consideration on this, minimal of the metrics to to collect the current entry counts, group them as different servirity, warning or critical, but here is also a tricky thing that the log entry will not be clear automatically, so this is always some value for this metric. and also the log entry timestamp is not irrelative with the metric timstamp, no easy to define the rules to determine the health state, so I think it is not practical here .

@jenningsloy318
Copy link
Owner

@NosIreland I update this exporter, implemented physical security part, you can grab the source code and raise a test for it now

@jenningsloy318
Copy link
Owner

No update for this issue, just close it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants