Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create smartctl plugin to get disk status #37

Merged
merged 6 commits into from
Sep 25, 2023

Conversation

AtaxyaNetwork
Copy link
Contributor

My last idea is to have a plugin to parse smartctl status to retrieve disk health
I tested a few things weeks ago, and I found smartctl can output a json, which is better for parsing.
Output is like this:

root@Ataxya:~# smartctl -H /dev/sda --json
{
  "json_format_version": [
    1,
    0
  ],
  "smartctl": {
    "version": [
      7,
      3
    ],
    "svn_revision": "5338",
    "platform_info": "x86_64-linux-5.18.0-3-amd64",
    "build_info": "(local build)",
    "argv": [
      "smartctl",
      "-H",
      "/dev/sda",
      "--json"
    ],
    "drive_database_version": {
      "string": "7.3/5319"
    },
    "exit_status": 0
  },
  "local_time": {
    "time_t": 1663097979,
    "asctime": "Tue Sep 13 21:39:39 2022 CEST"
  },
  "device": {
    "name": "/dev/sda",
    "info_name": "/dev/sda [SAT]",
    "type": "sat",
    "protocol": "ATA"
  },
  "smart_status": {
    "passed": true
  }
}

Unfortunalty, --json is only available on version 7

root@Ataxya:~# smartctl --version
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.18.0-3-amd64] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

On XCP-ng 8.2.1, I'm in 6.5

[13:51 Chouffe ~]# smartctl --version
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.19.0+1] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

So my question is: it's possible to ship smartctl version 7 in XCP-ng ? I tried with epel repos and other, and 6.5 is the last version available.
Can I package this manually to test it maybe ? If yes, how can i do this ?
Maybe @stormi can guide me ?

Thank you !

Signed-off-by: Cécile MORANGE contact@ataxya.net

Signed-off-by: Cécile MORANGE <contact@ataxya.net>
Signed-off-by: AtaxyaNetwork <contact@ataxya.net>
@AtaxyaNetwork
Copy link
Contributor Author

AtaxyaNetwork commented Apr 14, 2023

So I have created 2 scripts for 2 usage:

  • One to have just the Health part of smartctl for each disk
  • One to have the full output of smartctl, for each disk

Each script return a json sting, which can be parsed by XO (for example :D)
The idea is to use this script and sent the json output to XO, and XO can parse the json and display the information nicely :)

To call the script:

[15:42 Chouffe plugins]# xe host-call-plugin host-uuid=8b80bcc2-d31c-4f7d-85a7-e921f67c4ec5  plugin=smartctlHealth.py fn=check_smartctl 
{"/dev/sdf": "PASSED", "/dev/sdg": "PASSED", "/dev/sdd": "PASSED", "/dev/sde": "PASSED", "/dev/sdb": "PASSED", "/dev/sdc": "PASSED", "/dev/sda": "PASSED"}

[15:42 Chouffe plugins]# xe host-call-plugin host-uuid=8b80bcc2-d31c-4f7d-85a7-e921f67c4ec5  plugin=smartctlFull.py fn=check_smartctl 
{"/dev/sdf": {"power_on_time": {"hours": 5713}, "ata_version": {"minor_value": 94, "string": "ACS-4 T13/BSR INCITS 529 revision 5", "major_value": 2556}, "form_factor": {"ata_value": 3, "name": "2.5 inches"}, "firmware_version": "SVQ02B6Q", "wwn": {"oui": 9528, "naa": 5, "id": 65536604056}, "smart_status": {"passed": true}, "smartctl": {"build_info": "(local build)", "exit_status": 0, "argv": ["smartctl", "-j", "-a", "/dev/sdf"], "version": [7, 0], "svn_revision": "4883", "platform_info": "x86_64-linux-4.19.0+1"}, "temperature": {"current": 36}, "rotation_rate": 0, "interface_speed": {"current": {"sata_value": 3, "units_per_second": 60, "string": "6.0 Gb/s", "bits_per_unit": 100000000}, "max": {"sata_value": 14, "units_per_second": 60, "string": "6.0 Gb/s", "bits_per_unit": 100000000}}, "user_capacity": {"bytes": 1000204886016, "blocks": 1953525168}, "ata_smart_attributes": {"table": [{"name": "Reallocated_Sector_Ct", "flags": {"error_rate": false, "string": "PO--CK ", "event_count": true, "value": 51, "updated_online": true, "performance": false, "auto_keep": true, "prefailure": true}, "value": 100, "raw": {"string": "0", "value": 0}, "thresh": 10, "when_failed": "", "worst": 100, "id": 5}, {"name": "Power_On_Hours", "flags": {"error_rate": false, "string": "-O--CK ", "event_count": true, "value": 50, "updated_online": true, "performance": false, "auto_keep": true, "prefailure": false}, "value": 98, "raw": {"string": "5713", "value": 5713}, "thresh": 0, "when_failed": "", "worst": 98, "id": 9}, {"name": "Power_Cycle_Count", "flags": {"error_rate": false, "string": "-O--CK ", "event_count": true, "value": 50, "updated_online": true, "performance": false, "auto_keep": true, "prefailure": false}, "value": 99, "raw": {"string": "4", "value": 4}, etc etc it's a really long string

I'm not a dev, so, my code may be bad, don't hesitate to change/review/rm -rf it ! :)

@AtaxyaNetwork
Copy link
Contributor Author

Reminder: smartmontools must be in version >7 (can be installed with repo base)

@stormi
Copy link
Member

stormi commented Aug 24, 2023

XCP-ng 8.3 will have smartmontools 7, so json output will be available.

Regarding this PR, the first obvious thing I see is this should be a single plugin with two functions, rather than two separate plugins.

Update: I'd name the plugin simply smartctl.py.

Signed-off-by: Cécile MORANGE <contact@ataxya.net>
@AtaxyaNetwork
Copy link
Contributor Author

AtaxyaNetwork commented Sep 4, 2023

I just pushed the change you suggested!
Command to run:

[18:06 Chouffe plugins]# xe host-call-plugin host-uuid=8b80bcc2-d31c-4f7d-85a7-e921f67c4ec5  plugin=smartctl.py fn=check_smartctl
{"/dev/sdf": {"power_on_time": {"hours": 9147}, "ata_version": {"minor_value": 94, "string": "ACS-4 T13/BSR INCITS 529 revision 5", "major_value": 2556}, "form_factor": {"ata_value": 3, "name": "2.5 inches"}, "firmware_version": "SVQ02B6Q", "wwn": {"oui": 9528, "naa": 5, "id": 65536604056}, "smart_status": {"passed": true}, "smartctl": {"build_info": "(local build)", "exit_status": 0, "argv": ["smartctl", "-j", "-a", "/dev/sdf"], "version": [7, 0], "svn_revision": "4883", "platform_info": "x86_64-linux-4.19.0+1"}, "temperature": {"current": 32}, "rotation_rate": 0, [...]}


[18:06 Chouffe plugins]# xe host-call-plugin host-uuid=8b80bcc2-d31c-4f7d-85a7-e921f67c4ec5  plugin=smartctl.py fn=check_health
{"/dev/sdf": "PASSED", "/dev/sdg": "PASSED", "/dev/sdd": "PASSED", "/dev/sde": "PASSED", "/dev/sdb": "PASSED", "/dev/sdc": "PASSED", "/dev/sda": "PASSED"}

SOURCES/etc/xapi.d/plugins/smartctl.py Outdated Show resolved Hide resolved
SOURCES/etc/xapi.d/plugins/smartctl.py Outdated Show resolved Hide resolved
SOURCES/etc/xapi.d/plugins/smartctl.py Outdated Show resolved Hide resolved
SOURCES/etc/xapi.d/plugins/smartctl.py Outdated Show resolved Hide resolved
SOURCES/etc/xapi.d/plugins/smartctl.py Outdated Show resolved Hide resolved
@gthvn1
Copy link
Contributor

gthvn1 commented Sep 11, 2023

@AtaxyaNetwork can you also update the README.md with examples and provides a unit test?

@gthvn1
Copy link
Contributor

gthvn1 commented Sep 12, 2023

…ss things

Signed-off-by: Cécile MORANGE <contact@ataxya.net>
@AtaxyaNetwork
Copy link
Contributor Author

All suggested changes are pushed, I just need to do the unit testing

@gthvn1 gthvn1 changed the title WiP: Create smartctl plugin to get disk status Create smartctl plugin to get disk status Sep 12, 2023
SOURCES/etc/xapi.d/plugins/smartctl.py Outdated Show resolved Hide resolved
SOURCES/etc/xapi.d/plugins/smartctl.py Outdated Show resolved Hide resolved
SOURCES/etc/xapi.d/plugins/smartctl.py Outdated Show resolved Hide resolved
SOURCES/etc/xapi.d/plugins/smartctl.py Outdated Show resolved Hide resolved
Signed-off-by: Cécile MORANGE <contact@ataxya.net>
@AtaxyaNetwork
Copy link
Contributor Author

Suggested changes are pushed, with the unit testing. Thanks @gthvn1, for the help !

Copy link
Member

@stormi stormi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor typo. Otherwise, looks good to me, to the condition that someone tested/tests the error handling (and if you have leads to have it unit-tested or "xcp-ng-tests"-tested, even better).

tests/test_smartctl.py Outdated Show resolved Hide resolved
Copy link
Contributor

@benjamreis benjamreis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after addressing @stormi typo comment.

@gthvn1
Copy link
Contributor

gthvn1 commented Sep 14, 2023

I tested manually and if smartctl is not installed the plugin returns a backtrace that ends with OSError: [Errno 2] No such file or directory and the error code is set to 2

Error code: 2
Error parameters: No such file or directory, None, Traceback (most recent call last):
  File "/etc/xapi.d/plugins/xcpngutils/__init__.py", line 119, in wrapper
    return func(*args, **kwds)
  File "/etc/xapi.d/plugins/smartctl.py", line 15, in _list_disks
    result = run_command(['smarttctl', '--scan'])
  File "/etc/xapi.d/plugins/xcpngutils/__init__.py", line 67, in run_command
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

If smartctl returns a value that is different than 0 it is reported as well:

Error code: -1
Error parameters: Command '['smartctl', '--scan']' returned non-zero exit status 1, , Traceback (most recent call last):
  File "/etc/xapi.d/plugins/xcpngutils/__init__.py", line 119, in wrapper
    return func(*args, **kwds)
  File "/etc/xapi.d/plugins/smartctl.py", line 15, in _list_disks
    result = run_command(['smartctl', '--scan'])
  File "/etc/xapi.d/plugins/xcpngutils/__init__.py", line 71, in run_command
    raise subprocess.CalledProcessError(code, command, None)
CalledProcessError: Command '['smartctl', '--scan']' returned non-zero exit status 1

I'm looking to add this in unittest...

Signed-off-by: Cécile MORANGE <contact@ataxya.net>
@gthvn1 gthvn1 requested a review from stormi September 18, 2023 07:12
@stormi stormi merged commit c0778b0 into xcp-ng:master Sep 25, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants