Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for hrStorageIndex agility #15028

Merged
merged 5 commits into from Jul 7, 2023
Merged

Conversation

peejaychilds
Copy link
Contributor

Please give a short description what your pull request is for

It is possible that the HOST-RESOURCES-MIB::hrStorageTable has updated between discovery and polling. If we assume it hasn't we can end up assigning the values of the wrong mount point to another mount point. This can cause
issues with not only display, but also alarming.

We can compare the description expected with one from the hrStorageIndex entry and if they are not equal then we can filter the table to return an entry that does match.

We have found with JunOS EVO when anyone logs into the box it mounts
tmpfs 3.1G 0 3.1G 0% /run/user/{uid}
which causes the ordering of the HOST-RESOURCES-MIB::hrStorageTable to change which causes all sorts of issues ... (like Storage alarms because some partition just got muddled up with some other suppressed partition (ie a loopback mount) that has 0% free space)

Please note

Please read this information carefully. You can run ./lnms dev:check to check your code before submitting.

  • Have you followed our code guidelines?
  • If my Pull Request does some changes/fixes/enhancements in the WebUI, I have inserted a screenshot of it.
  • If my Pull Request makes discovery/polling/yaml changes, I have added/updated test data.

Testers

If you would like to test this pull request then please run: ./scripts/github-apply <pr_id>, i.e ./scripts/github-apply 5926
After you are done testing, you can remove the changes with ./scripts/github-remove. If there are schema changes, you can ask on discord how to revert.

@PJGuyTen
Copy link

PJGuyTen commented May 9, 2023

Thank you for looking into this! I've been trying to figure out a fix for this for many versions!

@Jellyfrog
Copy link
Member

Jellyfrog commented May 9, 2023

Would it be possible to index on something else, like mountpoint instead? Guessing it breaks the design too much?

@peejaychilds
Copy link
Contributor Author

Would it be possible to index on something else, like mountpoint instead? Guessing it breaks the design too much?

HOST-RESOURCES-MIB::hrStorageTable doesn't have the mount point unfortunately.

    29 => 
    array (
      'hrStorageIndex' => '29',
      'hrStorageType' => 'hrStorageFixedDisk',
      'hrStorageDescr' => 're0:/dev/sda6, mounted on /config',
      'hrStorageAllocationUnits' => '4096',
      'hrStorageSize' => '251878',
      'hrStorageUsed' => '375',
      'hrStorageAllocationFailures' => '0',
    ),

@murrant
Copy link
Member

murrant commented May 10, 2023

Mounts changing hrStorageIndex is a really annoying bug in your snmpd implementation...

@peejaychilds
Copy link
Contributor Author

@murrant tested works ok

Test with JunOS EVO box

Pre-Test : Set device 42 into non-polling poller group? (use GROUP 5 TYO )
Test Command - ./poller.php -d -v -h 42 -m storage
Examine re0:/dev/sda1, mounted on /boot (id=1579)

Control Test
On one dispatcher perform the following

  1. 'unpatch my patch' >> git checkout includes/polling/storage/hrstorage.inc.php
  2. run ./poller.php -d -v -h 42 -m storage > hr-test1
  3. log user into device
  4. run ./poller.php -d -v -h 42 -m storage > hr-test2
  5. log user out of device
  6. run ./poller.php -d -v -h 42 -m storage > hr-test3
grep 1579] hr-te*
hr-test1:SQL[UPDATE `storage` set `storage_used`=?,`storage_free`=?,`storage_size`=?,`storage_units`=?,`storage_perc`=? WHERE `storage_id` = ? ["19242496","186197504","205440000",512,9,1579] 1.07ms]
hr-test2:SQL[UPDATE `storage` set `storage_used`=?,`storage_free`=?,`storage_size`=?,`storage_units`=?,`storage_perc`=? WHERE `storage_id` = ? ["39051264","0","39051264",8192,100,1579] 0.91ms]
hr-test3:SQL[UPDATE `storage` set `storage_used`=?,`storage_free`=?,`storage_size`=?,`storage_units`=?,`storage_perc`=? WHERE `storage_id` = ? ["19242496","186197504","205440000",512,9,1579] 1.03ms]
grep "/boot" hr-te* | grep :hrStorage
hr-test1:hrStorageDescr.25 = re0:/dev/sda1, mounted on /boot
hr-test2:hrStorageDescr.26 = re0:/dev/sda1, mounted on /boot
hr-test3:hrStorageDescr.25 = re0:/dev/sda1, mounted on /boot

Results -> Notice the index's change, and the storage_used, storage_free etc follow the index, so are set incorrectly in 'test2' for id=1597

Patch Test

On second dispatcher perform the following

  1. 'unpatch' >> git checkout includes/polling/storage/hrstorage.inc.php
  2. apply new patch via ./scripts/github-apply 15028
  3. run ./poller.php -d -v -h 42 -m storage > hr-test1
  4. log user into device
  5. run ./poller.php -d -v -h 42 -m storage > hr-test2
  6. log user out of device
  7. run ./poller.php -d -v -h 42 -m storage > hr-test3
grep 1579] hr-te*
hr-test1:SQL[UPDATE `storage` set `storage_used`=?,`storage_free`=?,`storage_size`=?,`storage_units`=?,`storage_perc`=? WHERE `storage_id` = ? ["19242496","186197504","205440000",512,9,1579] 1.37ms]
hr-test2:SQL[UPDATE `storage` set `storage_used`=?,`storage_free`=?,`storage_size`=?,`storage_units`=?,`storage_perc`=? WHERE `storage_id` = ? ["19242496","186197504","205440000",512,9,1579] 0.46ms]
hr-test3:SQL[UPDATE `storage` set `storage_used`=?,`storage_free`=?,`storage_size`=?,`storage_units`=?,`storage_perc`=? WHERE `storage_id` = ? ["19242496","186197504","205440000",512,9,1579] 0.83ms]
grep "/boot" hr-te* | grep :hrStorage
hr-test1:hrStorageDescr.25 = re0:/dev/sda1, mounted on /boot
hr-test2:hrStorageDescr.26 = re0:/dev/sda1, mounted on /boot
hr-test3:hrStorageDescr.25 = re0:/dev/sda1, mounted on /boot
 grep quick hr*
hr-test2:Storage re0:/dev/sda5, mounted on /etc changed index 27 > 28, applying quickfix until discovery runs
hr-test2:Storage re0:/dev/sda6, mounted on /config changed index 28 > 29, applying quickfix until discovery runs
hr-test2:Storage re0:/dev/sda7, mounted on /var changed index 29 > 30, applying quickfix until discovery runs
hr-test2:Storage re0:/dev/sda2, mounted on /soft changed index 26 > 27, applying quickfix until discovery runs
hr-test2:Storage re0:/dev/loop0, mounted on /data/var/external changed index 9 > 10, applying quickfix until discovery runs
hr-test2:Storage re0:/dev/sda1, mounted on /boot changed index 25 > 26, applying quickfix until discovery runs

Results -> Notice the index's change HOWEVER the storage_used, storage_free etc follow the description, so are set correctly in 'test2' for id=1597

Post-Test set device back to poller group = 0

@murrant murrant merged commit 7e22b12 into librenms:master Jul 7, 2023
8 checks passed
@librenms-bot
Copy link

This pull request has been mentioned on LibreNMS Community. There might be relevant details there:

https://community.librenms.org/t/23-7-0-changelog/21841/1

TheMysteriousX pushed a commit to TheMysteriousX/librenms that referenced this pull request Aug 9, 2023
* fix for hrStorageIndex agility

* test for array

* Handle not found data

* Handle description changed correctly

* remove debug

---------

Co-authored-by: Tony Murray <murraytony@gmail.com>
peejaychilds added a commit to peejaychilds/librenms that referenced this pull request Oct 26, 2023
* fix for hrStorageIndex agility

* test for array

* Handle not found data

* Handle description changed correctly

* remove debug

---------

Co-authored-by: Tony Murray <murraytony@gmail.com>
@peejaychilds peejaychilds deleted the hrstorage_fix branch November 22, 2023 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants