Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data updates with past timestamps can cause boost errors #2977

Closed
eschoeller opened this issue Sep 26, 2019 · 31 comments
Closed

Data updates with past timestamps can cause boost errors #2977

eschoeller opened this issue Sep 26, 2019 · 31 comments
Labels
bug Undesired behaviour resolved A fixed issue
Milestone

Comments

@eschoeller
Copy link

eschoeller commented Sep 26, 2019

The title may not reflect what's actually occurring here, so feel free to change it. I've seen this behavior crop up once or twice now, once in 1.2.4 and now recently on 1.2.6 (with recent boost updates from 1.2.x branch). It's very infrequent, and I have very little idea of how to troubleshoot it or resolve it, but I wanted to make sure you were aware that it has occurred.

Here is some output I'm seeing in the boost.log which is of concern:

ERROR: unknown DS name '1569353360'
ERROR: unknown DS name '1569353358'
2019/09/24 13:45:10 - SYSTEM BOOST STATS: Time:38.3000 RRDUpdates:198246
ERROR: Not enough arguments
ERROR: Not enough arguments
ERROR: Not enough arguments
ERROR: Not enough arguments
ERROR: Not enough arguments
ERROR: Not enough arguments
ERROR: Not enough arguments
ERROR: opening '1569354366:3185:3185': No such file or directory
ERROR: opening '1569354366:0.00:0.00:0.35:0.02': No such file or directory
ERROR: opening '1569354366:0.00:0.00:0.00:0.00': No such file or directory
ERROR: opening '1569354366:0.00:0.00:0.00:0.00': No such file or directory
ERROR: opening '1569354366:0.00:0.00:0.32:0.00': No such file or directory
ERROR: opening '1569354366:0.00:0.00:0.00:0.00': No such file or directory
ERROR: opening '1569354366:0.00:0.00:0.00:0.00': No such file or directory
ERROR: opening '1569354366:0.00:0.00:0.00:0.00': No such file or directory
ERROR: opening '1569354366:0.00:0.00:0.05:0.00': No such file or directory
ERROR: opening '1569354366:0:16:0:8:0:1154386:0:0:0:0:24:0:0:0:30:0': No such file or directory
ERROR: opening '1569354366:67:0:1169973:1029509:1016:1015:1058477:970532:95907:96125': No such file or directory
ERROR: opening '1569354366:0:0:0:0:0:0:259:702:0:0:0:0:0:6899:0:0:0:0:0:0:1016:0:0:0:0': No such file or directory
ERROR: opening '1569354366:30460:898:2:24083:0:26333:2641:1977': No such file or directory
2019/09/24 14:01:08 - SYSTEM BOOST STATS: Time:39.0000 RRDUpdates:195759

This is just a snippet from one bad boost run that's occurred since 9/22. But that run itself had lots of OK messages, it just got tripped up towards the end.

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Are you removing Graphs between updates? I'm thinking we have an issue when Data Sources are removed while boost is in flight. What do you think?

@cigamit cigamit changed the title [1.2.6] Boost command overflows Updates to boost fail with various error messages when timespamp is in the past Sep 26, 2019
cigamit added a commit that referenced this issue Sep 26, 2019
Updates to boost fail with various error messages when timespamp is in the past
@cigamit cigamit added bug Undesired behaviour resolved A fixed issue labels Sep 26, 2019
@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

@eschoeller, thanks for the QA. Anything else. Just keep logging.

@cigamit cigamit added this to the v1.2.7 milestone Sep 26, 2019
@eschoeller
Copy link
Author

eschoeller commented Sep 26, 2019

Applied the updates. I'll keep an eye on this, thanks!

And no, I am certain that I did not remove any graphs or devices during that boost run. The only change that occurred that day was the addition of three Linux servers, sometime between 11:45AM and noon (I have an audit report generated every 15 minutes).

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

I was experiencing the issue. So, I'm like 100% percent certain it was fixed, cause it was fixed for me, and I found the root cause in the logic.

@cigamit cigamit removed the resolved A fixed issue label Sep 26, 2019
@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Well, I just tested this again, and getting more errors. Joy.

cigamit added a commit that referenced this issue Sep 26, 2019
cigamit added a commit that referenced this issue Sep 26, 2019
I think it's fixed now.  Will check in the morning.
cigamit added a commit that referenced this issue Sep 27, 2019
This restores the lastupdate check for boost.  Performance will not be as good, but it's better to have good updates.
@cigamit
Copy link
Member

cigamit commented Sep 27, 2019

Just did another commit. I reverted the check for the rrdupdate time. Best to just used RRDtool 1.5 and ensure that you select that under Console > Configuration > Settings > General

@eschoeller
Copy link
Author

eschoeller commented Sep 27, 2019 via email

@eschoeller
Copy link
Author

eschoeller commented Sep 27, 2019

Alright. I updated boost.php and poller_boost.php. I went through a boost run. I'm still seeing these items in the log:

ERROR: /cacti/cacti-1.2.6-prod/rra/thorn-a_procs_71179.rrd: found extra data on update argument: all
ERROR: /cacti/cacti-1.2.6-prod/rra/thorn-a_cswchs_71180.rrd: found extra data on update argument: all
ERROR: /cacti/cacti-1.2.6-prod/rra/thorn-a_runq_sz_71181.rrd: found extra data on update argument: all

I wasn't having this type of problem prior to yesterday. Let me know what you want to do. In this particular case I do believe that the "CPU:all" should have never been there in the first place, and I could get our systems folks to update that script everywhere, but I know there has been a change in functionality that could have reaching impacts into other areas.

I'm also still seeing stuff like this:

ERROR: /cacti/cacti-1.2.6-prod/rra/thorn-b_time_71302.rrd: expected 2 data source readings (got 1) from RRDUpdates

From a script output like:

2019/09/26 17:41:09 - SPINE: Poller[1] PID[15875] Device[1] HT[5] DS[59733] SS[3] SERVER: /cacti/cacti-1.2.6-prod/scripts/ss_poller.php ss_boost, output: Time:36.4200 RRDUpdates:378596

Which seems completely compliant to me

@netniV
Copy link
Member

netniV commented Sep 27, 2019

No, when we say RRDTool 1.5, it means as a minimum

@cigamit
Copy link
Member

cigamit commented Sep 27, 2019

@netniV is correct. RRDtool less than 1.5 causes all sorts of extra work for boost.

@eschoeller
Copy link
Author

OK sounds good thanks for the clarification. I’ll stay on the rrdtool version I’m on. So what do we need to do about this issue? I’ll be back at a PC in an hour or so and can report back on what has transpired in the logs.

@eschoeller
Copy link
Author

I'm still seeing all the same problems as before.

@netniV netniV changed the title Updates to boost fail with various error messages when timespamp is in the past Data updates with past timestamps can cause boost errors Sep 28, 2019
@cigamit
Copy link
Member

cigamit commented Sep 28, 2019

I've reproduced this behavior and traced the root cause. Have to check the code to resolve now.

ERROR: /cacti/cacti-1.2.6-prod/rra/thorn-b_time_71302.rrd: expected 2 data source readings (got 1) from RRDUpdates

cigamit added a commit that referenced this issue Sep 28, 2019
This will resolve the issue around template problems.
@cigamit
Copy link
Member

cigamit commented Sep 28, 2019

Okay, all fixed up now.

@eschoeller
Copy link
Author

eschoeller commented Sep 28, 2019

Applied the most recent updates. Went through a boost run. Here were the results:

2019/09/28 13:19:21 - SYSTEM BOOST STATS: Time:42.5700 RRDUpdates:396126
ERROR: /cacti/cacti-1.2.6-prod/rra/unitleavefluid_9365.rrd: expected 3 data source readings (got 2) from 1569698365
ERROR: /cacti/cacti-1.2.6-prod/rra/unitthreshreturn_9381.rrd: expected 2 data source readings (got 1) from 1569698368
ERROR: /cacti/cacti-1.2.6-prod/rra/unitrackinlet3_9382.rrd: expected 7 data source readings (got 6) from 1569699566
ERROR: /cacti/cacti-1.2.6-prod/rra/unitrackinlet3_9410.rrd: expected 7 data source readings (got 5) from 1569699867
ERROR: /cacti/cacti-1.2.6-prod/rra/unitrackinlet3_9424.rrd: expected 7 data source readings (got 4) from 1569700047
ERROR: /cacti/cacti-1.2.6-prod/rra/unitcooldemand_9453.rrd: expected 2 data source readings (got 1) from 1569699208
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedphasecurrent3_15783.rrd: expected 3 data source readings (got 2) from 1569699220
ERROR: /cacti/cacti-1.2.6-prod/rra/temperature_15793.rrd: expected 2 data source readings (got 1) from 1569700120
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_16516.rrd: expected 3 data source readings (got 2) from 1569699273
ERROR: /cacti/cacti-1.2.6-prod/rra/main_kwh_del_count_16572.rrd: not a simple unsigned integer: '1535275.000000'
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_16653.rrd: expected 3 data source readings (got 2) from 1569700120
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedvoltage3_16661.rrd: expected 3 data source readings (got 2) from 1569700120
ERROR: /cacti/cacti-1.2.6-prod/rra/traffic_in_16744.rrd: expected 2 data source readings (got 1) from 1569698354
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedphasecurrent3_17610.rrd: expected 3 data source readings (got 2) from 1569698317
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedcrest3_18123.rrd: expected 3 data source readings (got 2) from 1569700119
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_18132.rrd: expected 3 data source readings (got 1) from 1569700119
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedphasevoltage3_18139.rrd: expected 3 data source readings (got 1) from 1569700119
ERROR: /cacti/cacti-1.2.6-prod/rra/snmpinnosuchnames_19390.rrd: expected 25 data source readings (got 24) from 1569698473
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_27130.rrd: expected 3 data source readings (got 1) from 1569700120
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_29327.rrd: expected 3 data source readings (got 2) from 1569699384
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_31628.rrd: expected 3 data source readings (got 2) from 1569699037
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_51155.rrd: expected 3 data source readings (got 2) from 1569699744
ERROR: /cacti/cacti-1.2.6-prod/rra/branchcurrentutil1_63010.rrd: expected 6 data source readings (got 4) from 1569698317
ERROR: /cacti/cacti-1.2.6-prod/rra/established_71306.rrd: expected 9 data source readings (got 1) from 1569698347
2019/09/28 13:49:32 - SYSTEM BOOST STATS: Time:50.9000 RRDUpdates:384781

Significantly less errors. I'll keep watching it. Note there is a 'not a simple unsigned integer' error mixed in there. I have not reported that, and it comes through consistently on every run. I think that's on me to fix it though.

@cigamit
Copy link
Member

cigamit commented Sep 28, 2019

For each of of these, just before the error, can you also post the RRDtool update debug item? Also, take a snapshot of your boost table before you force the boost run.

create table testing like poller_output_boost;
insert into testing select * from poller_output_boost;

Then, when you get the error, let's look at the items in the testing table, and compare to the errors in the boost log. So, when you do this next time:

  1. Create the testing table
  2. Run poller_boost.php --force >> /somelog 2>&1
  3. Edit the log and look for the errors and the local_data_id
  4. Select * from testing where local_data_id = ?
  5. Post the debug for that local data id, and the contents to the select here.

@cigamit
Copy link
Member

cigamit commented Sep 28, 2019

Any chance of getting that new data today?

@eschoeller
Copy link
Author

Sorry just getting back to a computer right now. I'll see what I can do

@eschoeller
Copy link
Author

eschoeller commented Sep 29, 2019

OK here's one. This could end up being a lot of output if I do all of them

mysql> select * from testing where local_data_id=9281;
+---------------+------------------+---------------------+--------+
| local_data_id | rrd_name         | time                | output |
+---------------+------------------+---------------------+--------+
|          9281 | unitMaxInlet     | 2019-09-28 21:16:27 | 726    |
|          9281 | unitRackInlet1   | 2019-09-28 21:16:27 | 726    |
|          9281 | unitRackInlet2   | 2019-09-28 21:16:27 | 716    |
|          9281 | unitRackInlet3   | 2019-09-28 21:16:27 | 714    |
|          9281 | unitSupplyAir    | 2019-09-28 21:16:27 | 712    |
|          9281 | unitThreshInlet  | 2019-09-28 21:16:27 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:16:27 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:17:24 | 726    |
|          9281 | unitRackInlet1   | 2019-09-28 21:17:24 | 726    |
|          9281 | unitRackInlet2   | 2019-09-28 21:17:24 | 716    |
|          9281 | unitRackInlet3   | 2019-09-28 21:17:24 | 714    |
|          9281 | unitSupplyAir    | 2019-09-28 21:17:24 | 712    |
|          9281 | unitThreshInlet  | 2019-09-28 21:17:24 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:17:24 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:18:25 | 725    |
|          9281 | unitRackInlet1   | 2019-09-28 21:18:25 | 725    |
|          9281 | unitRackInlet2   | 2019-09-28 21:18:25 | 715    |
|          9281 | unitRackInlet3   | 2019-09-28 21:18:25 | 713    |
|          9281 | unitSupplyAir    | 2019-09-28 21:18:25 | 712    |
|          9281 | unitThreshInlet  | 2019-09-28 21:18:25 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:18:25 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:19:25 | 724    |
|          9281 | unitRackInlet1   | 2019-09-28 21:19:25 | 723    |
|          9281 | unitRackInlet2   | 2019-09-28 21:19:25 | 715    |
|          9281 | unitRackInlet3   | 2019-09-28 21:19:25 | 714    |
|          9281 | unitSupplyAir    | 2019-09-28 21:19:25 | 712    |
|          9281 | unitThreshInlet  | 2019-09-28 21:19:25 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:19:25 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:20:24 | 726    |
|          9281 | unitRackInlet1   | 2019-09-28 21:20:24 | 726    |
|          9281 | unitRackInlet2   | 2019-09-28 21:20:24 | 714    |
|          9281 | unitRackInlet3   | 2019-09-28 21:20:24 | 713    |
|          9281 | unitSupplyAir    | 2019-09-28 21:20:24 | 711    |
|          9281 | unitThreshInlet  | 2019-09-28 21:20:24 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:20:24 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:21:24 | 724    |
|          9281 | unitRackInlet1   | 2019-09-28 21:21:24 | 724    |
|          9281 | unitRackInlet2   | 2019-09-28 21:21:24 | 713    |
|          9281 | unitRackInlet3   | 2019-09-28 21:21:24 | 714    |
|          9281 | unitSupplyAir    | 2019-09-28 21:21:24 | 711    |
|          9281 | unitThreshInlet  | 2019-09-28 21:21:24 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:21:24 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:22:26 | 725    |
|          9281 | unitRackInlet1   | 2019-09-28 21:22:26 | 725    |
|          9281 | unitRackInlet2   | 2019-09-28 21:22:26 | 715    |
|          9281 | unitRackInlet3   | 2019-09-28 21:22:26 | 714    |
|          9281 | unitSupplyAir    | 2019-09-28 21:22:26 | 710    |
|          9281 | unitThreshInlet  | 2019-09-28 21:22:26 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:22:26 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:23:26 | 723    |
|          9281 | unitRackInlet1   | 2019-09-28 21:23:26 | 723    |
|          9281 | unitRackInlet2   | 2019-09-28 21:23:26 | 714    |
|          9281 | unitRackInlet3   | 2019-09-28 21:23:26 | 715    |
|          9281 | unitSupplyAir    | 2019-09-28 21:23:26 | 709    |
|          9281 | unitThreshInlet  | 2019-09-28 21:23:26 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:23:26 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:24:25 | 723    |
|          9281 | unitRackInlet1   | 2019-09-28 21:24:25 | 723    |
|          9281 | unitRackInlet2   | 2019-09-28 21:24:25 | 713    |
|          9281 | unitRackInlet3   | 2019-09-28 21:24:25 | 712    |
|          9281 | unitSupplyAir    | 2019-09-28 21:24:25 | 706    |
|          9281 | unitThreshInlet  | 2019-09-28 21:24:25 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:24:25 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:25:25 | 722    |
|          9281 | unitRackInlet1   | 2019-09-28 21:25:25 | 722    |
|          9281 | unitRackInlet2   | 2019-09-28 21:25:25 | 711    |
|          9281 | unitRackInlet3   | 2019-09-28 21:25:25 | 711    |
|          9281 | unitSupplyAir    | 2019-09-28 21:25:25 | 705    |
|          9281 | unitMaxInlet     | 2019-09-28 21:26:19 | 723    |
|          9281 | unitRackInlet1   | 2019-09-28 21:26:19 | 723    |
|          9281 | unitRackInlet2   | 2019-09-28 21:26:19 | 710    |
|          9281 | unitRackInlet3   | 2019-09-28 21:26:19 | 711    |
|          9281 | unitSupplyAir    | 2019-09-28 21:26:19 | 705    |
|          9281 | unitThreshInlet  | 2019-09-28 21:26:19 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:26:19 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:27:27 | 722    |
|          9281 | unitRackInlet1   | 2019-09-28 21:27:27 | 722    |
|          9281 | unitRackInlet2   | 2019-09-28 21:27:27 | 709    |
|          9281 | unitRackInlet3   | 2019-09-28 21:27:27 | 712    |
|          9281 | unitSupplyAir    | 2019-09-28 21:27:27 | 704    |
|          9281 | unitThreshInlet  | 2019-09-28 21:27:27 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:27:27 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:28:25 | 723    |
|          9281 | unitRackInlet1   | 2019-09-28 21:28:25 | 723    |
|          9281 | unitRackInlet2   | 2019-09-28 21:28:25 | 710    |
|          9281 | unitRackInlet3   | 2019-09-28 21:28:25 | 710    |
|          9281 | unitSupplyAir    | 2019-09-28 21:28:25 | 704    |
|          9281 | unitThreshInlet  | 2019-09-28 21:28:25 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:28:25 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:29:27 | 721    |
|          9281 | unitRackInlet1   | 2019-09-28 21:29:27 | 721    |
|          9281 | unitRackInlet2   | 2019-09-28 21:29:27 | 711    |
|          9281 | unitRackInlet3   | 2019-09-28 21:29:27 | 712    |
|          9281 | unitSupplyAir    | 2019-09-28 21:29:27 | 706    |
|          9281 | unitThreshInlet  | 2019-09-28 21:29:27 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:29:27 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:30:26 | 723    |
|          9281 | unitRackInlet1   | 2019-09-28 21:30:26 | 723    |
|          9281 | unitRackInlet2   | 2019-09-28 21:30:26 | 712    |
|          9281 | unitRackInlet3   | 2019-09-28 21:30:26 | 713    |
|          9281 | unitSupplyAir    | 2019-09-28 21:30:26 | 707    |
|          9281 | unitThreshInlet  | 2019-09-28 21:30:26 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:30:26 | 850    |
|          9281 | unitMaxInlet     | 2019-09-28 21:31:25 | 723    |
|          9281 | unitRackInlet1   | 2019-09-28 21:31:25 | 723    |
|          9281 | unitRackInlet2   | 2019-09-28 21:31:25 | 713    |
|          9281 | unitRackInlet3   | 2019-09-28 21:31:25 | 714    |
|          9281 | unitSupplyAir    | 2019-09-28 21:31:25 | 708    |
|          9281 | unitThreshInlet  | 2019-09-28 21:31:25 | 850    |
|          9281 | unitThreshSupply | 2019-09-28 21:31:25 | 850    |
+---------------+------------------+---------------------+--------+
110 rows in set (0.01 sec)
DEBUG: Updating Local Data Id:'9281', Template:unitMaxInlet:unitRackInlet1:unitRackInlet2:unitRackInlet3:unitSupplyAir:unitThreshInlet:unitThreshSupply, Output: 1569726987:726:726:716:714:712:850:850 1569727044:726:726:716:714:712:850:850 1569727105:725:725:715:713:712:850:850 1569727165:724:723:715:714:712:850:850 1569727224:726:726:714:713:711:850:850 1569727284:724:724:713:714:711:850:850 1569727346:725:725:715:714:710:850:850 1569727406:723:723:714:715:709:850:850 1569727465:723:723:713:712:706:850:850 1569727525:722:722:711:711:705 1569727579:723:723:710:711:705:850:850 1569727647:722:722:709:712:704:850:850 1569727705:723:723:710:710:704:850:850 1569727767:721:721:711:712:706:850:850 1569727826:723:723:712:713:707:850:850 1569727885:723:723:713:714:708:850:850 1569727945:722:723:715:714:709:850:850
ERROR: /cacti/cacti-1.2.6-prod/rra/unitrackinlet3_9281.rrd: expected 7 data source readings (got 5) from 1569727525

@eschoeller
Copy link
Author

Indeed the 1569727525 data set appears to be missing the unitThreshInlet and unitThreshSupply

@eschoeller
Copy link
Author

Overall errors from that run:

ERROR: /cacti/cacti-1.2.6-prod/rra/unitrackinlet3_9281.rrd: expected 7 data source readings (got 5) from 1569727525
ERROR: /cacti/cacti-1.2.6-prod/rra/unitthreshreturn_9409.rrd: expected 2 data source readings (got 1) from 1569727348
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_14553.rrd: expected 3 data source readings (got 2) from 1569727997
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_14563.rrd: expected 3 data source readings (got 2) from 1569727114
ERROR: /cacti/cacti-1.2.6-prod/rra/kwh_del_count_16572.rrd: not a simple unsigned integer: '1536856.250000'
ERROR: /cacti/cacti-1.2.6-prod/rra/traffic_in_16744.rrd: expected 2 data source readings (got 1) from 1569727153
ERROR: /cacti/cacti-1.2.6-prod/rra/errors_in_16816.rrd: expected 4 data source readings (got 2) from 1569727273
ERROR: /cacti/cacti-1.2.6-prod/rra/traffic_in_16817.rrd: expected 2 data source readings (got 1) from 1569727273
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_17623.rrd: expected 3 data source readings (got 2) from 1569727111
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedphasecurrent3_18164.rrd: expected 3 data source readings (got 1) from 1569727997
ERROR: /cacti/cacti-1.2.6-prod/rra/infeedload3_29368.rrd: expected 3 data source readings (got 1) from 1569728000
ERROR: /cacti/cacti-1.2.6-prod/rra/temperature_29387.rrd: expected 2 data source readings (got 1) from 1569728000
ERROR: /cacti/cacti-1.2.6-prod/rra/linecurrent1_48855.rrd: expected 3 data source readings (got 2) from 1569727998
ERROR: /cacti/cacti-1.2.6-prod/rra/tcpretranssegs_52378.rrd: expected 8 data source readings (got 5) from 1569727999
ERROR: /cacti/cacti-1.2.6-prod/rra/phasepowerfactor1_59248.rrd: expected 3 data source readings (got 2) from 1569727998
ERROR: /cacti/cacti-1.2.6-prod/rra/established_71306.rrd: expected 9 data source readings (got 1) from 1569726966

@eschoeller
Copy link
Author

I've mailed the entire testing table and boost output log to you.

@cigamit
Copy link
Member

cigamit commented Sep 29, 2019

That data is perfect. I suspect snmp timeouts. Reviewed the one your pasted for applicability. Then, I'll look at the others.

cigamit added a commit that referenced this issue Sep 29, 2019
Data updates with past timestamps can cause boost errors
@cigamit
Copy link
Member

cigamit commented Sep 29, 2019

Okay @eschoeller, all you need to do is update poller_boost.php, and then force a run. If all is working fine, you will have none of the legacy errors. Let us know ASAP.

@cigamit
Copy link
Member

cigamit commented Sep 29, 2019

Should be faster too.

cigamit added a commit that referenced this issue Sep 29, 2019
This change will allow RRDtool to return the lastupdate time of the RRDfile if you are using RRDtool version 1.4 or 1.3, or if using a newer version of RRDtool, but the RRDtool version is not set correctly in Cacti.

Better to simply upgrade to a more recent RRDtool as the different in performance will be pretty high.

I may may one more change to bring back some old code in order to reduce the impact on performance.
cigamit added a commit that referenced this issue Sep 29, 2019
* Update the stored rrdtool version to match the installed version
* This is done to save quite a few cycles if the end user has updated to a newer RRDtool version, and they have forgotten to update their setting in Cacti.
* This was done since performance of boost since it assumed the stored value of RRDtool is valid, and perfoamnce will suffer if that version is 1.4 or below.
@eschoeller
Copy link
Author

Applied the most recent updates
Here are the results from the forced run:

$ egrep 'ERROR|STAT' boost.test.2.log 
ERROR: /cacti/cacti-1.2.6-prod/rra/_main_kwh_del_count_16572.rrd: not a simple unsigned integer: '1539248.250000'
ERROR: /cacti/cacti-1.2.6-prod/rra/_established_71306.rrd: found extra data on update argument: U
2019/09/29 10:01:13 - SYSTEM BOOST STATS: Time:38.6400 RRDUpdates:252925

Significantly less errors ... but I do agree with you, the missing data is likely SNMP timeouts, so what's happening now? I suspect I still have missing data here and there, how is it all being handled?

@cigamit
Copy link
Member

cigamit commented Sep 30, 2019

So, for these last two errors, they are corner cases, for the first issue. Take a look at the boost table, run this query:

SELECT * FROM poller_output_boost WHERE local_data_id = 16572;

It is an integer or a float?

For the second issue, compare the data sources in the rrdtool info command to it's corresponding Data Template, and let's see if they are different. Also, for that error, please post the debug output.

Lastly, let's do all this in a new ticket. I think these corner cases still need addressing, but we did take care of just about all of them.

@cigamit cigamit closed this as completed Sep 30, 2019
@eschoeller
Copy link
Author

mysql> SELECT * FROM poller_output_boost WHERE local_data_id = 16572;
+---------------+--------------------+---------------------+----------------+
| local_data_id | rrd_name           | time                | output         |
+---------------+--------------------+---------------------+----------------+
|         16572 | main_kwh_del_count | 2019-10-23 20:23:24 | 1652722.500000 |
|         16572 | main_kwh_del_count | 2019-10-23 20:24:34 | 1652726.375000 |
|         16572 | main_kwh_del_count | 2019-10-23 20:25:33 | 1652729.625000 |
|         16572 | main_kwh_del_count | 2019-10-23 20:26:34 | 1652733.000000 |
|         16572 | main_kwh_del_count | 2019-10-23 20:27:33 | 1652736.250000 |
+---------------+--------------------+---------------------+----------------+
5 rows in set (0.00 sec)

@cigamit
Copy link
Member

cigamit commented Nov 10, 2019

Going to re-open this one. Might as well handle the float issue.

@cigamit cigamit reopened this Nov 10, 2019
@eschoeller
Copy link
Author

Sounds good, thanks:)

@cigamit
Copy link
Member

cigamit commented Nov 29, 2019

@eschoeller, I suggest you change the Data Source type from COUNTER to DCOUNTER for local_data_id 16572. DCOUNTER has been out there for some time now. For the other one with the 'found extra data', it would be nice to get the RRDtool update output when this took place.

@cigamit cigamit closed this as completed Dec 11, 2019
@github-actions github-actions bot locked and limited conversation to collaborators Jun 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Undesired behaviour resolved A fixed issue
Projects
None yet
Development

No branches or pull requests

3 participants