Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasource Statistics may obtain invalid data for some rebooted devices #4495

Closed
jdcoats opened this issue Dec 8, 2021 · 11 comments
Closed
Assignees
Labels
bug Undesired behaviour data collection Issues related to data collection resolved A fixed issue
Milestone

Comments

@jdcoats
Copy link

jdcoats commented Dec 8, 2021

I have been finding that from time to time dsstats will get invalid data and have to be purged. I have finally found a way to reproduce this issue on demand. Its not always this device or data type but here is an example of what happens when I reboot this Linux server.

image

2021/12/08 10:25:14 - CMDPHP intropage-hr Backtrace: (/plugins/intropage/intropage.php[39]:process_page_request_variables(), /plugins/intropage/include/functions.php[87]:intropage_detail_panel(), /plugins/intropage/include/functions.php[594]:busiest_traffic_detail(), /plugins/intropage/panellib/busiest.php[1143]:human_readable(), /plugins/intropage/include/functions.php[1579]:cacti_debug_backtrace())
--
2021/12/08 10:25:14 - CMDPHP INTROPAGE WARNING: Bytes = [4.919131751469E+18], Factor = [1024], i = [6] d = [1152921504606846976]

image

@jdcoats jdcoats added bug Undesired behaviour unverified Some days we don't have a clue labels Dec 8, 2021
@TheWitness TheWitness self-assigned this Dec 9, 2021
@TheWitness
Copy link
Member

Just finishing this up today on my day off.

@TheWitness TheWitness removed the unverified Some days we don't have a clue label Dec 10, 2021
@TheWitness TheWitness added this to the v1.2.20 milestone Dec 10, 2021
TheWitness added a commit that referenced this issue Dec 10, 2021
* DSStats Does not Scale on Large Systems
* DSStats will get invalid data for some devices after reboot requiring purging of stats

This is way less than what I want to do, but these are all essentially bugs.

There is also a change to lib/boost.php to provide better debug logging when testing.
@TheWitness TheWitness added the resolved A fixed issue label Dec 10, 2021
@TheWitness
Copy link
Member

@jdcoats, you need to update lib/dsstats.php, poller_dsstats.php and include/global_settings.php. After that see if you can reproduce again. Make sure the device has been up for a while and has a lot of traffic.

This change will implement RRDtool like clipping for COUNTERS and DERIVE type objects. If the resulting value is above the rrd_maximum value, it'll return NULL for the delta. A small gap will appear in the data, but that's fine. Maybe I can improve that. Not today.

It also addresses #4500, which will allow DSStats to finish sooner. Warning, the PHP processes take a lot of CPU.

@TheWitness TheWitness added the data collection Issues related to data collection label Dec 10, 2021
@jdcoats
Copy link
Author

jdcoats commented Dec 10, 2021

Thanks! No longer able to reproduce on demand.

@TheWitness
Copy link
Member

Cool. Ignore my other comment. The new parallel setting should speed things up too.

@TheWitness
Copy link
Member

Here is without boot.

image

Here is with boost.

image

@jdcoats
Copy link
Author

jdcoats commented Dec 10, 2021

image

@TheWitness
Copy link
Member

Cool. You should try like 4 processes and see if it's better.

@jdcoats
Copy link
Author

jdcoats commented Dec 10, 2021

Before:
image

After:
image

@jdcoats
Copy link
Author

jdcoats commented Dec 11, 2021

8 processes is about as good as i can get

image

@TheWitness
Copy link
Member

I'm going to do a code review tomorrow to see if there is a way to squeeze a few more seconds out. Stay tuned. Glad the problem is fixed though.

@jdcoats
Copy link
Author

jdcoats commented Dec 13, 2021

perfect

@jdcoats jdcoats closed this as completed Dec 13, 2021
TheWitness added a commit that referenced this issue Dec 21, 2021
* Graph Templating is quite slow on very large systems
* Update ChangLog for #4500 and #4495 as well
@github-actions github-actions bot locked and limited conversation to collaborators Mar 14, 2022
@netniV netniV changed the title DSSTATS will get invalid data for some devices after reboot requiring purging of stats Datasource Statistics may obtain invalid data for some rebooted devices Apr 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Undesired behaviour data collection Issues related to data collection resolved A fixed issue
Projects
None yet
Development

No branches or pull requests

2 participants