Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving hosts between data collectors is slow #2978

Closed
eschoeller opened this issue Sep 26, 2019 · 15 comments
Closed

Moving hosts between data collectors is slow #2978

eschoeller opened this issue Sep 26, 2019 · 15 comments
Labels
bug Undesired behaviour resolved A fixed issue
Milestone

Comments

@eschoeller
Copy link

Not sure if there's anything that can be done for this, but it takes a very long time to move multiple devices between data collectors. On the device Management page I select numerous devices and then select the "Change Device Settings" option. I then change the Poller Association.
There is a query that runs for every single device that takes 6 seconds to run:

# Time: 190925 18:59:04
# User@Host: apache[apache] @ localhost []  Id: 16810339
# Query_time: 5.228184  Lock_time: 0.000449 Rows_sent: 191  Rows_examined: 1031541
SET timestamp=1569459544;
SELECT did.* FROM data_input_data AS did INNER JOIN data_template_data AS dtd ON did.data_template_data_id=dtd.id INNER JOIN data_local AS dl ON dl.id=dtd.local_data_id INNER JOIN host AS h ON h.id=dl.host_id WHERE h.id = '728';

So, moving 50 devices from one collector to another takes a very long time.

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Run an explain on that same query:

explain SELECT did.* FROM data_input_data AS did INNER JOIN data_template_data AS dtd ON did.data_template_data_id=dtd.id INNER JOIN data_local AS dl ON dl.id=dtd.local_data_id INNER JOIN host AS h ON h.id=dl.host_id WHERE h.id = '728';

@eschoeller
Copy link
Author

+----+-------------+-------+--------+-------------------------------+---------+---------+---------------------------------+--------+-------------+
| id | select_type | table | type   | possible_keys                 | key     | key_len | ref                             | rows   | Extra       |
+----+-------------+-------+--------+-------------------------------+---------+---------+---------------------------------+--------+-------------+
|  1 | SIMPLE      | h     | const  | PRIMARY                       | PRIMARY | 3       | const                           |      1 | Using index |
|  1 | SIMPLE      | did   | ALL    | NULL                          | NULL    | NULL    | NULL                            | 446616 | NULL        |
|  1 | SIMPLE      | dtd   | eq_ref | PRIMARY,local_data_id         | PRIMARY | 3       | cacti.did.data_template_data_id |      1 | NULL        |
|  1 | SIMPLE      | dl    | eq_ref | PRIMARY,host_id_snmp_query_id | PRIMARY | 3       | cacti.dtd.local_data_id         |      1 | Using where |
+----+-------------+-------+--------+-------------------------------+---------+---------+---------------------------------+--------+-------------+
4 rows in set (0.00 sec)

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Well, that is a problem.

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Try this, and then run that query again:

ALTER TABLE data_input_data ADD INDEX data_template_data_id (data_template_data_id);

@eschoeller
Copy link
Author

mysql> ALTER TABLE data_input_data ADD INDEX data_template_data_id (data_template_data_id);
Query OK, 0 rows affected (3.91 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> explain SELECT did.* FROM data_input_data AS did INNER JOIN data_template_data AS dtd ON did.data_template_data_id=dtd.id INNER JOIN data_local AS dl ON dl.id=dtd.local_data_id INNER JOIN host AS h ON h.id=dl.host_id WHERE h.id = '728';
+----+-------------+-------+-------+-------------------------------+-----------------------+---------+--------------+------+--------------------------+
| id | select_type | table | type  | possible_keys                 | key                   | key_len | ref          | rows | Extra                    |
+----+-------------+-------+-------+-------------------------------+-----------------------+---------+--------------+------+--------------------------+
|  1 | SIMPLE      | h     | const | PRIMARY                       | PRIMARY               | 3       | const        |    1 | Using index              |
|  1 | SIMPLE      | dl    | ref   | PRIMARY,host_id_snmp_query_id | host_id_snmp_query_id | 3       | const        |   25 | Using where; Using index |
|  1 | SIMPLE      | dtd   | ref   | PRIMARY,local_data_id         | local_data_id         | 3       | cacti.dl.id  |    1 | Using index              |
|  1 | SIMPLE      | did   | ref   | data_template_data_id         | data_template_data_id | 3       | cacti.dtd.id |    4 | NULL                     |
+----+-------------+-------+-------+-------------------------------+-----------------------+---------+--------------+------+--------------------------+
4 rows in set (0.00 sec)

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

No, just run that same query without the explain. How long?

@eschoeller
Copy link
Author

191 rows in set (0.01 sec)

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Bingo! Now, try to move a Device.

@eschoeller
Copy link
Author

Yup. Moved 47 devices, nothing in the mysql-slow.log and it completed much faster. Cool!
So, as per usual, somehow I missed a database update along the way ...

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Nope, I think we optimized out some indexes thinking things like just having the primary hold the value would be fast. However, that's not how it works.

@cigamit cigamit changed the title [1.2.6] Moving Hosts between Data Collectors is slow Moving Hosts between Data Collectors is slow Sep 26, 2019
@cigamit cigamit added bug Undesired behaviour resolved A fixed issue labels Sep 26, 2019
cigamit added a commit that referenced this issue Sep 26, 2019
Moving Hosts between Data Collectors is slow
@eschoeller
Copy link
Author

Sadly, I think this broke my "hacked" version of graph_image.php that I use for some dashboards. I'm poking around at it right now.

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Keep us posted. I doubt the index change had anything to do with it.

@cigamit cigamit added this to the v1.2.7 milestone Sep 26, 2019
@eschoeller
Copy link
Author

eschoeller commented Sep 26, 2019

Somehow it did. The dashboard went blank right after I ran that alter command. Or maybe it was the two boost updates ... but I find that even more remote of a possibility. Beyond me why at this point. But, this version of graph_image.php was from 2013. After every upgrade I've been shocked that it continues to work. I just copied over the most current graph_image and made two small changes to restore it. Works just fine, so this is for the better in the end. One of those ticking timebombs:)

cigamit added a commit that referenced this issue Sep 26, 2019
@eschoeller
Copy link
Author

Another thing I noticed... for each device that is moved, I see this activity:

2019/09/25 20:46:20 - WEBUI NOTE: Table host Replicated to Poller With 1 Rows Updated
2019/09/25 20:46:20 - WEBUI NOTE: Table host_snmp_cache Replicated to Poller With 92 Rows Updated
2019/09/25 20:46:20 - WEBUI NOTE: Table data_local Replicated to Poller With 85 Rows Updated
2019/09/25 20:46:20 - WEBUI NOTE: Table graph_local Replicated to Poller With 77 Rows Updated
2019/09/25 20:46:20 - WEBUI NOTE: Table data_template_data Replicated to Poller With 85 Rows Updated
2019/09/25 20:46:20 - WEBUI NOTE: Table data_template_rrd Replicated to Poller With 319 Rows Updated
2019/09/25 20:46:21 - WEBUI NOTE: Table graph_templates_item Replicated to Poller With 1114 Rows Updated
2019/09/25 20:46:21 - WEBUI NOTE: Table data_input_data Replicated to Poller With 846 Rows Updated

Seems like perhaps the replication should happen just once at the end, instead of for each device?

@cigamit
Copy link
Member

cigamit commented Sep 26, 2019

Hmm. I'll have to check. I think it's just be replicating the impacted rows. I guess we should turn down the logging a bit.

Log a separate issue so that we don't loose it.

@netniV netniV changed the title Moving Hosts between Data Collectors is slow Moving Hosts between data collectors is slow Sep 28, 2019
@netniV netniV changed the title Moving Hosts between data collectors is slow Moving hosts between data collectors is slow Sep 28, 2019
@cigamit cigamit closed this as completed Sep 30, 2019
@github-actions github-actions bot locked and limited conversation to collaborators Jun 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Undesired behaviour resolved A fixed issue
Projects
None yet
Development

No branches or pull requests

2 participants