Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Devices still show in lists despite being deleted #2937

Closed
bmfmancini opened this issue Sep 12, 2019 · 54 comments
Closed

Devices still show in lists despite being deleted #2937

bmfmancini opened this issue Sep 12, 2019 · 54 comments
Labels
bug Undesired behaviour resolved A fixed issue
Milestone

Comments

@bmfmancini
Copy link
Member

Hey Guys

So I deleted about 100 Devices to re-inport them via automation into cacti
when I try to re-import cacti skips the IP's reporting device already in cacti

I tried the remove_device script and used a regex which found all the sites I used --confirm the script reports back successful

yet AutoM8 still skips the IP's and if you run the script again the devices are still there !!

@netniV
Copy link
Member

netniV commented Sep 12, 2019

Are you running this on the poller that normally collects from the devices?

@bmfmancini
Copy link
Member Author

I actually ran it from the main poller first which showed the devices in the DB then ran it on the remote poller that time it didnt show the devices in the db

@netniV
Copy link
Member

netniV commented Sep 12, 2019

I believe that now when a device gets deleted in Cacti, it is flagged as deleted using the deleted column and then this is replicated. This was a change in 1.2 to ensure that deleted devices were properly replicated for deletion rather than simply removing the record only to have it recreated because the main or remote thought it should be there.

I'm wondering if there is a conflict between this deleted flag being set and the record not removed yet, and automation checking if a device exists without filtering on the deleted column.

@bmfmancini
Copy link
Member Author

its weird I did a full sync as well and that didn't seem to help
I restored from a DB backup to resolve for now but its something for sure to look into

@cigamit
Copy link
Member

cigamit commented Sep 15, 2019

The only time that this could possibly happens is when:

  1. When devices are removed while the remote poller is running
  2. When the main data collector can not reach the remote database over a mysql connection.

Case 1) is difficult to completely overcome. We do have a purge command that happens at the bottom of each poller run, but if the remote database is not accessible, the purge will not complete successfully.

@bmfmancini
Copy link
Member Author

bmfmancini commented Sep 15, 2019 via email

@cigamit
Copy link
Member

cigamit commented Sep 15, 2019

From the main system, edit the Remote Data Collector and test the connection. Is everything coming back successful?

@bmfmancini
Copy link
Member Author

bmfmancini commented Sep 15, 2019 via email

@bmfmancini
Copy link
Member Author

bmfmancini commented Sep 15, 2019 via email

@cigamit cigamit added the unverified Some days we don't have a clue label Sep 15, 2019
@bmfmancini
Copy link
Member Author

Ok tested

The main poller is able to reach the remote pollers DB just fine
tested by messing around with the credentials

Still very weird

@bmfmancini
Copy link
Member Author

Ok so I revisited this and something is still no right

When you delete the device off the main poller it is gone you can add the site back manually but through AUTOM8 is will show as still in cacti

If you select the hostname table the entry is still in the DB !

The main poller can reach the remote pollers just fine replication works as well
no sql errors in the log and no errors in the mariadb log either

@netniV
Copy link
Member

netniV commented Oct 6, 2019

That is likely that AUTOM8 isn't ignoring devices that are set to deleted in the hosts table. Because of replication, they are not totally removed until all replication is complete.

@bmfmancini
Copy link
Member Author

bmfmancini commented Oct 6, 2019 via email

@bmfmancini
Copy link
Member Author

Deleted a device directly from the DB

here is the command I ran

(delete from host where hostname like 'Ip address;)

I re-ran autom8 and it picked up the device and re-added it
I did notice duplicate records of the device that I added manually I don't think cacti is marking the device as deleted

No errors are seen from the SQL call and no troubles issuing the command as the same user that cacti uses so not a permissions issue

I am not sure if this is specific to 1.2.5

@netniV
Copy link
Member

netniV commented Oct 8, 2019

Deleting the device manually will leave behind datasources and graphs related to it. It will also not sync to the other pollers. You should make sure that you don't have rows in other tables (especially tables with a host_id) that match. If you do, you'll have to start cascading through the tables or you could end up with other issues were code starts at a data source or graph and works backwards.

@netniV
Copy link
Member

netniV commented Oct 8, 2019

I would do a select on the device you want to delete manually before you do to keep a record of it.

@bmfmancini
Copy link
Member Author

bmfmancini commented Oct 8, 2019 via email

@bmfmancini
Copy link
Member Author

This is still ongoing I have run the audit db script all good analyze ab all good

I still see entries in the db from deleted devices I don't think the entries ate being marked for deletion

@bmfmancini
Copy link
Member Author

Oh and this is on 1.2.7

@cigamit
Copy link
Member

cigamit commented Nov 10, 2019

You need to be on branch 1.2.x. Also, perform a diff between config.php.dist and config.php, let me know what is different. I really suspect that you are having some database communication problems. Are you sure you don't have a database issue? Test connection comes back good for all collectors?

@bmfmancini
Copy link
Member Author

bmfmancini commented Nov 10, 2019 via email

@cigamit
Copy link
Member

cigamit commented Nov 10, 2019

How large is the install? What's you lowest collection frequency? I'll do some testing now that I've installed a remote data collector.

@bmfmancini
Copy link
Member Author

bmfmancini commented Nov 10, 2019 via email

@bmfmancini
Copy link
Member Author

Ok a fresh example

multi pollers setup 1.2.7 not the same as the above examples
I deleted 2 devices cacti shows 0 devices to poll yet DB shows the 2 devices and templates show 2 devices using them see below

image

image

image

I forced a sync between the pollers here is the output

2019/11/12 20:19:45 - WEBUI NOTE: All selected Remote Data Collectors in [2] synchronized correctly by user admin

2019/11/12 20:19:45 - CMDPHP NOTE: Table data_input_data Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:45 - CMDPHP NOTE: Table graph_templates_item Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:44 - CMDPHP NOTE: Table data_template_rrd Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:44 - CMDPHP NOTE: Table data_template_data Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:44 - CMDPHP NOTE: Table graph_local Replicated to Remote Poller 2 With 31 Rows Updated
2019/11/12 20:19:44 - CMDPHP NOTE: Table data_local Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:44 - CMDPHP NOTE: Table poller_reindex Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:44 - CMDPHP NOTE: Table host_snmp_cache Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:44 - CMDPHP NOTE: Table host Replicated to Remote Poller 2 With 2 Rows Updated
2019/11/12 20:19:43 - CMDPHP NOTE: Table poller_item Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:43 - CMDPHP NOTE: Table poller_command Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:43 - CMDPHP NOTE: Table host_snmp_query Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:43 - CMDPHP NOTE: Table user_domains_ldap Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:43 - CMDPHP NOTE: Table user_domains Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:43 - CMDPHP NOTE: Table user_auth_realm Replicated to Remote Poller 2 With 28 Rows Updated
2019/11/12 20:19:43 - CMDPHP NOTE: Table user_auth_group_realm Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:43 - CMDPHP NOTE: Table user_auth_group_perms Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:42 - CMDPHP NOTE: Table user_auth_group_members Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:42 - CMDPHP NOTE: Table user_auth_group Not Replicated to Remote Poller 2 Due to No Rows Found
2019/11/12 20:19:42 - CMDPHP NOTE: Table user_auth Replicated to Remote Poller 2 With 2 Rows Updated
2019/11/12 20:19:42 - CMDPHP NOTE: Table version Replicated to Remote Poller 2 With 1 Rows Updated
2019/11/12 20:19:42 - CMDPHP NOTE: Table poller Replicated to Remote Poller 2 With 2 Rows Updated
2019/11/12 20:19:42 - CMDPHP NOTE: Table data_input_fields Replicated to Remote Poller 2 With 98 Rows Updated
2019/11/12 20:19:42 - CMDPHP NOTE: Table snmp_query Replicated to Remote Poller 2 With 9 Rows Updated
2019/11/12 20:19:42 - CMDPHP NOTE: Table host_template_snmp_query Replicated to Remote Poller 2 With 14 Rows Updated
2019/11/12 20:19:42 - CMDPHP NOTE: Table host_template_graph Replicated to Remote Poller 2 With 31 Rows Updated
2019/11/12 20:19:41 - CMDPHP NOTE: Table host_template Replicated to Remote Poller 2 With 6 Rows Updated
2019/11/12 20:19:41 - CMDPHP NOTE: Table data_input Replicated to Remote Poller 2 With 31 Rows Updated

the remote poller shows 2 devices assigned to it ???

image

Second poller device view

image

See below for sql test

bi-directional

[root@localhost ~]# mysql -u cacti -p -h 192.168.1.228
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 15916
Server version: 5.5.64-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use cacti;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [cacti]> quit
Bye
[root@localhost ~]# ^C

MariaDB [cacti]> select hostname from host;
+---------------+
| hostname |
+---------------+
| 192.168.1.253 |
| 4.2.2.2 |
+---------------+
2 rows in set (0.00 sec)

@bmfmancini bmfmancini changed the title 1.2.5 Devices still show in cacti despite being deleted Devices still show in cacti despite being deleted Nov 13, 2019
@bmfmancini
Copy link
Member Author

Hey Guys

I am still seeing this any new ideas ?
I think I have provided all that I can even the LAB is doing it
I belive the tables are not replicating to reflect that the device is deleted to the remote pollers as this doesnt happen with the main poller

@bmfmancini
Copy link
Member Author

Hey guys

Wondering if you have had a chance to look at this any info I can provide please let me know !

@bmfmancini
Copy link
Member Author

so funny thing I found when I delete the device it does not show up in the device list but shows up everywhere else

It shows up in the auto-complete dropdown when you search for it in the device view but if you click it nothing shows up

In the monitor plugin tab the device shows in the device view
If you try to re-add the device via autom8 the log shows the device as already in cacti
if you add the device manually the device shows up twice in the auto complete dropdown

please I am at a loss here any help would be appreciated

@bmfmancini
Copy link
Member Author

Oh btw on the recent device deletion I forced a poller sync as well same outcome

@bmfmancini
Copy link
Member Author

Yeah I am not sure I know that in 1.2.4 this was not an issue not quite sure when the deleted column was introduced when I was doing testing for autom8 we would delete and re-add devices a lot without issues

I am sure that something in the background is not picking up the tag I belive its also happening on the main poller as well so I dont think its a replication issue as well but I will test again just to be sure

@bmfmancini
Copy link
Member Author

I have a theory could this be THOLD holding on to the record?
The reason I say is that I have a device that was deleted 50 days ago that still shows as a down device in thold now if you click edit device all of the info is present in the device page IP and description but no graphs or data sources

The device table in the DB does indeed have this set as a deleted device but why even after 50 days does it insist on staying ??

@bmfmancini
Copy link
Member Author

Hey Guys

While we look into this would you be able to tell me of a manual way to clear these entries ?

@bmfmancini
Copy link
Member Author

bmfmancini commented Jan 21, 2020

I found this block of code in api_device

function api_device_purge_deleted_devices() {
$devices = db_fetch_assoc_prepared('SELECT id, poller_id
FROM host
WHERE deleted = "on"
AND UNIX_TIMESTAMP(last_updated) < UNIX_TIMESTAMP()-86400');

So this would mean that after 24 hours that the devices should be purged I belive something is wrong with this block since I have seen devices both in my production and multiple labs stick around for days

When I run the api_device.php I see the below

php api_device.php -h
PHP Notice: Undefined variable: config in /var/www/html/cacti/lib/api_device.php on line 25
PHP Warning: include_once(/lib/poller.php): failed to open stream: No such file or directory in /var/www/html/cacti/lib/api_device.php on line 25
PHP Warning: include_once(): Failed opening '/lib/poller.php' for inclusion (include_path='.:/usr/share/pear:/usr/share/php') in /var/www/html/cacti/lib/api_device.php on line 25

Here is line 25 of api_device.php

include_once($config['base_path'] . '/lib/poller.php');

/* api_device_crc_update - update hash stored in settings table to inform
remote pollers to update their caches
@arg $poller_id - the id of the poller impacted by hash update
@arg $variable - the hash variable prefix for the replication setting. */
function api_device_cache_crc_update($poller_id, $variable = 'poller_replicate_device_cache_crc') {
$hash = hash('ripemd160', date('Y-m-d H:i:s') . rand() . $poller_id);

    db_execute_prepared("REPLACE INTO settings SET value = ?, name='$variable" . "_" . "$poller_id'", array($hash));

poller.php is for sure there

ls /var/www/html/cacti/lib/poller.php -lah
-rw-r--r-- 1 apache apache 58K Sep 29 13:57 /var/www/html/cacti/lib/poller.php

@TheWitness
Copy link
Member

Run this, post the output:

SELECT id, deleted, last_updated FROM host WHERE deleted="on";

In general, I don't know why there is a library include inside of lib/api_device.php. That is simply not right, especially if it's not within a function definition.

TheWitness added a commit that referenced this issue Jan 22, 2020
* We should not be perfomring includes in the base of any library file.
@TheWitness
Copy link
Member

I've removed the include in lib/api_device.php. But to apply it, you have to take all the files. It should be pretty harmless.

@bmfmancini
Copy link
Member Author

bmfmancini commented Jan 22, 2020 via email

@bmfmancini
Copy link
Member Author

Here is the output from the db

SELECT id, deleted, last_updated FROM host WHERE deleted="on";
+------+---------+---------------------+
| id | deleted | last_updated |
+------+---------+---------------------+
| 996 | on | 2020-01-22 09:41:11 |
| 1405 | on | 2020-01-22 09:41:06 |
| 1430 | on | 2020-01-22 09:41:06 |
| 1444 | on | 2020-01-22 09:41:07 |
| 2404 | on | 2020-01-22 09:41:10 |
| 2416 | on | 2020-01-22 09:41:06 |
| 2428 | on | 2020-01-22 09:41:06 |
| 2437 | on | 2020-01-22 09:41:05 |
| 2442 | on | 2020-01-22 09:41:05 |
| 2445 | on | 2020-01-22 09:41:08 |
| 2458 | on | 2020-01-22 09:41:11 |
| 2461 | on | 2020-01-22 09:41:05 |
| 2462 | on | 2020-01-22 09:41:09 |
| 2472 | on | 2020-01-22 09:41:08 |
| 2478 | on | 2020-01-22 09:41:09 |
| 2553 | on | 2020-01-22 09:41:05 |
| 2747 | on | 2020-01-22 09:41:04 |
| 2750 | on | 2020-01-22 09:41:03 |
| 2756 | on | 2020-01-22 09:41:09 |
| 2937 | on | 2020-01-22 09:41:06 |
| 2938 | on | 2020-01-22 09:41:09 |
| 2939 | on | 2020-01-22 09:41:12 |
| 2940 | on | 2020-01-22 09:41:11 |
| 2941 | on | 2020-01-22 09:41:05 |
| 2942 | on | 2020-01-22 09:41:13 |
| 2943 | on | 2020-01-22 09:41:08 |
| 2944 | on | 2020-01-22 09:41:11 |
| 2945 | on | 2020-01-22 09:41:13 |
| 2946 | on | 2020-01-22 09:41:24 |
| 2947 | on | 2020-01-22 09:41:05 |
| 2948 | on | 2020-01-22 09:41:13 |
| 2949 | on | 2020-01-22 09:41:04 |
| 2995 | on | 2020-01-22 09:40:34 |
| 2998 | on | 2020-01-22 09:41:17 |
| 3050 | on | 2020-01-22 09:41:10 |
| 3100 | on | 2020-01-22 09:41:06 |
| 3101 | on | 2020-01-22 09:41:05 |
| 3184 | on | 2020-01-22 09:41:09 |
| 3222 | on | 2020-01-22 09:41:10 |
| 3269 | on | 2020-01-22 09:41:07 |
| 3284 | on | 2020-01-22 09:41:04 |
| 3300 | on | 2020-01-22 09:41:08 |
| 3777 | on | 2020-01-22 09:41:06 |
| 4498 | on | 2020-01-22 09:41:06 |
| 4533 | on | 2020-01-22 09:41:08 |
| 4564 | on | 2020-01-22 09:41:08 |
| 4609 | on | 2020-01-22 09:41:09 |
| 4640 | on | 2020-01-22 09:41:06 |
| 4728 | on | 2020-01-22 09:41:06 |
| 4756 | on | 2020-01-22 09:41:05 |
| 4813 | on | 2020-01-22 09:41:05 |
+------+---------+---------------------+

@TheWitness
Copy link
Member

Well, that explain it then. Just need to find out where the update is occurring.

@bmfmancini
Copy link
Member Author

I applied the new files autom8 is working fine no errors seen as of yet

@bmfmancini
Copy link
Member Author

@TheWitness the update is happening at each poll
seems to update each minute

@bmfmancini
Copy link
Member Author

Ok so I added a device via autom8 now and deleted it its been 5 mins and the last updated shows as of 5 mins ago so something is definitely different the odd thing is that the other devices still show updating currently

@bmfmancini
Copy link
Member Author

| 4756 | on | 2020-01-22 10:36:05 |
| 4813 | on | 2020-01-22 10:36:05 |
| 4908 | on | 2020-01-22 10:26:17 |
+------+---------+---------------------+
52 rows in set (0.00 sec)

@TheWitness
Copy link
Member

I'm in the process of reviewing code that makes updates to the host table. There will be an update later this evening to cover some opportunities where updates may come in, say for example, as mentioned, with autom8.

@TheWitness TheWitness added bug Undesired behaviour and removed unverified Some days we don't have a clue labels Jan 23, 2020
TheWitness added a commit that referenced this issue Jan 23, 2020
* Devices still show in cacti despite being deleted
* PHP Error when Creating New Graphs through Automatically Added Devices using Sync Device Template
@TheWitness
Copy link
Member

Okay, think this is fixed now. There is a spine update too. Should be committed within a few minutes.

@TheWitness TheWitness added the resolved A fixed issue label Jan 23, 2020
@TheWitness TheWitness added this to the v1.2.9 milestone Jan 23, 2020
@bmfmancini
Copy link
Member Author

Dude thanks so much I will definitely test this and get back to you !

cigamit added a commit that referenced this issue Jan 24, 2020
Devices still show in cacti despite being deleted
@cigamit
Copy link
Member

cigamit commented Jan 24, 2020

So, I did some testing and found that discovery that was made on a remote data collector did not get pushed to the local databases, forcing a full sync to resolve that. The commit that I just made ensures that you don't have to full sync to make the discovery work. This this things goose is cooked now. I also tightened the screws on the purge timing to just under 10 minutes.

@netniV netniV changed the title Devices still show in cacti despite being deleted Devices still show in lists despite being deleted Feb 10, 2020
@ddb4github
Copy link
Contributor

ddb4github commented Feb 12, 2020

Previous commit e4fc8d7 rollback #2126, and impact all caller of API api_device_save, api_data_source_remove_multi, and push_out_host. The caller have to explicitly include lib/poller.php

@netniV
Copy link
Member

netniV commented Feb 12, 2020

Not sure where you are going with that?

@bmfmancini
Copy link
Member Author

On the original issue, I am seeing that this seems to be fixed in 1.2.9 in my lab testing anyways

@github-actions github-actions bot locked and limited conversation to collaborators Jun 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Undesired behaviour resolved A fixed issue
Projects
None yet
Development

No branches or pull requests

5 participants