Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyzing self-referential foreign keys triggering an infinite loop #1268

Closed
andy-wm-arthur opened this issue Sep 16, 2022 · 3 comments · Fixed by #1272
Closed

Analyzing self-referential foreign keys triggering an infinite loop #1268

andy-wm-arthur opened this issue Sep 16, 2022 · 3 comments · Fixed by #1272
Assignees

Comments

@andy-wm-arthur
Copy link
Contributor

The cause seems to be related to the FK depth rule

repro:

CREATE TABLE `dcim_interface` (
  `id` char(32) NOT NULL,
  `_custom_field_data` json NOT NULL,
  `name` varchar(64) NOT NULL,
  `label` varchar(64) NOT NULL,
  `description` varchar(200) NOT NULL,
  `_cable_peer_id` char(32),
  `enabled` tinyint NOT NULL,
  `mac_address` varchar(18),
  `mtu` int unsigned,
  `mode` varchar(50) NOT NULL,
  `_name` varchar(100) NOT NULL,
  `type` varchar(50) NOT NULL,
  `mgmt_only` tinyint NOT NULL,
  `_cable_peer_type_id` int,
  `_path_id` char(32),
  `cable_id` char(32),
  `device_id` char(32) NOT NULL,
  `lag_id` char(32),
  `untagged_vlan_id` char(32),
  `status_id` char(32),
  `parent_interface_id` char(32),
  `bridge_id` char(32),
  PRIMARY KEY (`id`),
  KEY `dcim_interface__cable_peer_type_id_ce52cb81` (`_cable_peer_type_id`),
  KEY `dcim_interface__name_8796fa61` (`_name`),
  KEY `dcim_interface__path_id_f8f4f7f0` (`_path_id`),
  KEY `dcim_interface_bridge_id_f2a8df85` (`bridge_id`),
  KEY `dcim_interface_cable_id_1b264edb` (`cable_id`),
  KEY `dcim_interface_device_id_359c6115` (`device_id`),
  KEY `dcim_interface_lag_id_ea1a1d12` (`lag_id`),
  KEY `dcim_interface_name_bc4e48ab` (`name`),
  KEY `dcim_interface_parent_interface_id_dc46b61a` (`parent_interface_id`),
  KEY `dcim_interface_status_id_5d68d3d6` (`status_id`),
  KEY `dcim_interface_untagged_vlan_id_838dc7be` (`untagged_vlan_id`),
  UNIQUE KEY `device_idname` (`device_id`,`name`),
  CONSTRAINT `dcim_interface_bridge_id_f2a8df85_fk_dcim_interface_id` FOREIGN KEY (`bridge_id`) REFERENCES `dcim_interface` (`id`),
  CONSTRAINT `dcim_interface_lag_id_ea1a1d12_fk_dcim_interface_id` FOREIGN KEY (`lag_id`) REFERENCES `dcim_interface` (`id`),
  CONSTRAINT `dcim_interface_parent_interface_id_dc46b61a_fk_dcim_interface_id` FOREIGN KEY (`parent_interface_id`) REFERENCES `dcim_interface` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;

UPDATE 
  dcim_interface 
SET 
  _custom_field_data = '{}', 
  status_id = 'f707776203834a4e86262cb131b7d0f8', 
  device_id = '681637a50af540d8a6c30c73cad68a6b', 
  name = 'Int1', 
  label = '', 
  description = '', 
  cable_id = '6d65eb3f8e624eaba030f144dd889ea5', 
  _cable_peer_type_id = 4, 
  _cable_peer_id = '6cd070b494d141ccb875abfc9ed9e97a', 
  _path_id = NULL, 
  enabled = 1, 
  mac_address = '00:11:11:11:11:11', 
  mtu = NULL, 
  mode = 'access', 
  parent_interface_id = NULL, 
  bridge_id = NULL, 
  _name = '9999999999999999Int000001............', 
  lag_id = NULL, 
  type = 'virtual', 
  mgmt_only = 0, 
  untagged_vlan_id = '082d01401dad431eb566e6040fe437c3' 
WHERE 
  dcim_interface.id = '16488814982942cb8dfeb09a651ae41f';
@Hydrocharged
Copy link
Contributor

Terrible news: this isn't an infinite loop, it's exponential time! The exponent is the max depth (15), and the base is however many foreign keys are cyclical on the query (3 in the repro). So this roughly creates a foreign key tree of size 3¹⁵. Right now, a tree of foreign key nodes is acyclical, so we determine the max depth at analysis time. There are two "solutions" here:

  1. Reduce the depth. 3⁵ is way smaller, but I'm sure someone is relying on this behavior somewhere, and 5 passes seems way too small (even a depth of 15 doesn't seem that big). Even then, we're still in exponential land, we've just pushed the bar further out.
  2. Allow the nodes to reference each other, creating a cyclical tree. We move depth analysis from analysis time to execution time. This requires a partial rewrite of the foreign key analyzer path.

I'm attempting #2. I've got something that handles the above case just fine, but it's broken some other self-referential tests. I'll continue working on this.

@andy-wm-arthur
Copy link
Contributor Author

@Hydrocharged Thanks for the update here. Is there anyway to memoize the search to make it faster? I'm curious what MySQL does here

@Hydrocharged
Copy link
Contributor

It's possible that MySQL may memoize the resulting trees, but I'm not sure how significant the speedup would be with the partial rewrite. Right now it takes forever (might even be practically infinite given memory pressure) to construct the tree given enough foreign keys. Caching the resulting nodes is a part of Solution 2, which also creates a cyclical tree. With the rewrite, fetching a memoized tree versus constructing a new one won't have a large enough of a speedup to be worth the additional complexity, at least not right now. We'd need to do a lot more state tracking for not just foreign keys, but tables as well.

I'd wager that MySQL is doing something closer to the partial rewrite, but they very well may be memoizing the resulting tree to really optimize execution times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants