Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyzer error: exceeded max analysis iterations (8) #4816

Closed
andy-wm-arthur opened this issue Nov 17, 2022 · 2 comments · Fixed by dolthub/go-mysql-server#1411
Closed

Analyzer error: exceeded max analysis iterations (8) #4816

andy-wm-arthur opened this issue Nov 17, 2022 · 2 comments · Fixed by dolthub/go-mysql-server#1411
Assignees
Labels
bug Something isn't working good repro Easily reproducible bugs sql Issue with SQL

Comments

@andy-wm-arthur
Copy link
Contributor

repro:

SET NAMES utf8;
SET autocommit=0;
SET autocommit=1;
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET autocommit=0;
CREATE TABLE `dcim_region` (
  `id` char(32) COLLATE utf8mb4_0900_ai_ci NOT NULL,
  `lft` int unsigned NOT NULL,
  `rght` int unsigned NOT NULL,
  `tree_id` int unsigned NOT NULL,
  `level` int unsigned NOT NULL,
  `parent_id` char(32) COLLATE utf8mb4_0900_ai_ci,
  PRIMARY KEY (`id`),
  KEY `dcim_region_parent_id_2486f5d4` (`parent_id`),
  KEY `dcim_region_tree_id_a09ea9a7` (`tree_id`),
  CONSTRAINT `dcim_region_parent_id_2486f5d4_fk_dcim_region_id` FOREIGN KEY (`parent_id`) REFERENCES `dcim_region` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
CREATE TABLE `dcim_site` (
  `id` char(32) COLLATE utf8mb4_0900_ai_ci NOT NULL,
  `region_id` char(32) COLLATE utf8mb4_0900_ai_ci,
  PRIMARY KEY (`id`),
  KEY `dcim_site_region_id_45210932` (`region_id`),
  CONSTRAINT `dcim_site_region_id_45210932_fk_dcim_region_id` FOREIGN KEY (`region_id`) REFERENCES `dcim_region` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `dcim_region` (`id`, `parent_id`, `lft`, `rght`, `tree_id`, `level`) VALUES ('37aa15fdf5434e90a411cabb1fa8365f', NULL, 1, 2, 1, 0);
INSERT INTO `dcim_region` (`id`, `parent_id`, `lft`, `rght`, `tree_id`, `level`) VALUES ('14ee755c20834b49a29f133fee980cb2', NULL, 1, 2, 2, 0);
INSERT INTO `dcim_region` (`id`, `parent_id`, `lft`, `rght`, `tree_id`, `level`) VALUES ('132aa97624404b6aab3bf82157def7e4', NULL, 1, 2, 3, 0);
SAVEPOINT `s139777131587392_x1967`;
INSERT INTO `dcim_region` (`id`, `parent_id`, `lft`, `rght`, `tree_id`, `level`) VALUES ('3ab5d9743e25401da3d4aac4c73d38e2', NULL, 1, 2, 4, 0);
INSERT INTO `dcim_region` (`id`, `parent_id`, `lft`, `rght`, `tree_id`, `level`) VALUES ('40e8871599fc4643b8b827d74be13d32', NULL, 1, 2, 5, 0);
INSERT INTO `dcim_region` (`id`, `parent_id`, `lft`, `rght`, `tree_id`, `level`) VALUES ('11f8430cb5fb44ee8b8969d27a883c00', NULL, 1, 2, 6, 0);
INSERT INTO `dcim_region` (`id`, `parent_id`, `lft`, `rght`, `tree_id`, `level`) VALUES ('e6dabc406e114c4bae71b450e1058f8f', NULL, 1, 2, 7, 0);
SELECT COUNT(*) 
FROM (
  SELECT (
    SELECT count(*) 
    FROM (
      SELECT U0.`id` 
      FROM `dcim_site` U0 
      INNER JOIN `dcim_region` U1 
      ON (U0.`region_id` = U1.`id`)
       WHERE (
        U1.`lft` >= `dcim_region`.`lft` AND 
        U1.`lft` <= `dcim_region`.`rght` AND 
        U1.`tree_id` = `dcim_region`.`tree_id`
      )
    ) _count
  ) AS `site_count` 
  FROM `dcim_region` 
  WHERE `dcim_region`.`id` 
  IN (
    '3ab5d9743e25401da3d4aac4c73d38e2', 
    '40e8871599fc4643b8b827d74be13d32', 
    '11f8430cb5fb44ee8b8969d27a883c00', 
    'e6dabc406e114c4bae71b450e1058f8f'
  )
) subquery;
SET autocommit=1;
@fulghum fulghum self-assigned this Nov 17, 2022
@fulghum
Copy link
Contributor

fulghum commented Nov 17, 2022

I ran this repro and confirmed that without data in the tables, analysis completes successfully; when there is data inserted in the tables, analysis gets stuck in a loop and doesn't complete.

Diffing the analyzer output for the two runs shows that the plans diverge during the optimizeJoins rule. When there is no data in the tables, a HashJoin with HashLookup is used; when there is data in the tables, an InnerJoin is used instead.

Plan with no data in the tables:

Project  
 ├─ columns: [[U0.id, idx=6, type=char(32) COLLATE utf8mb4_0900_ai_ci, nullable=false]]  
 └─ Filter((([U1.lft, idx=9, type=int unsigned, nullable=false] >= [dcim_region.lft, idx=1, type=int unsigned, nullable=false]) AND ([U1.lft, idx=9, type=int unsigned, nullable=false] <= [dcim_region.rght, idx=2, type=int unsigned, nullable=false])) AND ([U1.tree_id, idx=11, type=int unsigned, nullable=false] = [dcim_region.tree_id, idx=3, type=int unsigned, nullable=false]))  
     └─ HashJoin([U0.region_id, idx=7, type=char(32) COLLATE utf8mb4_0900_ai_ci, nullable=true] = [U1.id, idx=8, type=char(32) COLLATE utf8mb4_0900_ai_ci, nullable=false]), comment=  
         ├─ TableAlias(U0)  
         │   └─ Table  
         │       └─ columns: [id region_id]  
         └─ HashLookup(child: TUPLE([U1.id, idx=6, type=char(32) COLLATE utf8mb4_0900_ai_ci, nullable=false]), lookup: TUPLE([U0.region_id, idx=7, type=char(32) COLLATE utf8mb4_0900_ai_ci, nullable=true]))  
             └─ CachedResults  
                 └─ TableAlias(U1)  
                     └─ Table  
                         └─ columns: [id lft rght tree_id]

Plan with data in the tables:

Project  
 ├─ columns: [[U0.id, idx=6, type=char(32) COLLATE utf8mb4_0900_ai_ci, nullable=false]]  
 └─ Filter((([U1.lft, idx=9, type=int unsigned, nullable=false] >= [dcim_region.lft, idx=1, type=int unsigned, nullable=false]) AND ([U1.lft, idx=9, type=int unsigned, nullable=false] <= [dcim_region.rght, idx=2, type=int unsigned, nullable=false])) AND ([U1.tree_id, idx=11, type=int unsigned, nullable=false] = [dcim_region.tree_id, idx=3, type=int unsigned, nullable=false]))  
     └─ InnerJoin([U0.region_id, idx=11, type=char(32) COLLATE utf8mb4_0900_ai_ci, nullable=true] = [U1.id, idx=6, type=char(32) COLLATE utf8mb4_0900_ai_ci, nullable=false]), comment=  
         ├─ TableAlias(U1)  
         │   └─ Table  
         │       └─ columns: [id lft rght tree_id]  
         └─ TableAlias(U0)  
             └─ Table  
                 └─ columns: [id region_id]

I'll continue digging in...

@timsehn timsehn added bug Something isn't working sql Issue with SQL good repro Easily reproducible bugs labels Nov 17, 2022
@fulghum
Copy link
Contributor

fulghum commented Nov 17, 2022

During join planning, the cost of the tables is slightly different depending on if they contain rows or not. Having data in the table triggers an InnerJoin instead of a HashLookup to be used. InnerJoin is commutative, so on the next round of analysis, the expression group contains both orderings. The plan with the left/right sides reversed is chosen because it appears in the expression list first and has an identical cost to the non-flipped plan. This repeats on the next analysis cycle and causes the analyzer to error out since it can't find a stable plan.

I've got a fix coded up to ensure we list the non-flipped plan first in the expression group and this query is now working. I'll get it cleaned up, add some more tests, and get a PR out shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good repro Easily reproducible bugs sql Issue with SQL
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants