Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: query rewrite incorrectly moves HAVING clause to WHERE #15836

Open
derekperkins opened this issue May 3, 2024 · 0 comments
Open
Labels
Component: Query Serving Needs Triage This issue needs to be correctly labelled and triaged Type: Bug

Comments

@derekperkins
Copy link
Member

derekperkins commented May 3, 2024

Overview of the Issue

We have a query that does aggregation in the SELECT, which is then referenced in HAVING. The query planner appears to try to optimize the query by moving it to WHERE, but since it contains aggregations, the column name hasn't resolved yet, causing a query error.

Query that works in MySQL

SELECT
  location_id,
  MIN(deadline_at),
  COUNT(phrase_id) total_unique,
  CASE
    WHEN COUNT(*) >= 999 THEN 'full_batch'
    WHEN TIMESTAMPDIFF(MINUTE, MIN(deadline_at), NOW()) >= 3 THEN 'deadline'
    WHEN TIMESTAMPDIFF(HOUR, MIN(created_at), NOW()) >= 6 THEN 'created_at'
  END AS reason

FROM keyword_metrics__pipeline
WHERE batch_id IS NULL
GROUP BY location_id
HAVING reason IS NOT NULL
ORDER BY MIN(deadline_at)

Query as rewritten by Vitess

select 
    location_id, 
    min(deadline_at), 
    count(phrase_id) as total_unique, 
    case 
      when count(*) >= 999 then 'full_batch' 
      when timestampdiff(minute, min(deadline_at), now()) >= 3 then 'deadline' 
      when timestampdiff(hour, min(created_at), now()) >= 6 then 'created_at' 
    end as reason 

from keyword_metrics__pipeline 
where batch_id is null 
  and reason is not null 
group by location_id 
order by min(deadline_at) asc

Reproduction Steps

Create Table

CREATE TABLE `keyword_metrics__pipeline` (
  `pipeline_id` bigint NOT NULL AUTO_INCREMENT,
  `phrase_id` int unsigned NOT NULL,
  `location_id` mediumint unsigned NOT NULL,
  `yy_mm` smallint NOT NULL,
  `deadline_at` timestamp NOT NULL,
  `batch_id` bigint DEFAULT NULL,
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `modified_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`pipeline_id`),
  UNIQUE KEY `phrase_id__location_id__yy_mm__uq` (`phrase_id`,`location_id`,`yy_mm`),
  KEY `keywords_metrics__pipeline__location_id__fk` (`location_id`),
  KEY `keywords_metrics__pipeline__pipeline__batches` (`batch_id`,`location_id`,`yy_mm`),
  CONSTRAINT `keywords_metrics__pipeline__location_id__fk` FOREIGN KEY (`location_id`) REFERENCES `locations` (`location_id`),
  CONSTRAINT `keywords_metrics__pipeline__phrase_id__fk` FOREIGN KEY (`phrase_id`) REFERENCES `phrases` (`phrase_id`),
  CONSTRAINT `keywords_metrics__pipeline__pipeline__batches` FOREIGN KEY (`batch_id`, `location_id`, `yy_mm`) REFERENCES `keyword_metrics__pipeline_batches` (`batch_id`, `location_id`, `yy_mm`)
) ENGINE=InnoDB AUTO_INCREMENT=17 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED

vschema

{
   "sharded": false,
   "tables": {
      "keyword_metrics__pipeline_batches": {}
   }
}

Binary Version

Vitess v19.0.3
Percona Server v8.0.36

Operating System and Environment details

GKE v1.29

Log Fragments

No response

@derekperkins derekperkins added Type: Bug Component: Query Serving Needs Triage This issue needs to be correctly labelled and triaged labels May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Query Serving Needs Triage This issue needs to be correctly labelled and triaged Type: Bug
Projects
None yet
Development

No branches or pull requests

1 participant