Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: vitess generates corrupted results on cross keyspace non full group by queries #11625

Closed
tokikanno opened this issue Nov 2, 2022 · 3 comments · Fixed by #11633
Closed

Comments

@tokikanno
Copy link
Contributor

tokikanno commented Nov 2, 2022

Overview of the Issue

While doing cross keyspace query, vitess generates corrupted results if SQL statement not fulfill full group by constraint.

Reproduction Steps

Assume we have 2 keyspaces called commerce and meta
Running following sql will reproduce this issue.

I think vitess should either raise a error for non-full group by queries if the ONLY_FULL_GROUP_BY sql mode option is on.
Or just return the results as same as mysql default behavior do.

-- create table for commerce.orders
DROP TABLE IF EXISTS commerce.orders;
CREATE TABLE commerce.orders (
	id INTEGER(10) NOT NULL,
	buyer INTEGER(10) NOT NULL,
	total DECIMAL(10, 2) NOT NULL,
	created DATETIME DEFAULT CURRENT_TIMESTAMP,
	PRIMARY KEY(id)
);


-- create table for meta.users
DROP TABLE IF EXISTS meta.users;
CREATE TABLE meta.users (
	id INTEGER(10) NOT NULL,
	name VARCHAR(64) NOT NULL,
	created DATETIME DEFAULT CURRENT_TIMESTAMP,
	PRIMARY KEY(id)
);

-- create table for commerce.users, this table has the same schema with meta.users table
DROP TABLE IF EXISTS commerce.users;
CREATE TABLE commerce.users (
	id INTEGER(10) NOT NULL,
	name VARCHAR(64) NOT NULL,
	created DATETIME DEFAULT CURRENT_TIMESTAMP,
	PRIMARY KEY(id)
);


-- insert some order into commerce.orders
insert into commerce.orders (id, buyer, total) VALUES(1, 1, 100);
insert into commerce.orders (id, buyer, total) VALUES(2, 1, 200);
insert into commerce.orders (id, buyer, total) VALUES(3, 1, 300);
insert into commerce.orders (id, buyer, total) VALUES(4, 2, 111);
insert into commerce.orders (id, buyer, total) VALUES(5, 2, 222);
insert into commerce.orders (id, buyer, total) VALUES(6, 2, 333);

-- insert some user into meta.users
insert into meta.users (id, name) VALUES(1, 'user 1');
insert into meta.users (id, name) VALUES(2, 'user 2');

-- insert some user into commerce.users
-- now you have 2 users table with same data in both commerce and meta keyspace
insert into commerce.users (id, name) VALUES(1, 'user 1');
insert into commerce.users (id, name) VALUES(2, 'user 2');


-- cross keyspace join with non full group by
-- this query will generated corrupted results
--
-- | id | name | sum(o.total) | created |
-- |----|------|--------------|---------|
-- |  1 |    0 |       600.00 |    2022 |
-- |  2 |    0 |       666.00 |    2022 |
SELECT
	u.id,
	u.name,
	sum(o.total),
	o.created
FROM
	commerce.orders AS o
	JOIN meta.users AS u ON o.buyer = u.id
GROUP BY
	u.id;

-- the same non full group by query, but this time we join tables from the same keyspce
-- this is mysql default behavior
--
-- | id | name   | sum(o.total) | created             |
-- |----|--------|--------------|---------------------|
-- |  1 | user 1 |       600.00 | 2022-10-31 16:39:12 |
-- |  2 | user 2 |       666.00 | 2022-10-31 16:39:51 |
SELECT
	u.id,
	u.name,
	sum(o.total),
	o.created
FROM
	commerce.orders AS o
	JOIN commerce.users AS u ON o.buyer = u.id
GROUP BY
	u.id;

-- fixed query with full group by + proper aggregation statement (min)
--
-- | id | name   | sum(o.total) | created             |
-- |----|--------|--------------|---------------------|
-- |  1 | user 1 |       600.00 | 2022-10-31 16:39:12 |
-- |  2 | user 2 |       666.00 | 2022-10-31 16:39:51 |
SELECT
	u.id,
	u.name,
	sum(o.total),
	min(o.created)
FROM
	commerce.orders AS o
	JOIN meta.users AS u ON o.buyer = u.id
GROUP BY
	u.id, u.name;

Binary Version

This issue has been tested with

v14.0.1
v14.0.3
v15.0.0

All above versions have the same behavior

Operating System and Environment details

vitess was setup by vitess k8s operator.
We're using AWS x86_64 ec2 instance with ubuntu 20.20 as k8s worker nodes.

Log Fragments

No response

@tokikanno tokikanno added Needs Triage This issue needs to be correctly labelled and triaged Type: Bug labels Nov 2, 2022
@systay systay added Component: Query Serving and removed Needs Triage This issue needs to be correctly labelled and triaged labels Nov 2, 2022
@frouioui
Copy link
Member

frouioui commented Nov 3, 2022

Hello @tokikanno, thank you for finding and detailing this issue. This is really helpful. This is indeed a bug that was introduced when we added the "random" aggregation used on non full group by queries.

I have opened a Pull Request to fix the issue you're describing: #11633. This Pull Request will be backported to the release branches 14 and 15. Do you need a patch release for this bug?

@frouioui frouioui self-assigned this Nov 3, 2022
@tokikanno
Copy link
Contributor Author

@frouioui : May I know how fast will the patch be merged & released?

Also I think we could fork and build the vitess by our own if it takes too long 😂

@frouioui
Copy link
Member

frouioui commented Nov 3, 2022

@tokikanno, I'll talk with the rest of the maintainers and let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants