Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong assert in ruleutils_xx.c #7591

Open
Green-Chan opened this issue Apr 26, 2024 · 0 comments
Open

Wrong assert in ruleutils_xx.c #7591

Green-Chan opened this issue Apr 26, 2024 · 0 comments

Comments

@Green-Chan
Copy link
Contributor

In #7379 assert in ruleutils_15.c was changed from

	while (i < colinfo->num_cols && colinfo->colnames[i] == NULL)
		i++;
	Assert(i == colinfo->num_cols);

to

	for (int col_index = 0; col_index < colinfo->num_cols; col_index++)
	{
		/*
		 * In the above processing-loops, "i" advances only if
		 * the column is not new, check if this is a new column.
		 */
		if (colinfo->is_new_col[col_index])
			i++;
	}
	Assert(i == colinfo->num_cols);

(later ruleutils_14.c and ruleutils_16.c were also changed).
But that is still wrong. The following script leads to assert failure:

CREATE EXTENSION citus;
CREATE TABLE t1 (a int, b int);
CREATE TABLE t2 (x int, y int);
INSERT INTO t1 VALUES (1, 2), (4, 5);
INSERT INTO t2 VALUES (1, 1), (5, 5);
SELECT create_distributed_table('t2','x');
CREATE VIEW v AS SELECT * FROM t2 JOIN t1 on (t1.a=t2.x);
ALTER TABLE t2 ADD COLUMN z int;
SELECT * FROM v;

Here is the explanation of why assert was changed: #7379 (comment)
It is wrong. is_new_col has nothing to do with colnames. If you take a look at the deparse_columns definition and comments there as a whole, you'll see that is_new_col actually about columns described by new_colnames. Yes, may be that comment should be changed to

Entries with is_new_col false must match the non-NULL new_colnames entries one-for-one.
I also spent some time exploring what set_join_column_names() does, and that confirms that is_new_col is about new_colnames, not colnames.

So you should probably return the old assert back and reinvestigate why it fails. One of examples of failure is from multi_alter_table_statements test:

-- Create a reference table
CREATE TABLE tbl_ref_mats(row_id integer primary key);
INSERT INTO tbl_ref_mats VALUES (1), (2);
SELECT create_reference_table('tbl_ref_mats');

-- Create a distributed table
CREATE TABLE tbl_dist_mats(series_id integer);
INSERT INTO tbl_dist_mats VALUES (1), (1), (2), (2);
SELECT create_distributed_table('tbl_dist_mats', 'series_id');

-- Create a view that joins the distributed table with the reference table on the distribution key.
CREATE VIEW vw_citus_views as
SELECT d.series_id FROM tbl_dist_mats d JOIN tbl_ref_mats r ON d.series_id = r.row_id;

-- The view initially works fine
SELECT * FROM vw_citus_views ORDER BY 1;
-- Now, alter the table
ALTER TABLE tbl_ref_mats ADD COLUMN category1 varchar(50);
SELECT * FROM vw_citus_views ORDER BY 1;

From what I could understand, set_join_column_names() initialises colinfo->colnames and noldcolumns from rte->eref->colnames, and it expects that this list contains only old columns (meaning "columns that existed when the query was parsed"). For the above example rte->eref->colnames consists of "series_id", "row_id" and "category1".
If you comment out

SELECT create_reference_table('tbl_ref_mats');

then rte->eref->colnames consists of "series_id" and "row_id", as expected.

Looks like rte->eref->colnames is formed wrong for reference table

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant