sql/delegate: show changefeed jobs apply db-level filters#156873
sql/delegate: show changefeed jobs apply db-level filters#156873log-head wants to merge 1 commit intocockroachdb:masterfrom
Conversation
| FROM unnest(targets.names) AS x | ||
| WHERE NOT (x = ANY(COALESCE(jt.filter_tables, ARRAY[]::STRING[]))) | ||
| ) | ||
| END AS full_table_names, |
There was a problem hiding this comment.
I'm pointing out what we discussed sync. Because our source of watched tables is this sql query, it's possible in the event of a bug with the table watcher or something that this query shows what tables we should be watching, even if we're not actually watching them.
Maybe more concretely: Let's say we have a db level feed that includes table foo and it's lagging. Then we rename that table. We are in some sense continuing to emit events for it until it catches up, but we will immediately stop showing it in the list of watched tables. Maybe that's expected.
Maybe we can put some kind of note somewhere to point out some of this nuance. Seems like the most important thing for us to remember, if we're ever debugging a db-level feed issue with what set of tables is being watched is not to use this query to show us what tables are actually being watched.
| ) AS sink_uri, | ||
| targets.names AS full_table_names, | ||
| CASE | ||
| WHEN COALESCE((jt.filter_type::string)::INT, 0) = 1 THEN |
There was a problem hiding this comment.
Could we have comments or some kind of named expression that tells us that 0 corresponds to include and 1 to exclude?
| ft.filter_type, | ||
| array_agg(k) AS filter_tables | ||
| FROM filter_targets ft | ||
| CROSS JOIN LATERAL ( |
There was a problem hiding this comment.
This cross join takes a cartesian product which could theoretically be expensive. When we spoke you said you tested the performance of this and it didn't seem to make an impact.
| WHEN COALESCE((jt.filter_type::string)::INT, 0) = 1 THEN | ||
| ( | ||
| SELECT COALESCE(array_agg(x), ARRAY[]::STRING[]) | ||
| FROM unnest(targets.names) AS x |
There was a problem hiding this comment.
In this case "targets.names" is actually the list of all tables in the database, is that right? If that's right, we should probably name it to express that.
| (sd.filter_list::json)->'tables' AS tables | ||
| FROM spec_details sd | ||
| ), | ||
| joined_targets AS ( |
There was a problem hiding this comment.
"joined_targets" is a somewhat vague name that doesn't help so much with interpreting what's going on. It would be helpful to choose a more descriptive name and/or leave a comment.
SHOW CHANGEFEED JOBS did not filter the tables in full_table_names with INCLUDE/EXCLUDE TABLES (e.g. CREATE CHANGEFEED FOR DATABASE movr INCLUDE TABLES users). Because of this, it did not accurately reflect the tables being watched by the db-level changefeed. This change filters the table list, either including only specified tables or else excluding specified tables. Epic: CRDB-1421 Fixes: cockroachdb#156798 Release note (sql change): For DB-level changefeeds with a table filter (e.g. CREATE CHANGEFEED FOR DATABASE movr INCLUDE TABLES users), apply this filter to the results of SHOW CHANGEFEED JOBS full_table_names.
54412a3 to
02a665a
Compare
aerfrei
left a comment
There was a problem hiding this comment.
Approved pending we discuss through the issue of this potentially being out of sync with the actual watched tables.
asg0451
left a comment
There was a problem hiding this comment.
i think this is very hairy sql and question whether we shouldnt just make separate queries and reuse the filtering logic by turning this into a non-delegate thing
| filter_targets_map AS ( | ||
| SELECT | ||
| sd.id, | ||
| sd.filter_specification::json AS filter_specification_json, | ||
| (sd.filter_specification::json)->'filter_type' AS filter_type, | ||
| (sd.filter_specification::json)->'tables' AS filter_tables_map | ||
| FROM spec_details sd | ||
| ), | ||
| filter_targets_array AS ( | ||
| SELECT | ||
| ftm.id, | ||
| ftm.filter_type, | ||
| array_agg(k) AS filter_tables | ||
| FROM filter_targets_map ftm | ||
| CROSS JOIN LATERAL ( | ||
| SELECT key | ||
| FROM json_object_keys(ftm.filter_tables_map) AS key | ||
| WHERE json_typeof(ftm.filter_tables_map) = 'object' | ||
| ) AS j(k) | ||
| GROUP BY ftm.id, ftm.filter_type |
There was a problem hiding this comment.
can you add some more comments about what these intermediate sections do? i'm finding it hard to follow
| targets.names AS full_table_names, | ||
| CASE | ||
| -- Include filter is enumerated to 1. | ||
| WHEN COALESCE((fta.filter_type::string)::INT, 0) = 1 THEN |
| targets.names AS full_table_names, | ||
| CASE | ||
| -- Include filter is enumerated to 1. | ||
| WHEN COALESCE((fta.filter_type::string)::INT, 0) = 1 THEN |
There was a problem hiding this comment.
can you not hardcode 0 and 1 but interpolate from the enum values?
| SELECT COALESCE(array_agg(x), ARRAY[]::STRING[]) | ||
| -- targets.names is the list, before filtering, of fully qualified table names. | ||
| FROM unnest(targets.names) AS x | ||
| WHERE x = ANY(COALESCE(fta.filter_tables, ARRAY[]::STRING[])) |
There was a problem hiding this comment.
i'm having a hard time understanding these bits too
|
i have a higher level UX question: why should
|
|
I agree that this query has become too hard to follow. Tried a few different ways to simplify but the intermediate tables remain hairy. Opened an issue to implement as an opaque statement #157764. I see the point about not adding too many fields to |
|
is it worth adding a case to |
SHOW CHANGEFEED JOBS did not filter the tables in full_table_names with INCLUDE/EXCLUDE TABLES (e.g. CREATE CHANGEFEED FOR DATABASE movr INCLUDE TABLES users). Because of this, it did not accurately reflect the tables being watched by the db-level changefeed.
This change filters the table list, either including only specified tables or else excluding specified tables.
Epic: CRDB-1421
Fixes: #156798
Release note (sql change): For DB-level changefeeds with a table filter (e.g. CREATE CHANGEFEED FOR DATABASE movr INCLUDE TABLES users), apply this filter to the results of SHOW CHANGEFEED JOBS full_table_names.