Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tables column in sys.snapshots is missing quotes #14087

Open
hammerhead opened this issue May 8, 2023 · 10 comments · May be fixed by #16219
Open

tables column in sys.snapshots is missing quotes #14087

hammerhead opened this issue May 8, 2023 · 10 comments · May be fixed by #16219

Comments

@hammerhead
Copy link
Member

hammerhead commented May 8, 2023

CrateDB version

5.3.0

CrateDB setup information

No response

Problem description

Table names in sys.snapshots's tables array column aren't quoted, making it hard to use its values.

Steps to Reproduce

CREATE TABLE doc."I-need-quotes" (a INTEGER);

CREATE REPOSITORY demo_repository TYPE s3 WITH (bucket = '...', base_path = 'debug', access_key = '...', secret_key = '...');

CREATE SNAPSHOT demo_repository.s1 TABLE doc."I-need-quotes";

SELECT tables[1]
FROM sys.snapshots
WHERE name = 's1';
-- returns doc.I-need-quotes 

In a situation where I want to automate dropping tables before a restore, I may want to write queries such as:

SELECT 'DROP TABLE ' || t
FROM (
  SELECT UNNEST(tables) AS t
  FROM sys.snapshots
  WHERE name = 's1'
);
-- returns DROP TABLE doc.I-need-quotes (will fail, missing quotes)

-- or
SELECT 'DROP TABLE ' || QUOTE_IDENT(t)
FROM (
  SELECT UNNEST(tables) AS t
  FROM sys.snapshots
  WHERE name = 's1'
);
-- returns DROP TABLE "doc.I-need-quotes" (will fail, incorrectly quoted)

-- only working alternative
SELECT 'DROP TABLE ' || QUOTE_IDENT(SPLIT_PART(t, '.', 1)) || '.' || QUOTE_IDENT(SPLIT_PART(t, '.', 2))
FROM (
  SELECT UNNEST(tables) AS t
  FROM sys.snapshots
  WHERE name = s1'
);

Actual Result

See above.

Expected Result

Return a properly quoted name (doc."I-need-quotes") in tables that can easily be used further. Alternatively, split up table_name and schema_name into separate columns, so QUOTE_IDENT can get applied to them.

@hammerhead hammerhead added the triage An issue that needs to be triaged by a maintainer label May 8, 2023
@mfussenegger mfussenegger added feature: ux and removed triage An issue that needs to be triaged by a maintainer labels May 8, 2023
@matriv
Copy link
Contributor

matriv commented May 8, 2023

Imho we'd want to keep having this table names in sys.snapshots in sync with pg_tables and information_schema.tables , and in those we don't have " when they are need, and we are in sync with postgres:

matriv=> create table "55-tbl-needs-quotes"(a int);
CREATE TABLE
matriv=> select table_name from information_schema.tables where table_name like '55%';
     table_name      
---------------------
 55-tbl-needs-quotes
(1 row)

matriv=> select tablename from pg_tables where tablename like '55%';
      tablename      
---------------------
 55-tbl-needs-quotes
(1 row)

matriv=> select * from 55-tbl-needs-quotes;
ERROR:  syntax error at or near "55"
LINE 1: select * from 55-tbl-needs-quotes;
                      ^
matriv=> select * from "55-tbl-needs-quotes";
 a 
--11-
(0 rows)

@hammerhead
Copy link
Member Author

However, in your case, @matriv, the schema and table names are returned separately in information_schema.tables/pg_tables. In the sys.snapshots example, a fully qualified name is returned.
I think not applying quoting for individually returned schema/table names are fine. But in case a FQN is returned, it would be great if it was correctly quoted. Otherwise, the FQN needs to be split again into schema/table name by the client to fix it.

@matriv
Copy link
Contributor

matriv commented May 8, 2023

It sounds like a valid point, thx for explaining it further!

@RichardIcecube
Copy link

Hello, noobie contributor here. I'd like to tackle this issue, but am struggling to understand where to start. Doing a file search of sys.snapshots didn't help a ton. Could you give me a little bit of insight on where to start looking?

@matriv
Copy link
Contributor

matriv commented May 15, 2023

Thx for your interest @RichardIcecube, the class you want to check is SysSnapshotsTableInfo , there you can see that the method to adjust is SysSnapshot#tables()

@mfussenegger
Copy link
Member

I'm not sure if it's a good idea to just add quoting now after the fact. The workaround posted would stop working so it is kinda a breaking change.

Maybe instead we could add an additional object array with table_name and schema_name, similar to how table_partitions already works.
Then users can select what they want and make use of quote_ident if they need the quotes.

@RichardIcecube
Copy link

@matriv Thank you for helping me locate the points of interest in the program. I will do my best to build an understanding of how the code works.

I apologize for being a bit green, but I'm trying my best to follow along. To clarify what I need to do, @mfussenegger, are you suggesting creating another object array along with the existing one labeled table_partitions where there are additional table_name and schema_name columns?

If so, my current understanding of doing this would involve editing SysSnapshotsTableInfo#create() include an additional object array along with expanding SysSnapshot to accommodate the additional object array.

If I'm mistaken at all on this process, please correct errors

@mfussenegger
Copy link
Member

are you suggesting creating another object array along with the existing one labeled table_partitions where there are additional table_name and schema_name columns?

Yes. (But I don't really have a good name for it)

If so, my current understanding of doing this would involve editing SysSnapshotsTableInfo#create() include an additional object array along with expanding SysSnapshot to accommodate the additional object array.

SysSnapshot could probably expose a List<RelationName> (RelationName has both schema and name properties).

See

public List<String> tables() {
return concreteIndices.stream()
.map(RelationName::fqnFromIndexName)
.distinct()
.collect(Collectors.toList());
}

and:

public static RelationName fromIndexName(String indexName) {
IndexParts indexParts = new IndexParts(indexName);
return indexParts.toRelationName();
}

@RichardIcecube
Copy link

Unfortunately, I am a bit out of my depth here. Best of luck to whoever decides to tackle this issue!

@matriv
Copy link
Contributor

matriv commented May 23, 2023

no worries @RichardIcecube, and apologies, I shouldn't have rushed to put the good first issue label on this, as this needs first some more careful thought.

seut added a commit that referenced this issue Jun 24, 2024
The new column `table_relations` contains an array of objects
including the `table_schema` and `table_name` of all tables inside
the snapshot.

Closes #14087.
@seut seut linked a pull request Jun 24, 2024 that will close this issue
seut added a commit that referenced this issue Jun 24, 2024
The new column `table_relations` contains an array of objects
including the `table_schema` and `table_name` of all tables inside
the snapshot.

Closes #14087.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants