Skip to content

Quote schema name in operations to support special characters and spaces#1175

Merged
brandur merged 1 commit intomasterfrom
brandur-quoted-schema-name
Mar 16, 2026
Merged

Quote schema name in operations to support special characters and spaces#1175
brandur merged 1 commit intomasterfrom
brandur-quoted-schema-name

Conversation

@brandur
Copy link
Contributor

@brandur brandur commented Mar 15, 2026

Came up in #1170: we don't quote schemas in SQL operations, and that can
lead to problems in use of spaces, special characters, and uppercase
characters. It's not the end of the world given use of the above can be
considered a sizable anti-pattern anyway, but use of quoting is good for
general correctness.

Fixes #1170.

@brandur brandur force-pushed the brandur-quoted-schema-name branch from bc39c77 to 8133a84 Compare March 15, 2026 22:11
@brandur brandur requested a review from bgentry March 15, 2026 22:11
@mitar
Copy link
Contributor

mitar commented Mar 16, 2026

Thanks!

There is a regex which limits what schemas can contain. If you are quoting them now (which I do like, as general principle), then I think that regex check could be removed?

if params.Schema != "" {
maybeSchema = params.Schema + "."
// quote the schema name to support uppercase schema names and odd characters
maybeSchema = `"` + params.Schema + `".`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't proper quoting that you have to replace also any " with "" inside schema name?

I think defining a quoting function and then using it everywhere would be the easiest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Replaced these instances with a new dbutil.SafeIdentifier function. I thought about using pgx.Identifier too, but it's a tad awkward to use, and I didn't want to pull in the dependency for the non-Pgx drivers. I checked its implementation and luckily it's quite simple.

@brandur brandur force-pushed the brandur-quoted-schema-name branch from 8133a84 to f03674e Compare March 16, 2026 01:18
exerciseSQLFragments(ctx, t, executorWithTx)
exerciseExecutorTx(ctx, t, driverWithSchema, executorWithTx)
exerciseSchemaIntrospection(ctx, t, driverWithSchema, executorWithTx)
exerciseSchemaName(ctx, t, driverWithSchema)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new category doesn't feel super great. That said, I started with adding a specific test case for JobInsert (since I don't want to have to test this for every single operation), and that didn't feel great either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would say it could be grouped with other structural / foundational stuff but we don't really have a group for that yet 🤔

@brandur brandur force-pushed the brandur-quoted-schema-name branch 2 times, most recently from 442e00c to b351040 Compare March 16, 2026 01:22
Came up in #1170: we don't quote schemas in SQL operations, and that can
lead to problems in use of spaces, special characters, and uppercase
characters. It's not the end of the world given use of the above can be
considered a sizable anti-pattern anyway, but use of quoting is good for
general correctness.

Fixes #1170.
@brandur brandur force-pushed the brandur-quoted-schema-name branch from b351040 to dfc6497 Compare March 16, 2026 01:26
@brandur
Copy link
Contributor Author

brandur commented Mar 16, 2026

There is a regex which limits what schemas can contain. If you are quoting them now (which I do like, as general principle), then I think that regex check could be removed?

Hm, could do, but geeze, I wonder if there's actually any valid reason to go with all these crazy schema formats. We've never had anyone complain about this before and I wonder if we should keep the constraint tighter until we get a first non-synthetic complaint about it (where non-synthetic = one driven by a real-world need as opposed to a test case thing). cc @bgentry for thoughts as well.

Copy link
Contributor

@bgentry bgentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! I think we may need corresponding Pro fixes too, I see a couple similar schema += bits in there too.

@brandur
Copy link
Contributor Author

brandur commented Mar 16, 2026

think we may need corresponding Pro fixes too, I see a couple similar schema += bits in there too.

+1. We can probably wait until your big workflow changes are in, but shouldn't be hard to add after.

@mitar
Copy link
Contributor

mitar commented Mar 16, 2026

My non-synthetic use case is that for our project we do not currently have any restrictions on what schema name could be. Now that I integrated River into it, I would prefer to keep it this way. But it is not a big deal either. People do not really use strange schema names. They do use camelCase though and this is why we had those tests.

@brandur
Copy link
Contributor Author

brandur commented Mar 16, 2026

My non-synthetic use case is that for our project we do not currently have any restrictions on what schema name could be. Now that I integrated River into it, I would prefer to keep it this way. But it is not a big deal either. People do not really use strange schema names. They do use camelCase though and this is why we had those tests.

Hmm, camel case is a fair ask.

You're not saying that camel case is banned though are you? Just looking at postgresSchemaNameRE, that should be allowed now that we have our quoted schema fix here right?

var postgresSchemaNameRE = regexp.MustCompile(`^[a-zA-Z_][a-zA-Z0-9_]*$`)

@mitar
Copy link
Contributor

mitar commented Mar 16, 2026

Yes, camel case is already allowed. This is what I am saying. That we observed only camel case in practice. So we had a test for that. And River already allows it (but it was broken). So it is not a big deal if it is restricted to just this regex, except that formally we never defined that our schemas are restricted.

I think it is fine if we leave it restricted for now if you feel uneasy about lifting restrictions. I think current regex is fine.

(Now I remembered, we also had a problem that some our schemas started with numbers. We had to change that in our tests, too.)

@bgentry
Copy link
Contributor

bgentry commented Mar 16, 2026

I think I missed your earlier request for my input because I was reviewing at the same time. This definitely feels pretty obscure as far as the use case for having schema names with special characters in them, so I guess my opinion would be based on how much work it is to do this or how much code complexity it is. If it's quick and clean and not too much code, then I guess we might as well do it the right way. 🤷‍♂️

@brandur
Copy link
Contributor Author

brandur commented Mar 16, 2026

I think it is fine if we leave it restricted for now if you feel uneasy about lifting restrictions. I think current regex is fine.

K cool. Let's leave it as is for now, but I'm open to easing it if someone's got a justifiable complaint.

I think I missed your earlier request for my input because I was reviewing at the same time. This definitely feels pretty obscure as far as the use case for having schema names with special characters in them, so I guess my opinion would be based on how much work it is to do this or how much code complexity it is. If it's quick and clean and not too much code, then I guess we might as well do it the right way. 🤷‍♂️

No worries! The "fix" to relax the name constraints is very straightforward (just have to remove a regex and check) to be fair. That said, as per above, I don't think it's compromising to many cases in reality, so let's just leave it for the time being and see if we get some more feedback.

Thanks both!

@brandur brandur merged commit 366a8db into master Mar 16, 2026
15 checks passed
@brandur brandur deleted the brandur-quoted-schema-name branch March 16, 2026 05:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using upper case schemas do not correctly

3 participants