Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pg_tables and pg_views to catalog #1748

Closed
wants to merge 6 commits into from

Conversation

skabbes
Copy link
Contributor

@skabbes skabbes commented Jul 16, 2022

Fixes: #1747

Problem - I was unable to write a query against pg_timezone_names.

Actual failing query

select
  name,
  abbrev,
  utc_offset,
  to_char((current_timestamp at time zone name), 'HH:MI am') as current_time
from pg_catalog.pg_timezone_names
order by utc_offset;

This because sqlc did not know about these built-in tables / views.

Warning
I tried to make sqlc-pg-gen and make regen, but I was getting slightly different output, so in this commit I only committed the changes that affected the tables, not the existing functions (which I do not know why they were changed, perhaps a different version of postgres).

@skabbes
Copy link
Contributor Author

skabbes commented Jul 16, 2022

@kyleconroy I might need your help regenerating the files. I am having a heck-uva time.

➜  sqlc_skabbes git:(catalog_tables_views) ✗ make regen
go build -o ~/bin/sqlc-dev ./cmd/sqlc/
go run ./scripts/regenerate/
2022/07/16 11:35:01 internal/endtoend/testdata/wasm_plugin_sqlc_gen_greeter: sqlc-dev generate failed
# package greeter
error generating code: call: Exited with i32 exit status 1
wasm backtrace:
    0: 0x1cf49 - <unknown>!__wasi_proc_exit
note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable to may show more debugging information

exit status 1
make: *** [regen] Error 1

@kyleconroy
Copy link
Collaborator

Very strange error message. What os / arch are you on?

@skabbes
Copy link
Contributor Author

skabbes commented Jul 18, 2022

I'm on an M1 Mac, arm64 I suppose.

@skabbes
Copy link
Contributor Author

skabbes commented Jul 18, 2022

I added the env var, and get this next error message.

➜  wasm_plugin_sqlc_gen_greeter git:(catalog_tables_views) ✗ WASMTIME_BACKTRACE_DETAILS=1 sqlc-dev generate
# package greeter
error generating code: loadModule: Module was compiled without WebAssembly backtrace support but it is enabled for the host

@skabbes
Copy link
Contributor Author

skabbes commented Jul 18, 2022

@kyleconroy given this commit fixed the build, I think it has something to do with the pure size of the data or number of tables in pg_catalog.

I believe since all of a sudden there are tables in the pg_catalog schema, it is exporting them - but probably that's not what you want / wanted with your plugin design.

@@ -36743,5 +36743,36 @@ func genPGCatalog() *catalog.Schema {
},
},
}

s.Tables = []*catalog.Table{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just overwrote the Tables slice to make the output purposefully smaller - just to debug.

@kyleconroy
Copy link
Collaborator

I'm not sure what limit we're running into, but for the meantime let's drop the pg_catalog schema before sending it to the WASM plugin.

diff --git a/internal/ext/wasm/wasm.go b/internal/ext/wasm/wasm.go
index a55d8551..937ece21 100644
--- a/internal/ext/wasm/wasm.go
+++ b/internal/ext/wasm/wasm.go
@@ -175,7 +175,23 @@ func (r *Runner) loadWASM(ctx context.Context, cache string, expected string) ([
        return wmod, nil
 }
 
+// Due to some memory or file size limit, we must drop the pg_catalog
+// schema
+func removePGCatalog(req *plugin.CodeGenRequest) {
+       if req.Catalog != nil && req.Catalog.Schemas != nil {
+               schemas := req.Catalog.Schemas
+               for i, _ := range schemas {
+                       if schemas[i].Name == "pg_catalog" {
+                               req.Catalog.Schemas = append(schemas[:i], schemas[i+1:]...)
+                               return
+                       }
+               }
+       }
+}
+
 func (r *Runner) Generate(ctx context.Context, req *plugin.CodeGenRequest) (*plugin.CodeGenResponse, error) {
+       removePGCatalog(req)

Copy link
Contributor Author

@skabbes skabbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied the fixes, and now there is just some other problems to address, particularly percentile_* functions seems to have changed, and I'm not sure why.

@@ -29,7 +29,37 @@ LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace
WHERE n.nspname OPERATOR(pg_catalog.~) '^(pg_catalog)$'
AND p.proargmodes IS NULL
AND pg_function_is_visible(p.oid)
ORDER BY 1;
-- The order isn't too important - just that it is stable between runs
ORDER BY 1, 2, 3, 4, 5;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kyleconroy These columns previously weren't ordered at all, so I think the existing output in pg_catalog.go is kind of in almost 'random' order. I ordered by basically every column here, not because it is important - but to keep the output stable between runs of sqlc-pg-gen.

@@ -34,9 +31,21 @@ func Adminpack() *catalog.Schema {
{
Type: &ast.TypeName{Name: "text"},
},
{
Type: &ast.TypeName{Name: "text"},
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a reordering of the polymorphic variants of pg_file_rename

},
ReturnType: &ast.TypeName{Name: "boolean"},
},
{
Name: "pg_file_sync",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new, but appears to be correct.
https://www.postgresql.org/docs/13/adminpack.html

@@ -361,6 +352,15 @@ func Ltree() *catalog.Schema {
},
ReturnType: &ast.TypeName{Name: "ltree"},
},
{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -16544,12 +16993,27 @@ func genPGCatalog() *catalog.Schema {
Args: []*catalog.Argument{},
ReturnType: &ast.TypeName{Name: "double precision"},
},
{
Name: "percentile_cont",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kyleconroy These functions seem to have been changed. I have no idea why.

},
{
Name: "percentile_disc",
Args: []*catalog.Argument{
{
Type: &ast.TypeName{Name: "double precision[]"},
},
{
Type: &ast.TypeName{Name: "anyelement"},
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function definition has changed, but I am not sure why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is because there are special functions which have "normal" arguments, and "aggregated group" arguments.

https://www.postgresql.org/docs/9.4/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I ended up writing the SQL to distinguish between the 2 types of arguments, and it does indeed fix percentile_* functions, but now the basic aggregates are failing. (MAX, SUM, etc).

I think there needs to be more work in the compiler to treat / lookup these functions differently based on the context in the query. So not sure where to go from here.
c873256

@kyleconroy
Copy link
Collaborator

@skabbes Something is wrong with the updated pg-gen. We should be generating two functions for COUNT. Here's the current function signatures on main.

		{
			Name: "count",
			Args: []*catalog.Argument{
				{
					Type: &ast.TypeName{Name: "any"},
				},
			},
			ReturnType: &ast.TypeName{Name: "bigint"},
		},
		{
			Name:       "count",
			Args:       []*catalog.Argument{},
			ReturnType: &ast.TypeName{Name: "bigint"},
		},

And here's the new definition:

		{
			Name:       "count",
			Args:       []*catalog.Argument{},
			ReturnType: &ast.TypeName{Name: "bigint"},
		},
		{
			Name:       "count",
			Args:       []*catalog.Argument{},
			ReturnType: &ast.TypeName{Name: "bigint"},
		},

We're missing the any parameter. Maybe it's something to do with the cross join?

@skabbes
Copy link
Contributor Author

skabbes commented Jul 21, 2022

No, sorry I didn't explain well. routines in Postgres can be regular functions, procedures, aggregates, and window functions. https://www.postgresql.org/docs/current/catalog-pg-proc.html

If it is an aggregate, there are 2 kinds of arguments, the "direct" arguments, and the "aggregate" arguments. Previously, sqlc did not differentiate between them, and classified all of them as "direct" arguments.

Most aggregate functions that are used don't have any "direct" arguments at all, so it all kinda worked out just fine. However, percentile_cont and friends have both direct and aggregate arguments that sqlc was not equipped to handle.

You can see that you manually overrode the pg_catalog just to make that work in this commit.

My commit did the work to differentiate between these direct and aggregate arguments to routines.

So when I said:

I think there needs to be more work in the compiler to treat / lookup these functions differently based on the context in the query. So not sure where to go from here.

What I meant was we need to make "looking up a function" (actually, a routine) smarter in the sqlc internals, before this can work. Either that, or re-apply the manual workaround you did from September 2021.

The reality is that the function percentile_disc actually takes 2 parameters, the 0.5 (direct arguments) and the within group (order by authors.name) (the aggregate argument).

select percentile_disc(0.5) within group (order by authors.name)

@kyleconroy I think looking forward for sqlc, it needs to differentiate between them. I'd be happy to put some work into making that a reality, but just wanted to check-in and make sure you agree that's the right path.

@kyleconroy
Copy link
Collaborator

You can see that you manually overrode the pg_catalog just to make that work in this commit.

I honestly forgot about the manual changes I made to pg_catalog.go. While it was the right thing to do to fix the bug, it should have been fixed in sqlc-pg-gen. I'll take the // Code generated by sqlc-pg-gen. DO NOT EDIT. seriously in the future.

I'd be happy to put some work into making that a reality, but just wanted to check-in and make sure you agree that's the right path.

Yep, I think we're on the right path. That said, we're now fixing something unrelated to the original intent of this PR (aggregate functions vs PostgreSQL internal tables / views). I'd be happy to split this up if you'd like to get the tables and views in first, but it's up to you.

@skabbes
Copy link
Contributor Author

skabbes commented Aug 21, 2022

I think this:
#1809

Is the first step in actually making progress here.

@skabbes
Copy link
Contributor Author

skabbes commented Aug 22, 2022

I islolated the pg_* fixes in this PR:
#1811

I think the "aggreagated function args vs direct args" can / should be addressed independently.

@kyleconroy
Copy link
Collaborator

Agreed. Going to close this PR.

@kyleconroy kyleconroy closed this Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pg_catalog tables and views not queryable (specifically pg_timezone_names)
2 participants