Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Add the builtin function _pg_expandarray() #24422

Merged
merged 1 commit into from
Apr 4, 2018

Conversation

BramGruneir
Copy link
Member

@BramGruneir BramGruneir commented Apr 2, 2018

This adds the information_schema._pg_expandarray() function. It is needed for
compatibility with a number of ORMs. See #16971.

information_schema._pg_expandarray takes an array and returns a set of the
value and an index into the array. This is a very old function and from what I
can discern, was designed for internal use only, but was picked up by a few
ORMs. There is no real supporting documentation for the feature. The code for
it can be seen here:
https://sourcegraph.com/github.com/postgres/postgres/-/blob/src/backend/catalog/information_schema.sql#L43:17

Furthermore, if the function is a Set Retruning Function, it returns tuples
when evaluated in a scalar context:

In Postgres:

SELECT information_schema._pg_expandarray(array['i','i','o','t','b']);

 _pg_expandarray
-----------------
 (i,1)
 (i,2)
 (o,3)
 (t,4)
 (b,5)
(5 rows)

With this patch Cockroach now supports this directly as well:

SELECT information_schema._pg_expandarray(array['i','i','o','t','b']);

+------------------------------------+
| information_schema._pg_expandarray |
+------------------------------------+
| ('i',1)                            |
| ('i',2)                            |
| ('o',3)                            |
| ('t',4)                            |
| ('b',5)                            |
+------------------------------------+
(5 rows)

However, when evaluating an SRF in the FROM context, it should return
columns. Luckily, we already do exactly that. This is true for unnest() as
well as the new _pg_expandarray. The difference is a little subtle and I
couldn't find any good documentation on it, but this answer seems to cover it
quite well:
https://dba.stackexchange.com/questions/172463/understanding-set-returning-function-srf-in-the-select-list

In Postgres:

SELECT * FROM information_schema._pg_expandarray(array['i','i','o','t','b']);

 x | n
---+---
 i | 1
 i | 2
 o | 3
 t | 4
 b | 5
(5 rows)

In Cockroach:

SELECT * from information_schema._pg_expandarray(array['i','i','o','t','b']);

+---+---+
| x | n |
+---+---+
| i | 1 |
| i | 2 |
| o | 3 |
| t | 4 |
| b | 5 |
+---+---+
(5 rows)

Note that after froming the resulting set to a table, this function is
essentially the same as:

SELECT * from unnest(array['i','1','3','r']) with ordinality as c(x,n);

+---+---+
| x | n |
+---+---+
| i | 1 |
| 1 | 2 |
| 3 | 3 |
| r | 4 |
+---+---+
(4 rows)

But to retain direct compatibility with Postgres, the original SET response
needs to be maintained as well.

Part of #16971.

Release note (sql change): Added support for the
information_schema._pg_expandarray() function.

@BramGruneir BramGruneir added this to the 2.1 milestone Apr 2, 2018
@BramGruneir BramGruneir requested review from jordanlewis, knz and a team April 2, 2018 18:51
@cockroach-teamcity
Copy link
Member

This change is Reviewable

makeGeneratorBuiltinWithReturnType(
tree.ArgTypes{{"input", types.AnyArray}},
func(args []tree.TypedExpr) types.T {
if len(args) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len(args) == 0 is impossible since the function always accepts exactly one argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it is needed or else this testcase will panic:

query error pq: information_schema\._pg_expandarray\(\): cannot determine type of empty array\. Consider annotating with the desired type, for example ARRAY\[\]:::int\[\]
SELECT information_schema._pg_expandarray(ARRAY[])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. That's a different case, you would test that with len(args[0]) == 0, not len(args) == 0. (but please don't do that)

  2. that error is not a panic, it's a legitimate error and it is expected. array[] without an explicit type will give that error in any context, no tjust with your new built-in. Make the test case use array[]:::int[] and see what happens.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already added the test cases for array[]:::int[].

So, to be clear, by removing the if len(args) == 0.

SELECT information_schema._pg_expandarray(ARRAY[])

Results in a panic and not an error. It's that if condition that returns the tree.UnknownReturnType error:

panic: runtime error: index out of range

goroutine 1 [running]:
github.com/cockroachdb/cockroach/pkg/sql/sem/builtins.glob..func191(0x0, 0x0, 0x0, 0x527eee0, 0xc420502300)
	/Users/bram/go/src/github.com/cockroachdb/cockroach/pkg/sql/sem/builtins/generator_builtins.go:107 +0x1dc
github.com/cockroachdb/cockroach/pkg/sql/sem/tree.returnTypeToFixedType(0x4aee5e0, 0xc4201b3a20, 0x13)
	/Users/bram/go/src/github.com/cockroachdb/cockroach/pkg/sql/sem/tree/overload.go:265 +0x3b
github.com/cockroachdb/cockroach/pkg/sql/sem/tree.Builtin.FixedReturnType(0x4be1540, 0xc420502320, 0x4aee5e0, 0x100, 0x3, 0x4a673dd, 0xd, 0x4a8dcc5, 0x36, 0x0, ...)
	/Users/bram/go/src/github.com/cockroachdb/cockroach/pkg/sql/sem/tree/builtin.go:125 +0x30
main.generateFunctions(0xc42012c000, 0x1, 0x2, 0xc4205001e0, 0x1f)
	/Users/bram/go/src/github.com/cockroachdb/cockroach/pkg/cmd/docgen/funcs.go:192 +0x37e
main.init.0.func1(0xc4200ce240, 0xc420465960, 0x1, 0x2, 0x0, 0x0)
	/Users/bram/go/src/github.com/cockroachdb/cockroach/pkg/cmd/docgen/funcs.go:52 +0x171
github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).execute(0xc4200ce240, 0xc420465940, 0x2, 0x2, 0xc4200ce240, 0xc420465940)
	/Users/bram/go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:698 +0x46d
github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc4200ce900, 0x4a75f49, 0x1e, 0xc4200ce900)
	/Users/bram/go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:783 +0x2e4
github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).Execute(0xc4200ce900, 0x1, 0xc4200ce240)
	/Users/bram/go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:736 +0x2b
main.main()
	/Users/bram/go/src/github.com/cockroachdb/cockroach/pkg/cmd/docgen/main.go:37 +0x27

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nvanbenschoten can you advise what is the best course of action here.

@knz
Copy link
Contributor

knz commented Apr 2, 2018

Very nice. I am flabbergasted on how elegant/simple this is.

@BramGruneir
Copy link
Member Author

Just added a few extra test cases for edge cases.

@nvanbenschoten
Copy link
Member

s/froming/forming/ in commit description


Reviewed 1 of 3 files at r1, 1 of 2 files at r2.
Review status: 2 of 3 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


pkg/sql/sem/builtins/generator_builtins.go, line 107 at r1 (raw file):

Previously, knz (kena) wrote…

@nvanbenschoten can you advise what is the best course of action here.

This isn't great, but I think it's correct at the moment (see unnest). For generic functions where the return type is a function of the input parameter types, we can't know the return type without knowing the input types. In this case, the contract for ReturnTyper is that it should return UnknownReturnType.


Comments from Reviewable

This adds the information_schema._pg_expandarray() function. It is needed for
compatibility with a number of ORMs. See cockroachdb#16971.

information_schema._pg_expandarray takes an array and returns a set of the
value and an index into the array. This is a very old function and from what I
can discern, was designed for internal use only, but was picked up by a few
ORMs. There is no real supporting documentation for the feature. The code for
it can be seen here:
https://sourcegraph.com/github.com/postgres/postgres/-/blob/src/backend/catalog/information_schema.sql#L43:17

Furthermore, if the function is a Set Retruning Function, it returns tuples
when evaluated in a scalar context:

In Postgres:

```sql
SELECT information_schema._pg_expandarray(array['i','i','o','t','b']);

 _pg_expandarray
-----------------
 (i,1)
 (i,2)
 (o,3)
 (t,4)
 (b,5)
(5 rows)
```

With this patch Cockroach now supports this directly as well:

```sql
SELECT information_schema._pg_expandarray(array['i','i','o','t','b']);

+------------------------------------+
| information_schema._pg_expandarray |
+------------------------------------+
| ('i',1)                            |
| ('i',2)                            |
| ('o',3)                            |
| ('t',4)                            |
| ('b',5)                            |
+------------------------------------+
(5 rows)
```

However, when evaluating an SRF in the `FROM` context, it should return
columns. Luckily, we already do exactly that. This is true for `unnest()` as
well as the new `_pg_expandarray`. The difference is a little subtle and I
couldn't find any good documentation on it, but this answer seems to cover it
quite well:
https://dba.stackexchange.com/questions/172463/understanding-set-returning-function-srf-in-the-select-list

In Postgres:

```sql
SELECT * FROM information_schema._pg_expandarray(array['i','i','o','t','b']);

 x | n
---+---
 i | 1
 i | 2
 o | 3
 t | 4
 b | 5
(5 rows)

```

In Cockroach:

```sql
SELECT * FROM information_schema._pg_expandarray(array['i','i','o','t','b']);

+---+---+
| x | n |
+---+---+
| i | 1 |
| i | 2 |
| o | 3 |
| t | 4 |
| b | 5 |
+---+---+
(5 rows)
```

Note that this function with the FROM context is essentially the same as:

```sql
SELECT * FROM unnest(array['i','1','3','r']) with ordinality as c(x,n);

+---+---+
| x | n |
+---+---+
| i | 1 |
| 1 | 2 |
| 3 | 3 |
| r | 4 |
+---+---+
(4 rows)
```

But to retain direct compatibility with Postgres, the original SET response
needs to be maintained as well.

Part of cockroachdb#16971.

Release note (sql change): Added support for the
information_schema._pg_expandarray() function.
@BramGruneir
Copy link
Member Author

Thanks.

I guess it should be FROMing. Just re-worded to make it clearer.


Review status: 2 of 3 files reviewed at latest revision, 1 unresolved discussion.


Comments from Reviewable

@BramGruneir
Copy link
Member Author

@knz, just waiting on a final LGTM and I'll get bors to merge this.

@knz
Copy link
Contributor

knz commented Apr 4, 2018

:lgtm:


Review status: 2 of 3 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


Comments from Reviewable

@BramGruneir
Copy link
Member Author

bors r=knz

craig bot added a commit that referenced this pull request Apr 4, 2018
24422: sql: Add the builtin function _pg_expandarray() r=knz a=BramGruneir

This adds the information_schema._pg_expandarray() function. It is needed for
compatibility with a number of ORMs. See #16971.

information_schema._pg_expandarray takes an array and returns a set of the
value and an index into the array. This is a very old function and from what I
can discern, was designed for internal use only, but was picked up by a few
ORMs. There is no real supporting documentation for the feature. The code for
it can be seen here:
https://sourcegraph.com/github.com/postgres/postgres/-/blob/src/backend/catalog/information_schema.sql#L43:17

Furthermore, if the function is a Set Retruning Function, it returns tuples
when evaluated in a scalar context:

In Postgres:

```sql
SELECT information_schema._pg_expandarray(array['i','i','o','t','b']);

 _pg_expandarray
-----------------
 (i,1)
 (i,2)
 (o,3)
 (t,4)
 (b,5)
(5 rows)
```

With this patch Cockroach now supports this directly as well:

```sql
SELECT information_schema._pg_expandarray(array['i','i','o','t','b']);

+------------------------------------+
| information_schema._pg_expandarray |
+------------------------------------+
| ('i',1)                            |
| ('i',2)                            |
| ('o',3)                            |
| ('t',4)                            |
| ('b',5)                            |
+------------------------------------+
(5 rows)
```

However, when evaluating an SRF in the `FROM` context, it should return
columns. Luckily, we already do exactly that. This is true for `unnest()` as
well as the new `_pg_expandarray`. The difference is a little subtle and I
couldn't find any good documentation on it, but this answer seems to cover it
quite well:
https://dba.stackexchange.com/questions/172463/understanding-set-returning-function-srf-in-the-select-list

In Postgres:

```sql
SELECT * FROM information_schema._pg_expandarray(array['i','i','o','t','b']);

 x | n
---+---
 i | 1
 i | 2
 o | 3
 t | 4
 b | 5
(5 rows)

```

In Cockroach:

```sql
SELECT * from information_schema._pg_expandarray(array['i','i','o','t','b']);

+---+---+
| x | n |
+---+---+
| i | 1 |
| i | 2 |
| o | 3 |
| t | 4 |
| b | 5 |
+---+---+
(5 rows)
```

Note that after froming the resulting set to a table, this function is
essentially the same as:

```sql
SELECT * from unnest(array['i','1','3','r']) with ordinality as c(x,n);

+---+---+
| x | n |
+---+---+
| i | 1 |
| 1 | 2 |
| 3 | 3 |
| r | 4 |
+---+---+
(4 rows)
```

But to retain direct compatibility with Postgres, the original SET response
needs to be maintained as well.

Part of #16971.

Release note (sql change): Added support for the
information_schema._pg_expandarray() function.
@craig
Copy link
Contributor

craig bot commented Apr 4, 2018

Build succeeded

@craig craig bot merged commit 0e85331 into cockroachdb:master Apr 4, 2018
@BramGruneir BramGruneir deleted the expandarray branch April 4, 2018 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants