Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanner/spannertest: GROUP BY without an aggregation function returns all rows #5676

Closed
slugclub opened this issue Feb 21, 2022 · 1 comment · Fixed by #5717
Closed

spanner/spannertest: GROUP BY without an aggregation function returns all rows #5676

slugclub opened this issue Feb 21, 2022 · 1 comment · Fixed by #5717
Assignees
Labels
api: spanner Issues related to the Spanner API. 🚨 This issue needs some love. triage me I really want to be triaged.

Comments

@slugclub
Copy link

In Spanner, a GROUP BY clause with no aggregation functions in the selected columns yields the same behavior as a DISTINCT clause. The spannertest package doesn't handle this properly.

Environment

macOS 12.2.1

Go Environment

$ go version
go version go1.17.1 darwin/amd64
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/evem/Library/Caches/go-build"
GOENV="/Users/evem/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/evem/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/evem/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.17.1"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/evem/sos/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/jk/d5g3m1w57lv7pw_ltvvb49w8008sgq/T/go-build3706391793=/tmp/go-build -gno-record-gcc-switches -fno-common"

Code

If I add this test case where a GROUP BY is used to give the same behaviour as a DISTINCT:

--- a/spanner/spannertest/integration_test.go
+++ b/spanner/spannertest/integration_test.go
@@ -947,6 +947,15 @@ func TestIntegration_ReadsAndQueries(t *testing.T) {
                        },
                },
                // From https://cloud.google.com/spanner/docs/query-syntax#group-by-clause_1:
+               {
+                       `SELECT LastName FROM PlayerStats GROUP BY LastName`,
+                       nil,
+                       [][]interface{}{
+                               {"Adams"},
+                               {"Buchanan"},
+                               {"Coolidge"},
+                       },
+               },
                {
                        // TODO: Ordering matters? Our implementation sorts by the GROUP BY key,
                        // but nothing documented seems to guarantee that.

Expected behavior

I expect the test to pass.

Actual behavior

The test fails:

integration_test.go:1280: Results from Query("SELECT LastName FROM PlayerStats GROUP BY LastName", map[]) are wrong.
         got [[Adams] [Adams] [Buchanan] [Buchanan] [Coolidge]]
        want [[Adams] [Buchanan] [Coolidge]]

Additional context

The code expects that any query with a GROUP BY clause will include an aggregation function (e.g. SUM, MAX), but you don't need to include an aggregation function if the GROUP BY covers all of the selected columns.

See the Spanner docs:

GROUP BY is commonly used when aggregate functions are present in the SELECT list, or to eliminate redundancy in the output.

I have confirmed this behavior running against Google Cloud Spanner.

This fixes it but I'm not sure that it's the right approach:

--- a/spanner/spannertest/db_query.go
+++ b/spanner/spannertest/db_query.go
@@ -531,7 +531,7 @@ func (d *database) evalSelect(sel spansql.Select, qc *queryContext) (si *selIter
                }
                aggI = append(aggI, i)
        }
-       if len(aggI) > 0 {
+       if len(aggI) > 0 || len(sel.GroupBy) > 0 {
                raw, err := toRawIter(ri)
                if err != nil {
                        return nil, err
@slugclub slugclub added the triage me I really want to be triaged. label Feb 21, 2022
@product-auto-label product-auto-label bot added the api: spanner Issues related to the Spanner API. label Feb 21, 2022
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Feb 26, 2022
@rahul2393 rahul2393 removed the triage me I really want to be triaged. label Mar 3, 2022
@rahul2393
Copy link
Contributor

rahul2393 commented Mar 3, 2022

Thanks @slugclub for opening the issue, spansql/spannertest packages lag behind compared to the actualy cloud spanner feature wise and functionality wise, We are thinking of freezing the development of spansql/spannertest packages and recommend to use officially supported https://cloud.google.com/spanner/docs/emulator for in-memory usage.
Thanks, for this issue I have opened the PR and will be released in next release.
cc: @hengfengli @olavloite

@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the Spanner API. 🚨 This issue needs some love. triage me I really want to be triaged.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants