Skip to content
This repository has been archived by the owner on Feb 21, 2024. It is now read-only.

forward-port tests from SQL1 tree #2261

Merged
merged 1 commit into from Mar 1, 2023
Merged

forward-port tests from SQL1 tree #2261

merged 1 commit into from Mar 1, 2023

Conversation

seebs
Copy link
Contributor

@seebs seebs commented Feb 17, 2023

This forward-ports a number of tests from the previous SQL implementation. The porting is approximate in a number of ways, and not all tests are implemented/tested yet.

In particular, several tests are currently disabled because we don't support limit n constructs.

The tests that were primarily tests of the parser have been brought forward as parser tests. One of them has been altered to add parentheses, because our parser interprets
fld1 between 1 and 3 and fld2 = 2
as:
fld1 between (1 and 3) and (fld2 = 2)
which is invalid, while the old parser apparently interpreted it as:
(fld1 between 1 and 3) and (fld2 = 2)

We have not yet verified the SQL spec's requirements here, but sqlite agrees with our old parser, not our new parser, so this may be a regression.

The old tests expected an INNER JOIN to suppress duplicate values. Our new code does not, which is consistent with other SQL implementations. This is a change, but the old behavior appears to have been wrong. (You can still suppress duplicate values by specifying DISTINCT.)

In the previous implementations, a value like count(*) had count(*) as its column name. In the new implementation, it has an empty string as its column name.

Related to this, the prior implementation allowed you to write
select age, count(*) from grouper group by age having count > 1
but the new implementationt requires that to be spelled as
having count(*) > 1

This is consistent with other SQL implementations, so I think the new behavior is correct.

The behavior of SHOW COLUMNS and SHOW TABLES has changed, in that the specific results returned are significantly different. Perhaps more significantly, the old system spelled the former query as SHOW FIELDS, rather than SHOW COLUMNS. This may be considered a regression, in that SHOW FIELDS no longer works, and we should consider whether any hypothetical users might have been relying on the output of either of these. (I hope not, the new output is much better.)

Some of the old tests (the ones in handler_test) were accommodated by adding a couple of specific test cases to existing tests, specifically:
* handling timestamp values with Z rather than +00:00
* a join with a WHERE clause referring to fields in both source tables

@seebs seebs force-pushed the fb-1870 branch 5 times, most recently from 39ab6b1 to 07af117 Compare February 24, 2023 16:55
This forward-ports a number of tests from the previous SQL
implementation. The porting is approximate in a number of ways,
and not all tests are implemented/tested yet.

In particular, several tests are currently disabled because
we don't support `limit n` constructs.

The tests that were primarily tests of the parser have been
brought forward as parser tests. One of them has been altered
to add parentheses, because our parser interprets
	fld1 between 1 and 3 and fld2 = 2
as:
	fld1 between (1 and 3) and (fld2 = 2)
which is invalid, while the old parser apparently interpreted it
as:
	(fld1 between 1 and 3) and (fld2 = 2)

We have not yet verified the SQL spec's requirements here, but
sqlite agrees with our old parser, not our new parser, so this
may be a regression.

The old tests expected an INNER JOIN to suppress duplicate
values. Our new code does not, which is consistent with other
SQL implementations. This is a change, but the old behavior
appears to have been wrong. (You can still suppress duplicate
values by specifying DISTINCT.)

In the previous implementations, a value like `count(*)` had
`count(*)` as its column name. In the new implementation,
it has an empty string as its column name.

Related to this, the prior implementation allowed you to
write
	select age, count(*) from grouper group by age having count > 1
but the new implementationt requires that to be spelled as
	having count(*) > 1

This is consistent with other SQL implementations, so I think
the new behavior is correct.

The behavior of SHOW COLUMNS and SHOW TABLES has changed, in
that the specific results returned are significantly different.
Perhaps more significantly, the old system spelled the former
query as SHOW FIELDS, rather than SHOW COLUMNS. This may be
considered a regression, in that `SHOW FIELDS` no longer works,
and we should consider whether any hypothetical users might
have been relying on the output of either of these. (I hope
not, the new output is much better.)

Some of the old tests (the ones in handler_test) were accommodated
by adding a couple of specific test cases to existing tests,
specifically:
	* handling timestamp values with `Z` rather than `+00:00`
	* a join with a WHERE clause referring to fields in both
	  source tables

We introduce a new "partial" comparison type, because there's
no way for a test of `SHOW TABLES` to contain a correct table
row, because `SHOW TABLES` includes timestamps from when tables
were created. I'm not sure this is the right way to do this.

We add corresponding changes to dax_test, because the DAX tree
tests against the SQL tests.

We change the returned types of field names and field types to
plain strings, ironically because DAX needs this -- the test code
in the DAX tree is getting them back as plain strings, rather
than as dax.FieldName and dax.BaseType.

The tests using `having` are commented out because they don't
seem to be working, a ticket has been filed for this.

Two of the tests that should return strings are instead returning
untranslated integer IDs, but only for DAX, not for the regular
SQL tests, and the `delete` test has been commented out for
DAX-specific errors. If we merge this, the next step is to
ticket those and address them separately.
@sonarcloud
Copy link

sonarcloud bot commented Feb 27, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 2 Code Smells

No Coverage information No Coverage information
39.3% 39.3% Duplication

@bruce-b-molecula bruce-b-molecula self-requested a review March 1, 2023 18:59
@paddyjok paddyjok merged commit b7e9879 into master Mar 1, 2023
@paddyjok paddyjok deleted the fb-1870 branch March 1, 2023 19:40
Herkemer pushed a commit to Herkemer/featurebase that referenced this pull request Mar 15, 2023
* use t.Fatal(f) to abort tests, not panic

* make perf_able run at all, make it debug a bit better

switch perf-able to using same node type we use for other spot instances,
because otherwise it never finds any available capacity.

we switch the perf-able script to use the standard get_value function
instead of direct jq calls.

we try to grab server logs if the restore fails in the hopes of finding
out why the restore very occasionally fails.

* Fix some issues with running IDK tests in docker. (FeatureBaseDB#2248)

*Stop running TestKafkaSourceIntegration with t.Parallel()

This test can't be run in parallel as it's currently written. Doing so
allows for interleaving of messages to the same kafka topic between
tests.

I didn't attempt to modify the test so it could be run in parallel. That
could be done, but left for someone more ambitious.

* Remove idk/testenv/certs which got accidentally committed.

also update .gitignore to include those.

* changes to add bool support in idk (FeatureBaseDB#2240)

* initial changes to add bool support in idk

* modifying some default parameters for testing, will revert them later

* adding support for bool in making fragments function

* boolean values implementation without supporting empty or null values at this point

* Implement bool support in batch using a map (and a slice for nulls) (FeatureBaseDB#2247)

* Implement bool support in batch using a map (and a slice for nulls)

* Keep the PackBools default for now

But set it explicity in the ingest tests which rely on it.

* Modify batch to construct bool update like mutex

The code in API.ImportRoaringShard has a switch statement which causes
bool fields to be handled like mutex fields. This means, that the
viewUpdate.Clear value should only contain data in the first "row" of
the fragment, which it will treat as records to clear for *all* rows.
This makes more sense for mutex fields; for bool fields, there's only
one other row to clear. But since the code is currently handling them
the same, we need to construct viewUpdate.Clear such that it conforms to
that pattern.

This commit also adds a test which covers this logic.

* Remove commented code; revert config for testing

This commit also removes the DELETE_SENTINEL case for non-packed bools,
since that isn't supported anyway.

* Revert default setting

* remove inconsistent type scope

* correcting the logic of string converstion to bool

* resolving an error in a test

* adding tests to cover code related to bool support in batch.go file and interface.go files

* modifying interfaces test

* added one more test case

Co-authored-by: Travis Turner <travis@pilosa.com>
Co-authored-by: Travis Turner <travis@molecula.com>

* resolving bool null field ingestion error (FeatureBaseDB#2254)

* resolving bool null field ingestion error

* testing issues

* adding null support for bools

* updating the null bool field ingestion

* trying to resolve issue when ingesting null value for bool type

* adding a clearing support for bool type

* resolving issues with bool null value ingestion

* updating the jwt go package version and removing changes made in docker compose file

* reverting jwt go version

* removing v4 of jwt

* adding a comment in test file to see if sonar cloud accepts this file

* don't obtain stack traces on rbf.Tx creation

We thought stack traces were mildly expensive. We were very wrong.
Due to a complicated issue in the Go runtime, simultaneous requests
for stack traces end up contending on a lock even when they're not
actually contending on any resources. I've filed a ticket in the
Go issue tracker for this:

	golang/go#56400

In the mean time: Under some workloads, we were seeing 85% of all
CPU time go into the stack backtraces, of which 81% went into the
contention on those locks. But even if you take away the contention,
that leaves us with 4/19 of all CPU time in our code going into
building those stack backtraces. That's a lot of overhead for a
feature we virtually never use.

We might consider adding a backtrace functionality here, possibly
using `runtime.Callers` which is much lower overhead, and allows us
to generate a backtrace on demand (no argument values available,
but then, we never read those because they're unformatted hex
values), but I don't think it's actually very informative to know
what the stack traces were of the Tx; they don't necessarily reflect
the current state of any ongoing use of the Tx, so we can't necessarily
correlate them to goroutine stack dumps, and so on.

* fb-1729 Enriched Table Metadata (FeatureBaseDB#2255)

enriched metadata for tables

added support for the concept of a table and field owners in metadata; mechanism to derive owner from http request metadata; metadata for table description

* tightened up is/is not null filter expressions (FB-1741) (FeatureBaseDB#2260)

Covers tightening up handling filter expressions that contain is/is not null ops. These filters may have to be translated into PQL calls to be passed to the executor and even though sql3 language supports nullability for any data type, currently only BSI fields are nullable at the storage engine level (there is a ticket to add support for non-BSI field here FB-1689: IS SQL Argument returns incorrect error) so when these fields are used in filter conditions we need to handle BSI and non-BSI fields differently.

* added a test to cover the keyword replace as being synonymous with insert (FeatureBaseDB#2261)

* update molecula references to featurebase (FeatureBaseDB#2262)

Co-authored-by: Seebs <seebs@molecula.com>
Co-authored-by: Travis Turner <travis@pilosa.com>
Co-authored-by: Pranitha-malae <56414132+Pranitha-malae@users.noreply.github.com>
Co-authored-by: Travis Turner <travis@molecula.com>
Co-authored-by: pokeeffe-molecula <85502298+pokeeffe-molecula@users.noreply.github.com>
Co-authored-by: Stephanie Yang <stephanie@pilosa.com>
Herkemer pushed a commit to Herkemer/featurebase that referenced this pull request Mar 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants