Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: more RDBMS integration tests in CI #2286

Merged
merged 35 commits into from
Mar 30, 2023
Merged

test: more RDBMS integration tests in CI #2286

merged 35 commits into from
Mar 30, 2023

Conversation

Jelenkee
Copy link
Contributor

I added integration test for different DBs (currently DuckDB, SQLite, MySQL, Postgres and SQL Server).
The testcases are in a separate text file which is read at build time. I created a setup sql file for all DBs. There are some hacks because it's not compatible with every dialect, but I guess it's ok for tests.

Closes #2211

@Jelenkee Jelenkee changed the title Integration tests in CI feat: Integration tests in CI Mar 22, 2023
@max-sixty
Copy link
Member

Overall this looks extremely impressive @Jelenkee !

I'll look more later but a couple of quick questions:

  • One thing that's important is a bare cargo test still working without someone having docker installed. Does that work now, or is the integration test run as part of normal tests? We can think of a way of handling that; maybe with a feature.
  • This seems to eschew the existing integration tests, is that correct? While those currently only run the in-process DBs in CI, there is a postgres version that should run once set up. The config isn't nearly as elegant as yours (e.g. it requires copying the build artifacts), it does have an existing set of queries and snapshots. Ideally we would share the test code & queries & snapshots between databases, both to ensure they work similarly and to improve maintainability.

@Jelenkee
Copy link
Contributor Author

  • Docker is not required because the test is ignored per default
  • The new tests do not cover everything from the old chinook test. That's why I did not delete them. Currently insta is not used to run the tests because I'm not familiar with it. So no snapshots are generated

@max-sixty
Copy link
Member

Docker is not required because the test is ignored per default

Ah yes, I see, thanks!

The new tests do not cover everything from the old chinook test. That's why I did not delete them. Currently insta is not used to run the tests because I'm not familiar with it. So no snapshots are generated

I'm quite hesitant for the project to maintain a different set of integration tests, with different queries & data. What do you think about working to integrate your database connection code, which seems like the big & difficult piece of this, with the existing tests' queries & data?

It doesn't need to be refactored to exactly the same pattern (i.e. current each DB is a mod: https://github.com/prql/prql/blob/da173c39bb1c3af90ac8681bec9bd5295c379432/prql-compiler/tests/integration/main.rs#L208) — instead it could just be mod external_dbs or something similar.

I'm happy to help if needed. FWIW I think you'll find libraries like insta really nice, it significantly reduces boilerplate, as well as making the whole PRQL library more maintainable.


Otherwise things look very good. I can have another look over later and confirm if that's helpful, so you don't feel like there's always another change request.

@Jelenkee
Copy link
Contributor Author

Jelenkee commented Mar 24, 2023

I added all of the old integration test to the new test file except for invoice_totals. I'm not sure if all of the DBs work with these s-string.

The problem is that the tests don't run successfully because MsSQL has problem with limit and MySQL has problems with loop (maybe others too). The behaviour should be fixed before this PR is merged.

You have to help me with the snapshotting stuff. I don't get, how and why those snapshots are generated and so on

Copy link
Member

@max-sixty max-sixty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, very happy to help.

FWIW I'm out this weekend so won't really be online for a few days, but back at the start of next week.

I think we have two main choices:

I don't have a strong preference of whether which we move into which (though I do definitely think we should consolidate them, as discussed above).

Small nit: if we use the PR as a base and merge the existing code in, let's rename from "-vendors" to something like "-dbs" or "-rdmbs"; they're not really vendors per se

I left one comment for how to use insta — check that out — I don't think that part is much work at all. (Again, very happy to help there, though)

The main work will be loading the data into the DBs — the existing code loads CSVs directly, rather than INSERT INTO statements. I do think that gives us a bigger corpus of data to work with. And we are generally trying to move all our examples and tests in the direction of this specific dataset. Do all the DBs support loading from CSVs?


Does this make sense? I'm very appreciative of the contribution, I hope me asking for the change comes across as reasonable. I think this will be a great set of tests and I'm keen for the project keep using them and growing with them, hence the focus on ensuring these are easy to maintain. Very open to feedback.

prql-compiler/tests/integration-vendors/main.rs Outdated Show resolved Hide resolved
Copy link
Member

@eitsupi eitsupi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to see integration testing!
I added some comments around Docker.

.github/workflows/integration-test.yaml Outdated Show resolved Hide resolved
prql-compiler/tests/integration-vendors/docker-compose.yml Outdated Show resolved Hide resolved
prql-compiler/tests/integration-vendors/docker-compose.yml Outdated Show resolved Hide resolved
prql-compiler/tests/integration-vendors/docker-compose.yml Outdated Show resolved Hide resolved
prql-compiler/tests/integration-vendors/docker-compose.yml Outdated Show resolved Hide resolved
@max-sixty
Copy link
Member

The problem is that the tests don't run successfully because MsSQL has problem with limit and MySQL has problems with loop (maybe others too). The behaviour should be fixed before this PR is merged.

That's totally fine, let's just keep a list of exclusions, either in rust or with some comment in the test (e.g. -- mssql-exclude) — whatever is closer (probably the first is easier?)

I think we should try and merge this and then later figure out how to fix them & remove the exclusions — otherwise I worry this won't get merged for a while, and it's good code!


I added all of the old integration test to the new test file except for invoice_totals. I'm not sure if all of the DBs work with these s-string.

Sorry if I'm missing them — where are the old integration tests being run? I can't see them in testcases.txt or that the rust code in https://github.com/PRQL/prql/blob/b35f92eebaf5973f9f16c6098fde456a60e17912/prql-compiler/tests/integration-rdbms/main.rs is grabbing the old files.

Or maybe we haven't added them yet because of the problem above?


How can I run the tests locally? I've run docker-compose -d up and then cargo test --test integration-rdbms — they're passing but not running (I tried adding a panic!() to confirm.).

Possibly I have an issue running the docker containers, I get some odd debug logs from mssql.

Could it at least run the in-process tests such as duckdb without an external service? The existing integration tests do this.

@max-sixty
Copy link
Member

The main work will be loading the data into the DBs — the existing code loads CSVs directly, rather than INSERT INTO statements. I do think that gives us a bigger corpus of data to work with. And we are generally trying to move all our examples and tests in the direction of this specific dataset. Do all the DBs support loading from CSVs?

Just to reiterate this — there might be some refinements — e.g. excluding some tests that fail — but basically this is the one piece of work remaining

@max-sixty
Copy link
Member

max-sixty commented Mar 28, 2023

I think we're basically there. I just fixed the CLI test and now all ubuntu tests pass.

Given how large his already is — if you're up for refining this over the next few weeks, let's merge now and iterate on it in the future?

We get a failure on Mac, and I guess we will on Windows. Shall we just hack around that by conditioning running these tests on being in Linux? It's not permanent but it'll let us merge now.


Some follow-ups I can think of for future PRs. We can move into a separate issue:

  • Remove previous integration tests, move the data files into this path
  • Run in-process DBs even without an external connection (will be quite helpful if we start using this more)
  • Add a description & link to https://github.com/prql/prql/blob/7ae804a9d3f42f6ed3d33c04ab910dca164c7efc/web/book/src/contributing/development.md#L310
  • Implement whether to run external integration tests with a feature, rather than reading the CI env var (that could become a bit confusing otherwise)
  • (minor) Possibly output the snapshots as CSVs / something both machine & person-readable, e.g. with headers, rather than one long line
  • (minor) Consider whether we keep it in the current CI test vs move to another, possibly in test-all.yaml. It currently adds 40 seconds, which is not too bad. Or we condition the test-rust tests on a PR's paths, and avoid running it for small doc changes.
  • Any others @Jelenkee ?

@eitsupi
Copy link
Member

eitsupi commented Mar 29, 2023

Given the runner's resources, it would make sense to run tests that use Docker only on Linux runner.

@eitsupi eitsupi changed the title feat: Integration tests in CI test: more RDBMS integration tests in CI Mar 29, 2023
@Jelenkee
Copy link
Contributor Author

Wenn starting docker compose the owner of prql-compiler/tests/integration/data/chinook is changed on Linux. Probably on macOS too. That should be fixed because you cannot edit the files in there.

@Jelenkee
Copy link
Contributor Author

The CI is not running. Did I break something?

@max-sixty
Copy link
Member

The CI is not running. Did I break something?

Maybe a GitHub bug... I added a label to see if that would jolt it and it seems to have worked


#[test]
fn test_rdbms() {
if env::var("SKIP_INTEGRATION").is_ok() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very reasonable workaround!

@Jelenkee
Copy link
Contributor Author

Jelenkee commented Mar 29, 2023

It has a problem because insta found snaps that belong to no test becuase the tests were skipped.
https://github.com/PRQL/prql/actions/runs/4556292813/jobs/8036462791?pr=2286

if: ${{ inputs.os != 'windows-latest' }}
- name: skip integration if windows
run: echo "SKIP_INTEGRATION=true" >> $GITHUB_ENV
if: ${{ inputs.os == 'windows-latest' }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly also on Mac?

@max-sixty
Copy link
Member

max-sixty commented Mar 29, 2023

Almost there!!

It has a problem because insta found snaps that belong to no test becuase the tests were skipped.
https://github.com/PRQL/prql/actions/runs/4556292813/jobs/8036462791?pr=2286

Re the taskfile error — let's comment out --unreferenced=auto in the Taskfile.yml, and add a link to this PR? I can look at whether we can restore that in future.

@Jelenkee
Copy link
Contributor Author

Jelenkee commented Mar 29, 2023

For some reason test-js fails

And the linter is mad at me and I don't know why

@max-sixty
Copy link
Member

For some reason test-js fails

Weird, I think that's just flaky

And the linter is mad at me and I don't know why

I added a fix

@max-sixty
Copy link
Member

max-sixty commented Mar 30, 2023

I just resolved a merge conflict, then let's merge! I recognize it's been a long road.

I do think it would be quite valuable to refine parts; particularly where it touches the wider library — e.g. removing the old integration tests in favor of these / having a command run the same tests wherever it's run / etc.

To the extent you're up for opening an issue with the notes we made above, that would be great. No great rush on them, and possible others will pick up bits of it.

We could also add an entry to the changelog announcing the new tests.

Thank you very much indeed @Jelenkee !

@max-sixty max-sixty enabled auto-merge (squash) March 30, 2023 04:36
@max-sixty max-sixty merged commit 33fb9e0 into PRQL:main Mar 30, 2023
@Jelenkee Jelenkee mentioned this pull request Mar 30, 2023
7 tasks
max-sixty added a commit to max-sixty/prql that referenced this pull request Apr 2, 2023
* -added docker compose
-added test skeleton

* -added dependencies

* -added real test cases

* -added mssql

* refactoring

* -added go-task

* -more testcases

* -added github workflow

* -fixed formatting

* -fixed linting

* -fixed workflow yaml

* -fixed workflow yaml again

* -added more testcases

* Apply suggestions from code review

Co-authored-by: eitsupi <50911393+eitsupi@users.noreply.github.com>

* -refactored test workflow

* Fix order of args to `assert_display_snapshot`

Sorry for the incorrect suggestion prior!

* Specify platform for mysql image

* -added csv import

* -fixed or skipped all prql files

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* -fixed sqlite

* added snaps§

* Update prql-compiler/prqlc/src/cli.rs

* -removed test on macOS

* empty

* -skip integration test on windows

* -fixed taskfile

* -added double quotes

* Update .github/workflows/test-rust.yaml

---------

Co-authored-by: eitsupi <50911393+eitsupi@users.noreply.github.com>
Co-authored-by: Maximilian Roos <m@maxroos.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
@Jelenkee Jelenkee deleted the integration-tests branch April 13, 2023 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integration test for different DBs in CI
3 participants