Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple groupings that compile to use _rn result in a DB query error #826

Closed
mklopets opened this issue Jul 18, 2022 · 9 comments
Closed
Labels
bug Invalid compiler output or panic priority

Comments

@mklopets
Copy link
Collaborator

If, in a given pipeline, you use group more than once in a way that results in partition by being used with _rn, you will end up with multiple _rn columns and the database getting confused (this example was tested on Postgres)

from y_orig

group [y_id] (
  take 2 # take 1 uses `distinct` instead of partitioning, which might be a separate bug
)

group [x_id] (
  take 3
)

compiles to

WITH table_0 AS (
  SELECT
    y_orig.*,
    ROW_NUMBER() OVER (PARTITION BY y_id) AS _rn
  FROM
    y_orig
),
table_1 AS (
  SELECT
    table_0.*,
    ROW_NUMBER() OVER (PARTITION BY x_id) AS _rn
  FROM
    table_0
  WHERE
    _rn <= 2
)
SELECT
  table_1.*
FROM
  table_1
WHERE
  _rn <= 3

which errors with

Query 1 ERROR: ERROR:  column reference "_rn" is ambiguous
LINE 22:   _rn <= 3
           ^

On a related note, take 1 compiles to use distinct instead of partition by, which seems like random/inconsistent behaviour.

@mklopets mklopets added the bug Invalid compiler output or panic label Jul 18, 2022
@max-sixty
Copy link
Member

Thanks @mklopets . At least this one should be a reasonably easy fix.

On a related note, take 1 compiles to use distinct instead of partition by, which seems like random/inconsistent behaviour.

This is intended — it's much easier to read the SQL if it's distinct. I agree the change from take 2 to take 1 is large in SQL, but if we produced complicated SQL rather than just using distinct I think that would also be bad...

@max-sixty
Copy link
Member

I think possibly we can qualify _rn with the table it comes from. We don't have that much context when we form _rn so we may need to do a pass at the end — I'll have to spend some more time checking.

@mklopets what issues would you want prioritizied out of the last few ones you posted?

@mklopets
Copy link
Collaborator Author

@max-sixty issue #820 is the one currently causing (by far) the most headache, but #822 would also become a problem when making a new npm release (though we can just do a temporary revert as described under #822)

@mklopets
Copy link
Collaborator Author

This one here is currently not blocking anything, but there's also no known hacky workaround for it.

@mklopets
Copy link
Collaborator Author

I think possibly we can qualify _rn with the table it comes from

FWIW, it sounds like that still wouldn't work for multiple groupings on the same table, for example

just adding an incremental number to each of those row number columns, e.g. _rn_0 wouldn't look super pertty but it'd a) match how table_0 is generated and b) fix the problem

@max-sixty
Copy link
Member

just adding an incremental number to each of those row number columns, e.g. _rn_0 wouldn't look super pertty but it'd a) match how table_0 is generated and b) fix the problem

Yes, I think this would be the easiest approach. We don't have that context at the time unfortunately given how the compiler is modularized, but I think there are still ways of doing something. And very worst case we could always do a pass at the end.

@qharlie
Copy link
Contributor

qharlie commented Jul 24, 2022

And very worst case we could always do a pass at the end.

@max-sixty Where does that happen ?

@max-sixty
Copy link
Member

@charlie-sanders I think somewhere here: https://github.com/prql/prql/blob/main/prql-compiler/src/sql/translator.rs#L94-L102

I would be completely fine with that as a hack, and then we fix it later; would be great to add a TODO in the code if you do do that!

max-sixty added a commit that referenced this issue Jul 25, 2022
* Adding row_number_id to ROW_NUMBER() SQL to make them unique

* Adding tests for #826

* Fixing clippy

* Testing raw flag to pass the deadlinks check

* Apply suggestions from code review

Try reverting raw link change

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
@qharlie
Copy link
Contributor

qharlie commented Jul 25, 2022

Closed in #853

@qharlie qharlie closed this as completed Jul 25, 2022
max-sixty added a commit that referenced this issue Jul 29, 2022
* Adding row_number_id to ROW_NUMBER() SQL to make them unique

* Adding tests for #826

* Fixing clippy

* Testing raw flag to pass the deadlinks check

* Apply suggestions from code review

Try reverting raw link change

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
max-sixty added a commit that referenced this issue Jul 29, 2022
* Add double-bracket strings

* Update prql-compiler/src/prql.pest

* Remove unnecessary steps from GHAs (#805)

* Remove unnecessary steps from GHAs

* Remove version from mdbook-admonish install (#808)

As dscribed in the comment, not worth the extra time to pin the version

* Add section to contributing.md explaining our tests (#807)

* Add section to contributing explaining our tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* .

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Release prql-java (#781)

* rename java test package name

* cross try

* release conf

* fix tests

* update doc

* profile adjustment

* pre-commit

* update developers

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
Co-authored-by: Maximilian Roos <m@maxroos.com>

* Bump actions/setup-java from 1 to 3 (#810)

Bumps [actions/setup-java](https://github.com/actions/setup-java) from 1 to 3.
- [Release notes](https://github.com/actions/setup-java/releases)
- [Commits](actions/setup-java@v1...v3)

---
updated-dependencies:
- dependency-name: actions/setup-java
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Improve GHA caching (#811)

* Improve GHA caching

Swap out actions-rs for @baptiste0928 crate

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Use default target for book build (#813)

Now we no longer have the playground, no need to use wasm to test the book

* Dialects for ident escaping (#809)

* prql-macros (#814)

* Bump rusqlite from 0.27.0 to 0.28.0 (#815)

* Bump @testing-library/user-event from 14.2.1 to 14.2.3 in /playground (#816)

* Remove incorrect link (#821)

Removing this because it generates a 404, as per #818.

Is there a way to link to a section on the homepage? Linking to prql-lang.org/index.html#Bindings doesn't seem to work either.

* Bump @testing-library/user-event from 14.2.3 to 14.2.5 in /playground (#817)

Bumps [@testing-library/user-event](https://github.com/testing-library/user-event) from 14.2.3 to 14.2.5.
- [Release notes](https://github.com/testing-library/user-event/releases)
- [Changelog](https://github.com/testing-library/user-event/blob/main/CHANGELOG.md)
- [Commits](testing-library/user-event@v14.2.3...v14.2.5)

---
updated-dependencies:
- dependency-name: "@testing-library/user-event"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add back link to list of bindings from FAQ (#823)

* docs: link to list of bindings from faq

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* avoid setting empty id for most sections

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Add "copy to clipboard" button to playground for output SQL (#825)

* add "copy to clipboard" to playground output sql

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Bump @testing-library/user-event from 14.2.5 to 14.2.6 in /playground (#824)

Bumps [@testing-library/user-event](https://github.com/testing-library/user-event) from 14.2.5 to 14.2.6.
- [Release notes](https://github.com/testing-library/user-event/releases)
- [Changelog](https://github.com/testing-library/user-event/blob/main/CHANGELOG.md)
- [Commits](testing-library/user-event@v14.2.5...v14.2.6)

---
updated-dependencies:
- dependency-name: "@testing-library/user-event"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump terser from 5.13.1 to 5.14.2 in /playground (#830)

Bumps [terser](https://github.com/terser/terser) from 5.13.1 to 5.14.2.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add from_json to compiler (#829)

* Add from_json to compiler

* Update prql-compiler/src/lib.rs

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>

* Bump insta from 1.15.0 to 1.16.0 (#832)

Bumps [insta](https://github.com/mitsuhiko/insta) from 1.15.0 to 1.16.0.
- [Release notes](https://github.com/mitsuhiko/insta/releases)
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](mitsuhiko/insta@1.15.0...1.16.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add test for from_json (#831)

* Adding test for from_json

* Formatting

* Using better prql to test with

* Fixes #748, tests for prql-js  (#833)

* For #748 , tests for javascript package

* Bump @testing-library/user-event from 14.2.6 to 14.3.0 in /playground (#827)

Bumps [@testing-library/user-event](https://github.com/testing-library/user-event) from 14.2.6 to 14.3.0.
- [Release notes](https://github.com/testing-library/user-event/releases)
- [Changelog](https://github.com/testing-library/user-event/blob/main/CHANGELOG.md)
- [Commits](testing-library/user-event@v14.2.6...v14.3)

---
updated-dependencies:
- dependency-name: "@testing-library/user-event"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix some display trait bugs (#836)

* fix some display trait bugs

* convert single character string to char

* replace find and is_some with contains

* add snapshot test for display trait (#837)

* add snapshot test for display trait

* regenerate test

* Fix quoting issues in idents (#838)

* Fix quoting issues in idents

* Update docs

* Fixes #834, book/integrations for rust and python (#835)

* Fixes #834, integrations for rust and python

* Fix precedence issue (#839)

* Fix precedence issue

Not yet complete, but solves the immediate issue, and a nice framework to build off, thanks to @alijazerzen

* Complete the BinOps

* .

* Fix book build (#840)

The SQL wasn't being created in the book recently; and the mdbook uses an error as a value in its preprocessors, so this wasn't getting caught.

Not sure what a better design is, beyond me checking the actual website rather than my local build (or mdbook improving their interface...)

* Bump dependencies (#841)

* Fix book build

The SQL wasn't being created in the book recently; and the mdbook uses an error as a value in its preprocessors, so this wasn't getting caught.

Not sure what a better design is, beyond me checking the actual website rather than my local build (or mdbook improving their interface...)

* Bump dependencies

To bust cache, since #840 compiles new dependencies without changing Cargo.lock

* Attempt to silence codeql warning (#842)

ref github/codeql#4850

* Update gitignore (#844)

Don't include `.prql` files; consolidate `target` paths

* Improve error message on multi-part ident in equijoins (#845)

* Add release info to prql-macros crate (#846)

We don't publish this separately; `cargo-release` needs this info

* Add release info to prql-macros crate (#846) (#847)

We don't publish this separately; `cargo-release` needs this info

* Add cargo-release onto `task setup-dev` (#843)

* Add cargo-release onto `task setup-dev`

And remove deprecated task

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* (cargo-release) version 0.2.3 (#848)

* Disable prql-java release workflow (#851)

Until #850 is fixed.

(TBC, no great stress here -- better to have started and rolled back than never to have started :) )

* Fix python release workflow (#849)

* Fix python releaes workflow

Really we should have a test release workflow running which does a dry-run, so we don't get these sorts of problems (which I'm sure was my fault tbc...)

* Unique row number aliases (#853)

* Adding row_number_id to ROW_NUMBER() SQL to make them unique

* Adding tests for #826

* Fixing clippy

* Testing raw flag to pass the deadlinks check

* Apply suggestions from code review

Try reverting raw link change

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>

* Only run check-links on modified files (#854)

* Only run check-links on modified files

And then run every day on all. This should reduce the issues we had in #853, not the first time we've had them.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Only install wasm-pack in CI when needed (#855)

* Only install wasm-pack in CI when needed

GHA builds are getting longer now, we're well past our 1 minute target...

Though this isn't going to help much!

* Add some workflows to track our compilation time (#856)

* Add some workflows to track our compilation time

It's getting quite long now — CI jobs take 90-150 seconds, almost all of
which is compilation, even with a cache.

Locally I'm seeing more like 20s after wiping `target`.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* .

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* .

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Update docs in ast_fold.rs (#857)

* Precedence for unary operators and IS NULL (#859)

* Test compilation time for all targets (#860)

* Upgrade dependencies (#861)

Busts cache so #860 will cache correctly

* cli revamp (#863)

* Reorganize top-level functions (#865)

* Remove cached cargo-timings files in GHA (#864)

* Remove cached cargo-timings files in GHA

* Bump deps (#867)

Another cache bust, this time for #864

* Fix links to stdlib.prql (#868)

Weirdly the book printed an error, but passed?!
https://github.com/prql/prql/runs/7530558870?check_suite_focus=true

(this was from #1513, zero stress tbc)

* Fix BigQuery quoting (#858)

* Fix BigQuery quoting

Not sure this will be the end of it, but progress

* .

* Bump insta from 1.16.0 to 1.17.0 (#870)

Bumps [insta](https://github.com/mitsuhiko/insta) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/mitsuhiko/insta/releases)
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/commits)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Remove miscrate eprintln statement (#871)

Guessing this was an oversight

* Final fix of BQ quotes (hopefully) (#874)

* Switch implicit `compile` tests to actual `compile` tests (#875)

* Switch implicit `compile` tests to actual `compile` tests

* Bump wasm-react-scripts from 5.0.2 to 5.0.3 in /playground (#876)

Bumps [wasm-react-scripts](https://github.com/Cosmian/create-react-app/tree/HEAD/packages/react-scripts) from 5.0.2 to 5.0.3.
- [Release notes](https://github.com/Cosmian/create-react-app/releases)
- [Changelog](https://github.com/Cosmian/create-react-app/blob/main/CHANGELOG-1.x.md)
- [Commits](https://github.com/Cosmian/create-react-app/commits/react-error-overlay@5.0.3/packages/react-scripts)

---
updated-dependencies:
- dependency-name: wasm-react-scripts
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* (cargo-release) version 0.2.4 (#877)

* Only one workflow event per release (#878)

* .

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jie Han <11144133+doki23@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Aljaž Mur Eržen <aljazerzen@users.noreply.github.com>
Co-authored-by: Marko Klopets <mklopets@gmail.com>
Co-authored-by: Arrizal Amin <arrizalamin@gmail.com>
Co-authored-by: charlie sanders <711385+charlie-sanders@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Invalid compiler output or panic priority
Projects
None yet
Development

No branches or pull requests

3 participants