Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simple parts of Table.take and Table.drop functions to Database table #7615

Merged
merged 35 commits into from
Aug 31, 2023

Conversation

GregoryTravis
Copy link
Contributor

@GregoryTravis GregoryTravis commented Aug 18, 2023

Pull Request Description

Implements database Table and Column take/drop, except While and Sample.

Additional features and optimizations are in #7614.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed, the GUI was tested when built using ./run ide build.

@GregoryTravis GregoryTravis marked this pull request as ready for review August 18, 2023 19:15
Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall structure looks good to me.

However I was very surprised to see union used as part of the implementation. IMO it is too heavy and creates too complex queries (essentially the whole subquery which itself can be complex will be duplicated for each sub range, that can grow the SQL by large).

I'd suggest to use OR conditions in the filter instead - then we still keep a single subquery for this.

Also, please add the test for consecutive dropping/taking of elements, i.e.:

t.take 5 . take 3
t.take 3 . take 5
t.drop 3 . drop 2
t.drop 2 . drop 3

I know with the current implementation it should work, but I'd like to make sure and ensure we don't regress this when implementing any optimizations in the future.

Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good.

If possible, I'd like to change the ordering of sort operation in the newly added tests.

@radeusgd
Copy link
Member

Please also try adding an 'integration' test where we try to take and add_row_number after an aggregation, i.e.:

t0 = table_builder [["X", ["a", "b", "a", "c"]], ["Y", [1, 2, 4, 8]]]
t1 = t0.aggregate [Aggregate_Column.Group_By "X", Aggregate_Column.Sum "Y"]

t2 = t1.add_row_number
t3 = t1.take 2
t4 = t1.drop 1

I imagine the actual result may be non deterministic, we should just check if it is "sane" - e.g. the add_row_number row contains values 1, 2, 3 and take and drop return consistent rows i.e. subset of ["a", 5], ["b", 2], ["c", 8].

@GregoryTravis
Copy link
Contributor Author

Aggregate test added.

Copy link
Member

@jdunkerley jdunkerley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of minor tweaks.

Otherwise LGTM 👍

GregoryTravis and others added 4 commits August 22, 2023 13:16
…p_Helpers.enso

Co-authored-by: James Dunkerley <jdunkerley@users.noreply.github.com>
Co-authored-by: James Dunkerley <jdunkerley@users.noreply.github.com>
@GregoryTravis GregoryTravis added the CI: Ready to merge This PR is eligible for automatic merge label Aug 23, 2023
@GregoryTravis GregoryTravis added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Aug 31, 2023
@mergify mergify bot merged commit 061876e into develop Aug 31, 2023
26 checks passed
@mergify mergify bot deleted the wip/gmt/5131-db-take-drop branch August 31, 2023 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Clean build required CI runners will be cleaned before and after this PR is built. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants