Feat across #452

machow · 2022-09-28T14:24:59Z

This PR implements the across() bridge function, across all verbs except arrange() (for now). It also refactors sql verb implementations, cleaning up and consolidating their logic.

Note that this PR introduces context variables (similar to dplyr). This is because across() needs access to the original sql LazyTbl in order to translate its functions into their dialect specific implementations. Currently, two context variables are set -- one for the LazyTbl, and one to indicate whether or not to use a windowed or agg translation. This should be cleaned u up in the future.

All-in-all, the following changes were made:

feat: add across(), which can be used inside verbs to apply one operation to multiple columns.
feat: count() and add_count() support the name argument.
feat: the new symbolic "formula" object, Fx, now exposed as a top level import.
feat: implement grouped distinct for both pandas and sql.
feat(tidyselect)!: a lambda can no longer be used to create a tidyselection specifier. Instead we match dplyr's behavior:
- old: lambda _: _.startswith("abc") was equivalent to _.startswith("abc")
- Now, when select() is given a callable function, it passes each column of data to it, and expects back a boolean.
- new: select(cars, lambda ser: ser.dtype == "int")

.

fix: add_count() now correctly handles named arguments, is tested for most cases count() is tested on.
fix(sql)!: calling arrange() twice now resets the order_by variables set by the first call (matching dbplyr behavior).

.

internal:
- improve group handling -- the _make_groupby_safe() function ensures group_keys is false.
- consolidate much of the mutation logic into a single function _mutate_cols. This is used in many verbs: mutate(), transmute(), and distinct(), and indirectly in group_by()

machow force-pushed the feat-across branch from 2f08010 to 1c582be Compare September 28, 2022 14:27

github-actions bot deployed to pr-452 September 28, 2022 14:28 View deployment

github-actions bot deployed to pr-452 September 28, 2022 14:30 View deployment

github-actions bot deployed to pr-452 September 28, 2022 18:26 View deployment

github-actions bot deployed to pr-452 September 30, 2022 18:43 View deployment

github-actions bot deployed to pr-452 October 3, 2022 00:33 View deployment

github-actions bot deployed to pr-452 October 3, 2022 02:59 View deployment

github-actions bot deployed to pr-452 October 3, 2022 18:05 View deployment

machow marked this pull request as ready for review October 3, 2022 18:48

github-actions bot deployed to pr-452 October 3, 2022 20:03 View deployment

github-actions bot deployed to pr-452 October 3, 2022 21:01 View deployment

github-actions bot deployed to pr-452 October 3, 2022 21:08 View deployment

github-actions bot deployed to pr-452 October 3, 2022 21:21 View deployment

github-actions bot deployed to pr-452 October 4, 2022 00:26 View deployment

github-actions bot deployed to pr-452 October 4, 2022 19:26 View deployment

github-actions bot deployed to pr-452 October 4, 2022 22:19 View deployment

github-actions bot deployed to pr-452 October 4, 2022 22:36 View deployment

machow force-pushed the feat-across branch from 05b08a2 to b11d6a5 Compare October 5, 2022 17:31

github-actions bot deployed to pr-452 October 5, 2022 17:34 View deployment

github-actions bot deployed to pr-452 October 5, 2022 17:58 View deployment

machow force-pushed the feat-across branch from d545280 to 9e87fb1 Compare October 5, 2022 17:59

github-actions bot deployed to pr-452 October 5, 2022 18:01 View deployment

github-actions bot deployed to pr-452 October 6, 2022 14:35 View deployment

github-actions bot deployed to pr-452 October 6, 2022 17:13 View deployment

github-actions bot deployed to pr-452 October 6, 2022 17:24 View deployment

github-actions bot deployed to pr-452 October 6, 2022 20:24 View deployment

github-actions bot deployed to pr-452 October 7, 2022 15:57 View deployment

machow force-pushed the feat-across branch from 7dd34fa to f6c0468 Compare October 7, 2022 22:36

github-actions bot deployed to pr-452 October 7, 2022 22:38 View deployment

github-actions bot deployed to pr-452 October 7, 2022 22:44 View deployment

machow added 15 commits October 7, 2022 18:48

feat(sql): support across in count

fbe8f6a

refactor(sql): prep _mutate_cols to support arrange

93bc927

fix(sql): sqlalchemy 1.3 compat for simplify_select

e386579

tests: fix bigquery failing due to ordering

0c32754

tests: assert_equal_query2 defaults to sql_ordered False

30fb5a5

feat(sql): extremely rough version of summarize across

4e66fa0

refactor(sql): prepare to split verbs file

27391d5

refactor(sql): move verbs into own files

d5be35b

refactor(sql): remove unused functions

f675ac1

tests: across in verbs, bare columns in verbs

e12f4a2

fix(sql): case_when now passes pandas tests

fb57567

feat: expose Fx, across as top-level imports

7e7144a

feat(sql): add grouped distinct, improve tests

7e04927

fix(sql): add_count more robust, can mutate group cols, supports across

2862f6d

fix: count, add_count proper name arg support

73d1b6e

machow force-pushed the feat-across branch from 4cb9819 to 73d1b6e Compare October 7, 2022 22:48

github-actions bot deployed to pr-452 October 7, 2022 22:51 View deployment

machow added 2 commits October 7, 2022 19:30

fix(sql): do not create subquery for custom sql_raw

3e5502c

refactor!: remove tests of using function to tidyselect columns

95b6dbd

github-actions bot deployed to pr-452 October 7, 2022 23:34 View deployment

fix: clean up arrange, raise on across for now

570fd9f

github-actions bot deployed to pr-452 October 10, 2022 14:24 View deployment

fix(sql)!: arrange resets order_by vars, matches dbplyr

13ad7f6

github-actions bot deployed to pr-452 October 10, 2022 14:35 View deployment

tests: more tests of mutating after a summarize

3d4a79a

machow force-pushed the feat-across branch from 95bfe84 to 3d4a79a Compare October 11, 2022 23:39

github-actions bot deployed to pr-452 October 11, 2022 23:41 View deployment

github-actions bot deployed to pr-452 October 11, 2022 23:42 View deployment

machow merged commit 5be9e2f into main Oct 12, 2022

machow deleted the feat-across branch October 12, 2022 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat across #452

Feat across #452

machow commented Sep 28, 2022 •

edited

Feat across #452

Feat across #452

Conversation

machow commented Sep 28, 2022 • edited

machow commented Sep 28, 2022 •

edited