Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bitwise operators #235

Merged
merged 7 commits into from Feb 7, 2019

Conversation

@davidchall
Copy link
Contributor

davidchall commented Feb 6, 2019

This PR adds support for the base R functions: bitwNot(), bitwAnd(), bitwOr(), bitwXor(), bitwShiftL(), bitwShiftR().

I've added support for most backends based on their documentation and some manual testing. However, I didn't consider the ODBC and Access backends - are these also needed?

Also, is there some standard way to systematically compare the results produced by each db connection type to those produced by dplyr? I saw the figure from #131 and was wondering if this was produced by a standard script?

@davidchall davidchall force-pushed the davidchall:bitwise branch from 84fc0b4 to 8ce1f4d Feb 6, 2019
@edgararuiz

This comment has been minimized.

Copy link
Collaborator

edgararuiz commented Feb 6, 2019

HI @davidchall , thank you, this PR is looking great to me! The way are are running systematic tests against individual database back ends is via infrastructure and a testing package that @colearendt has created.

@colearendt - Do you think we may be able to run this PR through your dbtest setup?

@colearendt

This comment has been minimized.

Copy link
Collaborator

colearendt commented Feb 6, 2019

Sure thing! @davidchall do you have some examples of the kinds of mutate, filter, or select statements that might be used with these verbs? I have to admit I am not super familiar with them. The input for the tests is a YAML file with these types of statements. I think some of the currently open PRs have some examples of this YAML syntax, but I am happy to build the file if you have some handy expressions that are good test cases.

Copy link
Member

hadley left a comment

Looking great!

Can you please add a bullet to NEWS? It should briefly describe the change and end with (@yourname, #issuenumber).

R/backend-.R Outdated Show resolved Hide resolved
R/backend-.R Outdated Show resolved Hide resolved
tests/testthat/test-backend-impala.R Outdated Show resolved Hide resolved
tests/testthat/test-backend-teradata.R Outdated Show resolved Hide resolved
@davidchall

This comment has been minimized.

Copy link
Contributor Author

davidchall commented Feb 6, 2019

Thank you all for your feedback! Let's wait until I've resolved the final comment from @hadley before running dbtest.

@colearendt Looking at your examples, I came up with the following YAML file. I've not seen the values in the fld_integer column, so it's possible that some of the hardcoded values are inappropriate.

Also, it's important to note that the results can depend on whether we're using a signed or unsigned integer data type. Will be interesting to see if anything is flagged due to this.

- bitwNot:
    mutate: bitwNot(fld_integer)
    filter: bitwNot(fld_integer) < 0
    summarize: n_distinct(bitwNot(fld_integer))
    group_by: bitwNot(fld_integer)
    arrange: bitwNot(fld_integer)
- bitwAnd:
    mutate: bitwAnd(fld_integer, 8L)
    filter: bitwAnd(fld_integer, 8L) > 0
    summarize: n_distinct(bitwAnd(fld_integer, 8L))
    group_by: bitwAnd(fld_integer, 8L)
    arrange: bitwAnd(fld_integer, 8L)
- bitwOr:
    mutate: bitwOr(fld_integer, 8L)
    filter: bitwOr(fld_integer, 8L) > 0
    summarize: n_distinct(bitwOr(fld_integer, 8L))
    group_by: bitwOr(fld_integer, 8L)
    arrange: bitwOr(fld_integer, 8L)
- bitwXor:
    mutate: bitwXor(fld_integer, 8L)
    filter: bitwXor(fld_integer, 8L) > 0
    summarize: n_distinct(bitwXor(fld_integer, 8L))
    group_by: bitwXor(fld_integer, 8L)
    arrange: bitwXor(fld_integer, 8L)
- bitwShiftL:
    mutate: bitwShiftL(fld_integer, 1L)
    filter: bitwShiftL(fld_integer, 1L) > 10
    summarize: n_distinct(bitwShiftL(fld_integer, 1L))
    group_by: bitwShiftL(fld_integer, 1L)
    arrange: bitwShiftL(fld_integer, 1L)
- bitwShiftR:
    mutate: bitwShiftR(fld_integer, 1L)
    filter: bitwShiftR(fld_integer, 1L) > 10
    summarize: n_distinct(bitwShiftR(fld_integer, 1L))
    group_by: bitwShiftR(fld_integer, 1L)
    arrange: bitwShiftR(fld_integer, 1L)
@davidchall

This comment has been minimized.

Copy link
Contributor Author

davidchall commented Feb 7, 2019

@colearendt Ok, I think it should be good to test now - and I've updated the YAML file above to account for the latest changes. Thanks!

@hadley hadley merged commit 8d38547 into tidyverse:master Feb 7, 2019
3 checks passed
3 checks passed
codecov/patch 100% of diff hit (target 82.46%)
Details
codecov/project 82.58% (+0.11%) compared to d1a3c0d
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@hadley

This comment has been minimized.

Copy link
Member

hadley commented Feb 7, 2019

Thanks!

@colearendt

This comment has been minimized.

Copy link
Collaborator

colearendt commented Feb 8, 2019

@davidchall Thanks for writing the YAML file! I will include that into our dbtest suite. Overall, it looks really good. Just a few problems (on Postgres / MSSQL, which is what I currently have handy to test):

image

Looks like semantics of "where" clauses are different in SQL vs. R. In particular, "WHERE integer" does not get handled appropriately. I don't know what would possibly fix this. It's almost like you need a context aware translation - didn't you run into this somewhere @edgararuiz ?

> t %>% filter(bitwAnd(fld_integer, 8L))
Error in new_result(connection@ptr, statement) : 
  nanodbc/nanodbc.cpp:1344: 42804: [RStudio][PostgreSQL] (30) Error occurred while trying to execute a query: [SQLState 42804] ERROR:  argument of WHERE must be type boolean, not type integer
LINE 3: WHERE ("fld_integer" & 8)
               ^
 
> t %>% filter(bitwAnd(fld_integer, 8L)) %>% show_query()
<SQL>
SELECT *
FROM "blah"
WHERE ("fld_integer" & 8)

The other MSSQL "failures" are just "not available on this SQL variant"

@davidchall

This comment has been minimized.

Copy link
Contributor Author

davidchall commented Feb 8, 2019

@colearendt Ah, this does make sense. The result of bitwise operations on integers is also an integer, but the WHERE clause expects to receive a boolean value. I've adjusted the YAML file to account for this.

@davidchall davidchall deleted the davidchall:bitwise branch Feb 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.