Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bitwise operators #235

Merged
merged 7 commits into from
Feb 7, 2019
Merged

Support bitwise operators #235

merged 7 commits into from
Feb 7, 2019

Conversation

davidchall
Copy link
Contributor

This PR adds support for the base R functions: bitwNot(), bitwAnd(), bitwOr(), bitwXor(), bitwShiftL(), bitwShiftR().

I've added support for most backends based on their documentation and some manual testing. However, I didn't consider the ODBC and Access backends - are these also needed?

Also, is there some standard way to systematically compare the results produced by each db connection type to those produced by dplyr? I saw the figure from #131 and was wondering if this was produced by a standard script?

@edgararuiz-zz
Copy link
Contributor

HI @davidchall , thank you, this PR is looking great to me! The way are are running systematic tests against individual database back ends is via infrastructure and a testing package that @colearendt has created.

@colearendt - Do you think we may be able to run this PR through your dbtest setup?

@colearendt
Copy link
Collaborator

Sure thing! @davidchall do you have some examples of the kinds of mutate, filter, or select statements that might be used with these verbs? I have to admit I am not super familiar with them. The input for the tests is a YAML file with these types of statements. I think some of the currently open PRs have some examples of this YAML syntax, but I am happy to build the file if you have some handy expressions that are good test cases.

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great!

Can you please add a bullet to NEWS? It should briefly describe the change and end with (@yourname, #issuenumber).

R/backend-.R Outdated Show resolved Hide resolved
R/backend-.R Outdated Show resolved Hide resolved
tests/testthat/test-backend-impala.R Outdated Show resolved Hide resolved
tests/testthat/test-backend-teradata.R Outdated Show resolved Hide resolved
@davidchall
Copy link
Contributor Author

davidchall commented Feb 6, 2019

Thank you all for your feedback! Let's wait until I've resolved the final comment from @hadley before running dbtest.

@colearendt Looking at your examples, I came up with the following YAML file. I've not seen the values in the fld_integer column, so it's possible that some of the hardcoded values are inappropriate.

Also, it's important to note that the results can depend on whether we're using a signed or unsigned integer data type. Will be interesting to see if anything is flagged due to this.

- bitwNot:
    mutate: bitwNot(fld_integer)
    filter: bitwNot(fld_integer) < 0
    summarize: n_distinct(bitwNot(fld_integer))
    group_by: bitwNot(fld_integer)
    arrange: bitwNot(fld_integer)
- bitwAnd:
    mutate: bitwAnd(fld_integer, 8L)
    filter: bitwAnd(fld_integer, 8L) > 0
    summarize: n_distinct(bitwAnd(fld_integer, 8L))
    group_by: bitwAnd(fld_integer, 8L)
    arrange: bitwAnd(fld_integer, 8L)
- bitwOr:
    mutate: bitwOr(fld_integer, 8L)
    filter: bitwOr(fld_integer, 8L) > 0
    summarize: n_distinct(bitwOr(fld_integer, 8L))
    group_by: bitwOr(fld_integer, 8L)
    arrange: bitwOr(fld_integer, 8L)
- bitwXor:
    mutate: bitwXor(fld_integer, 8L)
    filter: bitwXor(fld_integer, 8L) > 0
    summarize: n_distinct(bitwXor(fld_integer, 8L))
    group_by: bitwXor(fld_integer, 8L)
    arrange: bitwXor(fld_integer, 8L)
- bitwShiftL:
    mutate: bitwShiftL(fld_integer, 1L)
    filter: bitwShiftL(fld_integer, 1L) > 10
    summarize: n_distinct(bitwShiftL(fld_integer, 1L))
    group_by: bitwShiftL(fld_integer, 1L)
    arrange: bitwShiftL(fld_integer, 1L)
- bitwShiftR:
    mutate: bitwShiftR(fld_integer, 1L)
    filter: bitwShiftR(fld_integer, 1L) > 10
    summarize: n_distinct(bitwShiftR(fld_integer, 1L))
    group_by: bitwShiftR(fld_integer, 1L)
    arrange: bitwShiftR(fld_integer, 1L)

@davidchall
Copy link
Contributor Author

@colearendt Ok, I think it should be good to test now - and I've updated the YAML file above to account for the latest changes. Thanks!

@hadley hadley merged commit 8d38547 into tidyverse:master Feb 7, 2019
@hadley
Copy link
Member

hadley commented Feb 7, 2019

Thanks!

@colearendt
Copy link
Collaborator

colearendt commented Feb 8, 2019

@davidchall Thanks for writing the YAML file! I will include that into our dbtest suite. Overall, it looks really good. Just a few problems (on Postgres / MSSQL, which is what I currently have handy to test):

image

Looks like semantics of "where" clauses are different in SQL vs. R. In particular, "WHERE integer" does not get handled appropriately. I don't know what would possibly fix this. It's almost like you need a context aware translation - didn't you run into this somewhere @edgararuiz ?

> t %>% filter(bitwAnd(fld_integer, 8L))
Error in new_result(connection@ptr, statement) : 
  nanodbc/nanodbc.cpp:1344: 42804: [RStudio][PostgreSQL] (30) Error occurred while trying to execute a query: [SQLState 42804] ERROR:  argument of WHERE must be type boolean, not type integer
LINE 3: WHERE ("fld_integer" & 8)
               ^
 
> t %>% filter(bitwAnd(fld_integer, 8L)) %>% show_query()
<SQL>
SELECT *
FROM "blah"
WHERE ("fld_integer" & 8)

The other MSSQL "failures" are just "not available on this SQL variant"

@davidchall
Copy link
Contributor Author

davidchall commented Feb 8, 2019

@colearendt Ah, this does make sense. The result of bitwise operations on integers is also an integer, but the WHERE clause expects to receive a boolean value. I've adjusted the YAML file to account for this.

@davidchall davidchall deleted the bitwise branch February 8, 2019 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants