Skip to content

Filtering SQLite and data.frame give different results when using OR (|) #934

@EricEdwardBryant

Description

@EricEdwardBryant

In reference to - http://stackoverflow.com/questions/28205184/

library(dplyr)
packageVersion('dplyr')
# [1] ‘0.4.1’

df <- tbl_df(data.frame(
  v1 = c('a', 'b', 'a', 'b'),
  v2 = c('b', 'a', 'a', 'b'),
  v3 = month.abb[1:4]))

db <- copy_to(src_sqlite('example.sqlite', create = TRUE), df)

filter(df, v1 == 'a' | v2 == 'a', v3 == 'Jan') 
# Source: local data frame [1 x 3]
#
#   v1 v2  v3
#1  a  b Jan

filter(db, v1 == 'a' | v2 == 'a', v3 == 'Jan')
# Source: sqlite 3.8.6 [example.sqlite]
# From: df [2 x 3]
# Filter: v1 == "a" | v2 == "a", v3 == "Jan" 
#
#  v1 v2  v3
#1  a  b Jan
#2  a  a Mar

filter(db, v1 == 'a' | v2 == 'a', v3 == 'Jan') %>% show_query()
# <SQL>
#   SELECT "v1", "v2", "v3"
# FROM "df"
# WHERE "v1" = 'a' OR "v2" = 'a' AND "v3" = 'Jan'

filter(db, (v1 == 'a' | v2 == 'a'), v3 == 'Jan')
# Source: sqlite 3.8.6 [example.sqlite]
# From: df [1 x 3]
# Filter: (v1 == "a" | v2 == "a"), v3 == "Jan" 
# 
# v1 v2  v3
#1  a  b Jan

Filtering the data.frame returns 1 row (as expected), whereas filtering the database returns 2 rows. This is fixed by adding explicit parentheses around the argument containing the OR expression in filter when querying the database. Perhaps ... arguments given to filter should be implicitly wrapped in parentheses?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions