rows_insert fails to populate duplicated entries to remote table with autoincrement ON #1149

tuge98 · 2023-02-10T14:18:30Z

The rows_insert function in the dplyr package is failing to populate duplicate entries in remote databases (SQLite and MSSQL) where a single column is designated as an auto-incrementing primary key. Although manual insertion of the duplicate rows works as expected, the rows_insert function is unable to properly insert the duplicates, leading to a discrepancy in the data stored in the remote database.

This might be either bug in the rows_insert source code of misuse of the function argument from my side.

dplyr version: 1.1.0

Brief description of the problem

suppressPackageStartupMessages({
  library(dplyr)
  library(dbplyr)
  library(DBI)
  library(odbc)
})

con <- DBI::dbConnect(RSQLite::SQLite())


tbl <- "CREATE TABLE table1(
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        value int)" |> as.sql(con)

DBI::dbExecute(con, tbl)
#> [1] 0


table1 <- tbl(con, "table1")

# initializing inserted rows: I am not specifying id since it will be generated by autoincremente column
rows_to_insrt <- tibble(value = 1)

# initializing the first row: rows_insert will populate the row with autoincrement ON
dplyr::rows_insert(table1, rows_to_insrt, copy = TRUE, conflict = "ignore", in_place = TRUE)
#> Matching, by = "value"


table1 |> select(everything())
#> # Source:   table<table1> [1 x 2]
#> # Database: sqlite 3.39.4 []
#>      id value
#>   <int> <int>
#> 1     1     1

# trying again: There should be additional row with id = 2 and value = 1
dplyr::rows_insert(table1, rows_to_insrt, copy = TRUE, conflict = "ignore", in_place = TRUE)
#> Matching, by = "value"


table1 |> select(everything())
#> # Source:   table<table1> [1 x 2]
#> # Database: sqlite 3.39.4 []
#>      id value
#>   <int> <int>
#> 1     1     1

# this works as supposed
DBI::dbExecute(con, "INSERT INTO table1 (value) values(1)")
#> [1] 1

table1 |> select(everything())
#> # Source:   table<table1> [2 x 2]
#> # Database: sqlite 3.39.4 []
#>      id value
#>   <int> <int>
#> 1     1     1
#> 2     2     1

^{Created on 2023-02-10 with reprex v2.0.2}

mgirlich · 2023-02-10T14:40:21Z

You want to use rows_append(), not rows_insert(). From the documentation:

rows_insert() adds new rows (like INSERT). By default, key values in y must not exist in x.

DavisVaughan transferred this issue from tidyverse/dplyr Feb 10, 2023

mgirlich closed this as completed Feb 10, 2023

mgirlich mentioned this issue Jun 6, 2023

Improve documentation for rows_insert() vs rows_append() tidyverse/dplyr#6864

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rows_insert fails to populate duplicated entries to remote table with autoincrement ON #1149

rows_insert fails to populate duplicated entries to remote table with autoincrement ON #1149

tuge98 commented Feb 10, 2023

mgirlich commented Feb 10, 2023

rows_insert fails to populate duplicated entries to remote table with autoincrement ON #1149

rows_insert fails to populate duplicated entries to remote table with autoincrement ON #1149

Comments

tuge98 commented Feb 10, 2023

mgirlich commented Feb 10, 2023