Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/describe prior to write #313

Merged
merged 12 commits into from Nov 19, 2019

Conversation

detule
Copy link
Collaborator

@detule detule commented Oct 30, 2019

Hi @jimhester @krlmlr

Please consider this patch-set as potentially a more robust alternative to #70.

  • With the current solution (master) FreeTDS users face truncation when inserting values longer than 255 characters.
  • An additional benefit could be performance enhancement when writing to wide tables, as this approach makes less round-trip calls to the server.

Looking forward to hearing your feedback.

…beParam

Backport of e99606bc8d01257c7c1316dd06f3a9c30e0a71fd
This is an exported function that is envisioned as a thin wrapper around SQLColumns.
Default method points to renamed/reordered connection_sql_columns.

We hope to call this method in dbWriteTable, to set parameter descriptions prior to binding values.  Given that drivers may have idiosyncratic implementations of SQLColumns (for example, some may not offer the ability to call this function on a table outside of the current catalog) offer an S4 method giving the end-user to write an implementation that works for them.
Calls nanodbc::describe_parameters
Call to result_describe_parameters, which in turn calls nanodbc::describe_parameters, is wrapped
in a tryCatch block - if there is a failure, we fallback to the nanodbc code-path where binding uses a call to SQLDescribeParam
@jimhester
Copy link
Contributor

Thanks for working on this!

It looks like this change is causing the tests to fail when using PostgreSQL, would you be able to look into this?

@detule
Copy link
Collaborator Author

detule commented Oct 30, 2019

Will do - thanks.

@detule
Copy link
Collaborator Author

detule commented Oct 31, 2019

@jimhester Should be ready for a second look.

The R-devel pipeilne failure seems to be unrelated to this patch-set.

R/Table.R Outdated
Comment on lines 74 to 75
if (!is.null(fieldDetails) && nrow(fieldDetails))
result_describe_parameters(rs@ptr, fieldDetails)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put braces around this conditional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do - thanks.

R/Table.R Outdated
tryCatch({
details <- odbcConnectionColumns(conn, name)
datails <- details[match(names(values), details$column_name)]
details[, c("ordinal_position", "data_type", "column_size", "decimal_digits")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call has no effect, did you mean to assign it back to details?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jim - the output of the tryCatch block is assigned to fieldDetails - the line you quoted there is what gets assigned in the success code-path.

Let me know if you think there is a better approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was due to the indentation style used, I would normally put the tryCatch() line on the same line as the assignment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update - thanks.

R/Table.R Outdated
fieldDetails <-
tryCatch({
details <- odbcConnectionColumns(conn, name)
datails <- details[match(names(values), details$column_name)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just be

datails <- details[names(values)]

But do we need to reorder the columns at all?

Copy link
Collaborator Author

@detule detule Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jim, the intention here was to:

  • Subset the rows of the output of SQLColumns to those (columns) we are writing to / we need to describe.
  • I thought some care was needed with the ordering - AFAICT there is no guarantee that the columns in values (and therefore the parameters in the INSERT statement) are ordered the same way as the columns in the table we are writing to (i.e. the rows in details).

Having said that, I think part of the confusion is the use of ordinal_position to index the parameters we are describing - this seems incorrect. After re-ordering the rows in details to match the column order in values as is done here, the index that gets passed nanodbc::statement::describe_parameters should be 1:ncol(values)

Will update.

R/Connection.R Outdated
"sql_datetime_subtype", "char_octet_length", "ordinal_position",
"nullable")
detail <-
detail[, c(5, 4, 3, 1, 6, 2, 7, 8, 9, 10, 17, 11, 12, 13, 14, 15, 16)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indexing by number seems too brittle, could we reorder by name first, then rename the columns if necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change - thanks.

R/Connection.R Outdated
Comment on lines 160 to 161
if (!is.null(column_name))
detail <- subset(detail, detail$column_name == column_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add braces here,

Also generally it is best to avoid using subset inside functions, the type of non-standard evaluation in subset() is not robust enough.

detail <- detail[detail$column_name == column_name, ] is the equivalent.

Also it is somewhat surprising that you would need to do an additional filtering, if you search for a specific column in SQLColumns only that column should be returned.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change - thanks.

- Wrap conditional expressions in braces
- Avoid using numeric indexing when re/ordering output of
  connection_sql_columns
- Bugfix: Appropriate index passed down to
  nanodbc::statement::describe_parameters
Check the case when, via `dbWriteTable`, we are writing a data.frame with columns
ordered differently than the table we are writing to.  In `dbWriteTable` we
attempt to describe parameters (types, length) of the INSERT query using
information about the columns of the table being written to.  Some care is needed
to make sure that the table column descriptions get paired with the appropriate parameters.
@detule
Copy link
Collaborator Author

detule commented Nov 6, 2019

@jimhester Thanks for the feedback!

  • Patched conditional braces, numeric column indexing
  • Fixed bug with pairing table column descriptions with INSERT statement parameters. Added a test.

test_roundtrip is perhaps not the best home for this test - happy to move it to a more appropriate location if you have any suggestions.

R/Connection.R Outdated
"decimal_digits", "numeric_precision_radix", "nullable", "remarks",
"column_default", "sql_data_type", "sql_datetime_subtype",
"char_octet_length", "ordinal_position")]
names(detail)[c(1, 2, 4, 6, 13)] <- c("table_cat", "table_schem",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer to not rename. And do we need to reorder the columns?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jim my original thinking here was to make the output as close to the output of the ODBC SQLColumns function as possible. It seemed to me that, since we are offering the end-user the ability to write their own implementation, then it is more natural to anchor on the field naming of the ODBC's SQLColumns API, than on the choices made when implementing the internal connection_sql_columns.

If you also think that's worthwhile then we either re-name the columns here, or change the output of connection_sql_columns. The latter seemed less desirable since even though internal, I wasn not sure if there aren't folks that have developed dependencies.

The re-ordering was more thinking in the same direction - making the output align with what I can see when I execute sp_columns in my SQL Server client for example.

I don't feel strongly here - happy to make this just a wrapper around connection_sql_columns if you think that's the best way forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer it to just be a wrapper around connection_sql_columns

@jimhester
Copy link
Contributor

I would prefer a separate test outside of the roundtrip that creates a table and appends to it.

Also move test checking for out-of-order column write to tests-PostgreSQL
@detule
Copy link
Collaborator Author

detule commented Nov 11, 2019

@jimhester Thanks again - appreciate the comments.

Let me know if I missed anything.

@jimhester jimhester merged commit 7c4bc6e into r-dbi:master Nov 19, 2019
@jimhester
Copy link
Contributor

Thanks!! You are da bomb!

@detule
Copy link
Collaborator Author

detule commented Nov 20, 2019

Jim: thank you very much - I know it was a chunky PR, and I appreciate the thoughtful comments.

Any thoughts on when this might make its way to CRAN?

@jimhester
Copy link
Contributor

I was aiming for a CRAN release on odbc next week sometime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants