Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Type Conversion Option for RJDBC #8

Closed
wants to merge 2 commits into from
Closed

Conversation

joshbode
Copy link

Hi Simon,

I've added some code to a fork of RJDBC enabling the conversion of data returned from queries to specific R data-types, based on the Java SQL data type constants (see [http://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql]).

Not sure if this is something you want to incorporate, but my motivation for this is that I work with a lot of time-series data and just about every one of my queries has an as.POSIXlt or as.Date immediately following it. I much prefer RJDBC over other options for two reasons - 1) It's much more portable (I work on a Windows machine where I don't have admin access :( ), and 2) I've found some of the implicit type conversion in RODBC to be too enthusiastic at times (e.g. a string column containing only numbers converted to numeric that I'd rather keep as character).

To reduce impact on existing users, I've made it the default to perform no conversions, as per the existing code.

Happy to discuss and make changes if this is of any use.

@s-u
Copy link
Owner

s-u commented Jun 11, 2014

Thanks, this is a good idea. There are a few minor things (e.g., you don't want to use as.POSIXlt on any real data), but I like the approach.

@s-u
Copy link
Owner

s-u commented Jun 11, 2014

moved into data-type-conversion branch

@joshbode
Copy link
Author

Great :)

Do you think the default conversion should be as.POSIXct or as.Date? To do it right with POSIXct probably requires some understanding of the timezone setting on the database and client?

Is there anything else you think should be altered?

@yasminlucero
Copy link

Can I ask about the status of this pull request? This looks like it will allow us to bring BIGINT over as character, (and then later apply bit64 int64 type). If so, we could really use this.

Do you need help with testing?

@joshbode
Copy link
Author

joshbode commented Dec 4, 2015

I'd forgotten about this work! Based on Simon's comments, I've made some additional changes to fix the date conversion (DATE -> as.Date, DATETIME -> as.POSIXct) and brought my code up to date with master.

With the changes in this branch you should be able to achieve the conversions you described by setting up a higher priority type-conversion via the options - i.e. something like:

type.map = c('bigint'=c(types$BIGINT), default.type.map)
conversion.map = c('bigint'=bit64::as.integer64, default.conversion.map)
RJDBC.options(convert.types=TRUE, type.map=type.map, conversion.map=conversion.map)

@hhoeflin
Copy link

This code is very interesting for an issue I have with 64bit integers. However, I was wondering why over 2 years after the pull-request, it hasn't been merged. Is there some conflict anyone is aware of?

@joshbode
Copy link
Author

Not to my knowledge - I'd completely forgotten about this work again.
Any thoughts @s-u?

@maxmoro
Copy link

maxmoro commented Jun 23, 2017

Any news on this update. I'm really looking for this feature.
thanks

@joshbode
Copy link
Author

joshbode commented Jun 24, 2017

Hi @maxmoro - I'm not using R much any more, so I had let this sit.

I haven't heard back from @s-u on this, but I've just brought the fork/PR up to date with master.
If you wanted to test/use it, let me know and I'll see how you could use it.

@maxmoro
Copy link

maxmoro commented Jun 27, 2017 via email

@joshbode
Copy link
Author

Thinking about it, there's a build step in there (mkdist) that compiles the Java code and I don't believe there is a way to automate that with install_githib.

So, you may be better doing something like cloning the repo, running ./mkdist and installing the resultant package directly (e.g. install.packages("RJDBC_0.2-6.tar.gz", repos=NULL)).

If you have any issues (e.g. missing JDK, etc) let me know and I help or just compile it for you.

@joshbode joshbode force-pushed the master branch 2 times, most recently from 07e92da to cc976a7 Compare June 28, 2017 07:45
@joshbode
Copy link
Author

OK - by adding src/Makefile (and a few other files), the package is installable directly from github via:

require(devtools)
install_github("joshbode/RJDBC")

assuming you are on Linux/macOS (or possibly Rtools if on Windows), and have make and the JDK (javac and jar) in your path.

Data type conversions for specialised numeric (integer and boolean) and
non-character data types such as dates.

Uses Java SQL data type codes to map data rather than hard-coded flags.
@joshbode
Copy link
Author

joshbode commented Jun 25, 2018

Hi @s-u - do you have any plans to merge this? Is there anything else you would like to see here?

Apart from the type conversions work, this also streamlines the build of the library so it works directly with R CMD INSTALL and devtools via install_github (by including a Makefile)

$ R CMD build .
* checking for file ‘./DESCRIPTION’ ... OK
* preparing ‘RJDBC’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
Removed empty directory ‘RJDBC/inst’
* building ‘RJDBC_0.2-7.tar.gz’

If not, I'll close the PR.

@joshbode
Copy link
Author

Closing - after all this time, I'm not using this package any more - if anyone wants to pick up this PR, feel free to fork my branch :)

@s-u
Copy link
Owner

s-u commented Mar 18, 2021

Thanks, I have added a slightly different implementation in 1c6399c with similar goals. The main difference is that I have opted for single-point entry where the fetching form Java stays as-is and only the post-processing is done in R on per-column basis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants