Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to query long strings in cell contents with SQL Server #156

Closed
nwstephens opened this issue Feb 20, 2018 · 8 comments
Closed

Unable to query long strings in cell contents with SQL Server #156

nwstephens opened this issue Feb 20, 2018 · 8 comments

Comments

@nwstephens
Copy link

Issue Description and Expected Result

Posting this issue on behalf of Dean. He is unable to query long strings in cell contents from SQL Server using the pro drivers. Dean verified that the query works in isql so, this issue must be R related. He also mentioned that the query works with the RSQLServer package.

Database

SQL Server using the RStudio Pro Driver from shinyapps.io

Reproducible Example

Not available at this time

Details

Dean is having problems querying cell contents in SQL Server with very long strings. In short, he is querying data that is 1000's of characters long:

[xxx,yyy,...,zzz]

but the query only returns the last 17 characters:

2540, 2621, 2417]

The query he is using looks like this:

sql <- "SELECT [ConfigurationKey],[ConfigurationValue] FROM [Datalithic].[dbo].[UserConfiguration] WHERE [UserId]=19 AND [ConfigurationKey] IN ('CheckedConfig','CurrentWellList')"
res <- dbGetQuery(con, sql)

@daattali
Copy link

Because of the sensitivity of the data I cannot post the real data to reproduce. I haven't tried reproducing the error with a brand new database but I assume that if you create a very long string (in my case it's a json string of a 50k values array) and try to fetch it, you'll see similar behavior.

Three query also returns fine using my windows odbc driver, but not using the professional odbc driver found on shinyappsio.

@jimhester
Copy link
Contributor

What is the type of the column you are trying to retrieve? e.g. is it text, varchar, nchar etc?

@daattali
Copy link

Looking at the table definition, it's nvarchar(MAX)

Here are two examples of strings (json arrays) that don't work properly. Beware, long strings.

Example 1

"[3364, 3373, 3399, 1199, 1279, 3483, 3516, 2513, 2794, 503, 3043, 270, 3586, 410, 2987, 3119, 2803, 2765, 3625, 373, 3165, 2855, 3112, 4713, 2964, 2993, 3290, 2761, 3121, 3839, 4663, 3118, 3314, 4629, 4897, 5003, 3855, 4697, 4618, 3705, 4814, 3867, 4765, 5067, 5703, 1359, 1050, 5199, 861, 4213, 1453, 1482, 5779, 5149, 5384, 5299, 5389, 1615, 5144, 5343, 4348, 1054, 6082, 4434, 7004, 5640, 5643, 5648, 7114, 7807, 1711, 7223, 1725, 7054, 7389, 4341, 7323, 6246, 6358, 7591, 7403, 7454, 1774, 6565, 6396, 6532, 6465, 1947, 6271, 7951, 2220, 6597, 8016, 8068, 10517, 10263, 10790, 8183, 10258, 10559, 10918, 10853, 10859, 10907, 13125, 11128, 12350, 13150, 12624, 8383, 12311, 13541, 12766, 12643, 13340, 12689, 12994, 12806, 13329, 12725, 12995, 13599, 11351, 13342, 13923, 11434, 13953, 14088, 15775, 13737, 12814, 13688, 14156, 14336, 13794, 14203, 14274, 14166, 13877, 14675, 14279, 17151, 17050, 14326, 11422, 13832, 14712, 16696, 14534, 14382, 16712, 16715, 16855, 16732, 17163, 16920, 17146, 17470, 11591, 17352, 17315, 17696, 17815, 12092, 16311, 18800, 18779, 12096, 15097, 11757, 12018, 17833, 12078, 12037, 16357, 16521, 16448, 19188, 19898, 19155, 16325, 19804, 19133, 16603, 18853, 18994, 19822, 19012, 15083, 19949, 20010, 20759, 20463, 19797, 19923, 19976, 20354, 19909, 20627, 20794, 20194, 20115, 20144, 20244, 20324, 17986, 20378, 20567, 18026, 17990, 18032, 21011, 18075, 19473, 20654, 19596, 18255, 21596, 19673, 18302, 18290, 21884, 18382, 18381, 22731, 18413, 18483, 18412, 15174, 15274, 21975, 22020, 24097, 18475, 18519, 18625, 21843, 18626, 15214, 15236, 15259, 23983, 24131, 24370, 24363, 25009, 24301, 23996, 22577, 24953, 24984, 23109, 25522, 23163, 24886, 24940, 25059, 24967, 25448, 24442, 25139, 24484, 15439, 25036, 25137, 24421, 25426, 23283, 24386, 23465, 24406, 23181, 24488, 25511, 23598, 25854, 25715, 23578, 25948, 26473, 26475, 26565, 25859, 27294, 26066, 24561, 26523, 26814, 24522, 24538, 26308, 28682, 28187, 28666, 28571, 26929, 24617, 24606, 27422, 28446, 27669, 28537, 28443, 28630, 29302, 29424, 29468, 28836, 28829, 28859, 29409, 29547, 28545, 28548, 27670, 27842, 28593, 27913, 28864, 28942, 28954, 27069, 27142, 28950, 29174, 27137, 29186, 27090, 27132, 29176, 32578, 29256, 29605, 31573, 33448, 27265, 33837, 34201, 35309, 32421, 37989, 37981, 36020, 36175, 38456, 38385, 39296, 39567, 39944, 40010, 37622, 41452, 41489, 36994, 37677, 41665, 42290, 40582, 43612, 43685, 43414, 40668, 43382, 43424, 43891, 40832, 43780, 42461, 42497, 42995, 44894, 42134, 42819, 42007, 44527, 40950, 44788, 41975, 42848, 44745, 44702, 45156, 44069, 44887, 44922, 44889, 44811, 45538, 44161, 45270, 44236, 44320, 44148, 41335, 47755, 45636, 44417, 44399, 43258, 47752, 45597, 41274, 45709, 41144, 45531, 45533, 45227, 47950, 48056, 46836, 46702, 48850, 48798, 48423, 48236, 48817, 48670, 47785, 48736, 48242, 45773, 48634, 48418, 45790, 45747, 48966, 48808, 49073, 46000, 47503, 47513, 49142, 49245, 49051, 47149, 49275, 46232, 49989, 50033, 46322, 46500, 46529, 41, 103, 67, 2249, 3530, 2367, 3404, 2234, 215, 3414, 3460, 2431, 3391, 2540, 2621, 2417]"

The query for a column containing this value with the professional odbc sql server driver returns

"2540, 2621, 2417]"

Example 2

(longer) https://pastebin.com/4JL4DqFa

Results in

",2506,2554,231,237,240,266,276,295,330,2576,2621,3566,3576,3598,3605,2349,2352,2420,2423,122,129,145,151,183,190,193,209,212,3496,3499,3538,3544,2449,2452,2455,2458,2465,2468,2471,1103,1174,1190,3362,3372,3375,3394,3417,3439,152,200,210,216,2337,2363,2366,2369,2398,2401,2408,2414,2417,2437,2440,3614,3624,3627,2666]"

@jimhester
Copy link
Contributor

Ok I can confirm this. Interestingly this works fine with the freeTDS drivers, but not with the Simba drivers. Writing the data works fine, it is only reading that seems to have an issue. In both cases I am connecting to the same database, here is a reproducible example

x <- data.frame(stringsAsFactors = FALSE, x = scan("https://pastebin.com/raw/4JL4DqFa", what = "character"))

# Simba drivers
library(DBI);con <- dbConnect(odbc::odbc(), "SimbaSQLServer", UID = 'testuser', PWD = 'test');
dbWriteTable(con, "foo", x, field.types = c(x = "nvarchar(MAX)"), overwrite = TRUE)
dbReadTable(con, "foo")

# FreeTDS drivers
library(DBI);con2 <- dbConnect(odbc::odbc(), "SQLServer", UID = 'testuser', PWD = 'test');
dbReadTable(con2, "foo")

@daattali
Copy link

Are you not able to reproduce the problem with the first shorter test? Would be simpler to work with

@jimhester
Copy link
Contributor

I can reproduce it if I force the column type to be nvarchar(max), it works fine with a normal nvarchar() column

@jimhester
Copy link
Contributor

Ok I have determined a cause, this was a bug in the upstream nanodbc library, I fixed it locally by d2444e1 and opened an issue at the upstream repository nanodbc/nanodbc#160

@pythiantech
Copy link

pythiantech commented Nov 13, 2019

I am on macOS Catalina Version 10.15.1, running R version 3.6.1. Facing exactly the same issue as described here by @daattali. odbc version is 1.1.6 which I installed from source and am using the Simba driver. I do have the freeTDS driver installed, but odbcListDrivers() does not list it.

While reading columns of type nvarchar(MAX) the results are getting truncated to only the first letter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants