From 93cb8231a141a5340adc43f7228b3fae69818135 Mon Sep 17 00:00:00 2001 From: Matt Dowle Date: Tue, 14 Mar 2017 13:45:25 -0700 Subject: [PATCH] fread complex field containing non-trivial quoting and escaping works under new rules. Test added. Closes #2051 --- NEWS.md | 4 ++++ inst/tests/issue_2051.csv | 3 +++ inst/tests/tests.Rraw | 4 ++++ 3 files changed, 11 insertions(+) create mode 100644 inst/tests/issue_2051.csv diff --git a/NEWS.md b/NEWS.md index f5afea8db..f6cc422da 100644 --- a/NEWS.md +++ b/NEWS.md @@ -17,6 +17,10 @@ 1. The type pun fix (using union) in 1.10.4 resolved some CRAN flavors but still failed the new `fwrite` nanotime test with R-devel on MacOS using latest clang from latest Xcode 8.2. It seems that clang optimizations in Xcode 8 are particularly aggressive and require even stricter adherence to C standards. The type pun was already centralized and now uses `memcpy` which is ok by C standards and compilers apparently know to optimize to avoid call overhead. +2. The new quote rules handles this single field \code{"Our Stock Screen Delivers an Israeli Software Company (MNDO, CTCH)<\/a> SmallCapInvestor.com - Thu, May 19, 2011 10:02 AM EDT<\/cite><\/div>Yesterday in \""Google, But for Finding + Great Stocks\"", I discussed the value of stock screeners as a powerful tool"}, [#2051](https://github.com/Rdatatable/data.table/issues/2051). Thanks to @scarrascoso for reporting. Example file added to test suite. + + #### NOTES diff --git a/inst/tests/issue_2051.csv b/inst/tests/issue_2051.csv new file mode 100644 index 000000000..61b7474d2 --- /dev/null +++ b/inst/tests/issue_2051.csv @@ -0,0 +1,3 @@ +COLUMN1,COLUMN2,COLUMN3,COLUMN4,COLUMN5,COLUMN6,COLUMN7,COLUMN8,COLUMN9,COLUMN10,COLUMN11,COLUMN12,COLUMN13,COLUMN14,COLUMN15,COLUMN16,COLUMN17,COLUMN18,COLUMN19,COLUMN20,COLUMN21,COLUMN22,COLUMN23,COLUMN24,COLUMN25,COLUMN26,COLUMN27,COLUMN28,COLUMN29,COLUMN30,COLUMN31,COLUMN32,COLUMN33,COLUMN34,COLUMN35,COLUMN36,COLUMN37,COLUMN38,COLUMN39,COLUMN40,COLUMN41,COLUMN42,COLUMN43,COLUMN44,COLUMN45,COLUMN46,COLUMN47,COLUMN48,COLUMN49,COLUMN50 +2015-03-25 13:55:05.000,EFEBAB4B4F84404A4FB843ACFA861945,020E33,COMP,Royal Bank of Canada,CA,44,,,,,,,,,,,,,,,,,,,,,,,,,PRESS-RELEASE,5A5702,XYZ,0.04,-0.58,0,0,1,0,0,0,0,0,2,13,WE-BD,XYZ,5355872:6127514,R.W. Pressprich & Co. Expands Energy Team +2015-03-25 13:55:05.653,F1C6459BB07D98926FEF3BBF2F0C4AEB,4A6F00,COMP,Alphabet Inc.,US,90,,,,,,,,,,,,,,,,,,,,,,,,,NEWS-FLASH,E5AA62,Yahoo! Finance,0.00,0.24,0,0,0,0,0,0,0,0,2,2,WE-BD,MRVR,10:20834205901,"Our Stock Screen Delivers an Israeli Software Company (MNDO, CTCH)<\/a> SmallCapInvestor.com - Thu, May 19, 2011 10:02 AM EDT<\/cite><\/div>Yesterday in \""Google, But for Finding Great Stocks\"", I discussed the value of stock screeners as a powerful tool" diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index da93afd5c..e5ff19a6a 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -9832,6 +9832,10 @@ if ("package:nanotime" %in% search()) { cat("Test 1751 not run. If required call library(nanotime) first.\n") } +# issue 2051 where a quoted field contains ", New quote rule detection handles it. +test(1752, fread("issue_2051.csv")[2,grep("^Our.*tool$",COLUMN50)], 1L) + + ########################## # TODO: Tests involving GForce functions needs to be run with optimisation level 1 and 2, so that both functions are tested all the time.