Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest github(1/31/2017) haven still fails with row lengths approx 2048 bytes #272

Closed
rogerjdeangelis opened this issue Jan 31, 2017 · 15 comments

Comments

@rogerjdeangelis
Copy link

commented Jan 31, 2017

Thanks for working on my issue!!

Latest github haven still fails with row lengths approx 2048 bytes

library(haven)
have<-structure(list(A1="",A2="",A3="",A4="",X1=0,X2=0,X3=0,X4=0,X5=0,X6=0,
X7=0,X8=0,X9=0,X10=0,X11=0,X12=0,X13=0,X14=0,X15=0,X16=0,X17=0,X18=0,
X19=0,X20=0,X21=0,X22=0,X23=0,X24=0,X25=0,X26=0,X27=0,X28=0,X29=0,
X30=0,X31=0,X32=0,X33=0,X34=0,X35=0,X36=0,X37=0,X38=0,X39=0,X40=0,
X41=0,X42=0,X43=0,X44=0,X45=0,X46=0,X47=0,X48=0,X49=0,X50=0,X51=0,
X52=0,X53=0,X54=0,X55=0,X56=0,X57=0,X58=0,X59=0,X60=0,X61=0,X62=0,
X63=0,X64=0,X65=0,X66=0,X67=0,X68=0,X69=0,X70=0,X71=0,X72=0,X73=0,
X74=0,X75=0,X76=0,X77=0,X78=0,X79=0,X80=0,X81=0,X82=0,X83=0,X84=0,
X85=0,X86=0,X87=0,X88=0,X89=0,X90=0,X91=0,X92=0,X93=0,X94=0,X95=0,
X96=0,X97=0,X98=0,X99=0,X100=0,X101=0,X102=0,X103=0,X104=0,X105=0,
X106=0,X107=0,X108=0,X109=0,X110=0,X111=0,X112=0,X113=0,X114=0,X115=0,
X116=0,X117=0,X118=0,X119=0,X120=0,X121=0,X122=0,X123=0,X124=0,X125=0,
X126=0,X127=0,X128=0,X129=0,X130=0,X131=0,X132=0,X133=0,X134=0,X135=0,
X136=0,X137=0,X138=0,X139=0,X140=0,X141=0,X142=0,X143=0,X144=0,X145=0,
X146=0,X147=0,X148=0,X149=0,X150=0,X151=0,X152=0,X153=0,X154=0,X155=0,
X156=0,X157=0,X158=0,X159=0,X160=0,X161=0,X162=0,X163=0,X164=0,X165=0,
X166=0,X167=0,X168=0,X169=0,X170=0,X171=0,X172=0,X173=0,X174=0,X175=0,
X176=0,X177=0,X178=0,X179=0,X180=0,X181=0,X182=0,X183=0,X184=0,X185=0,
X186=0,X187=0,X188=0,X189=0,X190=0,X191=0,X192=0,X193=0,X194=0,X195=0,
X196=0,X197=0,X198=0,X199=0,X200=0,X201=0,X202=0,X203=0,X204=0,X205=0,
X206=0,X207=0,X208=0,X209=0,X210=0,X211=0,X212=0,X213=0,X214=0,X215=0,
X216=0,X217=0,X218=0,X219=0,X220=0,X221=0,X222=0,X223=0,X224=0,X225=0,
X226=0,X227=0,X228=0,X229=0,X230=0,X231=0,X232=0,X233=0,X234=0,X235=0,
X236=0,X237=0,X238=0,X239=0,X240=0,X241=0,X242=0,X243=0,X244=0,X245=0,
X246=0,X247=0,X248=0,X249=0,X250=0,X251=0,X252=0,X253=0,X254=0,X255=0,
X256=0,X257=0),row.names=c(NA,-1L),
class=c("tbl_df","tbl","data.frame"))
write_sas(have,"d:/sd1/hangs.sas7bdat")

I did get this 'note' when installing the latest haven
DfReader.cpp:471:5: warning: 'dir' may be used uninitialized in this function [-
Wmaybe-uninitialized]
file_.seekg(offset, dir);
^

Stderr output:
Error in write_sas_(data, normalizePath(path, mustWork = FALSE)) :
Writing failure: A row of data will not fit into the file format.
Calls: write_sas -> write_sas_ -> .Call
Execution halted

@evanmiller

This comment has been minimized.

Copy link
Contributor

commented Jul 8, 2017

Should be fixed in WizardMac/ReadStat@046cadd

@hadley

This comment has been minimized.

Copy link
Member

commented Jan 7, 2018

@rogerjdeangelis Please try again with latest github version

@hadley hadley closed this Jan 7, 2018

@normark

This comment has been minimized.

Copy link

commented Jan 10, 2018

@hadley The problem is still reproducible on the master branch.

This must be an issue in readstat, should it be filed in the readstat repo instead? Also, I have problems finding out which upstream version of readstat that is used here?

@hadley

This comment has been minimized.

Copy link
Member

commented Jan 10, 2018

@evanmiller could you please take another look?

@evanmiller

This comment has been minimized.

Copy link
Contributor

commented Jan 10, 2018

@hadley I've added test coverage via WizardMac/ReadStat@1d40cb0, everything appears to work on the ReadStat end.

@hadley

This comment has been minimized.

Copy link
Member

commented Jan 10, 2018

Hmmm, I'm not sure how this could be a haven problem, but I'll think about it.

@evanmiller

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2018

Codecov indicates the fix isn't actually being triggered in tests, I'll tinker some more until I'm sure the code path is hot.

@rgayler

This comment has been minimized.

Copy link

commented Jan 11, 2018

FWIW I have just run into the same problem.
Using haven 1.1.0.9000

I have been given a sas7bdat file with 700 variables (mostly doubles). I can read it fine with read_sas(), but trying to write it out with write_sas() gets an error:


> haven::write_sas(dd, path_data_raw("sas", "test.sas7bdat")) 
Error in write_sas_(data, normalizePath(path, mustWork = FALSE)) :
    Writing failure: A row of data will not fit into the file format. 

> traceback() 
4: stop(list(message = "Writing failure: A row of data will not fit into the file format.",
         call = write_sas_(data, normalizePath(path, mustWork = FALSE)),
         cppstack = list(file = "", line = -1L, stack = c("/home/ross/R/x86_64-pc-linux-gnu-library/3.4/haven/libs/haven.so(Rcpp::exception::exception(char const*, bool)+0x7a) [0x7fd8b223293a]",
         "/home/ross/R/x86_64-pc-linux-gnu-library/3.4/haven/libs/haven.so(void Rcpp::stop<char const*>(char const*, char const*&&)+0x4f) [0x7fd8b223f46f]",
         "/home/ross/R/x86_64-pc-linux-gnu-library/3.4/haven/libs/haven.so(Writer::write()+0xb7d) [0x7fd8b224262d]",
         "/home/ross/R/x86_64-pc-linux-gnu-library/3.4/haven/libs/haven.so(write_sas_(Rcpp::Vector<19, Rcpp::PreserveStorage>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)+0x11d) [0x7fd8b223eefd]",
         "/home/ross/R/x86_64-pc-linux-gnu-library/3.4/haven/libs/haven.so(haven_write_sas_+0x128) [0x7fd8b2244b08]",
         "/usr/lib/R/lib/libR.so(+0xd3a86) [0x7fd8cc5a5a86]", "/usr/lib/R/lib/libR.so(Rf_eval+0x818) [0x7fd8cc5e4718]",
         "/usr/lib/R/lib/libR.so(+0x1164a0) [0x7fd8cc5e84a0]", "/usr/lib/R/lib/libR.so(Rf_eval+0x6cf) [0x7fd8cc5e45cf]",
         "/usr/lib/R/lib/libR.so(+0x11519b) [0x7fd8cc5e719b]", "/usr/lib/R/lib/libR.so(Rf_eval+0x5bb) [0x7fd8cc5e44bb]",
         "/usr/lib/R/lib/libR.so(+0x11452e) [0x7fd8cc5e652e]", "/usr/lib/R/lib/libR.so(Rf_eval+0x362) [0x7fd8cc5e4262]",
         "/usr/lib/R/lib/libR.so(+0x11519b) [0x7fd8cc5e719b]", "/usr/lib/R/lib/libR.so(Rf_eval+0x5bb) [0x7fd8cc5e44bb]",
         "/usr/lib/R/lib/libR.so(+0x11452e) [0x7fd8cc5e652e]", "/usr/lib/R/lib/libR.so(Rf_eval+0x362) [0x7fd8cc5e4262]", 
        "/usr/lib/R/lib/libR.so(Rf_ReplIteration+0x212) [0x7fd8cc60d752]",         "/usr/lib/R/lib/libR.so(+0x13bb51) [0x7fd8cc60db51]", "/usr/lib/R/lib/libR.so(run_Rmainloop+0x48) [0x7fd8cc60dc08]",
         "/usr/lib/rstudio/bin/rsession(rstudio::r::session::runEmbeddedR(rstudio::core::FilePath const&, rstudio::core::FilePath const&, bool, bool, SA_TYPE, rstudio::r::session::Callbacks const&, rstudio::r::session::InternalCallbacks*)+0x16f) [0xe40fbf]",
         "/usr/lib/rstudio/bin/rsession(rstudio::r::session::run(rstudio::r::session::ROptions const&, rstudio::r::session::RCallbacks const&)+0x84f) [0xe1f56f]",
         "/usr/lib/rstudio/bin/rsession(main+0x3633) [0x718ad3]",
         "/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7fd8cabcf1c1]",
         "/usr/lib/rstudio/bin/rsession(_start+0x29) [0x727f39]")))) 
3: .Call("haven_write_sas_", PACKAGE = "haven", data, path) 
2: write_sas_(data, normalizePath(path, mustWork = FALSE)) 
1: haven::write_sas(dd, path_data_raw("sas", "test.sas7bdat"))
@evanmiller

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2018

Okay, the code path is hot now... apparently I don't know how to count. 4096+ byte rows should work fine provided that row-compression isn't explicitly turned on (didn't see it turned on anywhere in haven).

If anyone wants to mess with the default page size, feel free to futz with PAGE_SIZE in readstat_sas.c.

@normark

This comment has been minimized.

Copy link

commented Jan 11, 2018

@evanmiller So the test is passing in the current readstat master? 😊

@evanmiller

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2018

@normak Yes but I haven't changed any relevant code, just added tests. Are you sure you're using haven master?

@normark

This comment has been minimized.

Copy link

commented Jan 11, 2018

@evanmiller Yes, installed from master using devtools::install_github("tidyverse/haven"). And I just manually double-checked the libpaths, the version om my search path is the master from here.

The error message that shows up in R appears to be triggered in readstat at either of these places:
https://github.com/WizardMac/ReadStat/blob/046cadde5886e8bcecb9511153ccc6e61bb00194/src/sas/readstat_sas7bdat_write.c#L418

https://github.com/WizardMac/ReadStat/blob/046cadde5886e8bcecb9511153ccc6e61bb00194/src/sas/readstat_sas7bdat_write.c#L525

This might be an entirely wrong suspicion though, I am totally unfamiliar with the codebase.. just guessing 😄

@evanmiller

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2018

@normak @hadley This issue should be closed, and #335 should be re-opened. The remaining issue has to do with the number and names of columns, not the row lengths.

@hadley hadley closed this Jan 11, 2018

@hadley

This comment has been minimized.

Copy link
Member

commented Jan 11, 2018

Ok, done.

@lock

This comment has been minimized.

Copy link

commented Jul 10, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jul 10, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
5 participants
You can’t perform that action at this time.