R package using Rcpp to parse a SAS file into a data.frame(). Currently
read.sas
is the main function and feature of this package.
The package allows (experimental) reading of sas7bdat files that are
- (un)compressed
As with other releases of the read
series, focus is again on being as
accurate as possible. Speed is welcome, but a secondary goal.
With remotes
:
remotes::install_github("JanMarvin/readsas")
With r-universe
:
options(repos = c(
janmarvin = 'https://janmarvin.r-universe.dev',
CRAN = 'https://cloud.r-project.org'))
install.packages('readsas')
fl <- system.file("extdata", "cars.sas7bdat", package = "readsas")
dd <- read.sas(fl)
head(dd)
#> speed dist
#> 1 4 2
#> 2 4 10
#> 3 7 4
#> 4 7 22
#> 5 8 16
#> 6 9 10
This should be much faster, since unselected cells of the entire data frame are skipped when reading, and it is memory efficient to load only specific columns or rows. However, the file header is always read in its entirety. If the file header is large enough, it will still take some time to read.
fl <- system.file("extdata", "mtcars.sas7bdat", package = "readsas")
dd <- read.sas(fl, select.cols = c("VAR1", "mpg", "hp"),
select.rows = c(2:5), rownames = TRUE)
head(dd)
#> mpg hp
#> Mazda RX4 Wag 21.0 110
#> Datsun 710 22.8 93
#> Hornet 4 Drive 21.4 110
#> Hornet Sportabout 18.7 175
The documentation of the sas7bdat package by Matt Shotwell and Clint
Cummins in their R package
sas7bdat
, by Jared Hobbs
for the python library
sas7bdat
, and
by EPAM in the Java library parso
was
crucial. Without their decryption of the SAS format, this package would
not have been possible.
Further testing was done using the R package
haven
by Hadley Wickam and Evan
Miller.