Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common interface: read_(f)ods, write_(f)ods, list_(f)ods_sheets #151

Closed
chainsawriot opened this issue Aug 21, 2023 · 1 comment
Closed
Labels

Comments

@chainsawriot
Copy link
Collaborator

chainsawriot commented Aug 21, 2023

As the only difference between read_ods and read_fods (as well as other functions such as list_ods_sheets and the upcoming write_ods) is flat when calling .read_ods, I was thinking about making read_ods and co. work like readxl::read_excel().

We should have a function like excel_format() https://github.com/tidyverse/readxl/blob/main/R/excel-format.R

For our case, it is mostly file extension: ods vs fods/xml; and then file signature (zip file's as.raw(c( "0x50", "0x4B", "0x03", "0x04" ))).

A rough implementation (unlike readxl, I don't want it to be exported):

.determine_ods_format <- function(path, guess = FALSE) {
    ext <- tolower(tools::file_ext(path))
    formats <- c(
        ods = "ods",
        fods = "fods",
        xml = "fods"
    )
    if (!isTRUE(guess)) {
        return(unname(formats[ext]))
    }
    zip_sig <- as.raw(c(
        "0x50", "0x4B", "0x03", "0x04"
    ))
    if (identical(zip_sig, readBin(path, n = 4, what = "raw"))) {
        return("ods")
    } else {
        return("fods")
    }
}

.determine_ods_format(readODS::write_ods(iris))
#> [1] "ods"
## will be available soon
##.determine_ods_format(readODS::write_fods(iris)) # fods

.determine_ods_format(readODS::write_ods(iris, tempfile(fileext = ".fods"))) ## bitte nicht nachmachen.
#> [1] "fods"
.determine_ods_format(readODS::write_ods(iris, tempfile(fileext = ".fods")), guess = TRUE)
#> [1] "ods"

Created on 2023-08-21 with reprex v2.0.2

read_ods, for example, would then be:

read_ods <- function(path,
                     sheet = 1,
                     col_names = TRUE,
                     col_types = NULL,
                     na = "",
                     skip = 0,
                     formula_as_formula = FALSE,
                     range = NULL,
                     row_names = FALSE,
                     strings_as_factors = FALSE,
                     verbose = FALSE,
                     as_tibble = TRUE,
                     .name_repair = "unique",
                     guess = FALSE) {
    ## Should use match.call but there's a weird bug if one of the variable names is 'file'
    .read_ods(path = path,
        sheet = sheet,
        col_names = col_names,
        col_types = col_types,
        na = na,
        skip = skip,
        formula_as_formula = formula_as_formula,
        range = range,
        row_names = row_names,
        strings_as_factors = strings_as_factors,
        verbose = verbose,
        as_tibble = as_tibble,
        .name_repair = .name_repair,
        flat = .determine_ods_format(path, guess = guess) == "fods")
}
@chainsawriot
Copy link
Collaborator Author

chainsawriot commented Aug 21, 2023

Or:

read_ods <- function(path,
                     sheet = 1,
                     col_names = TRUE,
                     col_types = NULL,
                     na = "",
                     skip = 0,
                     formula_as_formula = FALSE,
                     range = NULL,
                     row_names = FALSE,
                     strings_as_factors = FALSE,
                     verbose = FALSE,
                     as_tibble = TRUE,
                     .name_repair = "unique",
                     ods_format = "auto"
                     guess = FALSE) {
    if (ods_format == "auto") {
         ods_format <- .determine_ods_format(path, guess = guess)
    }
    ## Should use match.call but there's a weird bug if one of the variable names is 'file'
    .read_ods(path = path,
        sheet = sheet,
        col_names = col_names,
        col_types = col_types,
        na = na,
        skip = skip,
        formula_as_formula = formula_as_formula,
        range = range,
        row_names = row_names,
        strings_as_factors = strings_as_factors,
        verbose = verbose,
        as_tibble = as_tibble,
        .name_repair = .name_repair,
        flat = ods_format == "fods")
}

so that one can do crazy thing like:

read_ods("it_is_actually_fods_but_I_dont_want_you_to_guess.ods", ods_format = "fods")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant