Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quoted csv #3

Closed
certara-mtomashevskiy opened this issue Dec 24, 2019 · 3 comments
Closed

quoted csv #3

certara-mtomashevskiy opened this issue Dec 24, 2019 · 3 comments

Comments

@certara-mtomashevskiy
Copy link

@certara-mtomashevskiy certara-mtomashevskiy commented Dec 24, 2019

if someone exports rds presented like this
write.csv(rds01, 'rds01.csv')
and then tries to import it again
method.A(path.in = getwd(), file = 'rds01', ext = 'csv')
an error is given that column names are not found:
Error in get.data(path.in = path.in, path.out = path.out, file = file, :
Column names must be given as 'subject', 'period', 'sequence', 'treatment'.

since in fact the column names are quoted
To avoid this the user should use
write.csv(rds01, 'rds01.csv', quote = FALSE)

@certara-mtomashevskiy
Copy link
Author

@certara-mtomashevskiy certara-mtomashevskiy commented Dec 24, 2019

We can avoid that if substitute

datawithdescr <- read.csv(file = full.name, sep = sep, 
            dec = dec, quote = "", header = FALSE, strip.white = TRUE, 
            na.strings = c("NA", "ND", ".", 
                "", "Missing"), stringsAsFactors = FALSE)

to

datawithdescr <- read.csv(file = full.name, sep = sep, 
            dec = dec, quote = "'\"", header = FALSE, strip.white = TRUE, 
            na.strings = c("NA", "ND", ".", 
                "", "Missing"), stringsAsFactors = FALSE)

not sure about side effects yet

@Helmut01
Copy link
Owner

@Helmut01 Helmut01 commented Dec 24, 2019

Hi Michael,

Why would one do that? The CSV files are already provided in /library/replicateBE/extdata/*.csv… Given , with the commentary header but get.data.R handles it.

  • The rds files are named S3-objects (with the factorized data.frame within). It’s not our job to prevent all possible errors a user might make. Actually the code for export should by write.csv(rds01, 'rds01.csv', quote=FALSE, row.names=FALSE).
  • However, I will assess your suggestion.
@Helmut01
Copy link
Owner

@Helmut01 Helmut01 commented Dec 25, 2019

Hi Michael,

there is a side effect.
With write.csv(rds01, 'rds01.csv') row.names are exported. After line 120 str(data) gives for export/import of rds01:
'data.frame': 298 obs. of 7 variables:
$ NA : int 1 2 3 4 5 6 7 8 9 10 ...
$ subject : chr "1" "1" "1" "1" ...
$ period : chr "1" "2" "3" "4" ...
$ sequence : chr "RTRT" "RTRT" "RTRT" "RTRT" ...
$ treatment: chr "R" "T" "R" "T" ...
$ PK : chr "2285.96" "1955.82" "1345.94" "2856.24" ...
$ logPK : chr "7.734541" "7.578565" "7.204848" "7.957261" ...
The first column is named NA (which is not a character like the others). Hence, we need:
if (typeof(data[[1]]) == "integer") data <- data[, -1]

@Helmut01 Helmut01 closed this Dec 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.