Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop/long int #1008

Merged
merged 7 commits into from Aug 18, 2022
Merged

Drop/long int #1008

merged 7 commits into from Aug 18, 2022

Conversation

kylebaron
Copy link
Collaborator

@kylebaron kylebaron commented Aug 15, 2022

Summary

When data.table (and maybe readr) read in .csv files, they can create columns with class integer64. When the data frame is converted to matrix to pass into C++ layer, it can't handle that data ... pretty much undefined behavior (I'm not sure what all is going on but it looks like undefined). The fix for this is to coerce with as.double.

I am handing this by dropping any column that has a class attribute set. So valid data are bare numerics.

I am also refactoring the message that appears when columns are dropped (like character) so that the user is told the name of the column that is getting dropped along with the class of that column.

Test is added specifically for integer64 data. Previous tests for dropping character columns have been updated to look for the new message.

@kylebaron
Copy link
Collaborator Author

Broken

> data <- fread("ID,KA\n1,100000020200")
> data$TIME <- 1
> data$CMT <- 1
> mod <- house()
> mrgsim(mod, events = data)
Error: event record cmt must be between 1 and 3: 
 ID: 4.94066e-324, row: 1, cmt: 0, evid: 4.94066e-324

Fixed

> data <- fread("ID,KA\n1,100000020200")
> data$TIME <- 1
> data$CMT <- 1
> 
> mod <- house()
> mrgsim(mod, events = data)
[data-set] dropped column: KA (integer64)
Model:  housemodel 
Dim:    482 x 7 
Time:   0 to 120 
ID:     1 
    ID time GUT CENT RESP DV CP
1:   1 0.00   0    0   50  0  0
2:   1 0.25   0    0   50  0  0
3:   1 0.50   0    0   50  0  0
4:   1 0.75   0    0   50  0  0
5:   1 1.00   0    0   50  0  0
6:   1 1.00   0    0   50  0  0
7:   1 1.25   0    0   50  0  0
8:   1 1.50   0    0   50  0  0

@kylebaron kylebaron requested a review from kyleam August 16, 2022 20:11
Copy link
Contributor

@kyleam kyleam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. (Left a comment about a paste call, but even if I'm right, it doesn't change the behavior in any way.)

R/mrgindata.R Outdated
for(d in drop) {
type <- paste0(class(x[[d]]), collapse = ",")
msg <- c(context, " dropped column: ", d, " (", type, ")")
msg <- paste(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what's this paste doing? Won't the result be the same as the input?

> x <- c("foo", "bar")
> identical(x, paste(x))
[1] TRUE

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see ... I think I was intending to collapse that vector but message is doing it for me. Thanks.

@kylebaron kylebaron merged commit 5934663 into develop Aug 18, 2022
@kylebaron kylebaron deleted the drop/long-int branch August 18, 2022 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants