Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treatment values change when ratio argument is used? #213

Closed
maxdrohde opened this issue Aug 28, 2023 · 6 comments · Fixed by #215
Closed

Treatment values change when ratio argument is used? #213

maxdrohde opened this issue Aug 28, 2023 · 6 comments · Fixed by #215

Comments

@maxdrohde
Copy link

maxdrohde commented Aug 28, 2023

I am confused by the following result. Why does adding the ratio argument change the treatment label from 0,1 to 1,2?

Thank you for creating this package, just curious to know if I'm missing something here!

library(simstudy)
library(data.table)

Case 1

Input

dd <- genData(10)
dd <- trtAssign(dd,
                nTrt = 2,
                ratio = c(1,3),
                balanced = FALSE,
                grpName = "tx")

print(dd$tx)

Output

 [1] 1 2 2 1 1 2 1 2 2 1

Case 2

Input

dd <- genData(10)
dd <- trtAssign(dd,
               nTrt = 2,
               #ratio = c(1,3),
               balanced = FALSE,
               grpName = "tx")

print(dd$tx)

Output

[1] 0 1 0 1 0 0 0 1 0 0
@assignUser
Copy link
Collaborator

I had a look at the code and the difference is this line:

formula <- .5
if we change this to be c(.5, .5) it produces the same out put as trtObserve uses length(formulas) to set ncat which is then used to generate the values.

That line takes advantage of the fact that trtObserve adds a 'remainder' column to the matrix that is used to generate the values but produces this inconsistent result. Unless @kgoldfeld has objections I would say it makes sense to apply the minor change and make the results (and the code) consistent.

@assignUser
Copy link
Collaborator

Also @maxdrohde thanks for the well structured issue with reprex and everything 10/10! 🎉

@kgoldfeld
Copy link
Owner

I agree that the result is not ideal, and I agree that it should be changed. I do have concerns that it might impact some users who have learned to live with it.

As an aside, if you use trtAssign as a distribution in a dataDef, the results are more what you would expect:

d <- defData(varname = "tx", formula = "1;3", dist = "trtAssign")
genData(1000, d)[, table(tx)]
tx
  0   1 
250 750 
d <- defData(varname = "tx", formula = "1;2;3", dist = "trtAssign")
genData(1000, d)[, table(tx)]
tx
  1   2   3 
167 334 499 

@kgoldfeld
Copy link
Owner

I just want to make clear what would be the ideal behavior. It seems to me that with two categories, the result should always be 0/1, and never 1/2. @maxdrohde Is that what you were thinking as well?

@maxdrohde
Copy link
Author

@kgoldfeld Yes, just using 0/1 sounds good to me. My main concern was just that it wasn't consistent. Thanks for looking into this!

@assignUser assignUser linked a pull request Aug 30, 2023 that will close this issue
@kgoldfeld
Copy link
Owner

kgoldfeld commented Aug 30, 2023

@maxdrohde Just wanted to let you know that the behavior of trtAssign behavior is now consistent so that 0/1 is generated with 2 treatment arms, but 1/2/3/... is used with more than 2 arms. The changes are available in the development version here on github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants