New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User feedback from the test cycle before v 1.0 #155
Comments
One user has had a lot of trouble installing the package due to a mismatch between the version of haven on which R was built and the version of R that she's running. This seems to be due
This is a very specific problem with github packages installed through |
Thanks @aghaynes for reporting this. Its good to have it on the record in case anyone else runs into this. |
My thinking exactly |
Hi, the vignette is more written for users experienced in R. It is sometimes hard for me who is experienced in secuTrial but not in R. I miss some explanations for R. |
Complete date data can be transformed to xxx.date. I miss a similar transformation for incomplete dates. |
As a secuTrial user I am a bit confused about the meaning of "participants" used here. In secuTrial the "participant" is the one who enters data (user) and the one whos data are entered is the "patient". |
Hi @ClaudiaRokitta, thanks for your feedback.
SecuTrialR is primarily intended for those who will be using R to analyze data from SecuTrial... so basic knowledge of R is more or less assumed... depending what you want to do, some features are available in the shiny app (@PatrickRWright, is that still a part of the package?). There are also a lot of courses for R available of the internet, many for free...
We decided not to do anything with this as the imputation of dates is not trivial. A lot of thought needs to be put into how to handle them to produce sensible results. Note that xxx.date variables are created, they are just empty/NA for the observations without complete dates.
This has been a discussion among the developers. secuTrialR is not really intended for people that build databases in secuTrial, nor those that enter data into secuTrial. It is more intended for those that analyze the data (and are therefore not familiar with terms that secuTrial uses). The term "participant" (outside of secuTrial) is normally understood in trials as the people that take part in a trial (another term could be patient, but many trials include people that are not patients - e.g. controls - so participant is the more inclusive term). I hope that clarifies your queries... |
Feedback from a user in Bern... So in summary, good package and documentation, 2 possible to dos:
|
Really good package! All steps worked out perfectly for my data export. The My remark is rather formal:
|
Further feedback via email: I just have the following general initial installation remarks:
|
Regarding the installation: I agree. We currently have an intern here in Basel who ran into the same issues with his Windows machine thus further backing up that point. I will make some additions to the installation paragraph which will hopefully help. |
As this a point that is only relevant to github, I wouldn't add it to the vignette. Once it's on CRAN, the issue goes away... |
Still, even after publication on CRAN, some people might want to have the newest version from GitHub. Just a short sentence or two should be enough. |
I think some very brief comments will not do any harm especially because I would like the package to be easy to use as possible also for people outside of the R world. This is also relevant to the below comment. I will make a brief reference to an R book that can get people started. I.e. this one
|
Kudos to the creators of secuTrialR, a very comprehensive and well-documented package!!! Thank you. I especially found the subsetting functions very useful. data.extraction.date <- as.Date("04.02.2020", format="%d.%m.%Y") ## date of data extract
data.dir <- Sys.glob(paste("me16Engelter/datamanagement/exports/p_export_CSV_NRL08_",
format(data.extraction.date, "%Y%m%d"), "*",
sep=""))
ESTREL_data <- read_secuTrial(data.dir)
links_secuTrial(ESTREL_data)
_Error in split.default(x, x$var) : first argument must be a vector_ |
First of all, I think it’s really great that you started the downstream analysis of secuTrial data using R! I think the vignette is really a good seed for further data analysis. It is written clearly and comprehensibly and it’s very useful to have a functioning example data set to see how it’s working. Also the examples of regular expressions are nice for people to realize they can use this as tool for a better of data. I know you’ve written that “so basic knowledge of R is more or less assumed” but since you also describe some of the basic things, you might want to add one or two of the following as well (just as suggestion): For one it might be helpful to let people know that they can have a look at the details of the functions within R or via Github, in case they want to know what exactly is done by them. Another idea would be to let people know up to what point they need to re-run commands if they stopped one day and the return to the vignette the next day (I know you can save your workspace, but just in case). The export options are well described, however not really in the order they appear on the secuTrial ExportSearchTool. Maybe I’m too much into the screenshot idea (and I know these vignettes are not the place for that) but would be nice just to have a screenshot of the export settings. I was at little startled by the wording “The majority of the unrecognizable tables”-as I thought that meant some tables could be read by R; it’s probably just me but I would change it to “The majority of the probably unfamiliar tables…” Page 9: It would be good to label this and the other plots in the vignette, as for example what the color coding white/black means, just to make it easier for users. Page 10: Maybe add to “numbers for a few forms” “(number= 5)” or “The below table shows both absolute and relative numbers for 5 of the forms (n=5).” -but as you said, basic knowledge of R is required. Page 11: A short description of the seed would probably be helpful for some people. The inclusion of common warning messages is very helpful- maybe future messages could be pooled/ documented somewhere as well? The functions you implemented so far are very useful I think- hope there’s gonna be more contributors soon to implement further analysis! Trying own data sets I tested a couple of data sets with different export settings and the function works fine to tell the users for example if the dataset is the wrong format (SAS instead of CSV) or another encoding than UTF-8. If the export is rectangular (I double checked the export settings), it throws an error “Input to read_secuTrial() appears to be incompatible. Have you exported CSV format?”. This might be misleading as this export is CSV but in rectangular format…just in case people won’t read the export requirements…and I saw that actually this error message should appear: if (export_options$is_rectangular) { stop("Your export is rectangular. It can not be loaded with this function.") } However, the function only checks for the formal things as far as I have seen- but it does not check for example for completeness of entries like that the tables have the same length as the originally exported ones. Or am I wrong? I know you could probably only check this at random by for example giving the number of entries in a certain table and the user would need to re-check it manually… The diff_secuTrial() function works well and I think is very useful as a quick check for differences. Maybe you could add in the description though that hidden (from the visit plan for example) forms/ variables fall into the category of “removed” as well. I had some problems when running my own data with the functions annual_recruitment, return_random_participants and links_secuTrial. The first two were problems with finding columns Error in Unfortunately I didn’t have time to find the reason for the problem- it might be related to the fact that I tried a couple of things with the dataset; however I did not do any transformations to the original dataset. These columns are present and I figure it might just be a separator problem- maybe on import. The other functions all work fine. I might give it another try at some point. The links_secuTrial gave me following error: Error in split.default(x, x$var) : first argument must be a vector Probably a format problem as well… I hope this helps…thanks again for starting the R pipeline! |
Thanks both of you for pointing out the bug with links_secutrial. I'm looking at it now. It seems to be an issue with identifying the export settings which causes it to fail for exports without the audit trail. |
More feedback via Email:
Warning message:
In rm(list = cmd, envir = .tkplot.env) : object 'tkp.1' not found
vs <- visit_structure(my_exp)
# * Error in `[.data.frame`(r, form_order, c("formname", vis_order)) :
# undefined columns selected
links_secuTrial(my_exp)
# * Error in split.default(x, x$var) : first argument must be a vector
write_secuTrial(my_exp, format = "sav", path = tdir)
# * Error: SPSS only supports levels with <= 120 characters
# Problems: `investigatorinfo_f` |
Regarding the email
This should be solved now
Yes, this is probably due to lack of a fixed the visit structure. Not sure how we can test for that though...
I'm not sure... this is due to factor level labels being too long I assume? The error is reasonably informative, if you know the dataset then the user could reset the level labels... This type of thing is highly dependent on the data structure. |
I'm not a fan of this because it seems rather untidy. The propper way to remove the audit trail would be through the ExportSearchTool (i.e. just reexport with the option off). Thus I don't think we need an |
Fair point. I will make an appropriate adjustment.
I don't think such a function would be too useful, so I will not track the suggestion. |
This issue is intended to track feedback from the test cycle which starts today. Users may submit comments and feedback in this thread or - if they prefer - via email and I will track it here.
The deadline for review is 28th of February 2020.
The text was updated successfully, but these errors were encountered: