Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User feedback from the test cycle before v 1.0 #155

Closed
PatrickRWright opened this issue Jan 13, 2020 · 22 comments · Fixed by #175
Closed

User feedback from the test cycle before v 1.0 #155

PatrickRWright opened this issue Jan 13, 2020 · 22 comments · Fixed by #175

Comments

@PatrickRWright
Copy link
Collaborator

This issue is intended to track feedback from the test cycle which starts today. Users may submit comments and feedback in this thread or - if they prefer - via email and I will track it here.

The deadline for review is 28th of February 2020.

@PatrickRWright PatrickRWright changed the title User feedback from the test cycle User feedback from the test cycle before v 1.0 Jan 13, 2020
@PatrickRWright
Copy link
Collaborator Author

#156

@aghaynes
Copy link
Member

One user has had a lot of trouble installing the package due to a mismatch between the version of haven on which R was built and the version of R that she's running.
* installing *source* package 'secuTrialR' ... ** using staged installation ** R ** inst ** byte-compile and prepare package for lazy loading Fehler: (konvertiert von Warnung) Paket 'haven' wurde unter R Version 3.6.1 erstellt Ausführung angehalten

This seems to be due remotes:
This issue has the same issue and leads to this section of remotes helpfile, which gives the solution...

Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS="true")

This is a very specific problem with github packages installed through remotes though, rather than secuTrialR.

@PatrickRWright
Copy link
Collaborator Author

Thanks @aghaynes for reporting this. Its good to have it on the record in case anyone else runs into this.

@aghaynes
Copy link
Member

Its good to have it on the record in case anyone else runs into this.

My thinking exactly

@ClaudiaRokitta
Copy link

Hi, the vignette is more written for users experienced in R. It is sometimes hard for me who is experienced in secuTrial but not in R. I miss some explanations for R.

@ClaudiaRokitta
Copy link

Complete date data can be transformed to xxx.date. I miss a similar transformation for incomplete dates.

@ClaudiaRokitta
Copy link

As a secuTrial user I am a bit confused about the meaning of "participants" used here. In secuTrial the "participant" is the one who enters data (user) and the one whos data are entered is the "patient".

@aghaynes
Copy link
Member

Hi @ClaudiaRokitta, thanks for your feedback.

Hi, the vignette is more written for users experienced in R. It is sometimes hard for me who is experienced in secuTrial but not in R. I miss some explanations for R.

SecuTrialR is primarily intended for those who will be using R to analyze data from SecuTrial... so basic knowledge of R is more or less assumed... depending what you want to do, some features are available in the shiny app (@PatrickRWright, is that still a part of the package?). There are also a lot of courses for R available of the internet, many for free...

Complete date data can be transformed to xxx.date. I miss a similar transformation for incomplete dates.

We decided not to do anything with this as the imputation of dates is not trivial. A lot of thought needs to be put into how to handle them to produce sensible results. Note that xxx.date variables are created, they are just empty/NA for the observations without complete dates.

As a secuTrial user I am a bit confused about the meaning of "participants" used here. In secuTrial the "participant" is the one who enters data (user) and the one whos data are entered is the "patient".

This has been a discussion among the developers. secuTrialR is not really intended for people that build databases in secuTrial, nor those that enter data into secuTrial. It is more intended for those that analyze the data (and are therefore not familiar with terms that secuTrial uses). The term "participant" (outside of secuTrial) is normally understood in trials as the people that take part in a trial (another term could be patient, but many trials include people that are not patients - e.g. controls - so participant is the more inclusive term).

I hope that clarifies your queries...

@aghaynes
Copy link
Member

Feedback from a user in Bern...
Really useful package... The vignette seems to be OK and the codes transfer well to a project she works on. She likes the warning messages. Wonders whether the warnings could be saved to a dataframe and returned in the object (would form the basis of a query list.... particularly if the participant ID (mnppid/mnpaid) were provided). linkage plots when audit trail is exported is not really helpful (which we know about). She tried to remove the at files manually and then use the linkage function, but it didnt work (as the class was wrong) - consider having a as.secuTrialdata function for such instances?

So in summary, good package and documentation, 2 possible to dos:

  1. add warning messages to a dataframe in the returned object
  2. as.secuTrialdata method

@Inessakraft
Copy link

Really good package! All steps worked out perfectly for my data export. The
return_random_participants-function is great!
I had some common issues that are already listed in frequent warning
messages. I checked the variable that was affected and I remembered that I
renamed the label of the variable after the first release.

My remark is rather formal:

  • Finding changes/differences in exports -> I would rename this in "Finding
    changes/differences in eCRF implementations between two versions"
    --> In my case I needed to change the export options and include the
    structure and document editing. That's why I thought that the
    diff_secuTrial-function would also show me the difference between my
    two export versions that did only differ in the export options...
    But maybe this is a good idea for a future extension of the package: an
    "export options diff function"

@PatrickRWright PatrickRWright pinned this issue Feb 3, 2020
@PatrickRWright
Copy link
Collaborator Author

PatrickRWright commented Feb 20, 2020

Further feedback via email:
Great vignette, thanks a lot for sharing. I worked through it with your example file and about half of it with my own export file, which worked very well. For me, as a data manager, I especially liked the graphs that can easily be plotted. And it was very easy to follow the vignette.

I just have the following general initial installation remarks:

  • I would suggest to add which version of R is needed (I tried it first on my office computer and was stuck, since I needed to update everything - after a stop at the IT office (the ones with the admin rights) it runs fine now)
  • I worked through the vignette with my private computer. And first I was again stuck, because 'Rtools' was missing (and is apparently needed for Windows computers and devtools). I would therefore suggest to add a remark (https://cran.r-project.org/bin/windows/Rtools/).
  • Then I was surprised about the amount of dependencies and how long it took to install the 'devtools' (about 20-30min). I am aware that this is only the first time you run it, but it might be worth mentioning those points in the 'Installing' chapter. I also tried running it under Linux, where it run through a bit faster, but also there it took quite a while getting all the necessary dependencies.

@PatrickRWright
Copy link
Collaborator Author

Regarding the installation: I agree. We currently have an intern here in Basel who ran into the same issues with his Windows machine thus further backing up that point. I will make some additions to the installation paragraph which will hopefully help.

@aghaynes
Copy link
Member

aghaynes commented Feb 20, 2020

Regarding the installation...

As this a point that is only relevant to github, I wouldn't add it to the vignette. Once it's on CRAN, the issue goes away...

@markomi
Copy link
Collaborator

markomi commented Feb 20, 2020

As this a point that is only relevant to github, I wouldn't add it to the vignette. Once it's on CRAN, the issue goes away...

Still, even after publication on CRAN, some people might want to have the newest version from GitHub. Just a short sentence or two should be enough.

@PatrickRWright
Copy link
Collaborator Author

As this a point that is only relevant to github, I wouldn't add it to the vignette. Once it's on CRAN, the issue goes away...

Still, even after publication on CRAN, some people might want to have the newest version from GitHub. Just a short sentence or two should be enough.

I think some very brief comments will not do any harm especially because I would like the package to be easy to use as possible also for people outside of the R world.

This is also relevant to the below comment. I will make a brief reference to an R book that can get people started. I.e. this one

Hi, the vignette is more written for users experienced in R. It is sometimes hard for me who is experienced in secuTrial but not in R. I miss some explanations for R.

@suvi-subra
Copy link

suvi-subra commented Feb 24, 2020

Kudos to the creators of secuTrialR, a very comprehensive and well-documented package!!! Thank you. I especially found the subsetting functions very useful.
One issue I faced while testing the package with another dataset:

data.extraction.date <- as.Date("04.02.2020", format="%d.%m.%Y") ## date of data extract

data.dir <- Sys.glob(paste("me16Engelter/datamanagement/exports/p_export_CSV_NRL08_", 
                                           format(data.extraction.date, "%Y%m%d"), "*",
                                           sep=""))

ESTREL_data <- read_secuTrial(data.dir)

links_secuTrial(ESTREL_data) 

_Error in split.default(x, x$var) : first argument must be a vector_

@OliviaEbnerIAS
Copy link

First of all, I think it’s really great that you started the downstream analysis of secuTrial data using R!

I think the vignette is really a good seed for further data analysis. It is written clearly and comprehensibly and it’s very useful to have a functioning example data set to see how it’s working.

Also the examples of regular expressions are nice for people to realize they can use this as tool for a better of data.

I know you’ve written that “so basic knowledge of R is more or less assumed” but since you also describe some of the basic things, you might want to add one or two of the following as well (just as suggestion):

For one it might be helpful to let people know that they can have a look at the details of the functions within R or via Github, in case they want to know what exactly is done by them.

Another idea would be to let people know up to what point they need to re-run commands if they stopped one day and the return to the vignette the next day (I know you can save your workspace, but just in case).

The export options are well described, however not really in the order they appear on the secuTrial ExportSearchTool. Maybe I’m too much into the screenshot idea (and I know these vignettes are not the place for that) but would be nice just to have a screenshot of the export settings.

I was at little startled by the wording “The majority of the unrecognizable tables”-as I thought that meant some tables could be read by R; it’s probably just me but I would change it to “The majority of the probably unfamiliar tables…”

Page 9: It would be good to label this and the other plots in the vignette, as for example what the color coding white/black means, just to make it easier for users.

Page 10: Maybe add to “numbers for a few forms” “(number= 5)” or “The below table shows both absolute and relative numbers for 5 of the forms (n=5).” -but as you said, basic knowledge of R is required.

Page 11: A short description of the seed would probably be helpful for some people.

The inclusion of common warning messages is very helpful- maybe future messages could be pooled/ documented somewhere as well?

The functions you implemented so far are very useful I think- hope there’s gonna be more contributors soon to implement further analysis!

Trying own data sets

I tested a couple of data sets with different export settings and the function works fine to tell the users for example if the dataset is the wrong format (SAS instead of CSV) or another encoding than UTF-8.

If the export is rectangular (I double checked the export settings), it throws an error “Input to read_secuTrial() appears to be incompatible. Have you exported CSV format?”. This might be misleading as this export is CSV but in rectangular format…just in case people won’t read the export requirements…and I saw that actually this error message should appear:

if (export_options$is_rectangular) {

stop("Your export is rectangular. It can not be loaded with this function.")

}

However, the function only checks for the formal things as far as I have seen- but it does not check for example for completeness of entries like that the tables have the same length as the originally exported ones. Or am I wrong?

I know you could probably only check this at random by for example giving the number of entries in a certain table and the user would need to re-check it manually…

The diff_secuTrial() function works well and I think is very useful as a quick check for differences. Maybe you could add in the description though that hidden (from the visit plan for example) forms/ variables fall into the category of “removed” as well.

I had some problems when running my own data with the functions

annual_recruitment, return_random_participants and links_secuTrial.

The first two were problems with finding columns

Error in [.data.frame(cn, , c("mnpvisstartdate", "mnpctrid")) :
undefined columns selected

Unfortunately I didn’t have time to find the reason for the problem- it might be related to the fact that I tried a couple of things with the dataset; however I did not do any transformations to the original dataset. These columns are present and I figure it might just be a separator problem- maybe on import. The other functions all work fine. I might give it another try at some point.

The links_secuTrial gave me following error:

Error in split.default(x, x$var) : first argument must be a vector

Probably a format problem as well…

I hope this helps…thanks again for starting the R pipeline!

@aghaynes
Copy link
Member

@suvi-subra, @OliviaEbnerIAS

Thanks both of you for pointing out the bug with links_secutrial. I'm looking at it now. It seems to be an issue with identifying the export settings which causes it to fail for exports without the audit trail.

@PatrickRWright
Copy link
Collaborator Author

More feedback via Email:

  • Die Vignette liest sich sehr gut und ich finde sie sehr gut gelungen. Wunderbar mal eine gut geschriebene Vignette durchzuarbeiten!
  • Auch habe ich viel davon profitiert, denn nun kenne ich deutlich mehr Optionen des Pakets als zuvor. Ich finde es sehr, sehr nütlich!
  • Sprachliches Detail: auf Seite 10 könnte der Satz "For a more participant id centred statistic you can perform the following." sprachlich noch verbessert werden. Ich musste ihn 2-3 mal lesen um ihn zu verstehen.
  • Bei der Funktion return_random_participants habe ich eine Warnung erhalten, die ich nicht verstehe:
Warning message:
In rm(list = cmd, envir = .tkplot.env) : object 'tkp.1' not found
  • Abschnitt Frequent warning messages: super, dass es den Abschnitt gibt! Die momentanen Erklärungen sind noch etwas knapp und sagen nicht viel mehr als man von der Warnung her erfährt. Ev. könntet ihr Beispiele machen mit einem Export der diese Probleme aufweist?

  • visit plan and form status: das gibt bei mir eine Fehlermeldung. Ev. weil der Visitenplan nicht fixiert ist? Dann könnte die Fehlermeldung informativer sein.

vs <- visit_structure(my_exp)
# * Error in `[.data.frame`(r, form_order, c("formname", vis_order)) :
# undefined columns selected
  • Form linkage: geht bei mir auch nicht
links_secuTrial(my_exp)
# * Error in split.default(x, x$var) : first argument must be a vector
  • Conversion to SPSS: geht bei mir nicht
write_secuTrial(my_exp, format = "sav", path = tdir)
# * Error: SPSS only supports levels with <= 120 characters
# Problems: `investigatorinfo_f` 

@aghaynes
Copy link
Member

aghaynes commented Mar 5, 2020

Regarding the email

Form linkage: geht bei mir auch nicht
links_secuTrial(my_exp)
Error in split.default(x, x$var) : first argument must be a vector

This should be solved now

visit plan and form status: das gibt bei mir eine Fehlermeldung. Ev. weil der Visitenplan nicht fixiert ist? Dann könnte die Fehlermeldung informativer sein.
vs <- visit_structure(my_exp)
Error in [.data.frame(r, form_order, c("formname", vis_order)) :
undefined columns selected

Yes, this is probably due to lack of a fixed the visit structure. Not sure how we can test for that though...

Conversion to SPSS: geht bei mir nicht
write_secuTrial(my_exp, format = "sav", path = tdir)
Error: SPSS only supports levels with <= 120 characters
Problems: investigatorinfo_f

I'm not sure... this is due to factor level labels being too long I assume? The error is reasonably informative, if you know the dataset then the user could reset the level labels... This type of thing is highly dependent on the data structure.

@PatrickRWright
Copy link
Collaborator Author

She tried to remove the at files manually and then use the linkage function, but it didnt work (as the class was wrong) - consider having a as.secuTrialdata function for such instances?

I'm not a fan of this because it seems rather untidy. The propper way to remove the audit trail would be through the ExportSearchTool (i.e. just reexport with the option off). Thus I don't think we need an as.secuTrialdata function.

@PatrickRWright
Copy link
Collaborator Author

* Finding changes/differences in exports -> I would rename this in "Finding
  changes/differences in eCRF implementations between two versions"

Fair point. I will make an appropriate adjustment.

  ... idea for a future extension of the package: an
  "export options diff function"

I don't think such a function would be too useful, so I will not track the suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants