Feature request: Data cleaner and standardization functions #52

billdenney · 2022-09-29T20:53:56Z

I know that nlmixr2 (and I think that it's mainly in rxode2) does data cleaning before the model is sent to the integrator. I don't know where this is done, overall. And, I think that there would be additional needs within nlmixr2est. I think that it would be helpful to simplify, standardize, and centralize this data cleaning.

Related to #45

The text was updated successfully, but these errors were encountered:

billdenney · 2022-09-29T21:00:36Z

My overall thought is that there would likely be two functions:

getStandardColumnNames <- function(data, cols) {
...
}

setStandardColumnNames <- function(data, cols) {
...
}

Where if cols is missing, it gets all of the columns that nlmixr2 knows about with standardized names (e.g. "CMT" instead of "CmT", etc.). And the "set" version would do the renaming for standardized column names. If cols is present, then each would only operate on the requested columns.

mattfidler · 2022-09-29T21:59:18Z

I think babelmixr2 or even a separate package is a fine place to put something like this.

I am unsure what the value is above what is already provided in babelmixr2

billdenney · 2022-09-30T19:00:24Z

Moved issue from rxode2 to babelmixr2

billdenney · 2022-09-30T19:06:20Z

@mattfidler, As I'm trying to implement this for the PKNCA linkage, I'm still wanting some of the original information that isn't present when doing the direct data conversion. As an example, if the user gives an ID column as "id", I'd like to know that and to be able to use that column for the subject identifier. The rationale is that I'd like to ensure that I can have a text subject identifier (e.g. "Study-001-Site-002-Subject-0003") instead of a cleaned, numeric ID.

The PKNCA connection would still work to give initial estimates if I use the cleaned, numeric ID. But, if someone is wanting to track back to "why is this starting clearance 5 when I thought it would be 3", they would not have a link between the "ID" column that PKNCA used and the "ID" column that they originally gave.

So, I think that column name mapping function would still be helpful. I'll make a first pass, and please let me know if it should do something else to improve it.

mattfidler · 2022-10-01T02:29:07Z

I think it could be easy enough to use the nlmixrRowNumber and a conversion function to help with this. I could possibly be a thin reference to the merged, standardized dataset.

mattfidler · 2022-10-01T02:47:03Z

I am still not clear what this provides. I am sure when you get to it I will understand 😄

billdenney · 2022-10-02T00:41:06Z

😄

I just linked to the function I'm thinking of. The value for the PKNCA link is:

The user provides the id columns with unique subject identifiers (like "STUDY1-001-1001"). For interpretability in NCA results, it would be best to have those identifiers reported in the NCA.
- Or maybe, they give the name of the drug in the cmt column (realizing that it must start with a letter and cannot have dashes or anything that doesn't map to an R name in it).
The cleaned column name for id would be ID. PKNCA needs the subject identifier, so the nlmixr2-PKNCA link would go to the ID column and get numbers instead of unique subject identifiers.

Overall, I'm wanting the NCA results to be interpretable so that when they're printed out, the user can have them available for comparison to the final modeled results.

billdenney · 2022-10-18T21:29:58Z

I have an issue with the "nlmixrRowNums" column where all rows are set to 1. I'm having trouble finding where it's set. Can you please help me find where it's set?

mattfidler · 2022-10-18T22:04:39Z

https://github.com/nlmixr2/nlmixr2est/blob/0444e026dbe2d422d0a6cb56fc1f909e214cf025/R/focei.R#L1340

billdenney · 2022-10-19T01:03:54Z

Thanks. My issue was that my input was a tibble, so the length was 1. I made a PR to address that.

billdenney · 2022-10-23T02:15:06Z

This is handled now.

billdenney transferred this issue from nlmixr2/rxode2 Sep 30, 2022

billdenney mentioned this issue Oct 2, 2022

Provide initial parameter estimates using NCA via PKNCA #48

Merged

billdenney mentioned this issue Oct 19, 2022

Make .foceiPreProcessData() work with tibbles nlmixr2/nlmixr2est#262

Merged

billdenney closed this as completed Oct 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Data cleaner and standardization functions #52

Feature request: Data cleaner and standardization functions #52

billdenney commented Sep 29, 2022

billdenney commented Sep 29, 2022

mattfidler commented Sep 29, 2022

billdenney commented Sep 30, 2022

billdenney commented Sep 30, 2022

mattfidler commented Oct 1, 2022

mattfidler commented Oct 1, 2022

billdenney commented Oct 2, 2022

billdenney commented Oct 18, 2022

mattfidler commented Oct 18, 2022

billdenney commented Oct 19, 2022

billdenney commented Oct 23, 2022

Feature request: Data cleaner and standardization functions #52

Feature request: Data cleaner and standardization functions #52

Comments

billdenney commented Sep 29, 2022

billdenney commented Sep 29, 2022

mattfidler commented Sep 29, 2022

billdenney commented Sep 30, 2022

billdenney commented Sep 30, 2022

mattfidler commented Oct 1, 2022

mattfidler commented Oct 1, 2022

billdenney commented Oct 2, 2022

billdenney commented Oct 18, 2022

mattfidler commented Oct 18, 2022

billdenney commented Oct 19, 2022

billdenney commented Oct 23, 2022