Proposal: method for retrieving initial parameter estimates #29

barrettk · 2024-01-11T16:54:37Z

The original motivation stems from here, though I thought it would be nice to have a neat way to return the initial parameter estimates before this step. Additionally, I think scientists might appreciate the ability to easily retrieve and format initial estimates in some reports, that could be fed into pmtables or some other formatting function.

Requirements

For Theta parameters:
- Must be able to retrieve initial estimates, upper and lower bounds (if any), and whether any of the values are FIXED
For Matrix-type parameters (OMEGA and SIGMA)
- Must be able to format all records as a combined matrix if multiple records are found (minus the old method of specifying priors)
- Must be able to return which indices are FIXED. In an offline discussion, it was noted that FIX in block matrices fixes the entire record, though specific parameters can be fixed for diagonal matrices. It was suggested that a logical matrix might be the way to go here.
Some additional requirements/notes can be found in this discussion.

Initial Prototype

Note that this prototype uses dependencies that would not make sense to utilize in this package; I just used them for nicer formatting in a bbr environment. See this comment for some examples.

Code

#' Retrieve and format the initial estimates
#'
#' @param .mod
#'
#' @export
get_param_inits <- function(.mod){

  test_nmrec_version(.min_version = "0.3.0")
  check_model_object(.mod, "bbi_nonmem_model")


  ctl <- nmrec::read_ctl(get_model_path(.mod))

  recs <- list(
    thetas = nmrec:::create_param_index(ctl, "theta"),
    omegas = nmrec:::create_param_index(ctl, "omega"),
    sigmas = nmrec:::create_param_index(ctl, "sigma")
  )

  recs_values <- list(
    theta = fmt_record_values(param_index = recs$thetas),
    omega = fmt_record_values(param_index = recs$omegas),
    sigma = fmt_record_values(param_index = recs$sigmas)
  )


  return(recs_values)
}


#' Format the initial estimates for each record
#'
#' Matrix type records will be formatted as a block diagonal matrix. Multiple
#' records of the same type will be concatenated diagonally.
#'
#' @param param_index object returned from `create_param_index`.
#'
#' @keywords internal
fmt_record_values <- function(param_index){

  details <- param_index$details
  records <- param_index$records
  name <- param_index$name

  # Only grab initial values for theta bounds
  val_recs <- purrr::map(details, function(detail){
    val_rec <- purrr::map_dfr(detail$popts, function(.x){
      vals <- purrr::keep(.x$values, function(x_vals){
        inherits(x_vals, "nmrec_option")
      })

      purrr::map_dfr(vals, function(val){
        tibble::tibble(name = val$name, value = as.character(val$value))
      }) %>% tidyr::pivot_wider()
    })

    if (!identical(name, "theta")) {
      len_expected <- detail$size
      val_rec <- vector_to_matrix_ltri(
        as.numeric(val_rec$init), n = len_expected, type = detail$type
      )
    }
    return(val_rec)
  })

  # TODO: filter out or error if prior records found before this step/function as a whole
  if (identical(name, "theta")) {
    val_recs_fmt <- val_recs %>% purrr::list_rbind()  %>% dplyr::mutate(index = 1:n())
  }else{
    val_recs_fmt <- list(
      matrices = val_recs,
      fixed = purrr::map(records, matrix_is_fixed)
    )

    # Create combined matrix if multiple records found
    if(length(val_recs) != 1){
      val_recs_fmt$full_matrix <- as.matrix(Matrix::bdiag(val_recs))
    }
  }

  return(val_recs_fmt)
}


#' Format vector as a lower diagonal matrix
#'
#' @param vec a vector in row-major order of values.
#' @param n matrix dimension.
#' @param type type of matrix. Either "diagonal" or "block".
#'
#' @keywords internal
vector_to_matrix_ltri <- function(vec, n, type = c("diagonal", "block")) {
  type <- match.arg(type)
  mat <- matrix(0, nrow = n, ncol = n)
  if(type == "diagonal"){
    diag(mat) <- vec
  }else{
    mat[upper.tri(mat, diag = TRUE)] <- vec
  }
  return(t(mat))
}


matrix_is_fixed <- function(.record){
  any_fixed <- purrr::some(.record$values, function(rec_opt){
    if(inherits(rec_opt, "nmrec_option_nested")){
      # Handling for diagonal matrices
      purrr::some(rec_opt$values, function(rec_opt){
        inherits(rec_opt, "nmrec_option_flag") && rec_opt$name == "fixed"
      })
    }else{
      # Handling for block matrices
      inherits(rec_opt, "nmrec_option_flag") && rec_opt$name == "fixed"
    }
  })
}

nmrec prototype

After putting together this prototype and discussing with @kyleam, he put together a prototype more in line with the nmrec code base. From my testing so far, the nmrec prototype Kyle put together seems to return everything I need to put together a bbr wrapper.

I think the only thing that's missing, is a method for separating priors. Im not really sure if prior records should count as initial values, but I dont think erroring out makes sense in this case (unlike in set_{theta,omega,sigma} functions). I wonder if those should be an attribute (probably not) or returned in some other fashion. Im not opposed to just filtering them out, and mentioning that in the docs though.

As Kyle noted in the TODO sections, I think we may have to figure out better names to avoid the overlap with bbr functions (get_{theta,omega,sigma} functions). Somewhat of a shame because I like the naming convention in both cases, but obviously that's a good point given that it's quite possible a user could have both packages loaded.

The text was updated successfully, but these errors were encountered:

barrettk · 2024-01-11T16:55:18Z

@kyleam
What are your thoughts on returning all three types (c("init", "low", "up")) in nmrec::get_theta()? I can see some motivation not to, but at the moment I would probably make a map call to pull all three out. I assume the motivation was just to keep it as simple/precise as possible, but just curious how strongly you felt about that.

Im currently doing something like this:

get_theta_inits <- function(ctl){
  # Tabulate initial values and bounds
  theta_inits <- purrr::map_dfc(c("init", "low", "up"), function(type){
    theta_inits <- nmrec::get_theta(ctl, type = type)
    tibble::tibble(!!rlang::sym(type) := theta_inits)
  })
  
  # Tabulate presence of FIXED parameters
  theta_inits_fixed <- nmrec::get_theta(ctl, mark_flags = "fix")
  theta_inits$fixed <- attr(theta_inits_fixed, "nmrec_flags")$fixed
}

> theta_inits
# A tibble: 12 × 4
    init   low    up fixed
   <dbl> <dbl> <dbl> <lgl>
 1   0.7    NA    NA TRUE 
 2   0.7    NA    NA FALSE
 3   2      NA    NA FALSE
 4   2      NA    NA FALSE
 5   0.7    NA    NA FALSE
 6   0.7    NA    NA FALSE
 7   2      NA    NA FALSE
 8   2      NA    NA FALSE
 9   0.7    NA    NA FALSE
10   0.7    NA    NA FALSE
11   0.3    NA    NA FALSE
12   4      NA    NA TRUE

FWIW im not proposing returning a dataframe, just showcasing that im executing the call a total of 4 times to get the information I need (I realize I could do it 3 times, but this seemed more organized/readable)

kyleam · 2024-01-11T17:52:36Z

What are your thoughts on returning all three types (c("init", "low", "up")) in get_theta()?

I'd prefer not to. I'd like to avoid arguments that change the structure of the return value. We could make get_theta always return a list of vectors for within-get_theta consistency, but that makes the single-type case more awkward and more importantly makes it inconsistent with get_omega and get_sigma.

at the moment I would probably make a map call to pull all three out

Hmm I'm surprised you want to map to a list result. To jitter, presumably you want access to all of those matrices individually, so I think it'd be more to the point and readable to do something like

theta_init <- nmrec::get_theta(..., mark_flags = "fixed", type = "init")
theta_low <- nmrec::get_theta(..., type = "low")
theta_up <- nmrec::get_theta(..., type = "up")

But anyway, that's of course just subjective and depends on context. In either case, my view is that the caller (so bbr in this case) should do the handling for getting multiple types.

kyleam · 2024-01-11T17:56:30Z

Must be able to format all records as a combined matrix if multiple records are found (minus the old method of specifying priors)

As with the set_* functions, I'm not convinced we should try to support the non-informative record names. We should of course clearly document that they are not supported. In the jitter context, you'd currently get to the error once you try to feed the result to set_*. That may be good enough.

barrettk · 2024-01-11T18:00:51Z

I had a thought about prior blocks btw. I think it could be worth looking at bbr::param_labels() + bbr::apply_indices. It seems like these functions don't discriminate between initial estimates and prior estimates. My intuition tells me we should have some synergy between bbr and nmrec here, but of course there are a few ways we could accomplish that (including amending the bbr function to not return priors). Curious what your thoughts are on this (and whether you think it even matters).

> MODEL_DIR_C <- system.file("model", "nonmem", "complex",   package = "bbr")
> .mod <- read_model(file.path(MODEL_DIR_C, "example2_saemimp"))
> .mod %>% param_labels() %>% apply_indices() %>% head(13)
# A tibble: 13 × 5
   parameter_names label unit   type  param_type
   <chr>           <chr> <chr>  <chr> <chr>     
 1 THETA1          ""    "LCLM" ""    THETA     
 2 THETA2          ""    "LCLF" ""    THETA     
 3 THETA3          ""    "CLAM" ""    THETA     
 4 THETA4          ""    "CLAF" ""    THETA     
 5 THETA5          ""    "LV1M" ""    THETA     
 6 THETA6          ""    "LV1F" ""    THETA     
 7 THETA7          ""    "V1AM" ""    THETA     
 8 THETA8          ""    "V1AF" ""    THETA     
 9 THETA9          ""    "MU_3" ""    THETA     
10 THETA10         ""    "MU_4" ""    THETA     
11 THETA11         ""    "SDSL" ""    THETA     
12 THETA12         ""    ""     ""    THETA     # this is a prior record
13 OMEGA(1,1)      ""    ""     "[p]" OMEGA

kyleam · 2024-01-11T18:35:44Z

Sorry, I don't understand what the specific question is. (I think you're saying something beyond this, but if part of it is that param_labels should rely on nmrec for parsing, I agree that that would be a good direction to go. With a quick glance, it probably runs into several of the parsing issues that nmrec is meant to help users avoid.)

barrettk · 2024-01-11T18:53:52Z

I was in part suggesting we could refactor them to use nmrec, though mostly just that I think they should both either A) return priors, or B) filter them out. i.e. that they should be consistent, whichever direction makes more sense. I would lean towards filtering them out.

As an aside, I was wondering if we could add the option to return the block type for matrices, which would return either "diagonal" or "block". The only way I see I can get that information currently, is via create_param_index, which I obviously cant rely on. I know the matrix type isn't actually a 'flag', but I was looking to be able to generate indices for the initial estimates.

nmrec::get_omega(ctl, mark_flags = c("fix", "type"))

kyleam · 2024-01-11T19:02:20Z

I was in part suggesting we could refactor them to use nmrec [...]

Again, I'm not keen on doing anything about the non-informative record names at the nmrec level. If you really want to do it, I'd suggest do it on the bbr side first making use of what nmrec will give you publicly. But, as I mentioned before, I don't think looking at the $PRIOR record is sufficient to cover all cases.

but I was looking to be able to generate indices for the initial estimates.

It's returning the full matrix, so don't you already know each index?

barrettk · 2024-01-11T19:10:23Z

Again, I'm not keen on doing anything at the nmrec level for this. If you really want to do it, I'd suggest do it on the bbr side first making use of what nmrec will give you publicly. But, as I mentioned before, I don't think looking at the $PRIOR record is sufficient to cover all cases.

Yeah I think modifying param_labels()/apply_indices makes the most sense.

It's returning the full matrix, so don't you already know the index?

Yes I could get the indices that way, but I was trying to take advantage of those two bbr helpers so I could bind the lower triangular matrix to the data returned from param_labels()/apply_indices, just so we could add some of that additional information. I currently have to pass c(block(4), block(4)) to get that data:

> param_labels(.mod) %>% apply_indices(.omega = c(block(4), block(4))) %>% 
dplyr::filter(param_type != "THETA") %>% head(15)
# A tibble: 15 × 5
   parameter_names label unit  type  param_type
   <chr>           <chr> <chr> <chr> <chr>     
 1 OMEGA(1,1)      ""    ""    [p]   OMEGA     
 2 OMEGA(2,1)      ""    ""    [f]   OMEGA     
 3 OMEGA(2,2)      ""    ""    [p]   OMEGA     
 4 OMEGA(3,1)      ""    ""    [f]   OMEGA     
 5 OMEGA(3,2)      ""    ""    [f]   OMEGA     
 6 OMEGA(3,3)      ""    ""    [p]   OMEGA     
 7 OMEGA(4,1)      ""    ""    [f]   OMEGA     
 8 OMEGA(4,2)      ""    ""    [f]   OMEGA     
 9 OMEGA(4,3)      ""    ""    [f]   OMEGA     
10 OMEGA(4,4)      ""    ""    [p]   OMEGA     
11 OMEGA(5,5)      ""    ""    [A]   OMEGA     
12 OMEGA(6,5)      ""    ""    [A]   OMEGA     
13 OMEGA(6,6)      ""    ""    [A]   OMEGA     
14 OMEGA(7,5)      ""    ""    [A]   OMEGA     
15 OMEGA(7,6)      ""    ""    [A]   OMEGA

Those functions only work with defined values, so the following would not work with the full OMEGA matrix:

> param_labels(.mod) %>% apply_indices(.omega = c(block(8))) %>% dplyr::filter(param_type != "THETA") %>% head(15)
Error in apply_indices(., .omega = c(block(8))) :
Found 20 parameters for OMEGA but `.omega` argument specifies 36

Perhaps this is a bit overkill and out of scope though

kyleam · 2024-01-11T19:33:56Z

I was trying to take advantage of those two bbr helpers so I could bind the lower triangular matrix to the data returned from param_labels()/apply_indices

Let's see what the options are on bbr's end for handling the full matrix before adding redundant info on nmrec's end. If we decide that using these helpers is really the cleanest/easiest way to go, then we can revisit tacking on additional info.

seth127 · 2024-01-11T19:50:03Z

@barrettk I haven't fully digested all of this, but I'll jump in to say that I would prefer not to rely on bbr's param_labels() or apply_indices() functions. Those are pretty old and were intended to help with parsing parameter metadata from an old convention that we don't actively use anymore. Unless we really need to, I don't think we should have newer functionality relying on them. Does that make sense?

As an addendum, I think converting these to use nmrec could be a good idea, but is probably low priority for the reasons I stated in the previous paragraph. These currently rely on this old raw string parsing function which, as @kyleam notes, likely hits some of the pitfalls that nmrec was designed to avoid.

bbr is adding a feature to jitter the initial estimates from a control stream. To do so, it could loop through the initial estimate options and jitter each one, but it's probably more convenient to be able to work with the values in their native form (i.e. a numeric vector for THETA or square matrix for OMEGA/SIGMA). Note that naming these new functions get_{theta,omega,sigma} would align nicely with the existing set_{theta,omega,sigma} functions, but the get_* names would collide with bbr functions if a user attaches both bbr and nmrec. Re: #29

kyleam mentioned this issue Jan 18, 2024

add functions for getting inits #30

Merged

kyleam closed this as completed in #30 Jan 18, 2024

barrettk mentioned this issue Jan 23, 2024

Get initial parameter estimates and tweak initial parameter estimates metrumresearchgroup/bbr#635

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: method for retrieving initial parameter estimates #29

Proposal: method for retrieving initial parameter estimates #29

barrettk commented Jan 11, 2024 •

edited

Loading

barrettk commented Jan 11, 2024 •

edited

Loading

kyleam commented Jan 11, 2024

kyleam commented Jan 11, 2024 •

edited

Loading

barrettk commented Jan 11, 2024 •

edited

Loading

kyleam commented Jan 11, 2024

barrettk commented Jan 11, 2024

kyleam commented Jan 11, 2024 •

edited

Loading

barrettk commented Jan 11, 2024 •

edited

Loading

kyleam commented Jan 11, 2024

seth127 commented Jan 11, 2024

Proposal: method for retrieving initial parameter estimates #29

Proposal: method for retrieving initial parameter estimates #29

Comments

barrettk commented Jan 11, 2024 • edited Loading

Requirements

Initial Prototype

nmrec prototype

barrettk commented Jan 11, 2024 • edited Loading

kyleam commented Jan 11, 2024

kyleam commented Jan 11, 2024 • edited Loading

barrettk commented Jan 11, 2024 • edited Loading

kyleam commented Jan 11, 2024

barrettk commented Jan 11, 2024

kyleam commented Jan 11, 2024 • edited Loading

barrettk commented Jan 11, 2024 • edited Loading

kyleam commented Jan 11, 2024

seth127 commented Jan 11, 2024

barrettk commented Jan 11, 2024 •

edited

Loading

barrettk commented Jan 11, 2024 •

edited

Loading

kyleam commented Jan 11, 2024 •

edited

Loading

barrettk commented Jan 11, 2024 •

edited

Loading

kyleam commented Jan 11, 2024 •

edited

Loading

barrettk commented Jan 11, 2024 •

edited

Loading