Skip to content

Latest commit

 

History

History
418 lines (322 loc) · 18.7 KB

TODO-freq-dep-fitness.org

File metadata and controls

418 lines (322 loc) · 18.7 KB

Miscell to-do in freq-dep-fitness branch

use LongTests for what is currently under manual?

rfitness: generate representable and local max fitness landscapes

  • as it says
  • can use the OncoSimul to Using_OncoSimulR_to_get_accessible_genotypes_trans_mats.pdf for some of it, as well as code used in Diaz-Uriarte 2018 and Diaz-Uriarte and Vasallo
    • See notes in : prstr/2022-2023/Notas-trabajo.org

samplePop: unif

  • what unif does is not necessarily intuitive. We might want uniform time between first time and last time.

Immediate and simple

  • [ ] plot of onocoSimulIndiv: default for show should be genotypes.
  • [ ] McFL or McFLD: give message if using McFL instead of McFLD.
  • [ ] initSizes: give clear error message if any initSize <= 0 and explain that initSizes not specified are taken as 0.
  • [ ] explain why this is not a problem. It seems it is (you cannot say the WT is 1000) but it is not (WT is not in the model, so it is assumed a fitness of 0). But clarify this in examples:
           r4 <- function(a, b, c, n, k, gt = c("AG", "INV", "GLY", "INV, GLY")){
      data.frame(Genotype = gt,
                 Fitness = c(paste0("(", b, " + ", a, ") / 2 + f_INV * (", b, " - ", a, ") / 2 + (f_GLY + f_INV_GLY) * ((", a, " / 2) - ", n, ")"),
                             paste0(b, " - ", c, " + f_INV * (", c, " / 2) + f_GLY * (2 / 3) * ", c, " + f_INV_GLY * (2 / 3) * ", c),
                             paste0("((",b, " / 2) - ", k, " + ", n, " + ", a, ") + f_INV * (((", b, " - ", a, ") / 2) - ", n, ") - f_GLY * (", n, " + (3 / 4) * ", a, ") + f_INV_GLY * (((", b, " - ", a, ") / 2) - ", n, ")"),
                             paste0("((", b, " / 2) - ", k, " + ", n, " + ", a, ") + f_INV * (((", b, " - ", a, ") / 2) - ", n, ") + f_GLY * (((", b, " - ", a, ") / 2) - (", c, " / 3) - ", n, ") + f_INV_GLY * (((", b," - ", a, ") / 2) - (", c, " / 6) - ", n, ")")),
                 stringsAsFactors = FALSE)
    }
    
           afe4 <- allFitnessEffects(genotFitness = r4(1, 1, 0.1, 0.02, 0.01), frequencyDependentFitness = TRUE, frequencyType = "rel")
           evalAllGenotypes(fitnessEffects = afe4, spPopSizes = c("WT" = 1000, "AG" = 1000, "INV" = 0, "GLY" = 0, "GLY, INV" = 0))
        
  • [ ] Using a constant for fitness should work. See this: the first fails, the second does not. And the second and third are correct.
df3x <- data.frame(Genotype = c("A", "B", "C", "A, C"),
                       Fitness = c("1",
                                   "1",
                                   "1.5",
                                   "2"))

afd3 <- allFitnessEffects(genotFitness = df3x,
                          frequencyDependentFitness = TRUE,
                          frequencyType = "abs")


df4x <- data.frame(Genotype = c("A", "B", "C", "A, C"),
                       Fitness = c("1 + 0 * n_A",
                                   "1",
                                   "1.5",
                                   "2"))

afd4 <- allFitnessEffects(genotFitness = df4x,
                          frequencyDependentFitness = TRUE,
                          frequencyType = "abs")

afd4$full_FDF_spec



df5x <- data.frame(Genotype = c("WT", "A", "B", "C", "A, C"),
                   Fitness = c(
                       "0*n_",
                       "1",
                       "1",
                       "1.5",
                       "2"))

afd5 <- allFitnessEffects(genotFitness = df5x,
                          frequencyDependentFitness = TRUE)

afd5$full_FDF_spec


  • [ ] Make sampleEvery = 0.25 whenever freq-dep-fitness
  • [ ] finish the README explaining nem
  • [ ] convert freq-dep into master
  • [ ] emphasize in a few places (man, vignette) that FDF expressions have to be valid exprTk expressions and give link to page of exprtk.

urgent

  • [ ] LOD with multiple initmutants
    • might not be defined
    • Why not? Just go back all the way.
  • [ ] and check POM too
    • yes, this is well defined

first

  • [ ] can we simulate a constant input rate (e.g., a toxin with a constant input flow?)
  • [ ] examples of stopping conditions with FDF: more sophisticated examples.
  • [ ] fast_peaks and peak_valley: turn into a user usable function
  • [ ] Allow to pass “f_WT” or “n_WT”? But how do you tell if a gene with that name? Well, you should not use a gene named WT
    • So: stop if any gene named WT
    • Probably do not allow for f_WT (besides, “f_” is shorter; but f_WT is more explicit)
  • [ ] make onlyCancer = FALSE the default. And ditto for detection drivers, etc.
  • [ ] Allow spPopSizes (evalAllGenotypes) to be named, and check if matches order.
  • [ ] Same for initSize.
  • [ ] Think about catching silly gene names: f_, n_, etc.
  • [ ] mutator and multispecies: think, since it would affect the species marker genes.
  • [ ] for FDF: any expression of things that do not exist do not make it bail or crash, but return a 0 for that symbol, with a warning given to the user. This can be done in R? In vignette <!– no expression on fitness zero in FDF: zz:FIXME:remove_restriction –>
  • [ ] use the full_fl_spec in fitness evaluations in FDF. That is the single thing that shows all mappings explicitly
  • [ ] FDF: specify fitness functions as f_C, n_WT, f_A,D, etc.
  • [ ] Add a warning about names of genes: avoid “”, “WT”, “Root”
  • [ ] Fix this in vignette: “Let us verify that we have specified what we think we have specified using evalAllGenotypes (we have done this repeatedly in this vignette, for example in 1.3.2 or 1.6 or 5.1. Because of the way the code works, we need to pass the populations sizes at which we want fitness evaluated in evalAllGenotypes (this we do now in allFitnessEffects, but this might be moved to evalAllGenotypes in the future).

Note that when calling allFitnessEffects we have to set the paramenter frequencyDependentFitness to TRUE. Since we are using relative frequencies, we also specify freqType = “rel” (the need to do this might be removed in the future). We will see below (10.3, 10.5, and 10.4) several examples with absolute numbers.”

  • [ ] More comprehensive testing of mutator with FDF? See test.mutatorFDF.R function
  • [ ] Why should fitness of WT be forced to 1? FDF allows to avoid that.
  • [ ] Check that changing rate of null mutation is irrelevant
    • And there is no need to add dummyMutationRate to all genomes
  • [ ] tests for new messages from mutationFromScratch (when that stabilizes)
  • [ ] in Algo2_st: if( (spP.mutation == 0.0) && !(spP.birth <= 0 && mutationPropGrowth) ) { Rcpp::Rcout << “\n Entered Algo2 with mutation rate = 0\n”; if( spP.numMutablePos != 0 ) throw std::range_error(“mutation = 0 with numMutable != 0?”);

}

probably substitute by something else: mutation = 0 should only be the consequence of a birth <= 0. In all other cases, mutation should never be 0, but dummyMutationRate. And even if birth <= 0 and mutator, not clear to me that even then we should not set a mutation=dummyMutationRate (see below).

I think we could have an exception if mutation == 0.0 (actually, less than dummyMutationRate). of course, change code where it should to have mutator produce, as smallest mutation, a dummyMutationRate. Think properly, find tests where we got these messages and exceptions, etc.

I think this would allow us to deal with stupid cases without throwing exceptions. Things like death >> birth, things with death > birth and no mutable positions left, etc, etc. (In fact, I guess Gibson and Bruck could deal with this gently, without excepction.) The simulation would just die, period.

Again, find tests where these messages happen, think them through, and then maybe change.

  • [ ] get coverage to work again. I get “undefined reference to `__gcov_init’” and similar. Look at file coverage.R and try to run it. I just played with it, googling randomly for possible options, but I have no idea what is happening. Seems related to the linker with FitnessLandscape code, though.
  • [ ] Is this code unreachable?
    • if(spP.mutation == 0) { //   spP.W <= -90.0) {
    • if( (spP.mutation == 0.0) && (both in bnb_common.cpp)
  • [X] binary files were changed: explain:
    • [X] OncoSimulR/data/examplesFitnessEffects.RData
    • [X] OncoSimulR/inst/testdata_fee.RData
    • [X] miscell-files/fee.RData
  • [X] test_mutator.R:
    • line 208: is it “genotype” or “Genotype”
  • [X] is stringr really needed? Leave it for now.
  • [X] is combn needed? (see NAMESPACE). Leave it, as it is from utils.
  • [ ] Allow to mix freq and relative freqs in expressions
  • [ ] Remove the need to specify “frequencyDependentFitness”
  • [ ] Check example under additional-examples-freq-dep-fitness. Is it worth adding?
  • [ ] stopping conditions: allow with detectionDrivers?
    • [ ] in general, verify stopping: but then, no reason it shouldn’t as that has not been touched.
      • Some code in fdf-stopping.
      • [ ] add to formal tests.
    • [ ] stop based on “stable results”?
  • [ ] all compilation warnings: fix.
  • [ ] explain in vignette why this was possible without touching OncoSimulR_init.c, RcppExports.cpp
  • [ ] double check from_genot_utils.R (why?)
  • [X] check under Windows: failing in appveyor. NO LONGER A PROBLEM

mingw issue: using Rtools35.exe. NOT AN ISSUE ANTMORE

  • The file to use in Windoze is, for system-wide packages, C:/Users/ramon/.R/Makevars.win [nope, do not go to users/whatever/Documents]
  • The variable is: CXX11FLAGS [using only CXXFLAGS was not overwriding the -O2]
  • Still, with only -O3 or with both -Wa,-mbig-obj -O3 I still get the same problems of too many sections
  • I try with “–no-multiarch” (R CMD INSTALL –no-multiarch) so it only tries to build the 64-bit version:
    • Only with “-O3”: fails with “too many section (52845)”
    • Only with “-Wa,-mbig-obj”:
    • With both “-Wa,-mbig-obj -O3 -“:

Rtools40: SOLVED

  • Several dependencies of OncoSimulR fail: igraph, new, lme4 , pbkrtest (for car), etc.
  • I give up after install one of the dependencies of igraph.
  • After all, this is still using gcc from mingw32

LLVM/clang? IS THIS RELEVANT NOW?

  • It should work, but I do not see how to use clang in windowze.
  • The report from ExpTrk’s autho indictes he can get ExprTk to work under Windowze with clang.
    • Using clang with R: these three would seem to suggest one can use clag:

virtualbox notes

  • the screen size, etc: do “view full screeen mode” and then “auto resize”. Seems to work (?)

second

  • [X] change frequencyType = unemployed by NULL.
  • [X] can we mix freq. with absolute? Yes, because any frequency can be expressed as a ratio of numbers.
  • [X] death rate: cannot become smaller than initial. That would be the default, basely one. Otherwise, it is not possible to get a collapse here, because death rate always adjusted.
    • In bnb_common.cpp, updateRatesFDFMcFarlandLog
    • fixed: McFLD as another model
  • [X] isn’t frequencyType redundant? Couldn’t we guess if from “f” or “n”? So no need for “frequencyType = ‘rel’”

to fix.

  • [X] is allMutator Effects working? Nope. It doesn’t. Now it is.

miscell

  • Add

R_forceSymbols(dll, TRUE);

in void R_init_OncoSimulR(DllInfo *info) {

in OncoSimulR_init.c

see: https://cran.r-project.org/doc/manuals/R-exts.html#Registering-native-routines

  • [X] many long tests will fail without v.1.
  • [X] test where FDF crashes when no WT and no initMutant.
  • [X] vignette: reduce running times
  • [X] The second predator-prey example sucks. In general, those examples I am not sure are properly parameterized.
  • [X] Allow for mutation = 0. Yes, exactly 0. See some of the comments below, but it should be possible. In addition to the “no positions left” we would be able to model in “ecological time”, not in evolutionary time (i.e., just ecological stuff without mutation) once we have arbitrary initial composition.

    Nope. It is not. Mutation of exactly 0 cannot work. See file ./miscell-files/BNB-mutation-0-null-mutation.org

    If I set mu = 0 directly, then I get:

pM = 1

pE = \upgamma/g

pB = 1

Now, plugin those into algorithm 2, there are two problems:

a) The binomial generation can only work if g > \upgamma (i.e., birth rate > death rate). But we should be able to sample even if death is larger than birth (extinction is not guaranteed for short periods even if death > birth).

b) The negative binomial cannot work, as it gets a 0 for the probability (actually, I think this was a typo in the paper or a terminology issue, as you want 1 - pB, not pB; your code does have 1 -pB : negbindev(m, 1.0-pB, iRand);)

  • [X] When mutation rate == dummyMutationRate, wouldn’t it make sense to shortcircuit ti_nextTime_tmax_2_st

so that we directly go to ti = tSample + epsilon; (or + 2 epsilon or whatever, something clearly larger, regardless of numerical issues, than tSample —maybe even largest float possible —but watch out in case we add something to it later; adding 10 or 20 or something of that size should be perfectly OK if the tSample are reasonably small; maybe use an epsilon that works for sure with the tSample, or even return 2*tSample, ensuring certainly larger than tSample

soemthing like: ti = 2 * tSample; if(ti <= tSample) throw_exception(“whatever”)

)

This avoids generating a random number and a pow and calling pM_f_st (sinh and cosh involved). But, especially when mutation == dummyMutationRate because numMutablePos == 0, we know this genotype will never mutate and should never mutate. Recall exchange with Mather about mutation = 0. [2015-04-08 “yet another question about your BNB algorithm ” and my question in https://stats.stackexchange.com/questions/145344/simulating-birth-death-process-with-random-numbers-from-negative-binomial]

NOPE: not a good idea. That prevents mutation to the “null” completely.

See also ./miscell-files/BNB-mutation-0-null-mutation.org

  • [X] more on mutations: I think mutationFromScratch should make sure the smallest value ever returned is dummyMutationRate.

    So all returns should be max(dummyMutationRate, whatever). NOPE! This is now properly done by returning dummyMutationRate where it should and messages of warning

    This would also mean that the

    “inline double pE_f_st(double& pM, const spParamsP& spP){ double pE = (spP.death * (1.0 - pM ) )/(spP.W - spP.death - spP.birth * pM ); if( !std::isfinite(pE) ) { DP2(spP.death); DP2(spP.birth); DP2(pM); DP2(spP.W); DP2(spP.mutation); std::string error_message = R”(pE.f: pE not finite. This is expected to happen when mutationPropGrowth = TRUE and you have have an initMutant with death >> birth, as that inevitably leads to net birth rate of 0 and mutation rate of 0)”; throw std::range_error(error_message); } return pE;

}”

would never give that exception.