-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.Rmd
62 lines (33 loc) · 3.33 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
title: "Home"
site: workflowr::wflow_site
output:
workflowr::wflow_html:
toc: false
---
## In Progress
Investigations 8 and 9 implement parallel backfitting updates.
* [Investigation 8.](parallel.html) Parallelizing the backfitting algorithm shows promise.
* [Investigation 9.](parallel2.html) An additional trick is needed to parallelize the backfitting updates performed in this [MASH v FLASH GTEx analysis](https://willwerscheid.github.io/MASHvFLASH/MASHvFLASHgtex3.html).
[Investigation 10.](squarem.html) SQUAREM does poorly on FLASH backfits. DAAREM (a more recent algorithm by one of the authors of SQUAREM) does better, but offers smaller performance gains than parallelization.
[Investigation 11.](random.html) The order in which factor/loading pairs are updated (during backfitting) makes some difference, but not much.
[Investigation 12.](arbitraryV.html) To fit a FLASH model with an arbitrary error covariance matrix, I follow up on a [suggestion](https://github.com/stephenslab/flashr/issues/17) by Matthew Stephens.
[Investigation 13.](nonnegative.html) I use nonnegative priors to obtain a factorization of the GTEx donation matrix.
[Investigation 14.](scalar_tau.html) Tests an implementation of changes to the way `tau` is stored, as discussed [here](https://github.com/stephenslab/flashr/issues/83).
## Still Relevant
Notes 1 and 2 and Investigation 4 describe a way to compute the FLASH objective directly (rather than using the indirect method implemented in `flashr`).
* [Note 1.](obj_notes.html) Notes on computing the FLASH objective function. I derive an explicit expression for the KL divergence between prior and posterior.
* [Note 2.](flash_em.html) An alternate algorithm for optimizing the FLASH objective, using the explicit expression derived in the previous note.
* [Investigation 4.](alt_alg.html) The alternate algorithm agrees with FLASH with respect to both the objective and fit obtained.
Investigations 5a-b and 13 attempt to determine the best default initialization function.
* [Investigation 5a.](init_fn.html) An argument for changing the default `init_fn` to `udv_si_svd` when there is missing data and `udv_svd` otherwise. Based on an analysis of GTEx data.
* [Investigation 5b.](init_fn2.html) More evidence supporting the recommendations in Investigation 5a.
* [Investigation 13.](init_fn3.html) A counterargument. Results in Investigations 5a-b probably depend on the fact that $n$ is small ($n = 44$). For large $n$, setting `init_fn` to `udv_si` is best.
## Archived
The bug causing the problem described in Investigations 1-3 was fixed in version 0.1-13 of package `ebnm`.
* [Investigation 1.](objective.html) The FLASH objective function can behave very erratically.
* [Investigation 2.](objective2.html) This problem only occurs when using `ebnm_pn`, not `ebnm_ash`.
* [Investigation 3.](objective3.html) The objective can continue to get worse as loadings are repeatedly updated. Nonetheless, convergence takes place (from above!).
Investigations 6 and 7 deal with warmstarts, which were implemented in version 0.5-14 of `flashr`.
* [Investigation 6.](warmstart.html) Poor `optim` results can produce large decreases in the objective function. We should use warmstarts when `ebnm_fn = ebnm_pn`.
* [Investigation 7.](warmstart2.html) The advantages of warmstarts are not nearly as compelling when `ebnm_fn = ebnm_ash`.