-
Notifications
You must be signed in to change notification settings - Fork 2
/
FAQ
90 lines (73 loc) · 4.79 KB
/
FAQ
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
Frequently asked questions
==========================
Q: I found the right motifs, but the reported p-values are not significant.
What does this mean?
A: Unfortunately, this sometimes happens. This has for example been found for
CLIP data of the HuR motif. In this case real biological sequences were used as
controls, and the right motif was reported. While the motif is occurring more
frequently in signal sequences than in control sequences, the differences
between relative occurrence frequencies is low. Apparently, this is due to the
low sequence complexity of the motif, which leads to an elevated, and perhaps
unusually high control sequence occurrence frequency. Due to this low relative
frequency occurrence difference the resulting p-value is not good enough to
reject the hypothesis that the found motif is independent of the signal and
control conditions.
In this case the found motif is assumed to be the true motif. So this signifies
that the p-values computed by Discrover may sometimes be overly strict.
We recommend in cases like these (or actually: generally) to take into account
all available evidence, including prior knowledge, i.e. in the case of the HuR
motif the fact that this motif is assumed to be true motif. It may not always
be possible to automate in a simple way, but Bayesian reasoning should guide
the way. Sometimes it is just necessary to ignore conclusions of tools of auto-
mated reasoning as implemented in the p-value calculation in our tool, and
override their results with a judgement call. In this regard, [1] is a
recommended resource.
[1] Edwin Thompson Jaynes (2003). Probability Theory: The Logic of Science.
Cambridge University Press.
Q: Why is it that sometimes running the HMM parameter learning with identical
parameters and multiple threads does not yield identical results?
A: When using multiple threads, work gets distributed dynamically to the CPU
cores. As the expected statistics, and the gradient are computed and added
within each thread, the assignment of chunks of work to the threads is not
constant. As addition of floating point number is not stable with respect to
ordering, this leads to numerically differing results.
However, whenever the sequences contain clearly recognizable motifs, this
effect should be small, and the motifs should be discovered invariably, albeit
perhaps not identically.
Q: When using hybrid learning mode, why does it appear that the discriminative
objective functions value does not strictly increase in each iteration, even
though the reported relative increase may always be positive?
A: During hybrid learning, first a gradient step on the discriminative para-
meters for the discriminative objective function is applied, followed by an ex-
pectation maximization step for the generative parameters. The generative
updates often modify the parameters so as to decrease the objective function's
values. The reported relative increase of the discriminative objective function
is measured relative to before it is applied.
Q: Why is it that the (intermediate and final) scores reported during
optimization of HMM parameters are often smaller than those resulting from
seeding?
A: The HMM parameters are optimized by maximizing a smooth function based on
expected numbers of sites (defined by the posterior occurrence probability of
the HMM motif), while seeding uses integer counts of sequences that either have
or do not have an occurrence of a given IUPAC regular expression. The posterior
probabilities over which expected values are summed for the HMM parameter
optimization do not separate occurrences in such a black and white manner as the
integer counts of occurrences of IUPAC regular expression do.
It is thus that the scores based on expected values reported during and after
training are typically smaller than those reported after seeding.
Note that, importantly, the scores computed for the learned HMM parameters from
the Viterbi parses of the sequences are in turn typically greater than those
reported after seeding.
Q: I observed that using Monte-Carlo Markov chain sampling to learn HMM
parameters (with the `--sampling` command line switch) seems to yield higher
scores than when using gradient learning. Why is that?
A: Gradient learning uses the hybrid learning mode described in the publication.
In the hybrid mode only the emission parameters are optimized so as to increase
the objective function, while the transition parameters are optimized for the
likelihood.
The MCMC mode of Discrover on the other hand does not support such multiple-
objective learning modes, and instead optimizes all parameters for the chosen
objective function.
If you want to have a fair comparison, you can use the `--bglearn gradient`
command line switch with the gradient-based approaches to optimize all
parameters, including the transition parameters, for the chosen objective.