Skip to content

Commit

Permalink
Merge pull request #1284 from nextstrain/export-clock-cov-and-std-dev
Browse files Browse the repository at this point in the history
Export clock rate covariance matrix and std dev
  • Loading branch information
huddlej committed Aug 24, 2023
2 parents 5b9ab8d + c28535b commit 435bd29
Show file tree
Hide file tree
Showing 4 changed files with 60 additions and 0 deletions.
5 changes: 5 additions & 0 deletions CHANGES.md
Expand Up @@ -2,11 +2,16 @@

## __NEXT__

### Features

* refine: Export covariance matrix and standard deviation for clock rate regression in the node data JSON output when these values are calculated by TreeTime. These new values appear in the `clock` data structure of the JSON output as `cov` and `rate_std` keys, respectively. [#1284][] (@huddlej)

### Bug fixes

* distance: Improve documentation by describing how gaps get treated as indels and how users can ignore specific characters in distance calculations. [#1285][] (@huddlej)
* Fix help output compatibility with non-Unicode streams. [#1290][] (@victorlin)

[#1284]: https://github.com/nextstrain/augur/pull/1284
[#1285]: https://github.com/nextstrain/augur/pull/1285
[#1290]: https://github.com/nextstrain/augur/pull/1290

Expand Down
6 changes: 6 additions & 0 deletions augur/refine.py
Expand Up @@ -259,6 +259,12 @@ def run(args):
node_data['clock'] = {'rate': tt.date2dist.clock_rate,
'intercept': tt.date2dist.intercept,
'rtt_Tmrca': -tt.date2dist.intercept/tt.date2dist.clock_rate}
# Include the standard deviation of the clock rate, if the covariance
# matrix is available.
if hasattr(tt.date2dist, "cov") and tt.date2dist.cov is not None:
node_data["clock"]["cov"] = tt.date2dist.cov
node_data["clock"]["rate_std"] = np.sqrt(tt.date2dist.cov[0, 0])

if args.coalescent=='skyline':
try:
skyline, conf = tt.merger_model.skyline_inferred(gen=args.gen_per_year, confidence=2)
Expand Down
20 changes: 20 additions & 0 deletions tests/functional/refine/cram/timetree.t
Expand Up @@ -21,3 +21,23 @@ Confirm that TreeTime trees match expected topology and branch lengths.

$ python3 "$TESTDIR/../../../../scripts/diff_trees.py" "$TESTDIR/../data/tree.nwk" tree.nwk --significant-digits 2
{}

Confirm that JSON output includes details about the clock rate.

$ grep -A 15 '\"clock\"' branch_lengths.json
"clock": {
"cov": [
[
.*, (re)
.* (re)
],
[
.*, (re)
.* (re)
]
],
"intercept": .*, (re)
"rate": .*, (re)
"rate_std": .*, (re)
"rtt_Tmrca": .* (re)
},
29 changes: 29 additions & 0 deletions tests/functional/refine/cram/timetree_with_fixed_clock_rate.t
@@ -0,0 +1,29 @@
Setup

$ source "$TESTDIR"/_setup.sh

Try building a time tree with a fixed clock rate and clock std dev.

$ ${AUGUR} refine \
> --tree "$TESTDIR/../data/tree_raw.nwk" \
> --alignment "$TESTDIR/../data/aligned.fasta" \
> --metadata "$TESTDIR/../data/metadata.tsv" \
> --output-tree tree.nwk \
> --output-node-data branch_lengths.json \
> --timetree \
> --coalescent opt \
> --date-confidence \
> --date-inference marginal \
> --clock-rate 0.0012 \
> --clock-std-dev 0.0002 \
> --clock-filter-iqd 4 \
> --seed 314159 &> /dev/null

Confirm that JSON output does not include information about the clock rate std dev, since it was provided by the user.

$ grep -A 4 '\"clock\"' branch_lengths.json
"clock": {
"intercept": .*, (re)
"rate": .*, (re)
"rtt_Tmrca": .* (re)
},

0 comments on commit 435bd29

Please sign in to comment.