We make apps available for use within cogent3. They are obtained by the standard mechanism for getting apps.

In [1]:
from cogent3 import available_apps

The available apps can be listed using the `available_apps()` function with the `name_filter` argument set to `"dvs"`.

In [2]:
available_apps(name_filter="dvs")

package,name,composable,doc,input type,output type
diverse_seq,dvs_max,True,select the maximally divergent seqs from a sequence collection,"Alignment, ArrayAlignment, SequenceCollection","Alignment, ArrayAlignment, SequenceCollection"
diverse_seq,dvs_nmost,True,select the n-most diverse seqs from a sequence collection,"Alignment, ArrayAlignment, SequenceCollection","Alignment, ArrayAlignment, SequenceCollection"


You can get help using the Cogent 3 `app_help()` system.

In [3]:
from cogent3.app import app_help

app_help("dvs_max")

Overview
--------
select the maximally divergent seqs from a sequence collection

Options for making the app
--------------------------
dvs_max_app = get_app(
    'dvs_max',
    min_size=5,
    max_size=30,
    stat='stdev',
    moltype='dna',
    include=None,
    k=6,
    seed=None,
)

Parameters
----------
min_size
    minimum size of the divergent set
max_size
    the maximum size of the divergent set
stat
    either stdev or cov, which represent the statistics
    std(delta_jsd) and cov(delta_jsd) respectively
moltype
    molecular type of the sequences
include
    sequence names to include in the final result
k
    k-mer size
seed
    random number seed

Notes
-----
If called with an alignment, the ungapped sequences are used.
The order of the sequences is randomised. If include is not None, the
named sequences are added to the final result.

Input type
----------
SequenceCollection, ArrayAlignment, Alignment

Output type
-----------
SequenceCollection, ArrayAlignment, Alignment


You can copy the vignette from the help display to create your app instance.

In [4]:
from cogent3.app import get_app

app = get_app(
    "dvs_max",
    min_size=5,
    max_size=40,
    stat="stdev",
    moltype="dna",
    k=6,
    seed=None,
)
app

dvs_max(min_size=5, max_size=40, stat='stdev', moltype='dna', include=None, k=6,
seed=None)

We will now load a single alignment from the included sample data and apply the Divergent MAX plug-in to those sequences after removing gaps. (This alignment was chosen because the estimated tree was not completely terrible!)

In [5]:
from cogent3 import get_app, open_data_store

in_data = open_data_store("mammals-aligned.zip", suffix="fa", mode="r")
loader = get_app("load_aligned", moltype="dna")
aln = loader(in_data[64])

> **Note**
> Successive calls to the app can return different results as the sequence order is randomised.

In [6]:
selected = app(aln)
selected

0,1
,0
Platypus,---------------------------------ATGGCAGAGAATGGAAAAGATT---GT
Sloth,...................................................A........
Shrew,.................................----------------------...--
Pika,...................................................A........
Tenrec,..................................................GAG..GTC..
Wallaby,...................................................A........


We show how the sampled sequences are dispersed across the phylogeny by first estimating a tree using NJ based on the paralinear distance.

In [7]:
import project_path

write_pdf = project_path.pdf_writer()

dnd = (
    aln.quick_tree(calc="paralinear")
    .rooted_with_tip("Platypus")
    .get_figure(width=1400, height=1600)
)
dnd.tip_font = dict(size=32, family="Inconsolata")
dnd.label_pad = 0.003
dnd.line_width = 3
dnd.scale_bar = None
dnd.style_edges(edges=selected.names, line={"color": "red", "width": 3})
outpath = project_path.FIG_DIR / "selected_edges.pdf"
# dnd.show()
write_pdf(dnd.plotly_figure, outpath)