# ETE4 Tutorial Annotation and Visualization in Smartview

# Tree node annotation in ETE4

Adding properties to the nodes of a tree is called tree annotation. ETE stores the properties (annotations) of a node in a dictionary called props.

In a phylogenetic tree, the nodes (with their branches) often have names, branch lengths, and branch supports. ETE provides a shortcut for their corresponding properties **name**, **dist**, and **support**, so instead of writing ```n.props.get('name')```, you can write ```n.name```, and similarly for ```n.dist``` and ```n.support```.

The **```Tree.add_prop()```** and **```Tree.add_props()```** methods allow to add extra properties (features, annotations) to any node. The first one allows to add one one feature at a time, while the second one can be used to add many features with the same call.

Similarly, **```Tree.del_prop()```** can be used to delete a property.

Example using annotations when working on a tree:

In [None]:
from ete4 import Tree

t = Tree('((H:0.3,I:0.1),A:1,(B:0.4,(C:0.5,(J:1.3,(F:1.2,D:0.1)))));')

print(t.to_str())

In [None]:
# Reference some nodes (to use later).
A = t['A']  # by name
C = t['C']
H = t['H']
ancestor_JFC = t.common_ancestor(['J', 'F', 'C'])  # by common ancestor

# check out
print(ancestor_JFC.to_str())

In [6]:
# Let's now add some custom features to our nodes.
C.add_props(vowel=False, confidence=1.0)
A.add_props(vowel=True, confidence=0.8)

ancestor_JFC.name = "ancestor_JFC"
ancestor_JFC.add_props(nodetype='internal')

H.add_props(vowel=False, confidence=0.3)

for node in [A, C, H, ancestor_JFC]:
    print(f'Properties of {node.name}: {node.props}')

Properties of A: {'name': 'A', 'dist': 1.0, 'vowel': True, 'confidence': 0.8}
Properties of C: {'name': 'C', 'dist': 0.5, 'vowel': False, 'confidence': 1.0}
Properties of H: {'name': 'H', 'dist': 0.3, 'vowel': False, 'confidence': 0.3}
Properties of ancestor_JFC: {'nodetype': 'internal', 'name': 'ancestor_JFC'}


In [8]:
# Let's annotate by looping over all nodes.
# (Note that this overwrites the previous values.)
for leaf in t:
    is_vowel = leaf.name in 'AEIOU'
    leaf.add_props(vowel=is_vowel, confidence=1)

# Now we use this information to analyze the tree.
print('This tree has', sum(1 for n in t.search_nodes(vowel=True)), 'vowel nodes')
print('They are:', [leaf.name for leaf in t.leaves() if leaf.props['vowel']])


This tree has 2 vowel nodes
They are: ['I', 'A']


In [9]:
# But features may refer to any kind of data, not only simple values.
# For example, we can calculate some values and store them within nodes.
#
# Let's detect leaves under 'ancestor_JFC' with distance higher than 1.
# Note that it traverses a subtree which starts from 'ancestor_JFC'.
matches = [leaf for leaf in ancestor_JFC.leaves() if leaf.dist > 1.0]

# And save this pre-computed information into the ancestor node.
ancestor_JFC.add_props(long_branch_nodes=matches)

# Prints the precomputed nodes
print('These are the leaves under ancestor_JFC with long branches:',
      [n.name for n in ancestor_JFC.props['long_branch_nodes']])

These are the leaves under ancestor_JFC with long branches: ['J', 'F']


In [10]:
# We can also use the add_props() method to dynamically add new features.
value = input('Custom label value: ')
ancestor_JFC.add_props(label=value)
print(f'Ancestor has now the "label" property with value "{value}":')
print(ancestor_JFC.props)

Custom label value: able
Ancestor has now the "label" property with value "able":
{'nodetype': 'internal', 'name': 'ancestor_JFC', 'long_branch_nodes': [Tree 'J' (0x7f606bfc05d), Tree 'F' (0x7f606bfc065)], 'label': 'able'}


The original newick format did not support adding extra features to a tree. ETE includes support for the **New Hampshire eXtended format (NHX)**, which uses the original newick standard and adds the possibility of saving additional data related to each tree node.

Here is an example of a extended newick representation in which extra information is added to an internal node:

In [None]:
(A:0.3,(B:0.7,(D:0.6,G:0.1):0.6[&&NHX:conf=0.1:name=internal]):0.5);

As you can see, extra node features in the NHX format are enclosed between brackets. ETE is able to read and write features using this format, however, the encoded information is expected to be exportable as plain text.

The NHX format is automatically detected when reading a newick file, and the detected node properties are added. You can access the information by using ```node.props[prop_name]```.

Similarly, properties added to a tree can be included within the normal newick representation using the NHX notation. For this, you can call the ```Tree.write()``` method using the props argument, which is expected to be a list with the feature names that you want to include in the newick string. Use ```(props=None)``` to include all the node’s data into the newick string.

In [11]:
t = Tree('((H:0.3,I:0.1),A:1,(B:0.4,(C:0.5,(J:1.3,(F:1.2,D:0.1)))));')

print(t)

 ╭─┬╴H
─┤ ╰╴I
 ├╴A
 ╰─┬╴B
   ╰─┬╴C
     ╰─┬╴J
       ╰─┬╴F
         ╰╴D


In [13]:
# Add some more properties to leaves:
for leaf in t:
    is_vowel = leaf.name in 'AEIOU'
    leaf.add_props(vowel=is_vowel, confidence=1)

print('NHX notation including vowel and confidence properties:')
print(t.write(props=['vowel']))

NHX notation including vowel and confidence properties:
((H:0.3[&&NHX:vowel=False],I:0.1[&&NHX:vowel=True]),A:1[&&NHX:vowel=True],(B:0.4[&&NHX:vowel=False],(C:0.5[&&NHX:vowel=False],(J:1.3[&&NHX:vowel=False],(F:1.2[&&NHX:vowel=False],D:0.1[&&NHX:vowel=False])))));


In [14]:
print('NHX notation including all data in the nodes:')
print(t.write(props=None))

NHX notation including all data in the nodes:
((H:0.3[&&NHX:confidence=1:vowel=False],I:0.1[&&NHX:confidence=1:vowel=True]),A:1[&&NHX:confidence=1:vowel=True],(B:0.4[&&NHX:confidence=1:vowel=False],(C:0.5[&&NHX:confidence=1:vowel=False],(J:1.3[&&NHX:confidence=1:vowel=False],(F:1.2[&&NHX:confidence=1:vowel=False],D:0.1[&&NHX:confidence=1:vowel=False])))));


In [15]:
print('Exclude all NHX notation in the nodes:')
print(t.write(props=[]))

NHX notation including all data in the nodes:
((H:0.3,I:0.1),A:1,(B:0.4,(C:0.5,(J:1.3,(F:1.2,D:0.1)))));


To read NHX notation you can just read it as a normal newick:

In [23]:
# Load the NHX example from https://www.phylosoft.org/NHX/
nw = """
(((ADH2:0.1[&&NHX:S=human:E=1.1.1.1], ADH1:0.11[&&NHX:S=human:E=1.1.1.1])
:0.05[&&NHX:S=Primates:E=1.1.1.1:D=Y:B=100], ADHY:0.1[&&NHX:S=nematode:
E=1.1.1.1],ADHX:0.12[&&NHX:S=insect:E=1.1.1.1]):0.1[&&NHX:S=Metazoa:
E=1.1.1.1:D=N], (ADH4:0.09[&&NHX:S=yeast:E=1.1.1.1],ADH3:0.13[&&NHX:S=yeast:
E=1.1.1.1], ADH2:0.12[&&NHX:S=yeast:E=1.1.1.1],ADH1:0.11[&&NHX:S=yeast:E=1.1.1.1]):0.1
[&&NHX:S=Fungi])[&&NHX:E=1.1.1.1:D=N];
"""

t = Tree(nw)

print(t.to_str(props=['name', 'S'], compact=True))

                 ╭╴⊗,Primates╶┬╴ADH2,human
     ╭╴⊗,Metazoa╶┤            ╰╴ADH1,human
     │           ├╴ADHY,nematode
╴⊗,⊗╶┤           ╰╴ADHX,insect
     │         ╭╴ADH4,yeast
     ╰╴⊗,Fungi╶┼╴ADH3,yeast
               ├╴ADH2,yeast
               ╰╴ADH1,yeast


In [25]:
# And access the node's properties.
print('S property for the nodes that have it:')
for n in t.traverse():
    if 'S' in n.props:
        print('  %s: %s' % (n.name if n.name else n.id, n.props['S']))

S property for the nodes that have it:
  [0]: Metazoa
  [1]: Fungi
  [0, 0]: Primates
  ADHY: nematode
  ADHX: insect
  ADH4: yeast
  ADH3: yeast
  ADH2: yeast
  ADH1: yeast
  ADH2: human
  ADH1: human


# The Programmable Tree Drawing Engine in ETE4 smartview


Overview
--------

Before exploring the novel features and enhancements introduced in ETE v4, it is essential to understand the foundational elements of ETE’s programmable tree drawing engine. Inherited from ETE v3, the following fundamental components form a highly adaptable backbone, enabling the various customization and structuring of visualizations: 

a) ```TreeStyle```, a class can be used to create a custom set of options that control the general aspect of the tree image. For example, users can modify the scale used to render tree branches or choose between circular or rectangular tree drawing, and customize general settings for tree visualizing such as title, footer, legend, etc.

b) ```NodeStyle```, defines the specific aspect of each node (size, color, background, line type, etc.). A node style can be defined statically and attached to several nodes, or customized the conditions so different NodeStyle applied for nodes in different conditions. NodeStyle can even dynamically change on the fly to adapt ETE4’s zooming algorithm, which can be set through a TreeLayout.

c) ```Face```, as called as node faces, are small pieces of extra graphical information that can be linked to nodes (text labels, images, graphs, etc.). Several types of node faces are provided by the previous ETE3 module, ranging from simple text (```TextFace```) and geometric shapes (```CircleFace```), to molecular sequence representations (SequenceFace), etc. These faces are upgraded in ETE4 to adapt the large tree drawing engine.

d) ```TreeLayout```, is a class which defines a foundational layout for trees to set specific styles for both the entire tree and individual nodes, acting as a pre-drawing hooking framework. When a tree is about to be drawn, the above elements such as TreeStyle, NodeStyle, Face of nodes can be then set up and modified on the fly and returned to the drawer engine. Hence TreeLayout class can be understood as a suite of rules tree’s basic setting and how different nodes should be drawn. 

Scheme of fundamental components in ETE4's programmable tree drawing engine
![image.png](https://github.com/dengzq1234/ete4_gallery/blob/master/smartview/fundamental_ete4.jpg?raw=true)

## Explore interactive visualization of trees 


ETE's tree drawing engine is fully integrated with a built-in
graphical user interface (GUI) which allows to explore and manipulate
node's properties and tree topology. To start the visualization of a
node (tree or subtree), you can simply call the :func:`explore
<ete4.Tree.explore>` method.

One of the advantages of this visualization is that you can use it to
interrupt a given program/analysis, explore the tree, manipulate it,
and continue with the execution. Note that **changes made using the
GUI will be kept after quiting the GUI**. This feature is specially
useful during python sessions, and it can be utilized in various environments 
by modifying argument *keep_server*, including standalone scripts and interactive 
sessions such as IPython or Jupyter Notebooks. Below are examples demonstrating 
the method's usage in each context.

### Standalone scripts

When running a standalone script, argument ```keep_server``` should be set as **True** to keep 
the server running.

In [None]:
#!/usr/bin/python3
from ete4 import Tree

def main():
    t = Tree('((a,b),c);')
    t.explore(name="tree1", keep_server=True)

if __name__ == '__main__':
    main()

### Interactive sessions

When running in interactive sessions such as IPython or Jupyter Notebooks, 
leave ```keep_server``` as default **False**.

In [7]:
from ete4 import Tree
t = Tree('((a,b),c);')
t.explore()

Added tree tree1 with id 0.


### Verbose mode

When running in verbose mode by setting ```quiet``` argument to **False**, every actions 
will be printed in the terminal.

In [2]:
from ete4 import Tree
t = Tree('((a,b),c);')
t.explore(quiet=False)

Added tree tree-1 with id 0.


Bottle v0.12.25 server starting up (using WSGIRefServer())...
Listening on http://localhost:5000/
Hit Ctrl-C to quit.

127.0.0.1 - - [03/Nov/2023 15:41:42] "GET / HTTP/1.1" 303 0
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /trees HTTP/1.1" 200 29
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /layouts HTTP/1.1" 200 106
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /trees/0/size HTTP/1.1" 200 29
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /trees/0/collapse_size HTTP/1.1" 200 2
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /drawers/RectFaces/0 HTTP/1.1" 200 30
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /trees/0/ultrametric HTTP/1.1" 200 5
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /trees/0/nodecount HTTP/1.1" 200 27
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /layouts/0 HTTP/1.1" 200 106
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /trees/0/searches HTTP/1.1" 200 16
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /trees/0/all_selections HTTP/1.1" 200 16
127.0.0.1 - - [03/Nov/2023 15:41:43] "GET /trees/0/all_active HT

### Show leaf node names, branch length and branch support

Users can choose to show leaf node names, branch length and branch support in the 
tree ```explore()``` method.

In [1]:
from ete4 import Tree
t = Tree()
t.populate(10, random_branches=True)
t.explore(
    show_leaf_name=True, 
    show_branch_length=True,
    show_branch_support=True)

Added tree tree-1 with id 0.


In [None]:
### Render and download 

## Customizing the aspect of trees

Visualization customization is performed through four main elements: ```TreeStyle```, ```NodeStyle```, ```Face```, and ```TreeLayout```.

### Tree Layout

As shown in scheme of fundamental components from the previous section, TreeLayout contains element of 
tree style, node style and faces. Therefore, TreeLayout is the most important element in ETE4's drawing engine 
in regards to visualize information other than pure tree topology. TreeLayout can be called from :class:`TreeLayout` 
from :class:`ete4.smartview:`. It contains the following arguments:

- *name*: name of the TreeLayout object, obligatory field.
- *ts*: a function to set tree style.
- *ns*: a function to set node style.
- *aligned_faces*: whether to draw faces in aligned position, default *False*.
- *active*: whether to activate the TreeLayout, default *True*.
- *legend*: whether to show legend(need to be defined in tree style function), default *False*.


In [None]:
from ete4 import Tree
from ete4.smartview import TreeLayout

t = Tree()
t.populate(20, random_branches=True)

# define a TreeLayout
tree_layout = TreeLayout(name="MyTreeLayout")

# add TreeLayout to layouts
layouts = []
layouts.append(tree_layout)

# explore tree
t.explore(keep_server=True, layouts=layouts)