# Phylo2Vec demo

Welcome to the Phylo2Vec demo! Here, we will quickly visit the main functions of Phylo2Vec, including:
* How to sample random tree topologies (cladograms) as Phylo2Vec vectors
* How to convert Phylo2Vec vectors to Newick format and vice versa
* How to sample random trees with branch lengths (phylograms) as Phylo2Vec matrices
* How to convert these matrices to Newick format and vice versa
* Other useful operations on Phylo2Vec vectors

Note that the current version of Phylo2Vec (29/04/2025) relies on a core written in Rust, with bindings to Python and R. This comes with significant speed-ups, allowing manipulation large trees (up to ~100,000 to 1 million leaves). To become more familiar with Rust, we recommend this [interactive book](https://rust-book.cs.brown.edu/experiment-intro.html).

## 1. Imports

### 1.1. Rust core

* Currently, most functions of Phylo2Vec are written in Rust. There are many reasons to like Rust that make it desirable for this project, especially its speed, type and memory safety, and portability to other high-level languages like Python and R.  
* Most functions written in Rust are available in Python (and soon in R) via [PyO3](https://pyo3.rs/v0.24.2/) which provides Rust bindings for Python (via a module called here ```_phylo2vec_core```). Thus, most Python functions consist in thin wrappers of the Rust functions.

### 1.2. Other dependencies

* The Python side of Phylo2Vec doesn't require many dependencies, excepting [NumPy](https://numpy.org/) and [numba](https://numba.pydata.org/) for array manipulations. Here, we will also use [ete](https://github.com/etetoolkit/ete), a useful Python toolkit for tree manipulation and visualisation.

In [1]:
import os

import numpy as np

from ete3 import Tree

# To run the notebook here, we need to change the working directory
# to py-phylo2vec, which is the parent directory of the Python package.
os.chdir("../py-phylo2vec")

import phylo2vec._phylo2vec_core as core

## 2. Core functions

### 2.1. Sampling a random tree

Use ```sample_vector``` to sample a random tree topology. This function takes two arguments:
* ```n_leaves```, the desired number of leaves/tips in the tree
* ```ordered```, a boolean to sample _ordered_ or _unordered_ trees. This notion was introduced in our [published article](https://doi.org/10.1093/sysbio/syae030) ([preprint](https://arxiv.org/abs/2304.12693)) - a brief summary is described below:

| Characteristic             | **Ordered**                   | **Unordered**                  |
|----------------------------|-------------------------------|--------------------------------|
| Constraint                 | ```v[i]``` $\in \{0, 1, \ldots, i\}$ | ```v[i]``` $\in \{0, 1, \ldots, 2i\}$ |
| Description                | Similar to birth processes    | Bijection of binary tree space |
| Meaning of ```v[i]```      | leaf that forms a cherry with leaf $i$ at iteration $i$  | branch that splits and yields leaf $i$ at iteration $i$ |


In [4]:
from phylo2vec import sample_vector

sample_vector?

[0;31mSignature:[0m [0msample_vector[0m[0;34m([0m[0mn_leaves[0m[0;34m:[0m [0mint[0m[0;34m,[0m [0mordered[0m[0;34m:[0m [0mbool[0m [0;34m=[0m [0;32mFalse[0m[0;34m)[0m [0;34m->[0m [0mnumpy[0m[0;34m.[0m[0mndarray[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Sample a random tree via Phylo2Vec, in vector form.

Parameters
----------
n_leaves : int
    Number of leaves
ordered : bool, optional
    If True, sample an ordered tree, by default False

    True:
    v_i in {0, 1, ..., i} for i in (0, n_leaves-1)

    False:
    v_i in {0, 1, ..., 2*i} for i in (0, n_leaves-1)

Returns
-------
numpy.ndarray
    Phylo2Vec vector
[0;31mFile:[0m      ~/src/phylo2vec/public_fork2/py-phylo2vec/phylo2vec/utils/vector.py
[0;31mType:[0m      function

In [3]:
v = sample_vector(n_leaves=7)

print(repr(v))

array([0, 0, 1, 5, 8, 5])


In [4]:
v_ordered = sample_vector(n_leaves=7, ordered=True)

print(repr(v_ordered))

array([0, 0, 0, 2, 2, 2])


To check that a vector is valid according to the Phylo2Vec formulation, use ```check_vector```.

In [5]:
from phylo2vec.utils.vector import check_vector

check_vector?

[0;31mSignature:[0m [0mcheck_vector[0m[0;34m([0m[0mv[0m[0;34m:[0m [0mnumpy[0m[0;34m.[0m[0mndarray[0m[0;34m)[0m [0;34m->[0m [0;32mNone[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Input validation of a Phylo2Vec vector

The input is checked to satisfy the Phylo2Vec constraints

Parameters
----------
v : numpy.ndarray
    Phylo2Vec vector
[0;31mFile:[0m      ~/src/phylo2vec/public_fork2/py-phylo2vec/phylo2vec/utils/vector.py
[0;31mType:[0m      function

In [6]:
check_vector(v) # returns None

v_awkward = v.copy()

v_awkward[5] = 11

check_vector(v_awkward) # AssertionError

thread '<unnamed>' panicked at phylo2vec/src/utils.rs:140:5:
Validation failed: v[5] = 11 is out of bounds (max = 10)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


PanicException: Validation failed: v[5] = 11 is out of bounds (max = 10)

### 2.2 Converting a vector to a Newick string

Use ```to_newick``` to convert a vector to a Newick string.

In [7]:
from phylo2vec import to_newick

to_newick?

[0;31mSignature:[0m [0mto_newick[0m[0;34m([0m[0mvector_or_matrix[0m[0;34m:[0m [0mnumpy[0m[0;34m.[0m[0mndarray[0m[0;34m)[0m [0;34m->[0m [0mstr[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Convert a Phylo2Vec vector or matrix to Newick format

Parameters
----------
vector_or_matrix : numpy.array
    Phylo2Vec vector (ndim == 1)/matrix (ndim == 2)

Returns
-------
newick : str
    Newick tree
[0;31mFile:[0m      ~/src/phylo2vec/public_fork2/py-phylo2vec/phylo2vec/base/newick.py
[0;31mType:[0m      function

In [8]:
newick = to_newick(v)

newick

'((((0,2)9,4)10,(1,3)8)11,(5,6)7)12;'

Under the hood, ```to_newick``` performs two operations. ```get_pairs``` and ```_build_newick```.

```get_pairs``` produces an ordered list of pairs of leaves from the vector (as a post-order traversal), making use of [AVL trees](https://en.wikipedia.org/wiki/AVL_tree). Each element corresponds to a cherry with its parental node: [children1, children2, parent].

In this example, we have that:
* leaves (1, 6) form a first cherry → 1 = min(1, 6) so leaf 1 is the representative of that cherry
* leaves (4, 5) form a second cherry → 4 = min(4, 5) so leaf 4 is the representative of that cherry
* leaves (2, 3) form a third cherry → 2 = min(2, 3) so leaf 2 is the representative of that cherry
* (0, 1): 1 was already visited in the first cherry, so leaf 0 forms a cherry with the parent of leaf 1
* (0, 4): both were already visited, so the parents of leaf 0 and leaf 4 form a cherry
* (0, 2): both were already visited, to the parents of leaf 0 and leaf 2 form a cherry

In [38]:
v_fixed = np.array([0, 2, 2, 5, 4, 1])

pairs = core.get_pairs(v_fixed)

pairs

[(1, 6), (4, 5), (2, 3), (0, 1), (0, 4), (0, 2)]

```build_newick``` takes the list of pairs and forms a Newick string (with internal labels (or "parents"))

In [39]:
newick_fixed = core.build_newick(pairs)

newick_fixed2 = to_newick(v_fixed)

assert newick_fixed == newick_fixed2  # should be the same

newick_fixed2

'(((0,(1,6)7)10,(4,5)8)11,(2,3)9)12;'

For visualisation purposes, we can plot the tree using ete3

We observe the same "cherries" as described above: 
* (1, 6), which merges with 0
* (4, 5), which then merges with the subtree (0,(1,6))
* (2, 3), which then merges with the subtree ((0,(1,6)),(4,5))

In [40]:
def plot_tree(newick):
    print(Tree(newick))

plot_tree(newick_fixed)


         /-0
      /-|
     |  |   /-1
     |   \-|
   /-|      \-6
  |  |
  |  |   /-4
--|   \-|
  |      \-5
  |
  |   /-2
   \-|
      \-3


### 2.3 Converting a Newick to a vector

Use ```from_newick``` to convert a vector to a Newick string.

In [12]:
from phylo2vec import from_newick

from_newick?

[0;31mSignature:[0m [0mfrom_newick[0m[0;34m([0m[0mnewick[0m[0;34m:[0m [0mstr[0m[0;34m)[0m [0;34m->[0m [0mnumpy[0m[0;34m.[0m[0mndarray[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Convert a Newick string to a Phylo2Vec vector or matrix

Parameters
----------
newick : str
    Newick string for a tree

Returns
-------
numpy.ndarray
    Phylo2Vec matrix is branch lengths are present, otherwise a vector
[0;31mFile:[0m      ~/src/phylo2vec/public_fork2/py-phylo2vec/phylo2vec/base/newick.py
[0;31mType:[0m      function

In [13]:
# Let's generate a new v with 7 leaves using sample
v7 = sample_vector(7)
print(f"v (sampled): {repr(v7)}")

newick7 = to_newick(v7)
print(f"newick: {newick7}")

v7_new = from_newick(newick7)
print(f"v (convert): {repr(v7_new)}")

assert np.array_equal(v7, v7_new)  # should be the same

v (sampled): array([0, 2, 3, 4, 1, 8])
newick: (((((0,(1,5)7)8,4)9,6)10,3)11,2)12;
v (convert): array([0, 2, 3, 4, 1, 8])


We can also convert Newick strings without parent labels. Several functions are provided in ```phylo2vec.utils.newick``` to process Newick strings.

In [14]:
from phylo2vec.utils.newick import remove_parent_labels

newick7_no_parent = remove_parent_labels(newick7)

print(f"Newick with parent labels: {newick7}")
print(f"Newick without parent labels: {newick7_no_parent}")

v7_no_parent = from_newick(newick7_no_parent)
print(f"v (converted without parents): {repr(v7_new)}")

assert np.array_equal(v7, v7_no_parent)  # should be the same

Newick with parent labels: (((((0,(1,5)7)8,4)9,6)10,3)11,2)12;
Newick without parent labels: (((((0,(1,5)),4),6),3),2);
v (converted without parents): array([0, 2, 3, 4, 1, 8])


### 2.4 Matrix form

* Newick strings can also have branch lengths, so it is also desirable to store not only the topology (which the core Phylo2Vec does), but also the branch lengths

In this setup:
 * The 1st column is v[i] (i.e., the Phylo2Vec vector)
 * The 2nd and 3rd columns are the branch lengths of cherry in the ancestry matrix

Under the hood, ```from_newick``` checks whether the Newick string has branch lengths or not, and ```to_newick``` checks whether the input is a vector or a matrix, and performs the conversion. So we can use the same functions as before!

In [17]:
from phylo2vec import sample_matrix

In [16]:
# Let's sample another v

n_leaves = 5

m5 = sample_matrix(n_leaves)

print(f"m (sampled): {repr(m5)}")

m (sampled): array([[0.        , 0.27507809, 0.65492588],
       [0.        , 0.16556533, 0.07021012],
       [3.        , 0.9721505 , 0.06527178],
       [0.        , 0.39419627, 0.96055424]])


In [17]:
newick_with_bls = to_newick(m5)

print(newick_with_bls)

((((0:0.2750781,4:0.6549259)5:0.16556533,2:0.07021012)6:0.9721505,3:0.06527178)7:0.39419627,1:0.96055424)8;


In [18]:
m5_other = from_newick(newick_with_bls)

assert np.array_equal(m5, m5_other)  # should be the same

## 3. Other utility functions

### 3.1. Metrics

We believe it is possible to implement a wide variety of metrics pertaining to trees using the Phylo2Vec format.

These can be metrics between trees (we evoked calculating a Hamming distance between vectors in the Phylo2Vec paper), but also between leaves within a tree. An example of the latter is the [cophenetic distance](https://en.wikipedia.org/wiki/Cophenetic) (inspired by [ape](https://rdrr.io/cran/ape/man/cophenetic.phylo.html)).

In [5]:
from phylo2vec.metrics import pairwise_distances

In [15]:
np.set_printoptions(linewidth=1000)

v10 = sample_vector(n_leaves=10)

print(f"vector:\n{repr(v10)}")

print(f"Cophenetic distance matrix (topology):\n{pairwise_distances(v10, metric='cophenetic')}")

vector:
array([ 0,  2,  0,  0,  3, 10,  4,  7, 13])
Cophenetic distance matrix (topology):
[[0 5 6 4 3 4 7 4 4 4]
 [5 0 3 5 6 5 4 7 7 3]
 [6 3 0 6 7 6 3 8 8 4]
 [4 5 6 0 5 2 7 6 6 4]
 [3 6 7 5 0 5 8 3 3 5]
 [4 5 6 2 5 0 7 6 6 4]
 [7 4 3 7 8 7 0 9 9 5]
 [4 7 8 6 3 6 9 0 2 6]
 [4 7 8 6 3 6 9 2 0 6]
 [4 3 4 4 5 4 5 6 6 0]]


In [21]:
m10 = sample_matrix(n_leaves=10)

print(f"matrix:\n{repr(m10)}")

print(
    f"Cophenetic distance matrix (rounded):\n{pairwise_distances(m10, metric='cophenetic').round(3)}"
)

matrix:
array([[ 0.        ,  0.21036763,  0.58039254],
       [ 0.        ,  0.22874835,  0.5873965 ],
       [ 4.        ,  0.71927387,  0.23656006],
       [ 3.        ,  0.04383796,  0.31960505],
       [ 5.        ,  0.10017862,  0.5887714 ],
       [ 2.        ,  0.23395139,  0.5856393 ],
       [ 2.        ,  0.9120419 ,  0.13364407],
       [ 1.        ,  0.15698268,  0.4864966 ],
       [14.        ,  0.01806001,  0.77705199]])
Cophenetic distance matrix (rounded):
[[0.    2.    1.768 2.242 2.518 2.687 1.056 2.126 2.37  1.28 ]
 [2.    0.    3.3   1.636 1.912 2.081 2.588 3.658 0.791 0.987]
 [1.768 3.3   0.    3.542 3.818 3.987 1.185 0.816 3.67  2.579]
 [2.242 1.636 3.542 0.    0.363 0.733 2.83  3.9   2.006 1.23 ]
 [2.518 1.912 3.818 0.363 0.    1.009 3.106 4.176 2.282 1.506]
 [2.687 2.081 3.987 0.733 1.009 0.    3.275 4.345 2.451 1.675]
 [1.056 2.588 1.185 2.83  3.106 3.275 0.    1.543 2.958 1.868]
 [2.126 3.658 0.816 3.9   4.176 4.345 1.543 0.    4.028 2.938]
 [2.37  0.791 3.6

### 3.2 Optimisation

In the Phylo2Vec paper, we showcased a hill-climbing optimisation scheme to demonstrate the potential of phylo2vec for maximum likelihood-based phylogenetic inference.

These optimisation schemes (to be written in ```opt```) are not thoroughly maintained as difficult to test. One notable goal is to integrate [GradME](https://github.com/Neclow/GradME) into phylo2vec

### 3.3 Other utility functions

#### 3.3.1 Finding the number of leaves in a Newick

In [21]:
from phylo2vec.utils.newick import find_num_leaves

find_num_leaves?

[0;31mSignature:[0m [0mfind_num_leaves[0m[0;34m([0m[0mnewick[0m[0;34m:[0m [0mstr[0m[0;34m)[0m [0;34m->[0m [0mint[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Calculate the number of leaves in a tree from its Newick

Parameters
----------
newick : str
    Newick representation of a tree

Returns
-------
int
    Number of leaves
[0;31mFile:[0m      ~/src/phylo2vec/public_fork2/py-phylo2vec/phylo2vec/utils/newick.py
[0;31mType:[0m      function

In [22]:
assert find_num_leaves(newick7) == 7

#### 3.3.2 Removing and adding a leaf in a tree

One might want to prune or add nodes in an existing tree (a common example is the subtree-prune-and-regraft operation).

This is not a trivial operation as we need to re-compute the vector (as the number of leaves in the tree will have changed)

In [23]:
from phylo2vec.utils.vector import remove_leaf

remove_leaf?

[0;31mSignature:[0m [0mremove_leaf[0m[0;34m([0m[0mv[0m[0;34m,[0m [0mleaf[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Remove a leaf from a Phylo2Vec v

Parameters
----------
v : numpy.ndarray
    Phylo2Vec vector
leaf : int
    A leaf node to remove

Returns
-------
v_sub : numpy.ndarray
    Phylo2Vec vector without `leaf`
sister : int
    Sister node of leaf
[0;31mFile:[0m      ~/src/phylo2vec/public_fork2/py-phylo2vec/phylo2vec/utils/vector.py
[0;31mType:[0m      function

In [24]:
leaf = 3

v6, sister_leaf = remove_leaf(v7, leaf=leaf)

In [25]:
plot_tree(newick7)
plot_tree(to_newick(v6))


               /-0
            /-|
           |  |   /-1
         /-|   \-|
        |  |      \-5
      /-|  |
     |  |   \-4
   /-|  |
  |  |   \-6
--|  |
  |   \-3
  |
   \-2

            /-0
         /-|
        |  |   /-1
      /-|   \-|
     |  |      \-4
   /-|  |
  |  |   \-3
--|  |
  |   \-5
  |
   \-2


In [26]:
from phylo2vec.utils.vector import add_leaf

add_leaf?

[0;31mSignature:[0m [0madd_leaf[0m[0;34m([0m[0mv[0m[0;34m,[0m [0mleaf[0m[0;34m,[0m [0mpos[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Add a leaf to a Phylo2Vec vector v

Parameters
----------
v : numpy.ndarray
    Phylo2Vec vector
leaf : int >= 0
    A leaf node to add
pos : int >= 0
    A branch from where the leaf will be added

Returns
-------
v_add : numpy.ndarray
    Phylo2Vec vector including the new leaf
[0;31mFile:[0m      ~/src/phylo2vec/public_fork2/py-phylo2vec/phylo2vec/utils/vector.py
[0;31mType:[0m      function

In [27]:
# due to re-labelling in remove_leaf, we have to decrement sister_leaf
if sister_leaf >= leaf:
    sister_leaf -= 1

v_add = add_leaf(v6, leaf=3, pos=sister_leaf)

np.array_equal(v_add, v7)

True

#### 3.3.3 Applying and create an integer mapping from a Newick string

* Newick strings usually do not contain integers but real-life taxa (e.g., animal species, languages...). So it is important to provide another layer of conversion, where we can take in a Newick with string taxa, and convert it to a Newick with integer taxa, with a unique integer → taxon mapping.

In [28]:
n_leaves = 8

t = Tree()
t.populate(n_leaves)
nw_str = t.write(format=9)

print(nw_str)

print(t)

(((aaaaaaaaab,(aaaaaaaaac,aaaaaaaaad)),(aaaaaaaaae,aaaaaaaaaf)),((aaaaaaaaag,aaaaaaaaah),aaaaaaaaaa));

         /-aaaaaaaaab
      /-|
     |  |   /-aaaaaaaaac
     |   \-|
   /-|      \-aaaaaaaaad
  |  |
  |  |   /-aaaaaaaaae
  |   \-|
--|      \-aaaaaaaaaf
  |
  |      /-aaaaaaaaag
  |   /-|
   \-|   \-aaaaaaaaah
     |
      \-aaaaaaaaaa


In [29]:
from phylo2vec.utils.newick import create_label_mapping

create_label_mapping?

[0;31mSignature:[0m [0mcreate_label_mapping[0m[0;34m([0m[0mnewick[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Create an integer-taxon label mapping (label_mapping)
from a string-based newick (where leaves are strings)
and produce a mapped integer-based newick (where leaves are integers)
this also remove annotations pertaining to parent nodes

Parameters
----------
newick : str
    Newick with string labels

Returns
-------
newick_int : str
    Newick with integer labels
label_mapping : Dict str --> str
    Mapping of leaf labels (integers converted to string) to taxa
[0;31mFile:[0m      ~/src/phylo2vec/public_fork2/py-phylo2vec/phylo2vec/utils/newick.py
[0;31mType:[0m      function

In [30]:
from pprint import pprint

nw_int, label_mapping = create_label_mapping(nw_str)

plot_tree(nw_int)

pprint(label_mapping)


         /-2
      /-|
     |  |   /-0
     |   \-|
   /-|      \-1
  |  |
  |  |   /-3
  |   \-|
--|      \-4
  |
  |      /-5
  |   /-|
   \-|   \-6
     |
      \-7
{'0': 'aaaaaaaaac',
 '1': 'aaaaaaaaad',
 '2': 'aaaaaaaaab',
 '3': 'aaaaaaaaae',
 '4': 'aaaaaaaaaf',
 '5': 'aaaaaaaaag',
 '6': 'aaaaaaaaah',
 '7': 'aaaaaaaaaa'}


* The reverse operation is ```apply_label_mapping```

In [31]:
from phylo2vec.utils.newick import apply_label_mapping

apply_label_mapping?

[0;31mSignature:[0m [0mapply_label_mapping[0m[0;34m([0m[0mnewick[0m[0;34m,[0m [0mlabel_mapping[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Apply an integer-taxon label mapping (label_mapping)
from a string-based newick (where leaves are strings)
and produce a mapped integer-based newick (where leaves are integers)

Parameters
----------
newick : str
    Newick with integer labels
label_mapping : Dict str --> str
    Mapping of leaf labels (integers converted to string) to taxa

Returns
-------
newick : str
    Newick with string labels
[0;31mFile:[0m      ~/src/phylo2vec/public_fork2/py-phylo2vec/phylo2vec/utils/newick.py
[0;31mType:[0m      function

In [32]:
new_nw_str = apply_label_mapping(nw_int, label_mapping)

new_nw_str == nw_str

True

## 4. I/O: saving and writing files

It is also possible to save/write and load/read some files. Phylo2Vec supports various file formats for both arrays and Newick strings, making it versatile for different use cases.

### 4.1. Supported File Extensions

Phylo2Vec accepts the following file extensions for arrays and Newick strings:

- **Array file extensions** (for Phylo2Vec vector/matrices): `.csv`, `.txt`
- **Newick file extensions**: `.txt`, `.nwk`, `.newick`, `.tree`, `.treefile`

These extensions ensure compatibility with other programming languages as well as commonly used formats in phylogenetics and computational biology.

In [33]:
from glob import glob

import tempfile

from phylo2vec.io._validation import ACCEPTED_ARRAY_FILE_EXTENSIONS, ACCEPTED_NEWICK_FILE_EXTENSIONS

print(f"Accepted array file extensions: {ACCEPTED_ARRAY_FILE_EXTENSIONS}")
print(f"Accepted newick file extensions: {ACCEPTED_NEWICK_FILE_EXTENSIONS}")

Accepted array file extensions: ['.csv', '.txt']
Accepted newick file extensions: ['.txt', '.nwk', '.newick', '.tree', '.treefile']


### 4.2. Reading/writing arrays

To read and write Phylo2Vec vectors or matrices, use ```load```, and ```save```, respectively

In [34]:
from phylo2vec import load, save

with tempfile.TemporaryDirectory() as tmpdirname:
    print(f"Creating temporary directory {tmpdirname}...")
    tmpfname_vector = f"{tmpdirname}/v5.txt"
    v = sample_vector(5)
    print(f"Saving vector: {repr(v)}")
    save(v, tmpfname_vector)

    print(f"Files in {tmpdirname}: {glob(tmpdirname + '/*')}")

    v2 = load(tmpfname_vector)
    print(f"Loaded vector: {repr(v2)}")
    assert np.array_equal(v, v2)  # should be the same

    print()
    tmpfname_matrix = f"{tmpdirname}/m5.txt"

    m = sample_matrix(5)
    print(f"Saving matrix:\n{repr(m)}")
    save(m, tmpfname_matrix)

    print(f"Files in {tmpdirname}: {glob(tmpdirname + '/*')}")

    m2 = load(tmpfname_matrix)
    print(f"Loaded matrix:\n{repr(m2)}")

    assert np.array_equal(m, m2)  # should be the same

Creating temporary directory /tmp/tmpj8xz8wge...
Saving vector: array([0, 0, 1, 2])
Files in /tmp/tmpj8xz8wge: ['/tmp/tmpj8xz8wge/v5.txt']
Loaded vector: array([0, 0, 1, 2])

Saving matrix:
array([[0.        , 0.13019991, 0.7678588 ],
       [0.        , 0.05164948, 0.05071734],
       [2.        , 0.43557402, 0.60349464],
       [3.        , 0.53842747, 0.22101495]])
Files in /tmp/tmpj8xz8wge: ['/tmp/tmpj8xz8wge/m5.txt', '/tmp/tmpj8xz8wge/v5.txt']
Loaded matrix:
array([[0.        , 0.13019991, 0.7678588 ],
       [0.        , 0.05164948, 0.05071734],
       [2.        , 0.43557402, 0.60349464],
       [3.        , 0.53842747, 0.22101495]])


### 4.3. Reading/writing files containing Newick strings

To load and save Newick strings as Phylo2Vec vectors or matrices, use ```load_newick```, and ```save_newick```, respectively

In [35]:
from phylo2vec import load_newick, save_newick

with tempfile.TemporaryDirectory() as tmpdirname:
    print(f"Creating temporary directory {tmpdirname}...")
    tmpfname_vector = f"{tmpdirname}/v6.txt"
    v6 = sample_vector(6)

    print(f"Saving as newick: {repr(v6)}")
    save_newick(v6, tmpfname_vector)

    print(f"Files in {tmpdirname}: {glob(tmpdirname + '/*')}")

    v6_other = load_newick(tmpfname_vector)
    print(f"Loaded newick back to vector: {repr(v6_other)}")
    assert np.array_equal(v6_other, v6)  # should be the same

    print()
    tmpfname_matrix = f"{tmpdirname}/m6.txt"

    m6 = sample_matrix(5)
    print(f"Saving as newick:\n{repr(m6)}")
    save_newick(m6, tmpfname_matrix)

    print(f"Files in {tmpdirname}: {glob(tmpdirname + '/*')}")

    m6_other = load_newick(tmpfname_matrix)
    print(f"Loaded newick back to matrix:\n{repr(m6_other)}")

    assert np.array_equal(m6, m6_other)  # should be the same

Creating temporary directory /tmp/tmplll07dv0...
Saving as newick: array([0, 1, 4, 4, 8])
Files in /tmp/tmplll07dv0: ['/tmp/tmplll07dv0/v6.txt']
Loaded newick back to vector: array([0, 1, 4, 4, 8])

Saving as newick:
array([[0.        , 0.68952268, 0.29305875],
       [1.        , 0.94858885, 0.52901864],
       [3.        , 0.98526967, 0.5341301 ],
       [5.        , 0.39921775, 0.31769511]])
Files in /tmp/tmplll07dv0: ['/tmp/tmplll07dv0/v6.txt', '/tmp/tmplll07dv0/m6.txt']
Loaded newick back to matrix:
array([[0.        , 0.68952268, 0.29305875],
       [1.        , 0.94858885, 0.52901864],
       [3.        , 0.98526967, 0.5341301 ],
       [5.        , 0.39921775, 0.31769511]])
