# BMI/CS 576 - HW5
The objectives of this homework are to practice

* K-means clustering
* Gaussian mixture model-based clustering
* hierarchical clustering

## HW policies
Before starting this homework, please read over the [homework policies](https://canvas.wisc.edu/courses/167969/pages/hw-policies) for this course.  In particular, note that homeworks are to be completed *individually*.

You are welcome to use any code from the weekly notebooks in your solutions to the HW.

## PROBLEM 1: K-means algorithm (10 points)

Run the $k$-means algorithm (either by hand or with code) on the following set of one-dimensional points: $X = (x_1,x_2,x_3,x_4,x_5) = (2,4,5,9,10)$. Let $k = 2$ and the initial cluster centers be $f_1 = 2$ and $f_2 = 5$. After each iteration, show 

**(i)** the assignment of points to clusters and 

**(ii)** the updated cluster centers.

*If you run the algorithm via code, you must present the output in a nicely formatted manner.*

In [1]:
### BEGIN SOLUTION 
import kmeans
p1_profiles = [(2,), (4,), (5,), (9,), (10,)]
initial_f = [(2,), (5,)]
k = 2
list(kmeans.cluster_kmeans_iterator(p1_profiles, initial_f=initial_f, k=k))
### END SOLUTION

[([0, 1, 1, 1, 1], [(2.0,), (7.0,)]),
 ([0, 0, 1, 1, 1], [(3.0,), (8.0,)]),
 ([0, 0, 0, 1, 1], [(3.6666666666666665,), (9.5,)])]

### BEGIN SOLUTION TEMPLATE=solution to problem 1
![kmeans_solution](kmeans_solution.png)
### END SOLUTION

## PROBLEM 2: Gaussian mixture model-based clustering (20 points)

Run the EM algorithm (either by hand or with code) for Gaussian mixture model-based
clustering on the set of points in Problem 1 for **three** iterations.
Let $k=2$, the initial cluster means be $\mu_1 = 2$ and $\mu_2 = 5$,
the initial cluster prior probabilities be $P_1 = P_2 = 0.5$, and the
variances be $\sigma^2_1 = \sigma^2_2 = 3$.  You should treat the variances as fixed
parameters that are not updated during EM.  After each iteration, show 

**(i)** the probabilities of each point being assigned to each cluster, 

**(ii)** the updated cluster means, and 

**(iii)** the updated cluster prior probabilities. 

*If you run the algorithm via code, you must present the output in a nicely formatted manner.*

**For your own understanding (not graded)**: compare and constrast your results from the EM algorithm to those from k-means in problem 1.

### BEGIN SOLUTION TEMPLATE=solution to problem 2
![em_solution](em_solution.png)
### END SOLUTION

## PROBLEM 3: Bottom-up hierarchical clustering (60 points)
Implement bottom-up hierarchical clustering with the function `cluster_bottom_up` below.  The function takes as input a list of profiles and a list of corresponding names (which will be the names for the leaves of the resulting tree).  In addition, the function will take as input a string specifying which linkage function (e.g., single) to use for the clustering as well as a distance function that computes the distance (e.g., euclidean distance) between a pair of profiles.  The function will output a `TreeNode` object, representing the root of the hierarchical clustering tree.

**Distances:** The tree should have branch lengths computed in the same way as for the UPGMA algorithm for phylogenetic trees.  Each node should have a "height", with the leaf nodes being at height zero and the root node being the highest node.  After merging two nodes (clusters), $i$ and $j$, into a new node $k$, the height of node $k$ should be half the distance from cluster $i$ to cluster $j$, i.e., $height(k) = d_{ij} / 2$.

**Tie-breaking:** For tie-breaking purposes, you should keep track of an index for each node (cluster) in the tree.  The input profiles will correspond to the leaves of the tree, and should have indices 0 to $n-1$ where $n$ is the number of profiles.  Each successive node that is created should have the next available integer index (e.g., the very first merge of the algorithm should produce the node with index $n$ and the following merge should produce the node with index $n+1$).  When finding the next pair of clusters to merge, if two or more pairs have the same minimum distance, pick the pair with the lexicographically smallest pair of indices $(i, j)$.  For example, if the pairs of clusters (3, 8) and (5, 7) have the same minimum distance, you should choose the pair (3, 8) to merge next.

**Efficiency:** Your implementation should have runtime complexity of $O(n^3)$.  You are welcome to implement the more efficient $O(n^2)$ (for single-link) and $O(n^2 \log n)$ (for complete and average-link) algorithms, but this is not required.

**Hierarchical clustering data structure:** You are to use objects of the `TreeNode` class as we did in notebook 22 to build your hierarchical clustering structure.

**Tests:** Visible and hidden tests for your function are found at the bottom of this notebook.

## Modules for this HW

In [2]:
import toytree                        # for working with trees
from toytree.TreeNode import TreeNode # make TreeNode directly available
import math

In [3]:
def euclidean_distance(p1, p2):
    """The Euclidean distance between two profiles."""
    return math.sqrt(sum((e1 - e2)**2 for e1, e2 in zip(p1, p2)))

def manhattan_distance(p1, p2):
    """The Manhattan distance between two profiles."""
    return sum(abs(e1 - e2) for e1, e2 in zip(p1, p2))

def cluster_bottom_up(profiles, profile_names, linkage="single", distance=euclidean_distance):
    """Performs a bottom-down hierarchical clustering of a list of profiles, returning
    a tree that has the given profile names labeling the leaves.
    
    Args:
        profiles: a list of profiles/points (each of which is represented as a tuple)
        profile_names: a list of the same length as profiles giving the names of the profiles
        linkage: a string indicating the linkage method to use, which should be one of 
            "single", "complete", and "average"
        distance: a function that takes as input two profiles and returns a number giving
                  the distance between the two profile, i.e., distance(p1, p2) should return
                  the distance between profile p1 and profile p2.
    Returns:
        A TreeNode instance representing the root of the hierarchical clustering tree.
    """
    ### BEGIN SOLUTION
    dist_update = update_map[linkage]
    
    nodes = [TreeNode(name=name, dist=0) for name in profile_names]
    sizes = [1] * len(profiles)
    heights = [0] * len(profiles)
    next_node = len(nodes)
    c = set(range(next_node))

    # Make initial distance matrix
    max_nodes = 2 * len(profiles) - 1
    d = matrix(max_nodes, max_nodes)
    for i in c:
        for j in c:
            d[i][j] = d[j][i] = distance(profiles[i], profiles[j])
    
    while len(c) > 1:
        uv_dist, u, v = min_pair(d, c)
        height = uv_dist / 2
        node = TreeNode()
        node.add_child(nodes[u], dist=height - heights[u])
        node.add_child(nodes[v], dist=height - heights[v])
        nodes.append(node)
        sizes.append(sizes[u] + sizes[v])
        heights.append(height)
        c.remove(u)
        c.remove(v)
        j = next_node
        next_node += 1
        for i in c:
            d[i][j] = d[j][i] = dist_update(u, v, i, d, sizes)
        c.add(j)
    
    return nodes[c.pop()]
    ### END SOLUTION

### BEGIN SOLUTION TEMPLATE=
def min_pair(d, c):
    pair_dists = [(d[i][j], i, j) for i in c for j in c if i < j]
    return min(pair_dists)
    
def matrix(rows, cols):
    return [[None] * cols for i in range(rows)]
    
def single_link_update(u, v, k, d, s):
    return min(d[u][k], d[v][k])

def complete_link_update(u, v, k, d, s):
    return max(d[u][k], d[v][k])

def average_link_update(u, v, k, d, s):
    return (s[u] * d[u][k] + s[v] * d[v][k]) / (s[u] + s[v])

update_map = {
    "single": single_link_update,
    "complete": complete_link_update,
    "average": average_link_update
}
### END SOLUTION

## PROBLEM 4: Evaluation of cell type clusterings (10 points)

In this problem we will revisit the cell type expression profile dataset that we used in the day 22 activity.  This dataset is read in via the cell below.  One major division between human cell types is between blood cell types and non-blood cell types.  The cell below also reads in a list of the cell types in the dataset that are blood cell types.  
In this problem you are to evaluate the results of bottom-up clusterings computed from this dataset by determining how well the blood cell types cluster within each tree.  For a given hierarchical clustering tree, we may compute how well  the blood cell types cluster within the tree by finding the subtree within the tree that has the maximum Jaccard index with the set of blood cell types.  The Jaccard index is a measure of the similarity of two sets, $A$ and $B$, and is defined as:

$jaccard\_index(A, B) = \frac{|A \cap B|}{|A \cup B|}$

For the *six* possible combinations of distance measure (euclidean and manhattan) and linkage function (single, average, and complete), compute the bottom-up clustering of the cell type dataset, and then compute the maximum Jaccard index of a subtree of that tree with the blood cell types.  **Give the value of the maximum Jaccard index for each clustering and determine which distance measure and linkage function gives the best clustering with respect to this evaluation.**

*Implementation notes:* Note that each node in a tree corresponds to a subtree (the subtree rooted by that node), and thus one can iterate through all subtrees in a tree by iterating through the nodes.  You will likely find the `traverse` method of TreeNode objects helpful for iterating through nodes.  Also, the `get_leaf_names` method of TreeNode objects may be helpful in retrieving the names of the leaves in a subtree.


In [4]:
def read_gene_expression_profiles(filename):
    rows = [line.rstrip().split("\t") for line in open(filename)]
    sample_names = rows[0]
    columns = zip(*rows[1:])
    profiles = [tuple(map(float, column)) for column in columns]
    return profiles, sample_names

cell_type_profiles, cell_type_names = read_gene_expression_profiles("cell_type_expression.txt")

blood_cell_type_names = [line.rstrip() for line in open("blood_cell_types.txt")]

In [5]:
### BEGIN SOLUTION
def jaccard_index(s1, s2):
    return len(s1 & s2) / len(s1 | s2)

def max_clade_jaccard(tree, ref_set):
    return max(jaccard_index(set(node.get_leaf_names()), ref_set) for node in tree.traverse())

distance_map = {"euclidean": euclidean_distance,
                "manhattan": manhattan_distance}

bottom_up_trees = {(dist_name, update): cluster_bottom_up(cell_type_profiles, cell_type_names, update, distance)
                   for dist_name, distance in distance_map.items() for update in update_map}

jaccard_scores = {key: max_clade_jaccard(tree, set(blood_cell_type_names))
                  for key, tree in bottom_up_trees.items()}

for key, score in jaccard_scores.items():
    print(key, score)

max_score = max(jaccard_scores.values())
max_trees = [key for key, score in jaccard_scores.items() if score == max_score]

print("The max score of {} is obtained by {}".format(round(max_score, 3), max_trees))
### END SOLUTION

('euclidean', 'single') 0.6739130434782609
('euclidean', 'complete') 0.8043478260869565
('euclidean', 'average') 0.75
('manhattan', 'single') 0.782608695652174
('manhattan', 'complete') 0.75
('manhattan', 'average') 0.8
The max score of 0.804 is obtained by [('euclidean', 'complete')]


## Tests for PROBLEM 3

In [6]:
# we will generate a random dataset of 100-dimensional profiles from the vertices of a hypercube
import random
def random_hypercube_vertex(dims):
    return tuple(random.randint(0, 1) for i in range(dims))

random.seed(42)
dims = 100
num_profiles = 250
random_100_250_profiles = [random_hypercube_vertex(dims) for i in range(num_profiles)]
random_100_250_names = ["P{}".format(i) for i in range(num_profiles)]

# a dictionary of datasets for testing
# the key is the name of the test
# the value is a tuple (profiles, profile_names)
datasets = {
    "pair": ([(4, 2), (-2, -6)], 
             ["A", "B"]),
    "triple": ([(4, 2), (-2, -6), (1, 6)],
               ["A", "B", "C"]),
    "quintet": ([(0,), (6,), (8,), (11,), (15,)],
                ["A", "B", "C", "D", "E"]),
    "tiebreaker": ([(0,), (1,), (2,), (3,), (4,), (5,)],
                   ["A", "B", "C", "D", "E", "F"]),
    "cell_type_sample": (
        [(2.7, 0.0, 2.2, 0.0, 1.6, 2.2, 0.8, 2.6, 0.0, 0.0),
         (3.0, 0.0, 2.0, 0.0, 3.2, 1.6, 1.2, 2.6, 1.1, 1.1),
         (2.9, 0.0, 2.3, 0.0, 2.2, 1.9, 0.5, 2.3, 1.0, 0.3),
         (2.6, 0.0, 2.3, 0.0, 2.2, 2.0, 0.7, 2.0, 1.3, 0.8),
         (2.7, 0.0, 2.1, 0.0, 4.0, 2.1, 1.3, 2.6, 1.0, 1.0),
         (2.8, 0.0, 2.1, 0.0, 2.6, 2.2, 2.3, 2.4, 1.4, 1.1),
         (2.7, 0.0, 2.1, 0.0, 3.2, 2.8, 1.9, 1.9, 1.2, 0.8),
         (2.8, 0.0, 2.5, 1.3, 2.8, 2.1, 2.8, 2.0, 1.0, 0.7),
         (2.9, 0.0, 2.0, 0.8, 1.4, 2.2, 1.9, 2.6, 1.0, 0.7)],
        ["placental pericyte",
         "stromal cell",
         "pericyte cell",
         "skin fibroblast",
         "hematopoietic cell",
         "stromal cell of ovary",
         "calvarial osteoblast",
         "osteoblast",
         "astrocyte"]),
    "random_100_250": (random_100_250_profiles, random_100_250_names),
    "cell_type": (cell_type_profiles, cell_type_names)
}

# testing functions
def test_case_newick(name, linkage, dist):
    profiles, names = datasets[name]
    tree = cluster_bottom_up(profiles, names, linkage=linkage, distance=dist)
    tree.sort_descendants()
    return tree.write(format=1)    

def test_case(name, linkage, distance, correct_newick):
    output_newick = test_case_newick(name, linkage, distance)
    if output_newick != correct_newick:
        assert False, "Failed test\n Output: %s\nCorrect: %s" % (output_newick, correct_newick)
    else:
        print("SUCCESS: test passed")

In [7]:
# test pair_single_euclidean(4 points)
test_case("pair", "single", euclidean_distance, "(A:5,B:5);")

SUCCESS: test passed


In [8]:
# test pair_single_manhattan (4 points)
test_case("pair", "single", manhattan_distance, "(A:7,B:7);")

SUCCESS: test passed


In [9]:
# test triple_single_manhattan (3 points)
test_case("triple", "single", manhattan_distance, "((A:3.5,C:3.5):3.5,B:7);")

SUCCESS: test passed


In [10]:
# test triple_complete_manhattan (3 points)
test_case("triple", "complete", manhattan_distance, "((A:3.5,C:3.5):4,B:7.5);")

SUCCESS: test passed


In [11]:
# test triple_average_manhattan (4 points)
test_case("triple", "average", manhattan_distance, "((A:3.5,C:3.5):3.75,B:7.25);")

SUCCESS: test passed


In [12]:
# test quintet_single_manhattan (3 points)
test_case("quintet", "single", manhattan_distance, "(A:3,(((B:1,C:1):0.5,D:1.5):0.5,E:2):1);")

SUCCESS: test passed


In [13]:
# test quintet_complete_manhattan (3 points)
test_case("quintet", "complete", manhattan_distance, "((A:4,(B:1,C:1):3):3.5,(D:2,E:2):5.5);")

SUCCESS: test passed


In [14]:
# test quintet_average_manhattan (4 points)
test_case("quintet", "average", manhattan_distance, "(A:5,((B:1,C:1):2,(D:2,E:2):1):2);")

SUCCESS: test passed


In [15]:
# test tiebreaker_single_manhattan (5 points)
test_case("tiebreaker", "single", manhattan_distance, "(((A:0.5,B:0.5):0,(C:0.5,D:0.5):0):0,(E:0.5,F:0.5):0);")

SUCCESS: test passed


In [16]:
# test cell_type_sample_average_euclidean (5 points)
test_case("cell_type_sample", "average", euclidean_distance, 
          "((astrocyte:0.936623,((pericyte cell:0.377492,skin fibroblast:0.377492):0.3967,placental pericyte:0.774191):0.162432):0.172292,(((calvarial osteoblast:0.563471,stromal cell of ovary:0.563471):0.223098,(hematopoietic cell:0.504975,stromal cell:0.504975):0.281594):0.24476,osteoblast:1.03133):0.0775861);")

SUCCESS: test passed


In [17]:
# test cell_type_sample_single_chebyshev (5 points)
test_case("cell_type_sample", 
          "single", 
          lambda p1, p2: max(abs(e1 - e2) for e1, e2 in zip(p1, p2)), # Chebyshev distance
          '((astrocyte:0.55,(((calvarial osteoblast:0.3,stromal cell of ovary:0.3):0.1,(hematopoietic cell:0.4,stromal cell:0.4):0):0.1,((pericyte cell:0.25,skin fibroblast:0.25):0.25,placental pericyte:0.5):0):0.05):0.1,osteoblast:0.65);')

SUCCESS: test passed


In [18]:
# test random_100_250_runtime (10 points)
import timeit
random_100_250_newick = "((((((((P0:18,P75:18):5,(P178:18.5,P246:18.5):4.5):5,((P118:17,P33:17):7.5,((P125:19.5,P18:19.5):1,P235:20.5):4):3.5):2,(((P13:19,P133:19):3,(P162:17.5,P4:17.5):4.5):4,((P154:20,P93:20):3,(P73:19.5,P85:19.5):3.5):3):4):1.5,((((P10:18.5,P28:18.5):6,(P117:19,P37:19):5.5):3.5,(((P115:18.5,P48:18.5):4,(P138:17.5,P160:17.5):5):3,(P186:19.5,(P249:19,P47:19):0.5):6):2.5):2,((((P101:17.5,P137:17.5):6.5,(P108:17.5,P89:17.5):6.5):2,((P155:18.5,P54:18.5):3,P165:21.5):4.5):3,(((P119:20,P70:20):3.5,(P236:15,P61:15):8.5):3,((P187:18.5,P36:18.5):5,(P27:18,P80:18):5.5):3):2.5):1):1.5):1,(((((P100:19.5,P157:19.5):3,(P145:18.5,P19:18.5):4):3,((P179:17.5,P42:17.5):5,(P245:17,P52:17):5.5):3):3.5,(((P14:18.5,P144:18.5):5,((P17:20,P202:20):1,P46:21):2.5):3.5,((P159:18,P22:18):5.5,(P244:22,P51:22):1.5):3.5):2):2,((((P111:16,P128:16):7,(P20:16,P229:16):7):1,((P114:18,P168:18):3,P205:21):3):5.5,((((P123:14.5,P161:14.5):8.5,(P230:18,P88:18):5):4.5,((P129:19,P32:19):3.5,(P53:16.5,P6:16.5):6):5):1,(((P140:19.5,P141:19.5):1,P234:20.5):4.5,(P30:20,P49:20):5):3.5):1):1.5):1.5):1.5,((((((P1:19.5,P237:19.5):3,(P189:19.5,P228:19.5):3):4,((P122:18.5,P217:18.5):3,(P57:20.5,P84:20.5):1):5):2,((((P2:17,P226:17):1.5,P69:18.5):4.5,(P240:17.5,P241:17.5):5.5):3.5,(P81:20.5,(P90:18,P99:18):2.5):6):2):2,(((P105:17,P181:17):5.5,(P150:17,P77:17):5.5):4.5,((P188:18,P225:18):7,(P201:18.5,P95:18.5):6.5):2):3.5):1,(((((P11:19,P142:19):3.5,(P210:18.5,P34:18.5):4):3,(P164:16.5,P86:16.5):9):2,(((P158:20,P180:20):2,P182:22):2,(P247:18.5,P44:18.5):5.5):3.5):2,(((P120:19.5,P16:19.5):5.5,((P127:17.5,P64:17.5):2,P167:19.5):5.5):2.5,(((P131:18,P166:18):4,(P215:20.5,P222:20.5):1.5):2,(P172:19,P183:19):5):3.5):2):2):2.5):1,(((((((P102:18.5,P191:18.5):3,(P132:17.5,P63:17.5):4):2.5,(P106:19.5,(P194:16,P72:16):3.5):4.5):4,(((P112:17.5,P24:17.5):6,(P151:17.5,P71:17.5):6):0.5,(P146:19.5,P232:19.5):4.5):4):2,((((P110:18.5,P134:18.5):5.5,((P15:17.5,P203:17.5):4.5,(P238:17.5,P40:17.5):4.5):2):2.5,((P163:17.5,P198:17.5):5.5,(P169:19,P26:19):4):3.5):2.5,(((P116:18,P25:18):4.5,(P62:18.5,P83:18.5):4):3,((P121:21,P45:21):3,(P184:18.5,P39:18.5):5.5):1.5):3.5):1):2.5,(((((P103:17,P218:17):4.5,(P200:18,P56:18):3.5):5,(((P147:18.5,P176:18.5):5,(P177:18,P66:18):5.5):2,((P209:19,P219:19):1,P223:20):5.5):1):2.5,((((P124:18.5,P96:18.5):1,P21:19.5):5,(P185:19,P211:19):5.5):2.5,((P152:18.5,P204:18.5):4,(P29:16,P60:16):6.5):4.5):2):1.5,((((P107:21,(P5:16.5,P67:16.5):4.5):5.5,((P199:18,P231:18):5.5,(P92:18.5,P94:18.5):5):3):1,((P213:19,P68:19):5,(P227:20.5,(P248:18,P3:18):2.5):3.5):3.5):1.5,(((P149:19.5,P197:19.5):2.5,(P192:19,P91:19):3):3,(P156:21,(P221:19,P74:19):2):4):4):1.5):2):1,((((((P104:18,P78:18):5,(P113:19.5,P87:19.5):3.5):3.5,((P208:18,P239:18):5.5,(P58:20,P98:20):3.5):3):2.5,(((P171:17,P243:17):5.5,(P216:17,P23:17):5.5):4,((P242:18,P31:18):4,(P59:18.5,P76:18.5):3.5):4.5):2.5):1.5,((((P12:17.5,P38:17.5):4.5,P65:22):2.5,((P135:20.5,P8:20.5):2,(P173:17,P174:17):5.5):2):4,(((P170:16,P206:16):6,(P233:18,P82:18):4):3.5,((P207:15,P50:15):6.5,P43:21.5):4):3):2):0.5,(((((P109:18.5,P136:18.5):5,(P143:18.5,P212:18.5):5):2,(P153:19.5,P214:19.5):6):1.5,((P126:19.5,(P130:18.5,P195:18.5):1):6,((P190:18.5,P9:18.5):5,(P193:18,P220:18):5.5):2):1.5):2,(((P139:20.5,P224:20.5):4,((P175:15.5,P35:15.5):5.5,P55:21):3.5):3.5,((P148:17.5,P196:17.5):7.5,((P41:18,P7:18):5.5,(P79:19,P97:19):4.5):1.5):3):1):2):2.5):1.5);"
test_statement = 'test_case("random_100_250", "complete", manhattan_distance, random_100_250_newick)'
random_100_250_runtime = timeit.timeit(test_statement, number=1, globals=globals())
assert random_100_250_runtime < 12, "your cluster_bottom_up implementation is too inefficient"
print("SUCCESS: random_100_250_runtime test passed" )

SUCCESS: test passed
SUCCESS: random_100_250_runtime test passed


In [19]:
# hidden test 1
### BEGIN HIDDEN TESTS
cell_type_single_manhattan_newick = "(((((((((((((((((((((((((CD4-positive_ alpha-beta memory T cell:137.472,T-helper 17 cell:137.472):10.1451,activated CD4-positive_ alpha-beta T cell_ human:147.618):9.10096,memory T cell:156.718):14.1278,T follicular helper cell:170.846):18.0147,naive thymus-derived CD8-positive_ alpha-beta T cell:188.861):10.1172,innate effector T cell:198.978):10.2183,CD8-positive_ alpha-beta memory T cell:209.196):11.5892,(central memory CD8-positive_ alpha-beta T cell:146.053,(effector memory CD8-positive_ alpha-beta T cell:143.643,effector memory CD8-positive_ alpha-beta T cell_ terminally differentiated:143.643):2.40963):74.7329):20.5043,((((IgD-negative memory B cell:55.7833,IgG memory B cell:55.7833):124.669,memory B cell:180.453):13.9186,germinal center B cell:194.371):25.1883,plasmablast:219.56):21.7304):6.13775,plasma cell:247.428):11.3992,((((((common myeloid progenitor:127.742,hematopoietic multipotent progenitor cell:127.742):23.28,cord blood hematopoietic stem cell:151.022):43.7485,hematopoietic stem cell:194.77):14.7283,early lymphoid progenitor:209.498):21.2795,(myeloid lineage restricted progenitor cell:228.223,precursor B cell:228.223):2.55474):3.90671,(erythroid progenitor cell:226.19,megakaryocyte:226.19):8.49467):24.1423):2.79837,bone marrow cell:261.625):9.1234,plasmacytoid dendritic cell:270.749):1.33912,(((((alveolar macrophage:170.972,conidium:170.972):25.8514,myofibroblast cell:196.823):22.9957,(macrophage dendritic cell progenitor:205.448,pleural macrophage:205.448):14.3705):26.9025,myeloid dendritic cell:246.721):7.27309,pre-conventional dendritic cell:253.994):18.0935):1.2995,(common myeloid progenitor_ CD34-positive:251.79,(metamyelocyte:153.433,myelocyte:153.433):98.3573):21.5972):0.789438,lung secretory cell:274.177):3.05435,central nervous system macrophage:277.231):22.0074,((((((((((androgen binding protein secreting cell:236.877,(((((((astrocyte:169.82,(((((calvarial osteoblast:132.736,osteoblast:132.736):21.1437,((placental pericyte:143.718,stromal cell:143.718):9.17863,stromal cell of ovary:152.897):0.982707):2.68115,((embryonic cell:136.082,multinucleate cell:136.082):11.0144,muscle precursor cell:147.096):9.4645):1.22736,extracellular matrix secreting cell:157.788):10.3036,(pericyte cell:151.627,skin fibroblast:151.627):16.4644):1.72806):22.0251,kidney glomerular epithelial cell:191.845):4.44154,hematopoietic cell:196.286):10.6672,vascular associated smooth muscle cell:206.954):13.0746,((keratin accumulating cell:180.419,(mammary gland epithelial cell:148.46,myoepithelial cell:148.46):31.9591):25.7866,neuron associated cell _sensu Vertebrata_:206.206):13.8226):8.74591,kidney tubule cell:228.774):0.542252,epithelial cell of amnion:229.316):7.5608):2.76145,((endothelial cell of vascular tree:209.25,retinal blood vessel endothelial cell:209.25):22.7083,(endothelial cell:231.835,mesodermal cell:231.835):0.123168):7.68025):5.01196,trophoblast cell:244.651):5.81434,corneal endothelial cell:250.465):6.89159,brain microvascular endothelial cell:257.356):3.76148,(spermatid:138.414,spermatocyte:138.414):122.704):0.379572,pancreatic ductal cell:261.498):13.5956,fibroblast:275.093):8.85462,radial glial cell:283.948):1.53853,trophectodermal cell:285.486):13.7522):0.31133,(cumulus cell:237.874,granulosa cell:237.874):61.676):0.505705,(epithelial melanocyte:72.0404,melanocyte:72.0404):228.015):7.67266,(((((((CD8-positive_ alpha-beta thymocyte:4.22071,male germ cell:4.22071):106.454,electrically signaling cell:110.675):94.9289,glial cell:205.604):8.76218,((pancreatic D cell:190.679,pancreatic PP cell:190.679):12.4425,type A enterocrine cell:203.122):11.2444):30.5151,primordial germ cell:244.881):1.64413,(erythroid lineage cell:120.721,(orthochromatic erythroblast:90.2559,polychromatophilic erythroblast:90.2559):30.4647):125.805):37.6513,acinar cell:284.177):23.5516):9.54593,leukocyte:317.274):2.37291,respiratory epithelial cell:319.647):6.71374,anucleate cell:326.361):3.51559,metabolising cell:329.876);"
assert test_case_newick("cell_type", "single", manhattan_distance) == cell_type_single_manhattan_newick
### END HIDDEN TESTS

In [20]:
# hidden test 2
### BEGIN HIDDEN TESTS
cell_type_complete_manhattan_newick = "((((((((((((CD4-positive_ alpha-beta memory T cell:137.472,T-helper 17 cell:137.472):13.9488,activated CD4-positive_ alpha-beta T cell_ human:151.421):31.6023,memory T cell:183.024):14.3034,T follicular helper cell:197.327):128.142,(myeloid lineage restricted progenitor cell:228.223,precursor B cell:228.223):97.2457):45.9897,(bone marrow cell:285.651,(plasma cell:247.428,plasmablast:247.428):38.2231):85.8077):40.9352,(((((IgD-negative memory B cell:55.7833,IgG memory B cell:55.7833):136.733,memory B cell:192.517):46.3303,germinal center B cell:238.847):112.034,((myeloid dendritic cell:253.994,pre-conventional dendritic cell:253.994):42.3301,plasmacytoid dendritic cell:296.324):54.5567):30.67,(common myeloid progenitor_ CD34-positive:252.267,(metamyelocyte:153.433,myelocyte:153.433):98.8338):129.284):30.8427):23.6506,(((CD8-positive_ alpha-beta memory T cell:258.802,(innate effector T cell:198.978,naive thymus-derived CD8-positive_ alpha-beta T cell:198.978):59.8242):24.5277,(central memory CD8-positive_ alpha-beta T cell:164.93,(effector memory CD8-positive_ alpha-beta T cell:143.643,effector memory CD8-positive_ alpha-beta T cell_ terminally differentiated:143.643):21.287):118.4):102.734,((((common myeloid progenitor:127.742,hematopoietic multipotent progenitor cell:127.742):49.4948,cord blood hematopoietic stem cell:177.236):34.6848,hematopoietic stem cell:211.921):56.444,early lymphoid progenitor:268.365):117.699):49.9801):41.7707,(anucleate cell:365.51,(erythroid progenitor cell:226.19,megakaryocyte:226.19):139.321):112.305):109.675,(((((alveolar macrophage:170.972,conidium:170.972):28.0173,myofibroblast cell:198.989):74.7883,(macrophage dendritic cell progenitor:205.448,pleural macrophage:205.448):68.329):17.9095,central nervous system macrophage:291.687):201.563,(lung secretory cell:334.536,respiratory epithelial cell:334.536):158.713):94.2405):206.761,(((((((androgen binding protein secreting cell:276.151,(((((astrocyte:184.934,(extracellular matrix secreting cell:157.788,muscle precursor cell:157.788):27.146):4.22292,((calvarial osteoblast:132.736,osteoblast:132.736):33.0982,stromal cell of ovary:165.834):23.323):29.1306,hematopoietic cell:218.288):8.59538,((pericyte cell:151.627,skin fibroblast:151.627):41.8148,(placental pericyte:143.718,stromal cell:143.718):49.7239):33.4409):15.2868,(embryonic cell:136.082,multinucleate cell:136.082):106.088):33.9812):21.188,(trophoblast cell:253.392,vascular associated smooth muscle cell:253.392):43.9468):22.0959,(endothelial cell of vascular tree:209.25,retinal blood vessel endothelial cell:209.25):110.185):49.1504,((corneal endothelial cell:291.769,(endothelial cell:231.835,mesodermal cell:231.835):59.9336):48.5286,(epithelial melanocyte:72.0404,melanocyte:72.0404):268.257):28.2879):12.443,(((brain microvascular endothelial cell:261.498,pancreatic ductal cell:261.498):35.0921,(kidney glomerular epithelial cell:228.774,kidney tubule cell:228.774):67.8155):27.3405,(epithelial cell of amnion:261.803,((keratin accumulating cell:188.968,(mammary gland epithelial cell:148.46,myoepithelial cell:148.46):40.508):42.7183,neuron associated cell _sensu Vertebrata_:231.686):30.1172):62.1267):57.0981):41.5606,(cumulus cell:237.874,granulosa cell:237.874):184.715):109.129,(fibroblast:402.21,(radial glial cell:285.082,(spermatid:138.414,spermatocyte:138.414):146.668):117.128):129.508):262.534):46.6926,((((((((CD8-positive_ alpha-beta thymocyte:4.22071,male germ cell:4.22071):107.477,electrically signaling cell:111.697):102.761,glial cell:214.459):86.3794,((pancreatic D cell:190.679,pancreatic PP cell:190.679):41.0134,type A enterocrine cell:231.693):69.1455):63.7961,(primordial germ cell:307.728,trophectodermal cell:307.728):56.9061):38.8572,(erythroid lineage cell:146.737,(orthochromatic erythroblast:90.2559,polychromatophilic erythroblast:90.2559):56.4813):256.754):59.6671,(acinar cell:355.469,metabolising cell:355.469):107.69):82.4069,leukocyte:545.565):295.378);"
assert test_case_newick("cell_type", "complete", manhattan_distance) == cell_type_complete_manhattan_newick
### END HIDDEN TESTS

In [21]:
# hidden test 3
### BEGIN HIDDEN TESTS
cell_type_average_manhattan_newick = "(((((((((((CD4-positive_ alpha-beta memory T cell:137.472,T-helper 17 cell:137.472):12.0469,activated CD4-positive_ alpha-beta T cell_ human:149.519):18.3877,memory T cell:167.907):16.622,T follicular helper cell:184.529):92.4626,((CD8-positive_ alpha-beta memory T cell:233.999,(innate effector T cell:198.978,naive thymus-derived CD8-positive_ alpha-beta T cell:198.978):35.0212):24.948,(central memory CD8-positive_ alpha-beta T cell:155.491,(effector memory CD8-positive_ alpha-beta T cell:143.643,effector memory CD8-positive_ alpha-beta T cell_ terminally differentiated:143.643):11.8483):103.456):18.0444):47.2591,((((IgD-negative memory B cell:55.7833,IgG memory B cell:55.7833):130.701,memory B cell:186.485):34.558,germinal center B cell:221.043):77.3083,(bone marrow cell:273.638,(plasma cell:247.428,plasmablast:247.428):26.2103):24.7128):25.9):28.8571,(((((((common myeloid progenitor:127.742,hematopoietic multipotent progenitor cell:127.742):36.3874,cord blood hematopoietic stem cell:164.129):37.6669,hematopoietic stem cell:201.796):37.2145,early lymphoid progenitor:239.01):47.1529,(myeloid lineage restricted progenitor cell:228.223,precursor B cell:228.223):57.9402):32.9163,(erythroid progenitor cell:226.19,megakaryocyte:226.19):92.8898):23.7502,(common myeloid progenitor_ CD34-positive:252.028,(metamyelocyte:153.433,myelocyte:153.433):98.5955):90.8016):10.2781):34.7592,(((((alveolar macrophage:170.972,conidium:170.972):26.9344,myofibroblast cell:197.906):47.1271,(macrophage dendritic cell progenitor:205.448,pleural macrophage:205.448):39.5849):41.3201,central nervous system macrophage:286.353):27.5734,((myeloid dendritic cell:253.994,pre-conventional dendritic cell:253.994):36.933,plasmacytoid dendritic cell:290.927):22.9992):73.9406):15.7989,anucleate cell:403.666):109.193,((((((((CD8-positive_ alpha-beta thymocyte:4.22071,male germ cell:4.22071):106.965,electrically signaling cell:111.186):97.4825,glial cell:208.669):55.961,((pancreatic D cell:190.679,pancreatic PP cell:190.679):26.7279,type A enterocrine cell:217.407):47.2225):29.325,primordial germ cell:293.955):46.336,(erythroid lineage cell:133.729,(orthochromatic erythroblast:90.2559,polychromatophilic erythroblast:90.2559):43.473):206.562):54.8939,(acinar cell:355.469,metabolising cell:355.469):39.7161):47.7564,leukocyte:442.941):69.9182):93.1689,(((((((((androgen binding protein secreting cell:256.054,((((((astrocyte:182.179,(placental pericyte:143.718,stromal cell:143.718):38.4604):5.98971,(pericyte cell:151.627,skin fibroblast:151.627):36.5411):3.16962,(((calvarial osteoblast:132.736,osteoblast:132.736):27.121,stromal cell of ovary:159.857):16.9522,(extracellular matrix secreting cell:157.788,muscle precursor cell:157.788):19.021):14.529):9.66197,(embryonic cell:136.082,multinucleate cell:136.082):64.9181):10.8007,hematopoietic cell:211.801):22.8824,vascular associated smooth muscle cell:234.683):21.3711):20.5301,((kidney glomerular epithelial cell:228.774,kidney tubule cell:228.774):23.9593,trophoblast cell:252.733):23.8509):18.2197,corneal endothelial cell:294.804):4.89375,((endothelial cell of vascular tree:209.25,retinal blood vessel endothelial cell:209.25):65.5921,(endothelial cell:231.835,mesodermal cell:231.835):43.0069):24.8556):17.361,((brain microvascular endothelial cell:261.498,pancreatic ductal cell:261.498):26.3857,(epithelial cell of amnion:246.341,((keratin accumulating cell:184.693,(mammary gland epithelial cell:148.46,myoepithelial cell:148.46):36.2335):37.5143,neuron associated cell _sensu Vertebrata_:222.208):24.133):41.5426):29.1755):21.4903,(epithelial melanocyte:72.0404,melanocyte:72.0404):266.509):15.055,fibroblast:353.604):5.82409,(cumulus cell:237.874,granulosa cell:237.874):121.554):88.725,((lung secretory cell:334.536,respiratory epithelial cell:334.536):89.181,((radial glial cell:284.515,(spermatid:138.414,spermatocyte:138.414):146.101):43.118,trophectodermal cell:327.633):96.0844):24.4357):157.875);"
assert test_case_newick("cell_type", "average", manhattan_distance) == cell_type_average_manhattan_newick
### END HIDDEN TESTS

In [22]:
# hidden test 4
### BEGIN HIDDEN TESTS
cell_type_single_euclidean_newick = "(((((((((((((((((((((((((((CD4-positive_ alpha-beta memory T cell:6.8455,T-helper 17 cell:6.8455):0.566247,activated CD4-positive_ alpha-beta T cell_ human:7.41175):0.154683,memory T cell:7.56643):0.718436,naive thymus-derived CD8-positive_ alpha-beta T cell:8.28487):0.197892,T follicular helper cell:8.48276):0.814048,innate effector T cell:9.29681):1.03481,CD8-positive_ alpha-beta memory T cell:10.3316):0.108511,(central memory CD8-positive_ alpha-beta T cell:6.81632,(effector memory CD8-positive_ alpha-beta T cell:6.4802,effector memory CD8-positive_ alpha-beta T cell_ terminally differentiated:6.4802):0.336116):3.62381):0.942942,(((((IgD-negative memory B cell:3.0867,IgG memory B cell:3.0867):5.80826,plasmablast:8.89496):0.0527073,memory B cell:8.94767):0.663753,germinal center B cell:9.61142):0.381608,plasma cell:9.99303):1.39004):0.168946,bone marrow cell:11.552):0.0834478,(((((((common myeloid progenitor:5.38802,hematopoietic multipotent progenitor cell:5.38802):1.40673,cord blood hematopoietic stem cell:6.79474):1.59837,hematopoietic stem cell:8.39311):1.51569,early lymphoid progenitor:9.9088):0.277642,myeloid lineage restricted progenitor cell:10.1864):0.00826621,precursor B cell:10.1947):0.509947,(erythroid progenitor cell:10.104,megakaryocyte:10.104):0.600641):0.930809):0.40644,(((((alveolar macrophage:8.24664,conidium:8.24664):0.738597,myofibroblast cell:8.98524):1.15653,(macrophage dendritic cell progenitor:9.19201,pleural macrophage:9.19201):0.949762):0.808449,(myeloid dendritic cell:10.5609,pre-conventional dendritic cell:10.5609):0.389329):1.03081,lung secretory cell:11.981):0.0608731):0.281324,(((((((((((((((androgen binding protein secreting cell:9.58706,((((astrocyte:7.96197,((((calvarial osteoblast:6.57975,osteoblast:6.57975):1.08919,stromal cell of ovary:7.66894):0.0310536,(placental pericyte:6.96773,stromal cell:6.96773):0.732265):0.0361256,(((embryonic cell:6.355,multinucleate cell:6.355):0.532706,muscle precursor cell:6.8877):0.602027,extracellular matrix secreting cell:7.48973):0.246387):0.225855):0.192617,(pericyte cell:7.45064,skin fibroblast:7.45064):0.703952):0.658667,hematopoietic cell:8.81326):0.605937,kidney glomerular epithelial cell:9.41919):0.167866):0.306236,vascular associated smooth muscle cell:9.8933):0.675397,kidney tubule cell:10.5687):0.137824,((keratin accumulating cell:8.52016,(mammary gland epithelial cell:6.81773,myoepithelial cell:6.81773):1.70243):1.63861,neuron associated cell _sensu Vertebrata_:10.1588):0.54775):0.257138,epithelial cell of amnion:10.9637):0.0604816,(spermatid:5.44104,spermatocyte:5.44104):5.5831):0.102442,((endothelial cell of vascular tree:9.81061,retinal blood vessel endothelial cell:9.81061):1.02183,(endothelial cell:10.2956,mesodermal cell:10.2956):0.536874):0.294141):0.182273,corneal endothelial cell:11.3089):0.229379,trophoblast cell:11.5382):0.310105,radial glial cell:11.8483):0.0362006,brain microvascular endothelial cell:11.8845):0.047348,trophectodermal cell:11.9319):0.174104,pancreatic ductal cell:12.106):0.0998152,fibroblast:12.2058):0.0671628,(cumulus cell:10.5797,granulosa cell:10.5797):1.69322):0.0502617):0.0262937,central nervous system macrophage:12.3495):0.135404,plasmacytoid dendritic cell:12.4849):0.377832,(common myeloid progenitor_ CD34-positive:11.7005,(metamyelocyte:7.25891,myelocyte:7.25891):4.44155):1.1623):0.0816443,respiratory epithelial cell:12.9444):0.634632,(erythroid lineage cell:6.3294,(orthochromatic erythroblast:5.0486,polychromatophilic erythroblast:5.0486):1.28079):7.24964):0.0673064,(epithelial melanocyte:3.7254,melanocyte:3.7254):9.92095):0.12819,anucleate cell:13.7745):0.309553,primordial germ cell:14.0841):0.58375,metabolising cell:14.6678):0.495409,((CD8-positive_ alpha-beta thymocyte:3.07079,male germ cell:3.07079):9.21628,electrically signaling cell:12.2871):2.87617):0.229809,glial cell:15.3931):0.0810031,((pancreatic D cell:12.3885,pancreatic PP cell:12.3885):0.73155,type A enterocrine cell:13.12):2.35403):0.332348,acinar cell:15.8064):3.89529,leukocyte:19.7017);"
assert test_case_newick("cell_type", "single", euclidean_distance) == cell_type_single_euclidean_newick
### END HIDDEN TESTS

In [23]:
# hidden test 5
### BEGIN HIDDEN TESTS
cell_type_complete_euclidean_newick = "((((((((((((CD4-positive_ alpha-beta memory T cell:6.8455,T-helper 17 cell:6.8455):0.777379,activated CD4-positive_ alpha-beta T cell_ human:7.62288):1.69242,T follicular helper cell:9.3153):4.99633,(myeloid lineage restricted progenitor cell:10.1947,precursor B cell:10.1947):4.11693):1.41307,((CD8-positive_ alpha-beta memory T cell:10.3316,innate effector T cell:10.3316):2.49159,((central memory CD8-positive_ alpha-beta T cell:6.86562,(effector memory CD8-positive_ alpha-beta T cell:6.4802,effector memory CD8-positive_ alpha-beta T cell_ terminally differentiated:6.4802):0.385418):5.01591,(memory T cell:8.28487,naive thymus-derived CD8-positive_ alpha-beta T cell:8.28487):3.59666):0.941681):2.9015):1.76497,(common myeloid progenitor_ CD34-positive:11.7212,(metamyelocyte:7.25891,myelocyte:7.25891):4.46225):5.76852):0.548583,((((IgD-negative memory B cell:3.0867,IgG memory B cell:3.0867):6.18127,memory B cell:9.26797):2.26738,germinal center B cell:11.5354):3.88124,(bone marrow cell:11.7421,(plasma cell:9.99303,plasmablast:9.99303):1.7491):3.67447):2.62167):1.61223,((anucleate cell:15.3044,(erythroid progenitor cell:10.104,megakaryocyte:10.104):5.20035):1.83386,((((common myeloid progenitor:5.38802,hematopoietic multipotent progenitor cell:5.38802):2.47081,cord blood hematopoietic stem cell:7.85883):1.52692,hematopoietic stem cell:9.38575):2.67485,early lymphoid progenitor:12.0606):5.07762):2.51227):1.24235,(((((alveolar macrophage:8.24664,conidium:8.24664):1.05394,myofibroblast cell:9.30058):2.99074,(macrophage dendritic cell progenitor:9.19201,pleural macrophage:9.19201):3.0993):0.819879,central nervous system macrophage:13.1112):3.05899,((myeloid dendritic cell:10.5609,pre-conventional dendritic cell:10.5609):3.16509,plasmacytoid dendritic cell:13.726):2.4442):4.72266):3.16492,leukocyte:24.0578):3.35201,(((((((CD8-positive_ alpha-beta thymocyte:3.07079,male germ cell:3.07079):9.39063,electrically signaling cell:12.4614):3.36767,glial cell:15.8291):1.38244,(primordial germ cell:14.0841,trophectodermal cell:14.0841):3.12745):0.833289,((pancreatic D cell:12.3885,pancreatic PP cell:12.3885):1.76951,type A enterocrine cell:14.158):3.88683):1.31903,(erythroid lineage cell:7.53435,(orthochromatic erythroblast:5.0486,polychromatophilic erythroblast:5.0486):2.48574):11.8295):1.83624,(acinar cell:16.8891,metabolising cell:16.8891):4.311):6.20968):3.43327,((((((androgen binding protein secreting cell:11.0186,(embryonic cell:6.355,multinucleate cell:6.355):4.66359):0.335263,((((astrocyte:8.53306,(extracellular matrix secreting cell:7.48973,muscle precursor cell:7.48973):1.04333):0.454166,((calvarial osteoblast:6.57975,osteoblast:6.57975):1.29685,stromal cell of ovary:7.87659):1.11063):1.03544,(placental pericyte:6.96773,stromal cell:6.96773):3.05493):0.718524,(hematopoietic cell:9.26108,(pericyte cell:7.45064,skin fibroblast:7.45064):1.81044):1.48011):0.612661):2.24408,(trophoblast cell:11.8485,vascular associated smooth muscle cell:11.8485):1.7494):1.18662,fibroblast:14.7845):3.75667,((((brain microvascular endothelial cell:12.106,pancreatic ductal cell:12.106):1.2708,(kidney glomerular epithelial cell:10.5687,kidney tubule cell:10.5687):2.80809):1.91713,(epithelial cell of amnion:12.0466,((keratin accumulating cell:8.68809,(mammary gland epithelial cell:6.81773,myoepithelial cell:6.81773):1.87036):2.87332,neuron associated cell _sensu Vertebrata_:11.5614):0.485154):3.24736):1.64806,(((corneal endothelial cell:12.8519,(endothelial cell:10.2956,mesodermal cell:10.2956):2.55632):2.33469,(cumulus cell:10.5797,granulosa cell:10.5797):4.60682):1.47386,((endothelial cell of vascular tree:9.81061,retinal blood vessel endothelial cell:9.81061):5.48226,(epithelial melanocyte:3.7254,melanocyte:3.7254):11.5675):1.36756):0.28154):1.59925):2.13024,((lung secretory cell:13.5487,respiratory epithelial cell:13.5487):4.67286,(radial glial cell:11.9943,(spermatid:5.44104,spermatocyte:5.44104):6.55321):6.22731):2.44989):10.1716);"
assert test_case_newick("cell_type", "complete", euclidean_distance) == cell_type_complete_euclidean_newick
### END HIDDEN TESTS

In [24]:
# hidden test 6
### BEGIN HIDDEN TESTS
cell_type_average_euclidean_newick = "((((((((((((CD4-positive_ alpha-beta memory T cell:6.8455,T-helper 17 cell:6.8455):0.671813,activated CD4-positive_ alpha-beta T cell_ human:7.51732):0.605128,memory T cell:8.12244):0.772844,T follicular helper cell:8.89529):3.42183,((CD8-positive_ alpha-beta memory T cell:11.2104,(innate effector T cell:9.29681,naive thymus-derived CD8-positive_ alpha-beta T cell:9.29681):1.91357):0.654243,(central memory CD8-positive_ alpha-beta T cell:6.84097,(effector memory CD8-positive_ alpha-beta T cell:6.4802,effector memory CD8-positive_ alpha-beta T cell_ terminally differentiated:6.4802):0.360767):5.02366):0.452498):2.46252,((((IgD-negative memory B cell:3.0867,IgG memory B cell:3.0867):6.02112,memory B cell:9.10782):1.60384,germinal center B cell:10.7117):2.19296,(bone marrow cell:11.6471,(plasma cell:9.99303,plasmablast:9.99303):1.65404):1.25754):1.87502):0.68485,(((((((common myeloid progenitor:5.38802,hematopoietic multipotent progenitor cell:5.38802):1.93877,cord blood hematopoietic stem cell:7.32679):1.42268,hematopoietic stem cell:8.74947):2.07749,early lymphoid progenitor:10.827):1.53082,(myeloid lineage restricted progenitor cell:10.1947,precursor B cell:10.1947):2.16307):1.21665,(erythroid progenitor cell:10.104,megakaryocyte:10.104):3.47041):1.48517,(common myeloid progenitor_ CD34-positive:11.7108,(metamyelocyte:7.25891,myelocyte:7.25891):4.4519):3.34879):0.404888):1.37321,anucleate cell:16.8377):0.275686,((((((alveolar macrophage:8.24664,conidium:8.24664):0.896266,myofibroblast cell:9.14291):2.03077,(macrophage dendritic cell progenitor:9.19201,pleural macrophage:9.19201):1.98167):1.72366,central nervous system macrophage:12.8973):1.09428,((myeloid dendritic cell:10.5609,pre-conventional dendritic cell:10.5609):2.54456,plasmacytoid dendritic cell:13.1055):0.88617):2.55422,(lung secretory cell:13.5487,respiratory epithelial cell:13.5487):2.99714):0.567538):3.68531,((((((CD8-positive_ alpha-beta thymocyte:3.07079,male germ cell:3.07079):9.30345,electrically signaling cell:12.3742):3.30165,glial cell:15.6759):0.440918,primordial germ cell:16.1168):1.72155,(acinar cell:16.4871,((pancreatic D cell:12.3885,pancreatic PP cell:12.3885):1.25053,type A enterocrine cell:13.639):2.84808):1.35128):0.459171,((erythroid lineage cell:6.93187,(orthochromatic erythroblast:5.0486,polychromatophilic erythroblast:5.0486):1.88327):10.2874,(metabolising cell:15.5595,((radial glial cell:11.9213,(spermatid:5.44104,spermatocyte:5.44104):6.48026):1.36445,trophectodermal cell:13.2857):2.27375):1.65983):1.07822):2.50116):1.15813,leukocyte:21.9568):1.52273,(((((((((androgen binding protein secreting cell:10.4557,(((((astrocyte:7.96197,extracellular matrix secreting cell:7.96197):0.463237,((calvarial osteoblast:6.57975,osteoblast:6.57975):1.19302,stromal cell of ovary:7.77277):0.652445):0.563882,((embryonic cell:6.355,multinucleate cell:6.355):1.05257,muscle precursor cell:7.40757):1.58152):0.415031,((pericyte cell:7.45064,skin fibroblast:7.45064):1.60711,(placental pericyte:6.96773,stromal cell:6.96773):2.09002):0.346379):0.0601773,hematopoietic cell:9.4643):0.991402):0.58141,vascular associated smooth muscle cell:11.0371):1.80972,((kidney glomerular epithelial cell:10.5687,kidney tubule cell:10.5687):1.30766,trophoblast cell:11.8764):0.970474):0.402289,(corneal endothelial cell:12.3772,(endothelial cell:10.2956,mesodermal cell:10.2956):2.0816):0.871951):0.348169,(endothelial cell of vascular tree:9.81061,retinal blood vessel endothelial cell:9.81061):3.78668):0.913836,fibroblast:14.5111):0.176318,((brain microvascular endothelial cell:12.106,pancreatic ductal cell:12.106):1.34705,(epithelial cell of amnion:11.644,((keratin accumulating cell:8.60412,(mammary gland epithelial cell:6.81773,myoepithelial cell:6.81773):1.7864):2.21699,neuron associated cell _sensu Vertebrata_:10.8211):0.822924):1.80901):1.2344):0.459555,(cumulus cell:10.5797,granulosa cell:10.5797):4.56725):0.304201,(epithelial melanocyte:3.7254,melanocyte:3.7254):11.7258):8.02835);"
assert test_case_newick("cell_type", "average", euclidean_distance) == cell_type_average_euclidean_newick
### END HIDDEN TESTS