Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lineage creation error #47

Open
cmalinmayor opened this issue Jul 12, 2024 · 5 comments
Open

Lineage creation error #47

cmalinmayor opened this issue Jul 12, 2024 · 5 comments

Comments

@cmalinmayor
Copy link

Hi @georgeoshardo - during some random runs of our data generation script, we hit this error. I'm still not exactly sure what exp_node stands for - what do you think would cause this to happen? I can confirm that the node is actually not in the graph, based on the IDs that are printed, but I didn't paste it here because it is rather long.

  File "write_video.py", line 91, in get_lineage_graph
    lineage = Lineage(simulation)
  File "/groups/funke/home/malinmayorc/code/SyMBac/SyMBac/lineage.py", line 45, in __init__
    assert exp_node in self.temporal_lineage_graph.nodes, f"Node {exp_node} not in temporal lineage graph nodes {self.temporal_lineage_graph.nodes}"
@georgeoshardo
Copy link
Owner

georgeoshardo commented Jul 12, 2024

Hi Caroline. I know of this error and am planning to fix it in the branch I'm working on. I have made an incorrect assumption when building the lineage tree. A node should have at most two successors and one predecessor, but I am accidentally introducing grandmother and granddaughter connections (sometimes).

You have access to the dataframe with mother mask labels and time points. You might be able to reconstruct the graph properly compared to that very hastily thrown together loop, where the exception is thrown.

@georgeoshardo
Copy link
Owner

Actually my previous comment is partly wrong, it's not grandmother-granddaughter connections, but rather a connection between a daughter and a mother at the wrong time.

This is just adding every cell to the graph, with the node being (mask label, time), but crucially I add the SyMBac.cell.Cell as a node attribute:

        for cells in self.simulation.cell_timeseries:
            for cell in cells:
                self.temporal_lineage_graph.add_node((cell.mask_label, cell.t), cell = cell)

This is where I build the temporal connections, and is where where things start going wrong, exp_node is referring to the predecessor in time, we create a hypothetical node for a given cell's mask label, but in the previous time-step. If that node exists, we can add an edge between the node at t-1 and t.

        for node in self.temporal_lineage_graph.nodes:
            exp_node = (node[0], node[1]-1)
            if exp_node in self.temporal_lineage_graph.nodes:
                self.temporal_lineage_graph.add_edge(exp_node, node)

Here I'm trying to build the mother daughter connections, Iterating through all the nodes in the graph. I pull the SyMBac.cell.Cell object out of the node's attributes. If that cell has a mother, then I want to create a mother-daughter edge. I assume (incorrectly) that its mother node must be (cell.mother.mask_label, node[1]). I am not sure why I did not choose (cell.mother.mask_label, node[1]-1), but that still wouldn't fully fix the issue I think...

        for node in self.temporal_lineage_graph.nodes:
            cell = self.temporal_lineage_graph.nodes[node]["cell"]
            if cell.mother:
                exp_node = (cell.mother.mask_label, node[1])
                assert exp_node in self.temporal_lineage_graph.nodes
                self.temporal_lineage_graph.add_edge(exp_node, node)

I also discovered that I've broken this code in my current branch, but this shouldn't be an issue because I completely reworked the Cell attributes to be less bad, and make lineage construction easier (I realised my current scheme will not work well for 2D or 3D colonies, or for situations where the mother dies before the daughter due to lysis), but I appreciate you probably want a solution now, therefore if you have any pre-existing code to generate these graphs, I think the Lineage.all_cell_data_df should contain all the information you need?

@cmalinmayor
Copy link
Author

Thanks for the info. I was able to re-create the graph that I wanted (I think). Perhaps it might be best to remove the buggy code from this branch in the meantime, while you fix it in the new branch?

@georgeoshardo
Copy link
Owner

georgeoshardo commented Jul 15, 2024 via email

@cmalinmayor
Copy link
Author

cmalinmayor commented Jul 16, 2024

Sure, I essentially created the graph nodes from the generated masks (node_id = (mask_id, time) (https://github.com/cmalinmayor/SyMBac/blob/aaa9dd49c952fc63f2b50812227219bef689a4e7/scripts/write_video.py#L155). Then, I used the Lineage.family_tree_edgelist to make the edges (https://github.com/cmalinmayor/SyMBac/blob/aaa9dd49c952fc63f2b50812227219bef689a4e7/scripts/write_video.py#L60). Specifically, for each node in the graph at time>0, if the same mask_id was present in the previous frame, I linked (mask_id, time-1) -> (mask_id, time). If the same mask_id was not present in the previous frame, I looked in the family_tree_edgelist to find the parent_mask_id, and linked (parent_mask_id, time - 1) -> (mask_id, time). I then remapped all the node_ids (and masks) to unique integers throughout all time points, which is why the code is a bit more complex.

It's still not the most elegant, but generating the nodes in the graph from the masks rather than the simulation cells avoided issues caused by extra cells that were not visible in the mask but were in the simulation. This approach does break if I add lysis, with some parent cells (parent_mask_id, time - 1) not being in the graph. This shouldn't happen because lysed cells shouldn't come back in later frames or have their mask_ids reused, but I didn't have time to debug that yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants