## Interaction network template
Produces an interaction edgelist from a ```.mymridon``` experiment file and saves it as a csv, which can be further analyzed in the optional part or exported into other software such as RStudio.  
This notebook is a walk trough a sample usage of the following:
* the py-myrmidon library ([Documentation](https://formicidae-tracker.github.io/myrmidon/latest/))
* the facetnet library ([Documentation](https://c4science.ch/source/facet_unil/))


In [1]:
import py_fort_myrmidon as fm
import numpy as np  # Fundamental math library in python. Here used only for convience: to save the csv.
from datetime import datetime, timedelta  # For convenient handling of time and date
import networkx as nx  # Optional: for general graph analysis and plotting
import facetnet  # Optional: for community analysis
import matplotlib.pyplot as plt  # Optional: for plotting
import pandas as pd

# Optional: makes plots interactive:
# %matplotlib widget

In [2]:
# Dino Col
f_myrmidon = "/media/ebiag/Ebi-9/Dinoponera_Col1/DinoponeraCol1.myrmidon"
t_start = datetime(2024, 7, 6, 0, 1).astimezone(
    tz=None
)  # <year, month, day, hour, minute>
t_end = datetime(2024, 7, 10, 23, 59).astimezone(tz=None)


exp = fm.Experiment.Open(f_myrmidon)

The following is an iterator for fort mymrmidon time over days in a period. See Ant metadata template for explanation.

In [3]:
def fm_time_range(start_datetime, end_datetime):
    for n in range(int((end_datetime - start_datetime).days) + 1):
        yield fm.Time(start_datetime + timedelta(n))

### Create and output edgelists

A matcher is used to filter interactions of a certain type ("body part 1 with body part 1") that were recorded during a user define period from ```t_start``` to ```t_end```. The ```for``` loop takes the ```ant_id``` of both individuals invovled in the interaction and updates the egde weight in the count graph, as well as in the interaction duration graph. The two edge lists are then saved to csv with name that consists of the experiment name and the dates. This is to avoid confusion, any string can serve as a file name.

In [4]:
m = fm.Matcher.InteractionType(1, 1)

G_counts = nx.Graph()
G_counts.add_nodes_from(exp.Ants.keys())
G_seconds = nx.Graph()
G_seconds.add_nodes_from(exp.Ants.keys())

for t_begin in fm_time_range(t_start, t_end):
    interactions = fm.Query.ComputeAntInteractions(
        exp, start=t_begin, end=t_begin.Add(fm.Duration.Parse("24h")), matcher=m
    )
    for ia in interactions[1]:
        if G_counts.has_edge(ia.IDs[0], ia.IDs[1]):
            G_counts[ia.IDs[0]][ia.IDs[1]]["weight"] += 1
        else:
            G_counts.add_edge(ia.IDs[0], ia.IDs[1], weight=1)

        if G_seconds.has_edge(ia.IDs[0], ia.IDs[1]):
            G_seconds[ia.IDs[0]][ia.IDs[1]]["weight"] += (ia.End - ia.Start).Seconds()
        else:
            G_seconds.add_edge(
                ia.IDs[0], ia.IDs[1], weight=(ia.End - ia.Start).Seconds()
            )

# f_edgelist_ct = "edgelist_interaction_counts_{}_{}_{}.csv".format(exp.Name, t_start, t_end)
# f_edgelist_sec = "edgelist_interaction_seconds_{}_{}_{}.csv".format(exp.Name, t_start, t_end)

# Alternate filenames to work with windows and for edgelists within a 24 hour period
day_exp_strt = t_start.strftime("%Y%m%d")
day_exp_end = t_end.strftime("%Y%m%d")
hr_strt = t_start.strftime("%H%M")
hr_end = t_end.strftime("%H%M")

f_edgelist_ct = "edgelist_interaction_counts_{}_{}-{}.csv".format(
    exp.Name, day_exp_strt, day_exp_end
)
f_edgelist_sec = "edgelist_interaction_seconds_{}_{}-{}.csv".format(
    exp.Name, day_exp_strt, day_exp_end
)

nx.write_edgelist(G_counts, f_edgelist_ct)
nx.write_edgelist(G_seconds, f_edgelist_sec)

Computing ant interactions: 100%|█▉| 1439/1440 [00:16<00:00, 89.40tracked min/s]
Computing ant interactions: 100%|█▉| 1439/1440 [00:19<00:00, 75.28tracked min/s]
Computing ant interactions: 100%|█▉| 1439/1440 [00:17<00:00, 82.41tracked min/s]
Computing ant interactions: 100%|█▉| 1439/1440 [00:17<00:00, 83.84tracked min/s]
Computing ant interactions: 100%|█▉| 1439/1440 [00:16<00:00, 85.15tracked min/s]


#### Remove outliers/misidentified ants

Often there may be outliers which are due to some tags being misidentified in fort-studio. We can use the degree to determine these ant IDs and remove them from the edgelist. In this case the threshold we use to detrmine if an ant is misidentified depends on the time duration of the experiment or the time range over which we are extracting the edgelists and needs to be changed accordingly

In [None]:
# Code for removing outliers - ant tags that were misidentified and hence have very low collisions/interactions
# Get sorted list of degree per node to identify outliers to remove
degree_dict = {node: val for (node, val) in G_counts.degree()}
degree_dict_sort = {
    k: v for k, v in sorted(degree_dict.items(), key=lambda item: item[1])
}
remove = [node for node, degree in dict(G_counts.degree()).items() if degree < 50]
G_counts.remove_nodes_from(remove)
f_edgelist_ct = "edgelist_counts_removed_{}_{}_{}.csv".format(exp.Name, t_start, t_end)

nx.write_edgelist(G_counts, f_edgelist_ct)

#### Combining edgelists when data is corrputed

This works by first outputting multiple edgelists which don't overlap with the range of time when the data corruption happened. After this we read in the edgelists and combine them.
We can also skip the step of outputting multiple edgelists by not writing them to a CSV if needed. In this case we directly create multiple graphs from the experiemntal data, combine them as below and output one edgelist CSV

In [None]:
# Code to combine edgelists. Only useful when there is data corruption leading to 2 or 3 edgelists that need to be extracted leaving out the corrupted data
import pandas as pd
import networkx as nx

G1_c = nx.read_edgelist(
    "edgelist_interaction_counts_Woundcare_cfell1_T2_2022-05-31 00:01:00+02:00_2022-06-02 23:59:00+02:00.csv",
    nodetype=int,
)
G2_c = nx.read_edgelist(
    "edgelist_interaction_counts_Woundcare_cfell1_T2_2022-06-04 06:01:00+02:00_2022-06-04 23:59:00+02:00.csv",
    nodetype=int,
)
G1_s = nx.read_edgelist(
    "edgelist_interaction_seconds_Woundcare_cfell1_T2_2022-05-31 00:01:00+02:00_2022-06-02 23:59:00+02:00.csv",
    nodetype=int,
)
G2_s = nx.read_edgelist(
    "edgelist_interaction_seconds_Woundcare_cfell1_T2_2022-06-04 06:01:00+02:00_2022-06-04 23:59:00+02:00.csv",
    nodetype=int,
)
n_G1 = list(G1_c)
n_G2 = list(G2_c)
missing_G1 = [
    item for item in n_G2 if item not in n_G1
]  # AntID 57 present in G2 and has only interacted with 26 twice and with no other ant
# Convert to pandas dataframes, and combine dataframes with summation of weights
pG1_c = nx.to_pandas_edgelist(G1_c)
pG2_c = nx.to_pandas_edgelist(G2_c)
pG_c = pd.concat([pG1_c, pG2_c]).groupby(["source", "target"]).sum().reset_index()
# Convert back to networkx graph
G_counts = nx.from_pandas_edgelist(pG_c, edge_attr=True)
# Removing antID 57 from the Graph as it is wrongly tagged in all likelihood
G_counts.remove_node(57)
# Convert to pandas dataframes, and combine dataframes with summation of weights
pG1_s = nx.to_pandas_edgelist(G1_s)
pG2_s = nx.to_pandas_edgelist(G2_s)
pG_s = pd.concat([pG1_s, pG2_s]).groupby(["source", "target"]).sum().reset_index()
# Convert back to networkx graph
G_seconds = nx.from_pandas_edgelist(pG_s, edge_attr=True)
# Removing antID 57 from the Graph as it is wrongly tagged in all likelihood
G_seconds.remove_node(57)
f_edgelist_ct = "edgelist_interaction_counts_{}_{}_{}.csv".format(
    exp.Name, t_start, t_end
)
f_edgelist_sec = "edgelist_interaction_seconds_{}_{}_{}.csv".format(
    exp.Name, t_start, t_end
)

nx.write_edgelist(G_counts, f_edgelist_ct, data=True)
nx.write_edgelist(G_seconds, f_edgelist_sec, data=True)

### Optional part 1: graph visualization using networkx library
For convenience, the facetnet library is used and the previously written csv is parsed again. There are other, more elegant ways to do this. The grap is plotted using the spring model, a common way to visualize graphs.

In [None]:
G_counts.remove_nodes_from(
    list(nx.isolates(G_counts))
)  # Nodes without any interactions are removed
G_seconds.remove_nodes_from(list(nx.isolates(G_seconds)))
fig, ax = plt.subplots()
nx.draw_spring(G_counts)
fig, ax = plt.subplots()
nx.draw_spring(G_seconds)

### Optional part 2: community detection using facetnet
A fixed number of community is assumed. Facetnet returns a resulting soft modularity score of the community detection, as well as the soft community membership for each individual and each community, which can be understood as membership "percentage". The graph is then plotted again with a node color representing community membership. Facetnet can also be used as a command line tool to process the saved csv directly.

In [None]:
np.random.seed(12345678)  # set seed for reproducibility
nb_communities = 2
wc = nx.to_numpy_matrix(G_counts)
idmap = G_counts.nodes
idmap_inv = {nid: i for i, nid in enumerate(idmap)}
dat_fn = facetnet.step(idmap, idmap_inv, wc, 0.7, nb_communities, show_plot=False)

# Plot resulting communities as colored node. Remove zero count individuals.
soft_comm = dat_fn[5]
# fig, ax = plt.subplots()
color_nodes = []
for i in range(len(soft_comm)):
    # red, green, blue value. Red means community 0, green means community 1
    color_nodes.append((soft_comm[i, 0], soft_comm[i, 1], 0))
# options = {"edgecolors": "tab:gray", "node_size": 120, "alpha": 0.25}
# options = {"edge_color": "gray", "node_size": 120, "alpha": 0.5}
# nx.draw_spring(G_counts, node_color=np.asarray(color_nodes), **options)
fig = plt.figure(figsize=(10, 8), dpi=300)
pos = nx.spring_layout(G_counts, seed=1234567)
nx.draw_networkx_nodes(
    G_counts, pos, node_color=np.asarray(color_nodes), alpha=0.75, node_size=120
)
nx.draw_networkx_edges(G_counts, pos, edge_color="tab:gray", alpha=0.2)
fig.savefig("Woundcare_Inf_Cfel13_baseline_SM.png")
print("soft modularity score: {}".format(dat_fn[4]))

#### Output Social Maturity CSV with known queen ID

Once the queen ID is known we can determine which of the communities has the highest membership score for the queen. The social maturity is the membership values of the other community(s). We then output this as a CSV with a few extra columns

In [None]:
# Output social maturity scores based on known queen ID
import pandas as pd

queen_id = 33
i_q = list(idmap).index(
    queen_id
)  # Extract index of queen from idmap, first element output by facetnet
queen_comm = np.argmax(
    dat_fn[5][i_q]
)  # Identify which of the communities has the highest value for the Queen. This is the queen community, and hence (1-social maturity values)
soc_mat = pd.DataFrame(dat_fn[5][:, queen_comm], columns=["soc_mat"])
soc_mat["soc_mat"] = (
    1 - soc_mat["soc_mat"]
)  # Subtracting 1 from Queen community value to get the forager community value/social maturity value. In case there are more than 2 communities, the social maturity value is the summation of all the non-queen community values
soc_mat["antID"] = dat_fn[0]  # Add the antIDs to the community membership values
soc_mat["queen"] = np.where(soc_mat["antID"] == queen_id, 1, 0)
soc_mat.to_csv(
    "social_maturity_scores_{}_{}_{}.csv".format(exp.Name, t_start, t_end), index=False
)

### Additional code for removal experiments

For the removal experiments in which ants are removed from the colony based on their social maturity values, some aditional code is required.
We first get a list of ants with social maturity values below a threshold.
We can then obtain the list of nodes (antIDs) that were removed and recreate the pre-removal network with these individuals removed. This can then be used to obtain the social maturity of the remaining individuals giving us social maturity values with and without the removed individuals in the pre-removal phase


In [None]:
# Code to obtain list of antIDs below/above a certain social maturity value
sm1 = soc_mat.sort_values(by=["soc_mat"])[soc_mat["soc_mat"] < 0.5]
sm1.to_csv("social_maturity_RemovalTest2_AntstoRemove.csv", index=False)

In [None]:
# Code to obtain social maturity distribution of pre-removal networks without the ants which are removed during the experiment
# Load social maturity datasets
sm_pre = pd.read_csv("social_maturity_scores_RemovalTest2_Pre_20220727-20220731.csv")
sm_post = pd.read_csv("social_maturity_scores_RemovalTest2_Post_20220803-20220807.csv")
# Obtain list of nodes that were removed
removed = list(set(sm_pre["antID"].tolist()).difference(sm_post["antID"].tolist()))
# Load pre removal edgelist of interaction counts
G_counts_pre = nx.read_edgelist(
    "edgelist_interaction_counts_RemovalTest2_Pre_20220727-20220731.csv", nodetype=int
)
# Remove nodes from the graph that correspond to removed ants
G_counts_pre.remove_nodes_from(removed)
nx.write_edgelist(
    G_counts_pre,
    "edgelist_interaction_counts_RemovalTest2_PreModified_20220727-20220731.csv",
)

In [None]:
# Load pre removal edgelist of interaction seconds
G_seconds_pre = nx.read_edgelist(
    "edgelist_interaction_seconds_RemovalTest2_Pre_20220727-20220731.csv", nodetype=int
)
# Remove nodes from the graph that correspond to removed ants
G_seconds_pre.remove_nodes_from(removed)
nx.write_edgelist(
    G_seconds_pre,
    "edgelist_interaction_seconds_RemovalTest2_PreModified_20220727-20220731.csv",
)

In [None]:
# Plot modified pre-removal interaction graph
fig, ax = plt.subplots()
nx.draw_spring(G_counts_pre)

In [None]:
# Run facetnet algorithm over modified graph
np.random.seed(12345678)  # set seed for reproducibility
nb_communities = 2
wc = nx.to_numpy_matrix(G_counts_pre)
idmap = G_counts.nodes
idmap_inv = {nid: i for i, nid in enumerate(idmap)}
dat_fn = facetnet.step(idmap, idmap_inv, wc, 0.7, nb_communities, show_plot=False)

# Plot resulting communities as colored node. Remove zero count individuals.
soft_comm = dat_fn[5]
fig, ax = plt.subplots()
color_nodes = []
for i in range(len(soft_comm)):
    # red, green, blue value. Red means community 0, green means community 1
    color_nodes.append((soft_comm[i, 0], soft_comm[i, 1], 0))
options = {"edgecolors": "tab:gray", "node_size": 120, "alpha": 0.6}
nx.draw_spring(G_counts_pre, node_color=np.asarray(color_nodes), **options)
print("soft modularity score: {}".format(dat_fn[4]))

In [None]:
# Output social maturity scores based on known queen ID
import pandas as pd

queen_id = 1
i_q = list(idmap).index(
    queen_id
)  # Extract index of queen from idmap, first element output by facetnet
queen_comm = np.argmax(
    dat_fn[5][i_q]
)  # Identify which of the communities has the highest value for the Queen. This is the queen community, and hence (1-social maturity values)
soc_mat = pd.DataFrame(dat_fn[5][:, queen_comm], columns=["soc_mat"])
soc_mat["soc_mat"] = (
    1 - soc_mat["soc_mat"]
)  # Subtracting 1 from Queen community value to get the forager community value/social maturity value. In case there are more than 2 communities, the social maturity value is the summation of all the non-queen community values
soc_mat["antID"] = dat_fn[0]  # Add the antIDs to the community membership values
soc_mat["queen"] = np.where(soc_mat["antID"] == queen_id, 1, 0)
soc_mat.to_csv(
    "social_maturity_scores_{}_PreModified_20220727-20220731.csv".format(exp.Name),
    index=False,
)

## Phase specific interaction networks

In [None]:
def extract_edgelists(exp, t_begin, t_end, matcher):
    """Function to extract edgelists per hou

    Args:
        exp (fort-myrmion experiment file): Fort myrmidon experiment file
        t_begin (datetime): Start time in local time with timezone set to None
        t_end (datetime): End time in local time with timezone set to None
        matcher (fort-myrmidon matcher): matcher object to filter interactions

    Returns:
        _type_: _description_
    """
    # Initialize graphs
    G_counts = nx.Graph()
    G_counts.add_nodes_from(exp.Ants.keys())
    G_seconds = nx.Graph()
    G_seconds.add_nodes_from(exp.Ants.keys())
    # Obtain interactions within the hour
    t_hr = fm.Time(t_begin)
    t_st = fm.Time(t_end)
    interactions = fm.Query.ComputeAntInteractions(
        exp,
        start=t_hr,
        end=t_st,
        matcher=matcher,
        # reportProgress=False,
    )
    # Fill graph with dges based on interactions
    for ia in interactions[1]:
        if G_counts.has_edge(ia.IDs[0], ia.IDs[1]):
            G_counts[ia.IDs[0]][ia.IDs[1]]["weight"] += 1
        else:
            G_counts.add_edge(ia.IDs[0], ia.IDs[1], weight=1)

        if G_seconds.has_edge(ia.IDs[0], ia.IDs[1]):
            G_seconds[ia.IDs[0]][ia.IDs[1]]["weight"] += (ia.End - ia.Start).Seconds()
        else:
            G_seconds.add_edge(
                ia.IDs[0], ia.IDs[1], weight=(ia.End - ia.Start).Seconds()
            )
    return G_counts, G_seconds

In [None]:
f_myrmidon = "/media/ebiag/Ebi-9/Dinoponera_Col1/DinoponeraCol1.myrmidon"
t_start = datetime(2024, 7, 6, 0, 1).astimezone(
    tz=None
)  # <year, month, day, hour, minute>
t_end = datetime(2024, 7, 10, 23, 59).astimezone(tz=None)


exp = fm.Experiment.Open(f_myrmidon)
m = fm.Matcher.InteractionType(1, 1)


In [None]:
G_ct, G_sec = extract_edgelists(exp, t_start, t_end, m)
nx.write_edgelist(G_ct, "edgelist_interaction_counts_{}_{}_{}.csv".format(exp.Name, t_start, t_end))
nx.write_edgelist(G_sec, "edgelist_interaction_seconds_{}_{}_{}.csv".format(exp.Name, t_start, t_end))

In [None]:
def phase_edgelists(exp, phase, t_start, t_end, matcher):
    """Wrapper function to extract edgelists for each phase and save them to csv

    Args:
        exp (fort-myrmion experiment file): Fort myrmidon experiment file
        phase (str): Name of the phase
        t_start (datetime): Start time in local time with timezone set to None
        t_end (datetime): End time in local time with timezone set to None
        matcher (fort-myrmidon matcher): matcher object to filter interactions
    """
    start = datetime.now()
    G_counts, G_seconds = extract_edgelists(exp, t_start, t_end, matcher)
    if t_start.strftime("%Y%m%d") == t_end.strftime("%Y%m%d"):
        day_exp = t_start.strftime("%Y%m%d")
        hr_strt = t_start.strftime("%H%M")
        hr_end = t_end.strftime("%H%M")
        f_edgelist_ct = "edgelist_counts_{}_{}_{}_{}-{}.csv".format(
            exp.Name, phase, day_exp, hr_strt, hr_end
        )
        f_edgelist_sec = "edgelist_seconds_{}_{}_{}_{}-{}.csv".format(
            exp.Name, phase, day_exp, hr_strt, hr_end
        )
    elif t_start.strftime("%Y%m%d") != t_end.strftime("%Y%m%d"):
        day_strt = t_start.strftime("%Y%m%d")
        day_end = t_end.strftime("%Y%m%d")
        f_edgelist_ct = "edgelist_counts_{}_{}_{}-{}.csv".format(
            exp.Name, phase, day_strt, day_end
        )
        f_edgelist_sec = "edgelist_seconds_{}_{}_{}-{}.csv".format(
            exp.Name, phase, day_strt, day_end
        )
    nx.write_edgelist(G_counts, f_edgelist_ct)
    nx.write_edgelist(G_seconds, f_edgelist_sec)
    print(
        f"{'Time taken to extract edgelists for '}{phase}{' phase is '}{datetime.now() - start}"
    )

In [None]:
phase_list = [
    "baseline",
    "Control",
    "R1",
    "R2",
    "R3",
    "R4",
    "R5",
    "PostC",
    "PostR1",
    "PostR2",
    "PostR3",
    "PostR4",
    "PostR5",
]
m = fm.Matcher.InteractionType(1, 1)
phase_list = phase_list[2:7]
phase_list = [phase_name + "Plus2" for phase_name in phase_list]

### Colony Cfel42

In [None]:
f_myrmidon = "/media/ebiag/Ebi-2/Woundcare Experiment1/Cfell_wound_col42.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
phase_starts_exp = [
    datetime(2022, 5, 1, 15, 54).astimezone(tz=None),
    datetime(2022, 5, 2, 16, 3).astimezone(tz=None),
    datetime(2022, 5, 3, 15, 53).astimezone(tz=None),
    datetime(2022, 5, 4, 15, 50).astimezone(tz=None),
    datetime(2022, 5, 5, 15, 50).astimezone(tz=None),
    datetime(2022, 5, 6, 15, 55).astimezone(tz=None),
]
phase_starts_post = [
    datetime(2022, 5, 2, 9, 0).astimezone(tz=None),
    datetime(2022, 5, 3, 9, 0).astimezone(tz=None),
    datetime(2022, 5, 4, 9, 0).astimezone(tz=None),
    datetime(2022, 5, 5, 9, 0).astimezone(tz=None),
    datetime(2022, 5, 6, 9, 0).astimezone(tz=None),
    datetime(2022, 5, 7, 9, 0).astimezone(tz=None),
]
phase_starts = phase_starts_exp + phase_starts_post
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]
baseline_start = datetime(2022, 4, 27, 0, 0).astimezone(tz=None)
baseline_end = datetime(2022, 5, 1, 23, 59).astimezone(tz=None)
phase_starts = [baseline_start] + phase_starts
phase_ends = [baseline_end] + phase_ends

In [None]:
# Run only if you need to get displacement for 2 hours after estart of each treatment phase
phase_starts = phase_starts[2:7]
phase_starts = [(start_time + timedelta(hours=2)) for start_time in phase_starts]
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]

In [None]:
cfel42_phase_edgelists = [
    phase_edgelists(exp, phase, t_start, t_end, m)
    for phase, t_start, t_end in zip(phase_list, phase_starts, phase_ends)
]

### Colony Cfel1

In [None]:
f_myrmidon = "/media/ebiag/Ebi-2/Woundcare Experiment2/woundcare_cfell1_T2.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
phase_starts_exp = [
    datetime(2022, 6, 4, 14, 48).astimezone(tz=None),
    datetime(2022, 6, 5, 14, 57).astimezone(tz=None),
    datetime(2022, 6, 6, 14, 30).astimezone(tz=None),
    datetime(2022, 6, 7, 14, 49).astimezone(tz=None),
    datetime(2022, 6, 8, 14, 43).astimezone(tz=None),
    datetime(2022, 6, 9, 15, 5).astimezone(tz=None),
]
phase_starts_post = [
    datetime(2022, 6, 5, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 6, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 7, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 8, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 9, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 10, 8, 0).astimezone(tz=None),
]
phase_starts = phase_starts_exp + phase_starts_post
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]

In [None]:
# Run only if you need to get displacement for 2 hours after estart of each treatment phase
phase_starts = phase_starts[1:6]
phase_starts = [(start_time + timedelta(hours=2)) for start_time in phase_starts]
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]

In [None]:
# To exclude basline
cfel1_phase_edgelists = [
    phase_edgelists(exp, phase, t_start, t_end, m)
    for phase, t_start, t_end in zip(phase_list[1:], phase_starts, phase_ends)
]
# For Plus2 edgelists
# cfel1_phase_edgelists = [
#     phase_edgelists(exp, phase, t_start, t_end, m)
#     for phase, t_start, t_end in zip(phase_list, phase_starts, phase_ends)
# ]

Due to data corruption in Cfel1 on 4th day of baseeline we need to ignore around 30 hours of data. We will need to create two baseline edgelists and combine them

In [None]:
# Timings of two phases in baseline without corrupted data
baseline_start_1 = datetime(2022, 5, 31, 0, 0).astimezone(tz=None)
baseline_end_1 = datetime(2022, 6, 2, 23, 59).astimezone(tz=None)
baseline_start_2 = datetime(2022, 6, 4, 6, 0).astimezone(tz=None)
baseline_end_2 = datetime(2022, 6, 4, 23, 59).astimezone(tz=None)
# Extract edgelists
b1_counts, b1_seconds = extract_edgelists(exp, baseline_start_1, baseline_end_1, m)
b2_counts, b2_seconds = extract_edgelists(exp, baseline_start_2, baseline_end_2, m)

In [None]:
# Convert to pandas edgelist
b1_counts_df = nx.to_pandas_edgelist(b1_counts)
b2_counts_df = nx.to_pandas_edgelist(b2_counts)
b1_seconds_df = nx.to_pandas_edgelist(b1_seconds)
b2_seconds_df = nx.to_pandas_edgelist(b2_seconds)
# Combine dataframes
b_counts_df = (
    pd.concat([b1_counts_df, b2_counts_df])
    .groupby(["source", "target"])
    .sum()
    .reset_index()
)
b_seconds_df = (
    pd.concat([b1_seconds_df, b2_seconds_df])
    .groupby(["source", "target"])
    .sum()
    .reset_index()
)
# Convert back to networkx graph
b_counts = nx.from_pandas_edgelist(b_counts_df, edge_attr=True)
b_seconds = nx.from_pandas_edgelist(b_seconds_df, edge_attr=True)

In [None]:
f_edgelist_ct = "edgelist_counts_{}_baseline_20220531-20220604.csv".format(exp.Name)
f_edgelist_sec = "edgelist_seconds_{}_baseline_20220531-20220604.csv".format(exp.Name)
nx.write_edgelist(b_counts, f_edgelist_ct)
nx.write_edgelist(b_seconds, f_edgelist_sec)

### Colony Cfel54

In [None]:
f_myrmidon = "/media/ebiag/Ebi-2/Woundcare Experiment3/woundcare_cfell54_T3.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
phase_starts_exp = [
    datetime(2022, 6, 19, 14, 26).astimezone(tz=None),
    datetime(2022, 6, 20, 14, 35).astimezone(tz=None),
    datetime(2022, 6, 21, 14, 21).astimezone(tz=None),
    datetime(2022, 6, 22, 14, 28).astimezone(tz=None),
    datetime(2022, 6, 23, 14, 14).astimezone(tz=None),
    datetime(2022, 6, 24, 14, 31).astimezone(tz=None),
]
phase_starts_post = [
    datetime(2022, 6, 20, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 21, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 22, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 23, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 24, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 25, 8, 0).astimezone(tz=None),
]
phase_starts = phase_starts_exp + phase_starts_post
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]
baseline_start = datetime(2022, 6, 15, 0, 0).astimezone(tz=None)
baseline_end = datetime(2022, 6, 19, 23, 59).astimezone(tz=None)
phase_starts = [baseline_start] + phase_starts
phase_ends = [baseline_end] + phase_ends

In [None]:
# Run only if you need to get displacement for 2 hours after estart of each treatment phase
phase_starts = phase_starts[2:7]
phase_starts = [(start_time + timedelta(hours=2)) for start_time in phase_starts]
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]

In [None]:
cfel54_phase_edgelists = [
    phase_edgelists(exp, phase, t_start, t_end, m)
    for phase, t_start, t_end in zip(phase_list, phase_starts, phase_ends)
]

### Colony Cfel55

In [None]:
f_myrmidon = "/media/ebiag/Ebi-3/InfectionExp_Cfel55/InfectionExpCol55.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
phase_starts_exp = [
    datetime(2023, 4, 18, 14, 40).astimezone(tz=None),
    datetime(2023, 4, 20, 15, 45).astimezone(tz=None),
    datetime(2023, 4, 21, 14, 48).astimezone(tz=None),
    datetime(2023, 4, 22, 14, 17).astimezone(tz=None),
    datetime(2023, 4, 23, 14, 0).astimezone(tz=None),
    datetime(2023, 4, 24, 14, 54).astimezone(tz=None),
]
phase_starts_post = [
    datetime(2023, 4, 20, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 21, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 22, 7, 30).astimezone(tz=None),
    datetime(2023, 4, 23, 7, 30).astimezone(tz=None),
    datetime(2023, 4, 24, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 25, 8, 0).astimezone(tz=None),
]
phase_starts = phase_starts_exp + phase_starts_post
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]
baseline_start = datetime(2023, 4, 14, 0, 0).astimezone(tz=None)
baseline_end = datetime(2023, 4, 18, 23, 59).astimezone(tz=None)
phase_starts = [baseline_start] + phase_starts
phase_ends = [baseline_end] + phase_ends

In [None]:
phase_starts = phase_starts[2:7]
phase_starts = [(start_time + timedelta(hours=2)) for start_time in phase_starts]
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]

In [None]:
cfel55_phase_edgelists = [
    phase_edgelists(exp, phase, t_start, t_end, m)
    for phase, t_start, t_end in zip(phase_list, phase_starts, phase_ends)
]

### Colony Cfel 13

In [None]:
f_myrmidon = "/media/ebiag/Ebi-3/InfectionExp_Cfel13/InfectionExp_Cfel13.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
phase_starts_exp = [
    datetime(2023, 4, 23, 15, 5).astimezone(tz=None),
    datetime(2023, 4, 24, 15, 29).astimezone(tz=None),
    datetime(2023, 4, 25, 14, 19).astimezone(tz=None),
    datetime(2023, 4, 26, 15, 3).astimezone(tz=None),
    datetime(2023, 4, 27, 16, 43).astimezone(tz=None),
    datetime(2023, 4, 28, 14, 27).astimezone(tz=None),
]
phase_starts_post = [
    datetime(2023, 4, 24, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 25, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 26, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 27, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 28, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 29, 8, 0).astimezone(tz=None),
]
phase_starts = phase_starts_exp + phase_starts_post
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]
baseline_start = datetime(2023, 4, 19, 0, 0).astimezone(tz=None)
baseline_end = datetime(2023, 4, 23, 23, 59).astimezone(tz=None)
phase_starts = [baseline_start] + phase_starts
phase_ends = [baseline_end] + phase_ends

In [None]:
phase_starts = phase_starts[2:7]
phase_starts = [(start_time + timedelta(hours=2)) for start_time in phase_starts]
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]

In [None]:
cfel13_phase_edgelists = [
    phase_edgelists(exp, phase, t_start, t_end, m)
    for phase, t_start, t_end in zip(phase_list, phase_starts, phase_ends)
]

### Colony Cfel 64

In [None]:
f_myrmidon = "/media/ebiag/Ebi-1/InfectionExp_Cfel64/InfectionExpCol64.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
phase_starts_exp = [
    datetime(2023, 5, 31, 15, 5).astimezone(tz=None),
    datetime(2023, 6, 1, 15, 51).astimezone(tz=None),
    datetime(2023, 6, 2, 14, 44).astimezone(tz=None),
    datetime(2023, 6, 3, 14, 50).astimezone(tz=None),
    datetime(2023, 6, 4, 14, 43).astimezone(tz=None),
    datetime(2023, 6, 5, 14, 52).astimezone(tz=None),
]
phase_starts_post = [
    datetime(2023, 6, 1, 8, 0).astimezone(tz=None),
    datetime(2023, 6, 2, 8, 0).astimezone(tz=None),
    datetime(2023, 6, 3, 8, 0).astimezone(tz=None),
    datetime(2023, 6, 4, 8, 0).astimezone(tz=None),
    datetime(2023, 6, 5, 8, 0).astimezone(tz=None),
    datetime(2023, 6, 6, 8, 0).astimezone(tz=None),
]
phase_starts = phase_starts_exp + phase_starts_post
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]
baseline_start = datetime(2023, 5, 27, 0, 0).astimezone(tz=None)
baseline_end = datetime(2023, 5, 31, 23, 59).astimezone(tz=None)
phase_starts = [baseline_start] + phase_starts
phase_ends = [baseline_end] + phase_ends

In [None]:
phase_starts = phase_starts[2:7]
phase_starts = [(start_time + timedelta(hours=2)) for start_time in phase_starts]
phase_ends = [(start_time + timedelta(hours=6)) for start_time in phase_starts]

In [None]:
cfel64_phase_edgelists = [
    phase_edgelists(exp, phase, t_start, t_end, m)
    for phase, t_start, t_end in zip(phase_list, phase_starts, phase_ends)
]

## Hourly Interaction Networks

In [None]:
def hourly_edgelists(exp, phase_day, t_start, n_times, matcher):
    """Function to create edgelists per hour given a start time and the numbr of hours

    Args:
        exp (fort-myrmion experiment file): Fort myrmidon experiment file
        phase_day (str): Name of the phase and replicate day within the experiment
        t_start (datetime): Start time in local time with timezone set to None
        n_times (int): Number of hours/time ranges to create edgelists for
        matcher (fort-myrmidon matcher): matcher object to filter interactions
    """
    start = datetime.now()
    for n in range(n_times):
        # Get starting and ending time per hour
        t_begin = t_start + timedelta(hours=n)
        # t_begin = t_start
        t_end = t_begin + timedelta(hours=1)
        print(t_begin, t_end)
        # Extract edgelist for the hour
        G_counts, G_seconds = extract_edgelists(exp, t_begin, t_end, matcher)
        # Get day and hour details for name of edgelists
        day_exp = t_start.strftime("%Y%m%d")
        hr_strt = t_begin.strftime("%H%M")
        hr_end = t_end.strftime("%H%M")
        # Create names for edgelists
        f_edgelist_ct = "edgelist_counts_{}_{}_{}_H{}_{}-{}.csv".format(
            exp.Name, phase_day, day_exp, n + 1, hr_strt, hr_end
        )
        f_edgelist_sec = "edgelist_seconds_{}_{}_{}_H{}_{}-{}.csv".format(
            exp.Name, phase_day, day_exp, n + 1, hr_strt, hr_end
        )
        nx.write_edgelist(G_counts, f_edgelist_ct)
        nx.write_edgelist(G_seconds, f_edgelist_sec)
    end = datetime.now()
    print(f"{'Hourly edgelists for '}{phase_day}{' output in '}{end - start}")

In [None]:
# Common list items across all colonies
phase_days = [
    "R1",
    "R2",
    "R3",
    "R4",
    "R5",
    "PreR1",
    "PostR1",
    "PostR2",
    "PostR3",
    "PostR4",
    "PostR5",
]
m = fm.Matcher.InteractionType(1, 1)

### Colony Cfel42

In [None]:
f_myrmidon = "/media/ebiag/Ebi-2/Woundcare Experiment1/Cfell_wound_col42.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
# Lis of start times for different phases
hr_starts_exp = [
    datetime(2022, 5, 2, 16, 3).astimezone(tz=None),
    datetime(2022, 5, 3, 15, 53).astimezone(tz=None),
    datetime(2022, 5, 4, 15, 50).astimezone(tz=None),
    datetime(2022, 5, 5, 15, 50).astimezone(tz=None),
    datetime(2022, 5, 6, 15, 55).astimezone(tz=None),
]
hr_starts_pre = [
    datetime(2022, 5, 2, 9, 0).astimezone(tz=None),
]
hr_starts_post = [
    datetime(2022, 5, 3, 9, 0).astimezone(tz=None),
    datetime(2022, 5, 4, 9, 0).astimezone(tz=None),
    datetime(2022, 5, 5, 9, 0).astimezone(tz=None),
    datetime(2022, 5, 6, 9, 0).astimezone(tz=None),
    datetime(2022, 5, 7, 9, 0).astimezone(tz=None),
]
# Combine all start times
hr_starts = hr_starts_exp + hr_starts_pre + hr_starts_post

# Combine to create one list
func_list = [
    (exp, phase_day, hr_start, 6, m)
    for phase_day, hr_start in zip(phase_days, hr_starts)
]

In [None]:
cfel42_hourly_edgelists = [
    hourly_edgelists(exp, phase_day, t_start, n_hours, matcher)
    for exp, phase_day, t_start, n_hours, matcher in func_list
]

### Colony Cfel1

In [None]:
f_myrmidon = "/media/ebiag/Ebi-2/Woundcare Experiment2/woundcare_cfell1_T2.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
# Lis of start times for different phases
hr_starts_exp = [
    datetime(2022, 6, 5, 14, 57).astimezone(tz=None),
    datetime(2022, 6, 6, 14, 30).astimezone(tz=None),
    datetime(2022, 6, 7, 14, 49).astimezone(tz=None),
    datetime(2022, 6, 8, 14, 43).astimezone(tz=None),
    datetime(2022, 6, 9, 15, 5).astimezone(tz=None),
]
hr_starts_pre = [
    datetime(2022, 6, 5, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 6, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 7, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 8, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 9, 8, 0).astimezone(tz=None),
]
hr_starts_post = [datetime(2022, 6, 10, 8, 0).astimezone(tz=None)]
# Combine all start times
hr_starts = hr_starts_exp + hr_starts_pre + hr_starts_post

# Combine to create one list
func_list = [
    (exp, phase_day, hr_start, 6, m)
    for phase_day, hr_start in zip(phase_days, hr_starts)
]

In [None]:
cfel1_hourly_edgelists = [
    hourly_edgelists(exp, phase_day, t_start, n_hours, matcher)
    for exp, phase_day, t_start, n_hours, matcher in func_list
]

### Colony Cfel54

In [None]:
f_myrmidon = "/media/ebiag/Ebi-2/Woundcare Experiment3/woundcare_cfell54_T3.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
# Lis of start times for different phases
hr_starts_exp = [
    datetime(2022, 6, 20, 14, 35).astimezone(tz=None),
    datetime(2022, 6, 21, 14, 21).astimezone(tz=None),
    datetime(2022, 6, 22, 14, 28).astimezone(tz=None),
    datetime(2022, 6, 23, 14, 14).astimezone(tz=None),
    datetime(2022, 6, 24, 14, 31).astimezone(tz=None),
]
hr_starts_pre = [
    datetime(2022, 6, 20, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 21, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 22, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 23, 8, 0).astimezone(tz=None),
    datetime(2022, 6, 24, 8, 0).astimezone(tz=None),
]
hr_starts_post = [datetime(2022, 6, 25, 8, 0).astimezone(tz=None)]
# Combine all start times
hr_starts = hr_starts_exp + hr_starts_pre + hr_starts_post

# Combine to create one list
func_list = [
    (exp, phase_day, hr_start, 6, m)
    for phase_day, hr_start in zip(phase_days, hr_starts)
]

In [None]:
cfel54_hourly_edgelists = [
    hourly_edgelists(exp, phase_day, t_start, n_hours, matcher)
    for exp, phase_day, t_start, n_hours, matcher in func_list
]

### Colony Cfel 55

In [None]:
f_myrmidon = "/media/ebiag/Ebi-3/InfectionExp_Cfel55/InfectionExpCol55.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
# Lis of start times for different phases
hr_starts_exp = [
    datetime(2023, 4, 20, 15, 45).astimezone(tz=None),
    datetime(2023, 4, 21, 14, 48).astimezone(tz=None),
    datetime(2023, 4, 22, 14, 17).astimezone(tz=None),
    datetime(2023, 4, 23, 14, 0).astimezone(tz=None),
    datetime(2023, 4, 24, 14, 54).astimezone(tz=None),
]
hr_starts_pre = [
    datetime(2023, 4, 20, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 21, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 22, 7, 30).astimezone(tz=None),
    datetime(2023, 4, 23, 7, 30).astimezone(tz=None),
    datetime(2023, 4, 24, 8, 0).astimezone(tz=None),
]
hr_starts_post = [datetime(2023, 4, 25, 8, 0).astimezone(tz=None)]
# Combine all start times
hr_starts = hr_starts_exp + hr_starts_pre + hr_starts_post

# Combine to create one list
func_list = [
    (exp, phase_day, hr_start, 6, m)
    for phase_day, hr_start in zip(phase_days, hr_starts)
]

In [None]:
cfel55_hourly_edgelists = [
    hourly_edgelists(exp, phase_day, t_start, n_hours, matcher)
    for exp, phase_day, t_start, n_hours, matcher in func_list
]

### Colony Cfel13

In [None]:
f_myrmidon = "/media/ebiag/Ebi-3/InfectionExp_Cfel13/InfectionExp_Cfel13.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
# Lis of start times for different phases
hr_starts_exp = [
    datetime(2023, 4, 24, 15, 29).astimezone(tz=None),
    datetime(2023, 4, 25, 14, 19).astimezone(tz=None),
    datetime(2023, 4, 26, 15, 3).astimezone(tz=None),
    datetime(2023, 4, 27, 16, 43).astimezone(tz=None),
    datetime(2023, 4, 28, 14, 27).astimezone(tz=None),
]
hr_starts_pre = [
    datetime(2023, 4, 24, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 25, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 26, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 27, 8, 0).astimezone(tz=None),
    datetime(2023, 4, 28, 8, 0).astimezone(tz=None),
]
hr_starts_post = [datetime(2023, 4, 29, 8, 0).astimezone(tz=None)]
# Combine all start times
hr_starts = hr_starts_exp + hr_starts_pre + hr_starts_post

# Combine to create one list
func_list = [
    (exp, phase_day, hr_start, 6, m)
    for phase_day, hr_start in zip(phase_days, hr_starts)
]

In [None]:
cfel13_hourly_edgelists = [
    hourly_edgelists(exp, phase_day, t_start, n_hours, matcher)
    for exp, phase_day, t_start, n_hours, matcher in func_list
]

### Colony Cfel 64

In [None]:
f_myrmidon = "/media/ebiag/Ebi-1/InfectionExp_Cfel64/InfectionExpCol64.myrmidon"
exp = fm.Experiment.Open(f_myrmidon)
# Lis of start times for different phases
hr_starts_exp = [
    datetime(2023, 6, 1, 15, 51).astimezone(tz=None),
    datetime(2023, 6, 2, 14, 44).astimezone(tz=None),
    datetime(2023, 6, 3, 14, 50).astimezone(tz=None),
    datetime(2023, 6, 4, 14, 43).astimezone(tz=None),
    datetime(2023, 6, 5, 14, 52).astimezone(tz=None),
]
hr_starts_pre = [
    datetime(2023, 6, 1, 8, 0).astimezone(tz=None),
    datetime(2023, 6, 2, 8, 0).astimezone(tz=None),
    datetime(2023, 6, 3, 8, 0).astimezone(tz=None),
    datetime(2023, 6, 4, 8, 0).astimezone(tz=None),
    datetime(2023, 6, 5, 8, 0).astimezone(tz=None),
]
hr_starts_post = [datetime(2023, 6, 6, 8, 0).astimezone(tz=None)]
# Combine all start times
hr_starts = hr_starts_exp + hr_starts_pre + hr_starts_post

# Combine to create one list
func_list = [
    (exp, phase_day, hr_start, 6, m)
    for phase_day, hr_start in zip(phase_days, hr_starts)
]

In [None]:
cfel64_hourly_edgelists = [
    hourly_edgelists(exp, phase_day, t_start, n_hours, matcher)
    for exp, phase_day, t_start, n_hours, matcher in func_list[4:5]
]