# Table of contents

1. [Theoretical foundations](#theoretical_foundations)
1. [Constants](#constants)
1. [Imports](#imports)
1. [```pandas``` options](#pandas_options)
1. [Functions definitions](#functions_definitions)
1. [Organisms list for phylogenetic profiles](#organisms-list-for-phylogenetic-profiles)
1. [Results data](#results_data)
    1. [Permutation tests results](#permutation_tests_results)
        1. [Get the networks organisms names from the results filenames](#get_the_networks_organisms_names_from_the_results_filenames)
        1. [Read](#prwlr-read-data)
            1. [Raw numbers](#raw_numbers)
            1. [p-values](#pvalue)
        1. [Transform and annotate](#transform_and_annotate)
            1. [Merging with numerical values](#merging_with_numerical_values)
            1. [Taxonomical annotations](#taxonomical_annotations)
            1. [Mark which direction the native values are tailed](#mark_which_direction_the_native_values_are_tailed)
            1. [Drop columns no longer neccessary](#drop_columns_no_longer_neccessary)
            1. [Drop duplicates](#drop_duplicates)
        1. [Save](#prwlr-save)
    1. [Network Enrichment Analysis Test results](#network_enrichment_analysis_test_results)
        1. [Read](#r-neat-read)
    1. [Combined results of permutation tests and NEAT](#combined_results_of_permutatation_tests_and_neat)
        1. [Merge](#combined-merge)
        1. [Save](#combined-save)
1. [Plotting](#plotting)
    1. [Convert combined results to ```dot``` files](#combined-to-dot)
        1. [Read](#plotting-read)
        1. [Save](#plotting-save)
    1. [Convert ```dot``` to ```png```](#dot-to-png)

<a class=anchor id=theoretical_foundations></a>
# Theoretical foundations

This notebook extends the one named ```physical-interactions-no-ko-dups-removal```.

It focuses on visualization and interpretation of the results obtained in ```physical-interactions-no-ko-dups-removal```.

It does **NOT** replace the results visualization part of the stem-notebook ```physical-interactions-no-ko-dups-removal```

The evolutionary *new* genes can were called modificators by Simon

<a class=anchor id=constants></a>
# Constants

**Directories paths must have the trailing slash**

In [None]:
# Constants
sep='\t'
PERMUTATION_TESTS_RESULTS_DIRECTORY = './physical-interactions-no-ko-dups-removal/'
R_NEAT_RESULTS_PATH = './r-neat/'
R_NEAT_PRWLR_JOINED_PATH = './r-neat-prwlr-joined/'
R_NEAT_PRWLR_JOINED_PLOTS_PATH = './plots/'

<a class=anchor id=imports></a>
# Imports

In [None]:
# imports
import prwlr
import pandas as pd
from glob import glob
import pygraphviz
from functools import partial
import matplotlib.pyplot as plt
%matplotlib inline

<a class=anchor id=pandas_options></a>
# ```pandas``` options

In [None]:
# pandas options
pd.set_option('display.max_colwidth', 40)

<a class=anchor id=functions_definitions></a>
# Functions definitions

In [None]:
# assign_taxonomy
def assign_taxonomy(
    row,
    archea,
    bacteria,
    eukaryots,
):
    """
    Assign the taxonomical domain depending on whether
    the organisms passed are in present in
    the phylogenetic profiles EXLUSIVELY. In case of ambiguity
    it is assigned as <mixed>.
    
    Parameters
    -------
    row: pandas.DataFrame row
        Suitable for pandas.DataFrame.apply method.
    
    Returns
    -------
    str, abbreviated name of the domain
        arch, bact, eukar, S_cer, mixed
    """
    if prwlr.utils.isiniterable(
        archea,
        [IDs_names[_] for _ in row.get_present()],
        all_present=False,
        ) and prwlr.utils.isiniterable(
            bacteria + eukaryots,
            [IDs_names[_] for _ in row.get_absent()],
            all_present=True,
        ):
        return 'arch'
    elif prwlr.utils.isiniterable(
        bacteria,
        [IDs_names[_] for _ in row.get_present()],
        all_present=False,
    ) and prwlr.utils.isiniterable(
        archea + eukaryots,
        [IDs_names[_] for _ in row.get_absent()],
        all_present=True,
    ):
        return 'bact'
    elif prwlr.utils.isiniterable(
        eukaryots,
        [IDs_names[_] for _ in row.get_present()],
        all_present=False,
    ) and prwlr.utils.isiniterable(
        archea + bacteria,
        [IDs_names[_] for _ in row.get_absent()],
        all_present=True,
    ):
        return 'eukar'
    elif prwlr.utils.isiniterable(
        archea + bacteria + eukaryots,
        [IDs_names[_] for _ in row.get_present()],
        all_present=True,
    ):
        return 'all'
    else:
        return 'mixed'

In [None]:
# mark_result_direction
def mark_result_direction(
    row,
    significance_threshold=0.001,
    native_lower_col='native_lower',
    native_higher_col='native_higher',
    p_value_col='p-value',
):
    if row[p_value_col] > significance_threshold:
        return 'insignificant'
    if row[native_lower_col] is row[native_higher_col]:
        return 'equal'
    elif row[native_lower_col] and not row[native_higher_col]:
        return 'lower'
    elif row[native_higher_col] and not row[native_lower_col]:
        return 'higher'

In [None]:
# df2dot
def df2dot(
    df,
    path,
    nodes_source='taxa_Q',
    nodes_target='taxa_A',
    taxa_Q_col='taxa_Q',
    taxa_A_col='taxa_A',
    result_direction_col='result_direction',
    taxonomy_common_col='taxonomy_common',
    inter_perc_col='inter_perc',
    empty_string_regex='^$',
    no_ortholog_value='no_ortholog',
    inter_perc_multiplier=100,
    taxonomy_common_style={
        True: 'solid',
        False: 'dashed',
    },
    significance_colors={
        'higher': 'green',
        'lower': 'red',
        'insignificant': 'grey',
    },
    dot_col_names={
        'inter_perc': 'penwidth',
        'result_direction': 'color',
        'taxonomy_common': 'style',
    }
):
    """
    Save dot file from pandas.DataFrame
    """
    df = df.copy()
    df[taxa_Q_col] = df[taxa_Q_col].str.replace(empty_string_regex, no_ortholog_value)
    df[taxa_A_col] = df[taxa_A_col].str.replace(empty_string_regex, no_ortholog_value)
    df[result_direction_col] = df[result_direction_col].map(significance_colors)
    df[taxonomy_common_col] = df[taxonomy_common_col].map(taxonomy_common_style)
    df[inter_perc_col] = df[inter_perc_col] * inter_perc_multiplier
    df.rename(columns=dot_col_names, inplace=True)
    G = nx.from_pandas_edgelist(
        df=df,
        source=nodes_source,
        target=nodes_target,
        edge_attr=True,
        create_using=nx.Graph(),
    )
    nx.drawing.nx_pydot.write_dot(G, path)

<a class='anchor' id='organisms-list-for-phylogenetic-profiles'></a>
## Organisms list for phylogenetic profiles

The presence of a given protein/gene ortholog in these organisms is marked as *+* or *-* (or ```True```/```False```) in the phylogenetic profiles.

In [None]:
# organisms
archea = [
    'Aeropyrum pernix',
    'Sulfolobus islandicus', 
]

bacteria = [
    'Agrobacterium fabrum',
    'Bacillus subtilis',
    'Chlamydophila felis',
    'Escherichia coli',
    'Staphylococcus aureus',
]

eukaryots = [
    'Arabidopsis thaliana',
    'Caenorhabditis elegans',
    'Dictyostelium discoideum',
    'Drosophila melanogaster',
    'Homo sapiens',
    'Plasmodium falciparum',
    'Saccharomyces cerevisiae',
    'Schizosaccharomyces pombe',
    'Tetrahymena thermophila',
    'Trypanosoma cruzi',
    'Volvox carteri',
]

In [None]:
# IDs names
try:
    IDs_names = prwlr.get_IDs_names(bacteria + archea + eukaryots)
except OSError:
    IDs_names = {
        'ape': 'Aeropyrum pernix',
        'sis': 'Sulfolobus islandicus',
        'atu': 'Agrobacterium fabrum',
        'bsu': 'Bacillus subtilis',
        'cfe': 'Chlamydophila felis',
        'eco': 'Escherichia coli',
        'sau': 'Staphylococcus aureus',
        'ath': 'Arabidopsis thaliana',
        'cel': 'Caenorhabditis elegans',
        'ddi': 'Dictyostelium discoideum',
        'dme': 'Drosophila melanogaster',
        'hsa': 'Homo sapiens',
        'pfa': 'Plasmodium falciparum',
        'sce': 'Saccharomyces cerevisiae',
        'spo': 'Schizosaccharomyces pombe',
        'tet': 'Tetrahymena thermophila',
        'tcr': 'Trypanosoma cruzi',
        'vcn': 'Volvox carteri',
    }

<a class=anchor id=results_data></a>
# Results data

The results are saved in multiple files derived from analyses carried out with more than one package. Below they are combined and annotated with additional information.

<a class=anchor id=permutation_tests_results></a>
## [Permutation tests results](https://bionas.ibb.waw.pl:5001/sharing/IIwMNhNmF)

The data can be downloaded with the link above, clicked manually.

Download the ```zip``` file into the ```CWD``` of this notebook and unzip it. If the download path is different, please adjust the ```PERMUTATION_TESTS_RESULTS_DIRECTORY``` constant accordingly - section [Constants](#constants)

**The non-programmatical download is intended**

<a class=anchor id=get_the_networks_organisms_names_from_the_results_filenames></a>
### Get the networks organisms names from the results filenames

In [None]:
# organisms names
ext = 'csv'
_biogrid_orgs_lower = [
    ' '.join(i.split('/')[-1].split('.')[0].split('_')[-2:])
    for i in glob(f'{PERMUTATION_TESTS_RESULTS_DIRECTORY}*.{ext}')
]
biogrid_orgs = set([
    ' '.join((i.split()[0].capitalize(), i.split()[1]))
    for i in _biogrid_orgs_lower
])
del _biogrid_orgs_lower

<a class=anchor id=prwlr-read-data></a>
### Read

<a class=anchor id=raw_numbers></a>
#### Raw numbers

**Mind that it allocates a lot of RAM: ~90GB**

In [None]:
# raw numbers
sep = '\t'
ext = 'csv'
#Native values
PHYS_native_prof_str = {
    k: prwlr.read_network(
        f"{PERMUTATION_TESTS_RESULTS_DIRECTORY}PHYS_native_prof_str_{k.replace(' ', '_').lower()}.{ext}",
        index_col=[0],
        sep=sep,
    )
    for k in biogrid_orgs
}
PHYS_native_prof_str_aff = {
    k: prwlr.read_network(
        f"{PERMUTATION_TESTS_RESULTS_DIRECTORY}PHYS_native_prof_str_aff_{k.replace(' ', '_').lower()}.{ext}",
        index_col=[0],
        sep=sep,
    )
    for k in biogrid_orgs
}
PHYS_native_prof_str_hyb = {
    k: prwlr.read_network(
        f"{PERMUTATION_TESTS_RESULTS_DIRECTORY}PHYS_native_prof_str_hyb_{k.replace(' ', '_').lower()}.{ext}",
        index_col=[0],
        sep=sep,
    )
    for k in biogrid_orgs
}
# Permuted values
PHYS_permuted_prof_str = {
    k: prwlr.read_network(
        f"{PERMUTATION_TESTS_RESULTS_DIRECTORY}PHYS_permuted_prof_str_{k.replace(' ', '_').lower()}.{ext}",
        index_col=[0],
        sep=sep,
    )
    for k in biogrid_orgs
}
PHYS_permuted_prof_str_aff = {
    k: prwlr.read_network(
        f"{PERMUTATION_TESTS_RESULTS_DIRECTORY}PHYS_permuted_prof_str_aff_{k.replace(' ', '_').lower()}.{ext}",
        index_col=[0],
        sep=sep,
    )
    for k in biogrid_orgs
}
PHYS_permuted_prof_str_hyb = {
    k: prwlr.read_network(
        f"{PERMUTATION_TESTS_RESULTS_DIRECTORY}PHYS_permuted_prof_str_hyb_{k.replace(' ', '_').lower()}.{ext}",
        index_col=[0],
        sep=sep,
    )
    for k in biogrid_orgs
}

<a class=anchor id=pvalue></a>
#### p-values

In [None]:
# p-values
PHYS_native_permuted_prof_str_p_vals = {
    k: prwlr.read_network(
        f"{PERMUTATION_TESTS_RESULTS_DIRECTORY}PHYS_native_permuted_prof_str_p_vals_{k.replace(' ', '_').lower()}.{ext}",
        index_col=[0],
        sep=sep,
    )
    for k in biogrid_orgs
}
PHYS_native_permuted_prof_str_hyb_p_vals = {
    k: prwlr.read_network(
        f"{PERMUTATION_TESTS_RESULTS_DIRECTORY}PHYS_native_permuted_prof_str_hyb_p_vals_{k.replace(' ', '_').lower()}.{ext}",
        index_col=[0],
        sep=sep,
    )
    for k in biogrid_orgs
}
PHYS_native_permuted_prof_str_aff_p_vals = {
    k: prwlr.read_network(
        f"{PERMUTATION_TESTS_RESULTS_DIRECTORY}PHYS_native_permuted_prof_str_aff_p_vals_{k.replace(' ', '_').lower()}.{ext}",
        index_col=[0],
        sep=sep,
    )
    for k in biogrid_orgs
}

<a class=anchor id=transform_and_annotate></a>
### Transform and annotate

<a class=anchor id=merging_with_numerical_values></a>
#### Merging with numerical values

In [None]:
# Merging with numerical values
PHYS_native_permuted_prof_str_p_vals_num_vals = {
    k: pd.merge(
        left=v,
        right=PHYS_native_permuted_prof_str_p_vals[k],
        on=[prwlr.Columns.PROF_Q, prwlr.Columns.PROF_A]
    )
    for k, v in PHYS_native_prof_str.items()
}
PHYS_native_permuted_prof_str_aff_p_vals_num_vals = {
    k: pd.merge(
        left=v,
        right=PHYS_native_permuted_prof_str_aff_p_vals[k],
        on=[prwlr.Columns.PROF_Q, prwlr.Columns.PROF_A]
    )
    for k, v in PHYS_native_prof_str_aff.items()
}
PHYS_native_permuted_prof_str_hyb_p_vals_num_vals = {
    k: pd.merge(
        left=v,
        right=PHYS_native_permuted_prof_str_hyb_p_vals[k],
        on=[prwlr.Columns.PROF_Q, prwlr.Columns.PROF_A]
    )
    for k, v in PHYS_native_prof_str_hyb.items()
}

<a class=anchor id=mark_which_direction_the_native_values_are_tailed></a>
#### Mark which direction the native values are tailed

In [None]:
# Mark which direction the native values are tailed
for v in PHYS_native_permuted_prof_str_p_vals_num_vals.values():
    v['result_direction'] = v.apply(mark_result_direction, axis=1)
for v in PHYS_native_permuted_prof_str_aff_p_vals_num_vals.values():
    v['result_direction'] = v.apply(mark_result_direction, axis=1)
for v in PHYS_native_permuted_prof_str_hyb_p_vals_num_vals.values():
    v['result_direction'] = v.apply(mark_result_direction, axis=1)

<a class=anchor id=taxonomical_annotations></a>
#### Taxonomical annotations

In [None]:
# Taxonomical annotations

# Make the taxonomical bins fixed so it suits the apply method well and eliminates typo errors
assign_taxonomy_fixed_taxa = partial(assign_taxonomy, archea=archea, bacteria=bacteria, eukaryots=eukaryots)
# Deep copy the dataframes
PHYS_native_permuted_prof_str_p_vals_annotations = {
    k: v.copy()
    for k, v in PHYS_native_permuted_prof_str_p_vals_num_vals.items()
}

PHYS_native_permuted_prof_str_aff_p_vals_annotations = {
    k: v.copy()
    for k, v in PHYS_native_permuted_prof_str_aff_p_vals_num_vals.items()
}

PHYS_native_permuted_prof_str_hyb_p_vals_annotations = {
    k: v.copy()
    for k, v in PHYS_native_permuted_prof_str_hyb_p_vals_num_vals.items()
}
# Extract taxonomy data from the profiles and save it to another to column
for v in PHYS_native_permuted_prof_str_p_vals_annotations.values():
    v['taxonomy_Q'] = v['PROF_Q'].apply(assign_taxonomy_fixed_taxa)
    v['taxonomy_A'] = v['PROF_A'].apply(assign_taxonomy_fixed_taxa)

for v in PHYS_native_permuted_prof_str_aff_p_vals_annotations.values():
    v['taxonomy_Q'] = v['PROF_Q'].apply(assign_taxonomy_fixed_taxa)
    v['taxonomy_A'] = v['PROF_A'].apply(assign_taxonomy_fixed_taxa)

for v in PHYS_native_permuted_prof_str_hyb_p_vals_annotations.values():
    v['taxonomy_Q'] = v['PROF_Q'].apply(assign_taxonomy_fixed_taxa)
    v['taxonomy_A'] = v['PROF_A'].apply(assign_taxonomy_fixed_taxa)
# Convert and transform taxonomy data into a single string
for v in PHYS_native_permuted_prof_str_p_vals_annotations.values():
    v['taxa_Q'] = v['PROF_Q'].apply(lambda x: '|'.join(x.get_present()))
    v['taxa_A'] = v['PROF_A'].apply(lambda x: '|'.join(x.get_present()))

for v in PHYS_native_permuted_prof_str_aff_p_vals_annotations.values():
    v['taxa_Q'] = v['PROF_Q'].apply(lambda x: '|'.join(x.get_present()))
    v['taxa_A'] = v['PROF_A'].apply(lambda x: '|'.join(x.get_present()))

for v in PHYS_native_permuted_prof_str_hyb_p_vals_annotations.values():
    v['taxa_Q'] = v['PROF_Q'].apply(lambda x: '|'.join(x.get_present()))
    v['taxa_A'] = v['PROF_A'].apply(lambda x: '|'.join(x.get_present()))

for v in PHYS_native_permuted_prof_str_p_vals_annotations.values():
    v['PROF_Q'] = v['PROF_Q'].apply(lambda x: x.to_string())
    v['PROF_A'] = v['PROF_A'].apply(lambda x: x.to_string())
    v['taxonomy_common'] = (v['taxonomy_Q'] == v['taxonomy_A'])

for v in PHYS_native_permuted_prof_str_aff_p_vals_annotations.values():
    v['PROF_Q'] = v['PROF_Q'].apply(lambda x: x.to_string())
    v['PROF_A'] = v['PROF_A'].apply(lambda x: x.to_string())
    v['taxonomy_common'] = (v['taxonomy_Q'] == v['taxonomy_A'])

for v in PHYS_native_permuted_prof_str_hyb_p_vals_annotations.values():
    v['PROF_Q'] = v['PROF_Q'].apply(lambda x: x.to_string())
    v['PROF_A'] = v['PROF_A'].apply(lambda x: x.to_string())
    v['taxonomy_common'] = (v['taxonomy_Q'] == v['taxonomy_A'])

<a class=anchor id=drop_columns_no_longer_neccessary></a>
#### Drop columns no longer neccessary

In [None]:
# Drop columns no longer neccessary
columns = [
    'INTERACTION',
    'native_higher',
    'native_lower',
    'inter_sum',
    'inter_number',
    'per_number',
    'p-value',
]
for v in PHYS_native_permuted_prof_str_p_vals_annotations.values():
    v.drop(columns=columns, inplace=True)
for v in PHYS_native_permuted_prof_str_aff_p_vals_annotations.values():
    v.drop(columns=columns, inplace=True)
for v in PHYS_native_permuted_prof_str_hyb_p_vals_annotations.values():
    v.drop(columns=columns, inplace=True)

<a class=anchor id=drop_duplicates></a>
#### Drop duplicates

In [None]:
# Drop duplicates
for v in PHYS_native_permuted_prof_str_p_vals_annotations.values():
    v.drop_duplicates(inplace=True)
for v in PHYS_native_permuted_prof_str_aff_p_vals_annotations.values():
    v.drop_duplicates(inplace=True)
for v in PHYS_native_permuted_prof_str_hyb_p_vals_annotations.values():
    v.drop_duplicates(inplace=True)

<a class=anchor id=save_annotated_results_to_files></a>
#### Save annotated results to CSV files

In [None]:
# Save to CSV
for k, v in PHYS_native_permuted_prof_str_p_vals_annotations.items():
    v.to_csv(f"./tmp/PHYS_native_permuted_prof_str_p_vals_annotations_{k.replace(' ', '_').lower()}.csv", sep='\t')
for k, v in PHYS_native_permuted_prof_str_aff_p_vals_annotations.items():
    v.to_csv(f"./tmp/PHYS_native_permuted_prof_str_aff_p_vals_annotations_{k.replace(' ', '_').lower()}.csv", sep='\t')
for k, v in PHYS_native_permuted_prof_str_hyb_p_vals_annotations.items():
    v.to_csv(f"./tmp/PHYS_native_permuted_prof_str_hyb_p_vals_annotations_{k.replace(' ', '_').lower()}.csv", sep='\t')

<a class=anchor id=network_enrichment_analysis_test_results></a>
## [Network Enrichment Analysis Test results (```r-neat``` package)](https://bionas.ibb.waw.pl:5001/sharing/KKJcCl3KI)

The data can be downloaded with the link above, clicked manually.

Download the ```zip``` file into the ```CWD``` of this notebook and unzip it. If the download path is different, please adjust the ```R_NEAT_RESULTS_PATH``` constant accordingly - section [Constants](#constants)

**The non-programmatical download is intended**

<a class=anchor id=r-neat-read></a>
### Read

In [None]:
# Read r-neat results
# No selection
r_neat_results = {
    k: pd.read_csv(f"{R_NEAT_RESULTS_PATH}PHYS-{k}-r-neat.csv", sep=sep)
    for k in (i.replace(' ', '_') for i in biogrid_orgs)
}
# Affinity Binding
r_neat_results_aff = {
    k: pd.read_csv(f"{R_NEAT_RESULTS_PATH}PHYS-{k}_aff-r-neat.csv", sep=sep)
    for k in (i.replace(' ', '_') for i in biogrid_orgs)
}
# Two Hybrid
r_neat_results_hyb = {
    k: pd.read_csv(f"{R_NEAT_RESULTS_PATH}PHYS-{k}_hyb-r-neat.csv", sep=sep)
    for k in (i.replace(' ', '_') for i in biogrid_orgs)
}

<a class=anchor id=combined_results_of_permutatation_tests_and_neat></a>
## Combined results of permutation tests and NEAT

<a class=anchor id=combined-merge></a>
### Merge

In [None]:
# Merge r-neat results and prwlr results
# No interactions detection method filter
r_neat_results_prwlr_results = {
    k: pd.merge(
        left=PHYS_native_permuted_prof_str_p_vals_annotations[k.replace('_', ' ')],
        right=v,
        left_on=[prwlr.Columns.PROF_Q, prwlr.Columns.PROF_A],
        right_on=['A', 'B'],
    )
    for k, v in r_neat_results.items()
}
# Affinity Binding
r_neat_results_prwlr_results_aff = {
    k: pd.merge(
        left=PHYS_native_permuted_prof_str_aff_p_vals_annotations[k.replace('_', ' ')],
        right=v,
        left_on=[prwlr.Columns.PROF_Q, prwlr.Columns.PROF_A],
        right_on=['A', 'B'],
    )
    for k, v in r_neat_results_aff.items()
}
# Two hybrid
r_neat_results_prwlr_results_hyb = {
    k: pd.merge(
        left=PHYS_native_permuted_prof_str_hyb_p_vals_annotations[k.replace('_', ' ')],
        right=v,
        left_on=[prwlr.Columns.PROF_Q, prwlr.Columns.PROF_A],
        right_on=['A', 'B'],
    )
    for k, v in r_neat_results_hyb.items()
}

<a class=anchor id=combined-save></a>
### Save

In [None]:
# Save
for k, v in r_neat_results_prwlr_results.items():
    v.to_csv(
        f"{R_NEAT_PRWLR_JOINED_PATH}r_neat_results_prwlr_results_{k}.csv",
        sep=sep,
    )

for k, v in r_neat_results_prwlr_results_aff.items():
    v.to_csv(
        f"{R_NEAT_PRWLR_JOINED_PATH}r_neat_results_prwlr_results_aff_{k}.csv",
        sep=sep,
    )

for k, v in r_neat_results_prwlr_results_hyb.items():
    v.to_csv(
        f"{R_NEAT_PRWLR_JOINED_PATH}r_neat_results_prwlr_results_hyb_{k}.csv",
        sep=sep,
    )

<a class=anchor id=plotting></a>
# Plotting

<a class=anchor id=combined-to-dot></a>
## Convert combined results to ```dot``` files

The attributes for the dot file in order to draw the graph using the ```graphviz```:

- edge witdth:
    - source column: ```inter_perc```
    - target attribute: ```penwidth```
    - values: supposed to be an integer so the value must be multiplied by 100
- edge color:
    - source column: ```result_direction```
    - target attribute: ```color```
    - values: ```green``` for higher, ```red``` for ```lower```, ```grey``` for insignificant
- edge style:
    - source column: ```taxonomy_common```
    - target attribute: ```style```
    - values: ```dashed``` for ```False```, ```solid``` for ```True```

- edge label:
    - significance reported by the ```r-neat``` package can be marked with ```*```

- nodes si
Modifications for avoiding the syntax errors:

- replace empty strings with something, eg ```empty``` or ```no_ortholog```

<a class=anchor id=plotting-read></a>
### Read

In [None]:
# Read
# No interaction detection method selection
r_neat_results_prwlr_results = {
    k.replace(' ', '_'): pd.read_csv(
        f"{R_NEAT_PRWLR_JOINED_PATH}r_neat_results_prwlr_results_{k.replace(' ', '_')}.csv",
        sep=sep,
        index_col=[0],
    ).fillna({k: 'no-ortholog' for k in {'taxa_Q', 'taxa_A'}})
    for k in biogrid_orgs
}
# Affinity Binding
r_neat_results_prwlr_results_aff = {
    k.replace(' ', '_'): pd.read_csv(
        f"{R_NEAT_PRWLR_JOINED_PATH}r_neat_results_prwlr_results_aff_{k.replace(' ', '_')}.csv",
        sep=sep,
        index_col=[0],
    ).fillna({k: 'no-ortholog' for k in {'taxa_Q', 'taxa_A'}})
    for k in biogrid_orgs
}
# Two-hybrid selection
r_neat_results_prwlr_results_hyb = {
    k.replace(' ', '_'): pd.read_csv(
        f"{R_NEAT_PRWLR_JOINED_PATH}r_neat_results_prwlr_results_hyb_{k.replace(' ', '_')}.csv",
        sep=sep,
        index_col=[0],
    ).fillna({k: 'no-ortholog' for k in {'taxa_Q', 'taxa_A'}})
    for k in biogrid_orgs
}

<a class=anchor id=plotting-save></a>
### Save

In [None]:
# Convert dataframes to dot files
for k, v in r_neat_results_prwlr_results.items():
    df2dot(
        v.rename(columns={'pvalue': 'label'}),
        f"{R_NEAT_PRWLR_JOINED_PLOTS_PATH}r_neat_results_prwlr_results_{k}.dot",
    )

for k, v in r_neat_results_prwlr_results_aff.items():
    df2dot(
        v.rename(columns={'pvalue': 'label'}),
        f"{R_NEAT_PRWLR_JOINED_PLOTS_PATH}r_neat_results_prwlr_results_aff_{k}.dot",
    )

for k, v in r_neat_results_prwlr_results_hyb.items():
    df2dot(
        v.rename(columns={'pvalue': 'label'}),
        f"{R_NEAT_PRWLR_JOINED_PLOTS_PATH}r_neat_results_prwlr_results_hyb_{k}.dot",
    )

<a class=anchor id=dot-to-png></a>
## Convert ```dot``` to ```png```

**IMPORTANT NOTE**

```R_NEAT_PRWLR_JOINED_PLOTS_PATH``` variable defined in the ```Python``` namespace is not available in the ```Bash``` namespace. Thus, it must be defined separately within the cell containing the ```Bash``` call

In [None]:
%%bash

R_NEAT_PRWLR_JOINED_PLOTS_PATH='./presentation-main-data/physical-interactions-no-ko-dups-removal-plots/'

for i in ${R_NEAT_PRWLR_JOINED_PLOTS_PATH}*dot; do
    dot -T png ${i} > ${i}.png;
done