# core

> Fill in a module description here

## Krackhardt OG approach

From his OG paper:

The graph hierarchy condition states that in a digraph D, for each pair of points where one (Pi) can reach another (Pj), the second (Pj) can't reach the first (Pi). 
For example, in a formal organization chart a high lvl employee can reach through the chain of command her subordinate's subordinate. If the formal organization if working "properly", this lower lvl employee can't simultaneously reach the high lvl employee.
To measure the degree of hierarchy of digraph D, a new digraph Dr must be created. Dr is defined as the reachability digraph of D. Each point in D exists in Dr; moreover, the line (Pi,Pj) exists in Dr if and only if Pi can reach Pj in D. If D is graph hierarchic, then Dr will have no symmetric lines in it (i.e. if the line (Pi,Pj) exists in Dr then the line (Pj,Pi) does not).

The degree of hierarchy then is defined as:

    Graph Hierarchy = 1 - [V/MaxV]
    Where:
        V = Number of unordered pairs of points in Dr that are symmetrically linked
        MaxV = Number of unordered pairs of points in Dr where Pi is linked to Pj or viceversa.

**Definitions**:

- Hierarchy degree: This a network-wide metric that determines how "hierarchical" a graph is.
From https://doi.org/10.1186/s13059-015-0624-2:
    The degree of hierarchy for a given for a given network is not well-defined concept. Ispolatov et al. [27] introduced the idea of dominant direction by minimizing the number of feedback links. While it is a proxy of hierarchical structure to a certain extent, the method does not provide a rigorous statistical confidence. Here, we define a metric to quantify the degree of hierarchy for a given hierarchical network, and then propose a new method called hierarchical score maximization (HSM) to infer the hierarchy of a directed network

- Hierarchy metric (at node level): From sna package in R.
    Hierarchy measures quantify the extent of asymmetry in a structure; the greater the extent of asym-
    metry, the more hierarchical the structure is said to be. (This should not be confused with how
    centralized the structure is, i.e., the extent to which centralities of vertex positions are highly con-
    centrated.)
They discuss and offer 2 different approaches:

    - reciprocity: This setting returns one minus the dyadic reciprocity for each input graph (see
    grecip)
    - krackhardt: This setting returns the Krackhardt hierarchy score for each input graph. The
    Krackhardt hierarchy is defined as the fraction of non-null dyads in the reachability graph
    which are asymmetric. Thus, when no directed paths are reciprocated (e.g., in an in/outtree),
    Krackhardt hierarchy is equal to 1; when all such paths are reciprocated, by contrast (e.g., in
    a cycle or clique), the measure falls to 0.
        - Hierarchy is one of four measures (connectedness, efficiency, hierarchy, and lubness)
        suggested by Krackhardt for summarizing hierarchical structures. Each corresponds to one of
        four axioms which are necessary and sufficient for the structure in question to be an outtree;
        thus, the measures will be equal to 1 for a given graph iff that graph is an outtree. Deviations
        from unity can be interpreted in terms of failure to satisfy one or more of the outtree conditions,
        information which may be useful in classifying its structural properties.
        Note that hierarchy is inherently density-constrained: as densities climb above 0.5, the proportion
        of mutual dyads must (by the pigeonhole principle) increase rapidly, thereby reducing possibili-
        ties for asymmetry. Thus, the interpretation of hierarchy scores should take density into account,
        particularly if density is artifactual (e.g., due to a particular dichotomization procedure)
    
- Flow Hierarchy Score: Described in XXXX (paper). Implementation from the networkX package.

In [None]:
import os, pandas as pd, numpy as np
import networkx

import rpy2, rpy2.situation
from rpy2.robjects import r, pandas2ri
from rpy2.robjects.packages import importr

In [None]:
# G = networkx.DiGraph([(1, 2), (2, 3), (1, 5), (2, 4), (4, 6), (5, 6), (3, 1)])

In [None]:
# #Graph from OmniPath

# G = networkx.from_pandas_edgelist(pd.read_csv(f"{data_dir}/omnipath_curated_interactions.csv", index_col=0), source='source_genesymbol', target='target_genesymbol', create_using=networkx.DiGraph)
# G

In [None]:
# base = importr("base")
# sna = importr("sna")

# krackhardt_score = sna.hierarchy(networkx.to_numpy_array(G), measure="krackhardt")
# print(sna.hierarchy(networkx.to_numpy_array(networkx.DiGraph(G)), measure="krackhardt"))
# print(krackhardt_score)

In [None]:
# networkx.flow_hierarchy(G)

In [None]:
# diG = networkx.DiGraph(G)

# #First, compute reachability graph

# # #Manual computation is very slow (iterating through all node pairs)
# # Dr = networkx.DiGraph()
# # Dr.add_nodes_from(diG.nodes())
# # #iterate on node pairs frm diG
# # for u in diG.nodes():
# #     for v in diG.nodes():
# #         if networkx.has_path(diG, u, v):
# #             Dr.add_edge(u,v)

# #Transitive closure of graph is a graoh that contains an edge (u,v) if there is a path from u to v in the original graph
# Dr = networkx.transitive_closure(diG, reflexive=None)

# # Count the number of non-null dyads in TC that are asymmetric
# asymmetric_dyads = 0
# non_null_dyads = 0

# n = len(Dr.nodes())

# for i in Dr.nodes():
#     for j in Dr.nodes():
#         if Dr.has_edge(i,j) or Dr.has_edge(j,i):
#             non_null_dyads += 1
#             if Dr.has_edge(i,j) != Dr.has_edge(j,i):
#                 asymmetric_dyads += 1

            
# # Compute and return the Krackhardt hierarchy score
# if non_null_dyads == 0:
#     print("fuck")

# asymmetric_dyads / non_null_dyads

In [None]:
#| default_exp core

In [None]:
#| hide
from nbdev.showdoc import *

In [None]:
#| export
def foo(): pass

In [None]:
#| hide
import nbdev; nbdev.nbdev_export()