In [None]:
#hide
#default_exp network
from nbdev.showdoc import show_doc
from IPython.display import HTML
%load_ext autoreload
%autoreload 2

# network

>Analyzing groups of glycans as networks (e.g., biosynthesis networks)

In [None]:
#export
from glycowork.network.biosynthesis import *

`network` contains functions to arrange and analyze glycans in the context of networks. In such a network, each node represents a glycan and edges represent, for instance, their connection via a biosynthetic step. It should be noted, since `glycowork` treats glycans as molecular graphs, that these networks represent hierarchical graphs, with the network being one graph and each node within the network also a graph. `network` contains the following modules:
- `biosynthesis` contains functions to construct and analyze biosynthetic glycan networks

# biosynthesis
>constructing and analyzing biosynthetic glycan networks

In [None]:
show_doc(subgraph_to_string)

<h4 id="subgraph_to_string" class="doc_header"><code>subgraph_to_string</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L43" class="source_link" style="float:right">[source]</a></h4>

> <code>subgraph_to_string</code>(**`subgraph`**, **`libr`**=*`None`*)

converts glycan subgraph back to IUPAC-condensed format

| Arguments:
| :-
| subgraph (networkx object): subgraph of one monosaccharide and its linkage
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used

| Returns:
| :-
| Returns glycan motif in IUPAC-condensed format (string)

In [None]:
show_doc(get_neighbors)

<h4 id="get_neighbors" class="doc_header"><code>get_neighbors</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L59" class="source_link" style="float:right">[source]</a></h4>

> <code>get_neighbors</code>(**`glycan`**, **`glycans`**, **`libr`**=*`None`*, **`graphs`**=*`None`*)

find (observed) biosynthetic precursors of a glycan

| Arguments:
| :-
| glycan (string): glycan in IUPAC-condensed format
| glycans (list): list of glycans in IUPAC-condensed format
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used
| graphs (list): list of glycans in df as graphs; optional if you call get_neighbors often with the same df and want to provide it precomputed

| Returns:
| :-
| (1) a list of direct glycan precursors in IUPAC-condensed
| (2) a list of indices where each precursor from (1) can be found in glycans

In [None]:
get_neighbors('Gal(b1-3)GlcNAc(b1-3)[Gal(b1-4)GlcNAc(b1-6)]Gal(b1-4)Glc',
             ['Gal(b1-3)GlcNAc(b1-3)[GlcNAc(b1-6)]Gal(b1-4)Glc', 'Gal(b1-4)Glc'])

(['Gal(b1-3)GlcNAc(b1-3)[GlcNAc(b1-6)]Gal(b1-4)Glc'], [[0], [], [], []])

In [None]:
show_doc(create_adjacency_matrix)

<h4 id="create_adjacency_matrix" class="doc_header"><code>create_adjacency_matrix</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L84" class="source_link" style="float:right">[source]</a></h4>

> <code>create_adjacency_matrix</code>(**`glycans`**, **`libr`**=*`None`*, **`virtual_nodes`**=*`False`*, **`reducing_end`**=*`['Glc', 'GlcNAc']`*)

creates a biosynthetic adjacency matrix from a list of glycans

| Arguments:
| :-
| glycans (list): list of glycans in IUPAC-condensed format
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used
| virtual_nodes (bool): whether to include virtual nodes in network; default:False
| reducing_end (list): monosaccharides at the reducing end that are allowed; default:['Glc','GlcNAc']

| Returns:
| :-
| (1) adjacency matrix (glycan X glycan) denoting whether two glycans are connected by one biosynthetic step
| (2) list of which nodes are virtual nodes (empty list if virtual_nodes is False)

In [None]:
show_doc(find_diff)

<h4 id="find_diff" class="doc_header"><code>find_diff</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L142" class="source_link" style="float:right">[source]</a></h4>

> <code>find_diff</code>(**`glycan_a`**, **`glycan_b`**, **`libr`**=*`None`*)

finds the subgraph that differs between glycans and returns it, will only work if the differing subgraph is connected

| Arguments:
| :-
| glycan_a (string): glycan in IUPAC-condensed format
| glycan_b (string): glycan in IUPAC-condensed format
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used

| Returns:
| :-
| Returns difference between glycan_a and glycan_b in IUPAC-condensed

In [None]:
find_diff('Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
         'Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc')

'Fuca1-6'

In [None]:
show_doc(construct_network)

<h4 id="construct_network" class="doc_header"><code>construct_network</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L168" class="source_link" style="float:right">[source]</a></h4>

> <code>construct_network</code>(**`glycans`**, **`add_virtual_nodes`**=*`'none'`*, **`libr`**=*`None`*, **`reducing_end`**=*`['Glc', 'GlcNAc']`*, **`limit`**=*`5`*)

visualize biosynthetic network

| Arguments:
| :-
| glycans (list): list of glycans in IUPAC-condensed format
| add_virtual_nodes (string): indicates whether no ('None'), proximal ('simple'), or all ('exhaustive') virtual nodes should be added; default:'none'
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used
| reducing_end (list): monosaccharides at the reducing end that are allowed; default:['Glc','GlcNAc']
| limit (int): maximum number of virtual nodes between observed nodes; default:5

| Returns:
| :-
| Returns a networkx object of the network

In [None]:
show_doc(plot_network)

<h4 id="plot_network" class="doc_header"><code>plot_network</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L222" class="source_link" style="float:right">[source]</a></h4>

> <code>plot_network</code>(**`network`**, **`plot_format`**=*`'kamada_kawai'`*, **`edge_label_draw`**=*`True`*)

visualizes biosynthetic network

| Arguments:
| :-
| network (networkx object): biosynthetic network, returned from construct_network
| plot_format (string): how to layout network, either 'kamada_kawai' or 'spring'; default:'kamada_kawai'
| edge_label_draw (bool): draws edge labels if True; default:True

In [None]:
show_doc(get_virtual_nodes)

<h4 id="get_virtual_nodes" class="doc_header"><code>get_virtual_nodes</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L339" class="source_link" style="float:right">[source]</a></h4>

> <code>get_virtual_nodes</code>(**`glycan`**, **`libr`**=*`None`*, **`reducing_end`**=*`['Glc', 'GlcNAc']`*)

find unobserved biosynthetic precursors of a glycan

| Arguments:
| :-
| glycan (string): glycan in IUPAC-condensed format
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used
| reducing_end (list): monosaccharides at the reducing end that are allowed; default:['Glc','GlcNAc']

| Returns:
| :-
| (1) list of virtual node graphs
| (2) list of virtual nodes in IUPAC-condensed format

In [None]:
get_virtual_nodes('Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc')

([<networkx.classes.graph.Graph at 0x246e2123520>,
  <networkx.classes.graph.Graph at 0x246e2123880>,
  <networkx.classes.graph.Graph at 0x246e2123bb0>],
 ['Man(a1-3)Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc',
  'Man(a1-6)Man(b1-4)GlcNAc(b1-4)[Fuc(a1-6)]GlcNAc',
  'Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc'])

In [None]:
show_doc(find_shared_virtuals)

<h4 id="find_shared_virtuals" class="doc_header"><code>find_shared_virtuals</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L264" class="source_link" style="float:right">[source]</a></h4>

> <code>find_shared_virtuals</code>(**`glycan_a`**, **`glycan_b`**, **`libr`**=*`None`*, **`reducing_end`**=*`['Glc', 'GlcNAc']`*)

finds virtual nodes that are shared between two glycans (i.e., that connect these two glycans)

| Arguments:
| :-
| glycan_a (string): glycan in IUPAC-condensed format
| glycan_b (string): glycan in IUPAC-condensed format
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used
| reducing_end (list): monosaccharides at the reducing end that are allowed; default:['Glc','GlcNAc']

| Returns:
| :-
| Returns list of edges between glycan and virtual node (if virtual node connects the two glycans)

In [None]:
find_shared_virtuals('GlcNAc(b1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
                    'GlcNAc(b1-2)Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc')

[('GlcNAc(b1-2)Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
  'Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc'),
 ('GlcNAc(b1-2)Man(a1-6)[Man(a1-3)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
  'Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc')]

In [None]:
show_doc(fill_with_virtuals)

<h4 id="fill_with_virtuals" class="doc_header"><code>fill_with_virtuals</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L296" class="source_link" style="float:right">[source]</a></h4>

> <code>fill_with_virtuals</code>(**`glycans`**, **`libr`**=*`None`*, **`reducing_end`**=*`['Glc', 'GlcNAc']`*)

for a list of glycans, identify virtual nodes connecting observed glycans and return their edges

| Arguments:
| :-
| glycans (list): list of glycans in IUPAC-condensed
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used
| reducing_end (list): monosaccharides at the reducing end that are allowed; default:['Glc','GlcNAc']

| Returns:
| :-
| Returns list of edges that connect observed glycans to virtual nodes

In [None]:
show_doc(find_path)

<h4 id="find_path" class="doc_header"><code>find_path</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L414" class="source_link" style="float:right">[source]</a></h4>

> <code>find_path</code>(**`glycan_a`**, **`glycan_b`**, **`libr`**=*`None`*, **`reducing_end`**=*`['Glc', 'GlcNAc']`*, **`limit`**=*`5`*)

find virtual node path between two glycans

| Arguments:
| :-
| glycan_a (string): glycan in IUPAC-condensed format
| glycan_b (string): glycan in IUPAC-condensed format
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used
| reducing_end (list): monosaccharides at the reducing end that are allowed; default:['Glc','GlcNAc']
| limit (int): maximum number of virtual nodes between observed nodes; default:5

| Returns:
| :-
| (1) list of edges to connect glycan_a and glycan_b via virtual nodes
| (2) dictionary of edge labels detailing difference between two connected nodes

In [None]:
find_path('Gal(b1-4)GlcNAc(b1-3)Gal(b1-4)Glc','Gal(b1-4)Glc')

([('Gal(b1-4)GlcNAc(b1-3)Gal(b1-4)Glc', 'GlcNAc(b1-3)Gal(b1-4)Glc'),
  ('GlcNAc(b1-3)Gal(b1-4)Glc', 'Gal(b1-4)Glc')],
 {('Gal(b1-4)GlcNAc(b1-3)Gal(b1-4)Glc', 'GlcNAc(b1-3)Gal(b1-4)Glc'): 'Galb1-4',
  ('GlcNAc(b1-3)Gal(b1-4)Glc', 'Gal(b1-4)Glc'): 'GlcNAcb1-3'})

In [None]:
show_doc(find_shortest_path)

<h4 id="find_shortest_path" class="doc_header"><code>find_shortest_path</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L470" class="source_link" style="float:right">[source]</a></h4>

> <code>find_shortest_path</code>(**`goal_glycan`**, **`glycan_list`**, **`libr`**=*`None`*, **`reducing_end`**=*`['Glc', 'GlcNAc']`*, **`limit`**=*`5`*)

finds the glycan with the shortest path via virtual nodes to the goal glycan

| Arguments:
| :-
| goal_glycan (string): glycan in IUPAC-condensed format
| glycan_list (list): list of glycans in IUPAC-condensed format
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used
| reducing_end (list): monosaccharides at the reducing end that are allowed; default:['Glc','GlcNAc']
| limit (int): maximum number of virtual nodes between observed nodes; default:5

| Returns:
| :-
| (1) list of edges of shortest path to connect goal_glycan and glycan via virtual nodes
| (2) dictionary of edge labels detailing difference between two connected nodes in shortest path

In [None]:
find_shortest_path('Gal(b1-4)GlcNAc(b1-3)Gal(b1-4)Glc',
                  ['Gal(b1-4)Glc', 'Gal(b1-4)GlcNAc'])

([('Gal(b1-4)GlcNAc(b1-3)Gal(b1-4)Glc', 'GlcNAc(b1-3)Gal(b1-4)Glc'),
  ('GlcNAc(b1-3)Gal(b1-4)Glc', 'Gal(b1-4)Glc')],
 {('Gal(b1-4)GlcNAc(b1-3)Gal(b1-4)Glc', 'GlcNAc(b1-3)Gal(b1-4)Glc'): 'Galb1-4',
  ('GlcNAc(b1-3)Gal(b1-4)Glc', 'Gal(b1-4)Glc'): 'GlcNAcb1-3'})

In [None]:
show_doc(network_alignment)

<h4 id="network_alignment" class="doc_header"><code>network_alignment</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L521" class="source_link" style="float:right">[source]</a></h4>

> <code>network_alignment</code>(**`network_a`**, **`network_b`**)

combines two networks into a new network
| Arguments:
| :-
| network_a (networkx object): biosynthetic network from construct_network
| network_b (networkx object): biosynthetic network from construct_network

| Returns:
| :-
| Returns combined network as a networkx object

In [None]:
show_doc(infer_virtual_nodes)

<h4 id="infer_virtual_nodes" class="doc_header"><code>infer_virtual_nodes</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L542" class="source_link" style="float:right">[source]</a></h4>

> <code>infer_virtual_nodes</code>(**`network_a`**, **`network_b`**, **`combined`**=*`None`*)

identifies virtual nodes that have been observed in other species

| Arguments:
| :-
| network_a (networkx object): biosynthetic network from construct_network
| network_b (networkx object): biosynthetic network from construct_network
| combined (networkx object): merged network of network_a and network_b from network_alignment; default:None

| Returns:
| :-
| (1) tuple of (virtual nodes of network_a observed in network_b, virtual nodes occurring in both network_a and network_b)
| (2) tuple of (virtual nodes of network_b observed in network_a, virtual nodes occurring in both network_a and network_b)

In [None]:
show_doc(infer_network)

<h4 id="infer_network" class="doc_header"><code>infer_network</code><a href="https://github.com/BojarLab/glycowork/tree/master/glycowork/network/biosynthesis.py#L580" class="source_link" style="float:right">[source]</a></h4>

> <code>infer_network</code>(**`network`**, **`network_species`**, **`species_list`**, **`filepath`**=*`None`*, **`df`**=*`None`*, **`add_virtual_nodes`**=*`'exhaustive'`*, **`libr`**=*`None`*, **`reducing_end`**=*`['Glc', 'GlcNAc']`*, **`limit`**=*`5`*)

replaces virtual nodes if they are observed in other species

| Arguments:
| :-
| network (networkx object): biosynthetic network that should be inferred
| network_species (string): species from which the network stems
| species_list (list): list of species to compare network to
| filepath (string): filepath to load biosynthetic networks from other species, if precalculated (def. recommended, as calculation will take ~1.5 hours); default:None
| df (dataframe): dataframe containing species-specific glycans, only needed if filepath=None;default:None
| add_virtual_nodes (string): indicates whether no ('None'), proximal ('simple'), or all ('exhaustive') virtual nodes should be added;only needed if filepath=None;default:'exhaustive'
| libr (list): library of monosaccharides; if you have one use it, otherwise a comprehensive lib will be used;only needed if filepath=None
| reducing_end (list): monosaccharides at the reducing end that are allowed;only needed if filepath=None;default:['Glc','GlcNAc']
| limit (int): maximum number of virtual nodes between observed nodes;only needed if filepath=None;default:5

| Returns:
| :-
| Returns network with filled in virtual nodes

In [None]:
#hide
from nbdev.export import notebook2script; notebook2script()

Converted 00_core.ipynb.
Converted 01_alignment.ipynb.
Converted 02_glycan_data.ipynb.
Converted 03_ml.ipynb.
Converted 04_motif.ipynb.
Converted 05_examples.ipynb.
Converted 06_network.ipynb.
Converted index.ipynb.
