# Repeated and Mixed Product Embeddings
In this notebook we will build upon the 'simple_product_embeddings' notebook and explore the effects of repeatedly applying the graph product operation to graphs before embedding them with cycle counting. Further we will experiment with mixing effective combinations of graph products to further reduce collisions and improve the quality of the embeddings.

In [1]:
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from tqdm import tqdm

from compare import compare_embeddings as compare
from cycle_counting import embed_graph_cycles as cycles

from product import PRODUCTS

As a first step, we will write code to facilitate easy repeated application of different graph products.

In [2]:
def apply_products_embed(G, products, factors, size):
    """
    Apply several products with factor graphs to a given graph. The resulting
    graph is embedded into a cycle graph of the given size.

    Parameters
    ----------
    G : networkx.Graph, list of networkx.Graph
        A NetworkX graph.
    products : list of functions (G, factor_graph) -> G
        A list of functions that take a graph and a factor graph and return a
        graph.
    factors : list of networkx.Graph, networkx.Graph
        Factors to be used for the products. Either a list of graphs with the
        same length as `products` or a single graph that is used for all products.
    size : int
        The size of the resulting cycle graph embedding.

    Returns
    -------
    embedding : np.ndarray
        The embedding of the resulting graph into a cycle graph of the given
        size.
    """
    if isinstance(G, list):
        return [apply_products_embed(g, products, factors, size) for g in G]

    if isinstance(factors, nx.Graph):
        factors = [factors] * len(products)
    elif len(factors) != len(products):
        raise ValueError("Number of factors does not match number of products.")

    for product, factor in zip(products, factors):
        G = product(G, factor)

    return cycles(G, size)

def evaluate_products(products_df):
    """
    Evaluate the results of a product experiment, by comparing the resulting
    graph embeddings. 

    Parameters
    ----------
    products_df : pandas.DataFrame
        A dataframe with the results of the product experiment. The columns
        contain the different products and the index contains the factor graphs
        used for the products. Each cell contains a list of graph embeddings.

    Returns
    -------
    results : pandas.DataFrame
        A dataframe with the results of the evaluation.
    """
    print("Comparing embeddings...")
    results = products_df.map(lambda x: compare(x))
    return results

To allow a comparison with the results of the 'simple_product_embeddings' notebook, we will use the same dataset, the graph atlas.

In [3]:
Gs = [G for G in nx.graph_atlas_g() if not nx.is_empty(G) and nx.is_connected(G)]
len(Gs)

995

For the evaluation of the possible positive effect of repeated graph products we will focus on the Strong, Tensor and Modular products. For these products we will examin:
- Repeated application of the same product with same factor graphs
- Repeated application of the same product with factor graphs of increasing size
- Mixed application of different products with factor graphs found effective in the previous experiments 

In [4]:
products = {k: v for k, v in PRODUCTS.items() if k in ["Strong", "Tensor", "Modular"]}

## Repeated Application

In [5]:
factor_graphs = {
    'K3': nx.complete_graph(3),
    'K5': nx.complete_graph(5),
    'P3': nx.path_graph(3),
    'P5': nx.path_graph(5),
    'S3': nx.star_graph(3),
    'S5': nx.star_graph(5),
}

In [6]:
def results(depth):
    graph_products = pd.DataFrame(index=factor_graphs.keys(), columns=products.keys())
    graph_products.index.name = "Factor Graph"
    graph_products.columns.name = "Graph Product"

    # embed_size = 7 * max(#nodes in factor graph) * depth
    max_factor_size = max(len(factor_graph) for factor_graph in factor_graphs.values())
    embed_size = 7 * (max_factor_size ** depth)
    print(f"Embedding graphs (size: {embed_size})...", flush=True)

    progress = tqdm(total=len(factor_graphs) * len(products))
    for factor_name, factor_graph in factor_graphs.items():
        for graph_name, product in products.items():
            product_pipeline = [product] * depth
            graph_products.loc[factor_name, graph_name] = apply_products_embed(Gs, product_pipeline, factor_graph, embed_size)
            progress.update(1)
    progress.close()
    return graph_products

### Embeddings & Comparisons

In [7]:
d1 = results(1)
evaluate_products(d1)

Embedding graphs (size: 42)...


  0%|          | 0/18 [00:00<?, ?it/s]

100%|██████████| 18/18 [00:07<00:00,  2.34it/s]

Comparing embeddings...





Graph Product,Strong,Tensor,Modular
Factor Graph,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
K3,11795,255,283
K5,9086,108,123
P3,787,7311,36
P5,33,549,7
S3,1987,7263,72
S5,2570,7224,28


In [8]:
d2 = results(2)
evaluate_products(d2)

Embedding graphs (size: 252)...


100%|██████████| 18/18 [03:37<00:00, 12.08s/it]

Comparing embeddings...





Graph Product,Strong,Tensor,Modular
Factor Graph,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
K3,10280,38,76
K5,10240,13,9
P3,140,4308,37
P5,0,0,26
S3,479,7483,11
S5,477,7488,17


In [9]:
d3 = results(3)
evaluate_products(d3)

Embedding graphs (size: 1512)...


100%|██████████| 18/18 [1:54:43<00:00, 382.41s/it] 

Comparing embeddings...





Graph Product,Strong,Tensor,Modular
Factor Graph,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
K3,9544,2,27
K5,10241,3,12
P3,14,3115,2
P5,0,0,54
S3,1,2837,13
S5,3,3061,13


## Increasing Size

In [10]:
factor_graphs = {
    'Complete': nx.complete_graph,
    'Path': nx.path_graph,
    'Star': nx.star_graph,
}

In [11]:
def results(start, end, depth):
    factor_graph_sizes = np.linspace(start, end, depth, dtype=int)
    repr_sizes = str(tuple(factor_graph_sizes)).replace(" ", "")

    factor_combinations = [k + repr_sizes for k in factor_graphs.keys()]
    graph_products = pd.DataFrame(index=factor_combinations, columns=products.keys())
    graph_products.index.name = "Factor Graph"
    graph_products.columns.name = "Graph Product"


    # embed_size = 7 * mult(graph sizes + 1)
    embed_size = 7 * np.prod(factor_graph_sizes + 1)
    print(f"Embedding graphs (size: {embed_size})...", flush=True)

    progress = tqdm(total=len(factor_graphs) * len(products))
    for factor_name, factor_graph_type in zip(factor_combinations, factor_graphs.values()):
        for graph_name, product in products.items():
            product_pipeline = [product] * depth
            factor_pipeline = [factor_graph_type(size) for size in factor_graph_sizes]
            graph_products.loc[factor_name, graph_name] = apply_products_embed(Gs, product_pipeline, factor_pipeline, embed_size)  
            progress.update(1)
    progress.close()
    return graph_products

In [12]:
step2 = results(3, 5, 2)
evaluate_products(step2)

Embedding graphs (size: 168)...


  0%|          | 0/9 [00:00<?, ?it/s]

100%|██████████| 9/9 [01:19<00:00,  8.80s/it]

Comparing embeddings...





Graph Product,Strong,Tensor,Modular
Factor Graph,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"Complete(3,5)",10315,6,8
"Path(3,5)",2,116,42
"Star(3,5)",479,7483,11


In [13]:
step3 = results(3, 5, 3)
evaluate_products(step3)

Embedding graphs (size: 840)...


100%|██████████| 9/9 [29:59<00:00, 199.91s/it]

Comparing embeddings...





Graph Product,Strong,Tensor,Modular
Factor Graph,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"Complete(3,4,5)",9544,2,27
"Path(3,4,5)",0,2,19
"Star(3,4,5)",415,2836,23


## Mixed Application
In the previous experiments we have seen some surprisingly good results with the Modular product, throughout all factor graphs. Also the match of the strong product with the Path factor graph was quite effective, as well as the Tensor product with the Complete and the Path factor graphs. To get disjoint matches, we will combine:
- Modular product with Star factor graph
- Strong product with Path factor graph
- Tensor product with Complete factor graph

In [14]:
factor_graphs = {
    'Complete': nx.complete_graph,
    'Path': nx.path_graph,
    'Star': nx.star_graph,
}

In [15]:
def results(product_pipe, factor_pipe):
    
    embed_size = 7 * np.prod([len(factor_graph) for factor_graph in factor_pipe])
    print(f"Embedding graphs (size: {embed_size})...", flush=True)

    return apply_products_embed(Gs, product_pipe, factor_pipe, embed_size)

In [16]:
factor_matches = {
    'Modular': factor_graphs['Star'](5),
    'Strong': factor_graphs['Path'](5),
    'Tensor': factor_graphs['Complete'](5),
}

def combine(*product_factor_pairs):
    product_pipe = []
    factor_pipe = []
    for product, factor in product_factor_pairs:
        product_pipe.append(product)
        factor_pipe.append(factor)
    return product_pipe, factor_pipe

In [17]:
all_2d_combinations = pd.DataFrame(index=products.keys(), columns=products.keys())
all_2d_combinations.index.name = "First Product"
all_2d_combinations.columns.name = "Second Product"

for first_product in products.keys():
    for second_product in products.keys():
        product_pipe, factor_pipe = combine(
            (products[first_product], factor_matches[first_product]),
            (products[second_product], factor_matches[second_product]),
        )
        all_2d_combinations.loc[first_product, second_product] = results(product_pipe, factor_pipe)

evaluate_products(all_2d_combinations)     

Embedding graphs (size: 175)...


Embedding graphs (size: 175)...
Embedding graphs (size: 210)...
Embedding graphs (size: 175)...
Embedding graphs (size: 175)...
Embedding graphs (size: 210)...
Embedding graphs (size: 210)...
Embedding graphs (size: 210)...
Embedding graphs (size: 252)...
Comparing embeddings...


Second Product,Strong,Tensor,Modular
First Product,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Strong,0,2,22
Tensor,6,13,4
Modular,0,0,17


In [18]:
p_pipe, f_pipe = combine(
    (products['Modular'], factor_matches['Modular']),
    (products['Strong'], factor_matches['Strong']),
    (products['Tensor'], factor_matches['Tensor']),
)
ms_sp_tc = pd.DataFrame({'Result': [results(p_pipe, f_pipe)]})
evaluate_products(ms_sp_tc)

Embedding graphs (size: 1050)...
Comparing embeddings...


Unnamed: 0,Result
0,0
