# GPU-Accelerated Group-in-a-Box Layout for Analyzing Large Graphs

Visualizing interactions within large, complex datasets can be challenging. The Group-in-a-Box layout simplifies this by arranging communities into grids, making it easier to analyze relationships within and between groups. It’s especially useful for deeper insights in social media analysis.

PyGraphistry extends the [Group-in-a-box Layout for Multifaceted Analysis of Communities](https://www.cs.umd.edu/users/ben/papers/Rodrigues2011Group.pdf) with high-speed performance, flexible customization, and integration into the PyGraphistry ecosystem.

## Key Benefits

- **Faster Insights**: GPU support accelerates commercial workloads, reducing runtime for a 3M+ edge social network from 18 minutes to just 26 seconds. CPU mode likewise benefits from algorithmic and vector optimizations. The result is rapid analysis iterations that unblocks workflows and achieves previously out-of-reach results.

- **Customizable Layouts**:
  - **Flexible Partitioning**: Choose from built-in algorithms or custom keys to align partitions with data structures like regional clusters or demographics.
  - **Adaptive Community Layouts**: Focus on tightly connected groups or highlight outliers to uncover hidden patterns.

- **Clear Visualization of Isolated Nodes**: The PyGraphistry variant additionally arranges isolated nodes in circles around their connected counterparts to handle the common case of noisey nodes within a partition.

## Tutorial

Follow this tutorial to master PyGraphistry's Group-in-a-Box layout:

- Load an 88,000-edge Facebook network.
- Layout in seconds on CPU or sub-second on GPU.
- Customize partitioning and group layouts.


In [1]:
import pandas as pd
import graphistry
graphistry.__version__

'0+unknown'

In [2]:
# API key page (free GPU account): https://hub.graphistry.com/users/personal/key/
graphistry.register(
    api=3,
    personal_key_id=FILL_ME_IN,
    personal_key_secret=FILL_ME_IN
)

## Data


In [18]:
e_df = pd.read_csv(
    'https://raw.githubusercontent.com/graphistry/pygraphistry/refs/heads/master/demos/data/facebook_combined.txt',
    sep=' ',
    names=['s', 'd']
)

print(e_df.shape)
e_df.head()

(88234, 2)


Unnamed: 0,s,d
0,0,1
1,0,2
2,0,3
3,0,4
4,0,5


In [42]:
g = graphistry.edges(e_df, 's', 'd').scene_settings(point_size=0.5)

### Regular layout

In [43]:
g.plot()

## Group-in-a-box

Passing in a pandas dataframe defaults to using igraph rf layout (CPU) for each partition

In [44]:
g2 = g.group_in_a_box_layout()

edge index g._edge not set so using edge index as ID; set g._edge via g.edges(), or change merge_if_existing to Falseedge index g._edge __edge_index__ missing as attribute in ig; using ig edge order for IDsPandas engine detected. FA2 falling back to igraph fredge index g._edge not set so using edge index as ID; set g._edge via g.edges(), or change merge_if_existing to Falseedge index g._edge __edge_index__ missing as attribute in ig; using ig edge order for IDs

In [45]:
g2.plot()

### GPU Mode

Switching the input to GPU dataframes automatically transitions execution to GPU mode

In [46]:
g_gpu = g.to_cudf()

g2_gpu = g_gpu.group_in_a_box_layout()

Failed to run cugraph algorithm and src/dst columns are numeric, coercing to strings and retrying

In [47]:
g2_gpu.plot()

## Configure: Precomputed partition and alternate layout

* Use an existing node attribute to predetermine box membership
* Control the layout algorithm and its parameters

In [48]:
g_louvain = g.to_cudf().compute_cugraph('louvain', directed=False)
assert 'louvain' in g_louvain._nodes

In [52]:
from graphistry.plugins.igraph import layout_algs as igraph_layouts
from graphistry.plugins.cugraph import layout_algs as cugraph_layouts

{
    'igraph_layout_algs': ', '.join(igraph_layouts),
    'cugraph_layout_algs': ', '.join(cugraph_layouts)
}

{'igraph_layout_algs': 'auto, automatic, bipartite, circle, circular, dh, davidson_harel, drl, drl_3d, fr, fruchterman_reingold, fr_3d, fr3d, fruchterman_reingold_3d, grid, grid_3d, graphopt, kk, kamada_kawai, kk_3d, kk3d, kamada_kawai_3d, lgl, large, large_graph, mds, random, random_3d, rt, tree, reingold_tilford, rt_circular, reingold_tilford_circular, sphere, spherical, circle_3d, circular_3d, star, sugiyama',
 'cugraph_layout_algs': 'force_atlas2'}

In [53]:
(g_louvain
     .group_in_a_box_layout(
         partition_key='louvain',
         layout_alg='force_atlas2',
         layout_params={
             'lin_log_mode': True
         }
     )
).plot()