Skip to content

bpames/DENSUB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DENSUB.py -- ADMM for the DENsest SUBmatrix problem

Introduction

This is a Python module implementing the algorithms developed in the paper Convex optimization for the densest subgraph and densest submatrix problems.

See also R-package and Matlab package.

The problem of identifying a dense submatrix is a fundamental problem in the analysis of matrix structure and complex networks. This code provides tools for identifying the densest submatrix of the fixed size in a given graph/matrix using first-order optimization methods.

See the tutorial below to get started.

Usage

The densub module contains the following:

  • plantedsubmatrix: function that generates binary matrix sampled from dense submatrix of particular size.
  • mat_shrink: function for evaluating the matrix soft-threholding operator applied to vector of singular values (used in the $X$-update step of densub.m)
  • DenSub: class of ADMM algorithms for our relaxation of the densest subgraph and submatrix problems.

Like other python modules, densub can be used after moving it to your file path and loading it:

from densub import *

Note we can also load densub with a prefix using one of the following.

import densub
import densub as ds

Examples

We illustrate use of this package on three different types of data:

  1. Using random matrices sampled from the planted dense $m \times n$ submatrix model;

  2. Using random adjacency matrices of random graphs sampled from the planted dense subgraph model;

  3. Using a real-world collaboration network.

Random Matrices

We generate a random matrix with noise obscuring the planted submatrix using the function plantedsubmatrix and then use the solver from the DenSub class to recover the planted submatrix.

# Dimensions of the full matrix.
M, N = 100, 200

# Dimensions of the dense block.
m, n = 50, 40

# In-group and noise density.
q, p = 0.85, 0.25

# Make binary matrix with planted mn-submatrix
[A, X0, Y0] = plantedsubmatrix(M,N,m,n,p,q)

After generating the random matrix with desired planted structure, we can visually represent the matrix and planted submatrix as two-tone images, where dark pixels correspond to nonzero entries, and light pixels correspond to zero entries.

$A$, $X_0$, and $Y_0$

The vizualization of the randomly generated matrix $A$ helps us to understand its structure. It is clear that it contains a dense 50 x 40 block (top left corner); this block is indexed by the matrix $X_0$. On the other hand, the nonzero entries of the matrix $Y_0$ are exactly those where $A$ is $0$ and $X_0$ is not.

We call the ADMM solver with default settings and visualize the output:

ds = DenSub(verbose=True)
[X,Y,Q,its] = ds.solve(A, m, n, 6/(np.sqrt(m*n)*(q-p)))

The ADMM solver returns the optimal solutions $X$ and $Y$:

$X$ and $Y$

It must be noted that matrices $X$ and $Y$ are identical to the actual structures of $X_0$ and $Y_0$. The planted submatrix is recovered.

Symmetric Matrices and Dense Subgraphs

DenSub can also be used to solve the densest subgraph problem without modification.

To illustrate this process, we first make a random graph containing $60$ nodes with a single dense subgraph with $40$ nodes. We use the same probabilities of edge creation from our previous example. We sample the random graph using the stochastic_block_model function from the Networkx package.

# Size of random matrix.
M, N = 60, 60 

# Size of planted block.
m = 40

# Connectivity probabilities.
p = [[0.85, 0.25], [0.25, 0.25]]

# Sample the graph.
G = nx.stochastic_block_model(sizes=[m, M - m], p=p, seed=42)

# Generate the adjacency matrix.
AG = nx.adjacency_matrix(G).toarray()

Graph Sampled from the Dense Subgraph Model

We can find this subgraph by calling DenSub. Here, we change the optimization parameters from their defaults for illustration purposes.

tau = 0.5
tol = 1e-5
maxits = 1000

m = 40

ds = DenSub(tau=tau, opt_tol=tol, maxiter=maxits, verbose=True)
[X,Y,Q,its] = ds.solve(AG, m, m, 8/m)

As before, the DenSub solver converges to the matrix representation of the planted dense subgraph highlighted in the figures below.

Nodes of the planted subgraph are highlighted as orange squares.

Reusing Class Instances

Our implementation of DenSub as a class of optimization algorithms, distinguished only by iteration parameters, allows us to solve different problem instances with these same settings.

For example, we can solve the original instance of the planted submatrix problem with the settings we used to recover the planted subgraph using the same solver.

[X,Y,Q,its] = ds.solve(A, m=50, n=40, gamma=6/np.sqrt(200*0.6))

Here, we converge to the same solution but in more iterations when using this choice of Augmented Lagrangian parameter $\tau$ and a more strict stopping tolerance.

The JAZZ Collaboration Data Set

The following is an example on how one could use the package to analyze the collaboration network found in the JAZZ dataset (see Community Structure in Jazz. Gleiser and Danon. 2003). This is a social/collaboration network where each pair of musicians are linked if they have performed together. The maximum clique in this network contains 30 musicians.

# Read the graph from adjacency list file.
Gjazz = nx.read_adjlist("DEMO/jazz.txt")
Gjazz = nx.convert_node_labels_to_integers(Gjazz, first_label=0, ordering='default', label_attribute=None)

# Generate the adjacency matrix.
Ajazz = nx.adjacency_matrix(Gjazz).toarray()

The JAZZ graph

We are now ready to try to identify the densest submatrix of size 30 in this adjacency matrix. Here, we set $m=n=30$ in our call to DenSub.

m = 30
ds = DenSub(tau=0.35, opt_tol=1e-4, maxiter=2000, verbose=True)
[X,Y,Q,its] = ds.solve(Ajazz, m, m, 12/m)

Our algorithm finds the maximum clique, corresponding to the group of musicians indexed by nonzero entries of $X$ visualized below.

Nodes of the maximum clique in the JAZZ graph are highlighted oranged squares

How to contribute

  • Fork, clone, edit, commit, push, create pull request.

Reporting bugs and other issues

If you encounter a clear bug, please file a minimal reproducible example on github.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published