# Community Detection Using the `NETWORK` Actionset in SAS Viya and Python

#### Imports

Our imports are broken out as follows:

| Module        | Description                                                                        |
|:--------------|:----------------------------------------------------------------------------------:|
| `os`          | Allows access to environment variables.                                            |
| `swat`        | SAS Python module that orchestrates communicatoin with a CAS server.               |
| `pandas`      | Data management module we use for preparation of local data.                       |
| `networkx`    | Used to manage graph data structures when plotting.                                |
| `bokeh`       | Module used to generate interactive plots of graphs.                               |
| `python_demo` | Custom module written for these examples that handles datasets and visualizations. |

In [1]:
import os
import swat
import pandas as pd

import networkx as nx

from bokeh.io import output_notebook, show
from bokeh.layouts import gridplot
from bokeh.palettes import Spectral8

from python_demo.datasets.examples import community_graph_links, community_graph_nodes
from python_demo.visualization.bokeh_graphs import render_plot

The call to `output_notebook` is required by `bokeh` to render plots inside Jupyter Notebooks.

In [2]:
output_notebook()

In [3]:
host = os.environ['CAS_HOST']
port = int(os.environ['CAS_PORT'])
print(f"{host}:{port}")

rdcgrd113.unx.sas.com:23404


In [4]:
dfLinkSetIn = community_graph_links()
dfNodeSetIn = community_graph_nodes()

Let's start by looking at the basic network itself.

We create a `networkx` graph and pass it to our `bokeh` helper function to create the initial plot.

In [5]:
G_comm = nx.from_pandas_edgelist(dfLinkSetIn, 'from', 'to')

title = 'Sample Undirected Graph for Community Detection'
hover = [('Node', '@index')]
nodeSize = 25

plot = render_plot(G_comm, title, hover, nodeSize)
show(plot)

In [6]:
conn = swat.CAS(host, port)

In [7]:
_ = conn.loadactionset("network")

NOTE: Added action set 'network'.


In [8]:
conn.caslibinfo()

Unnamed: 0,Name,Type,Description,Path,Definition,Subdirs,Local,Active,Personal,Hidden,Transient
0,CASTestTmp,PATH,castest's test files,/bigdisk/lax/castest/,,1.0,0.0,0.0,0.0,0.0,0.0
1,CASUSER(daherr),PATH,Personal File System Caslib,/u/daherr/,,1.0,0.0,1.0,1.0,0.0,1.0
2,Formats,PATH,Format Caslib,/bigdisk/lax/formats/,,1.0,0.0,0.0,0.0,0.0,0.0


In [9]:
_ = conn.upload(dfLinkSetIn, casout=dict(name='LinkSetIn'))
_ = conn.upload(dfNodeSetIn, casout=dict(name='NodeSetIn'))

NOTE: Cloud Analytic Services made the uploaded file available as table LINKSETIN in caslib CASUSER(daherr).
NOTE: The table LINKSETIN has been created in caslib CASUSER(daherr) from binary data uploaded to Cloud Analytic Services.
NOTE: Cloud Analytic Services made the uploaded file available as table NODESETIN in caslib CASUSER(daherr).
NOTE: The table NODESETIN has been created in caslib CASUSER(daherr) from binary data uploaded to Cloud Analytic Services.


In [10]:
conn.network.community(links=dict(name='LinkSetIn'),
                       outnodes=dict(name='nodeSetOutA'),
                       outLevel=dict(name='CommLevelOut'),
                       outCommunity=dict(name='CommOut'),   
                       outOverlap=dict(name='CommOverlapOut'),     
                       outCommLinks=dict(name='CommLinksOut'),
                       resolutionList=[0.5, 1]
 )

NOTE: The number of nodes in the input graph is 9.
NOTE: The number of links in the input graph is 11.
NOTE: Processing community detection using 1 threads across 1 machines.
NOTE: At resolution=1, the community algorithm found 3 communities with modularity=0.392562.
NOTE: At resolution=0.5, the community algorithm found 2 communities with modularity=0.342975.
NOTE: Processing community detection used 0.00 (cpu: 0.00) seconds.


Unnamed: 0,casLib,Name,Label,Rows,Columns,casTable
0,CASUSER(daherr),nodeSetOutA,,9,3,"CASTable('nodeSetOutA', caslib='CASUSER(daherr)')"
1,CASUSER(daherr),CommLinksOut,,3,5,"CASTable('CommLinksOut', caslib='CASUSER(daher..."
2,CASUSER(daherr),CommOut,,5,9,"CASTable('CommOut', caslib='CASUSER(daherr)')"
3,CASUSER(daherr),CommLevelOut,,2,4,"CASTable('CommLevelOut', caslib='CASUSER(daher..."
4,CASUSER(daherr),CommOverlapOut,,11,3,"CASTable('CommOverlapOut', caslib='CASUSER(dah..."

Unnamed: 0,Name1,Label1,cValue1,nValue1
0,numNodes,Number of Nodes,9,9.0
1,numLinks,Number of Links,11,11.0
2,graphDirection,Graph Direction,Undirected,

Unnamed: 0,Name1,Label1,cValue1,nValue1
0,problemType,Problem Type,Community Detection,
1,status,Solution Status,OK,
2,cpuTime,CPU Time,0.00,0.0
3,realTime,Real Time,0.00,0.00012


In [11]:
# pull the node set locally so we can plot
comm_nodes_cas = conn.CASTable('NodeSetOutA').to_dict(orient='index')

In [12]:
comm_nodes_0 = {v['node']:v['community_0'] for v in comm_nodes_cas.values()}
comm_nodes_1 = {v['node']:v['community_1'] for v in comm_nodes_cas.values()}

In [13]:
nx.set_node_attributes(G_comm, comm_nodes_0, 'community_0')
nx.set_node_attributes(G_comm, comm_nodes_1, 'community_1')

In [14]:
for node in G_comm.nodes:
    G_comm.nodes[node]['highlight_0'] = Spectral8[int(G_comm.nodes[node]['community_0'])]
    G_comm.nodes[node]['highlight_1'] = Spectral8[int(G_comm.nodes[node]['community_1'])]

In [15]:
title = 'Community Detection Example 1: Resolution 1'
hover = [('Node', '@index'), ('Community', '@community_0')]
attr_for_highlight = 'highlight_0'
nodeSize = 25

plot = render_plot(G_comm, title, hover, nodeSize, attr_for_highlight)
show(plot)

In [16]:
title = 'Community Detection Example 2: Resolution 0.5'
hover = [('Node', '@index'), ('Community', '@community_1')]
attr_for_highlight = 'highlight_1'
nodeSize = 25

plot = render_plot(G_comm, title, hover, nodeSize, attr_for_highlight)
show(plot)

Now, let's perform community detection on fixed node groups.

The Python code in the subsequent block is equivalent to this block of CASL:
```
proc network
   nodes             = mycas.NodeSetIn
   links             = mycas.LinkSetIn
   outNodes          = mycas.NodeSetOut;
   community
      resolutionList = 1.0
      fix            = fixGroup;
run;
```

In [17]:
conn.network.community(nodes=dict(name='NodeSetIn'),
                       links=dict(name='LinkSetIn'),
                       outnodes=dict(name='NodeSetOutB'),
                       resolutionList=[1.0],
                       fix='fixGroup')

NOTE: The number of nodes in the input graph is 9.
NOTE: The number of links in the input graph is 11.
NOTE: Processing community detection using 1 threads across 1 machines.
NOTE: At resolution=1, the community algorithm found 3 communities with modularity=0.342975.
NOTE: Processing community detection used 0.00 (cpu: 0.00) seconds.


Unnamed: 0,casLib,Name,Label,Rows,Columns,casTable
0,CASUSER(daherr),NodeSetOutB,,9,2,"CASTable('NodeSetOutB', caslib='CASUSER(daherr)')"

Unnamed: 0,Name1,Label1,cValue1,nValue1
0,numNodes,Number of Nodes,9,9.0
1,numLinks,Number of Links,11,11.0
2,graphDirection,Graph Direction,Undirected,

Unnamed: 0,Name1,Label1,cValue1,nValue1
0,problemType,Problem Type,Community Detection,
1,status,Solution Status,OK,
2,cpuTime,CPU Time,0.00,0.0
3,realTime,Real Time,0.00,0.000122


In [18]:
conn.fetch(table={'name':'NodeSetOutB'}, 
           sortby=[{'name':'community_0', 
                    'order':'ascending'}
                  ]
          )

Unnamed: 0,node,community_0
0,A,0.0
1,F,0.0
2,B,0.0
3,E,0.0
4,G,1.0
5,I,1.0
6,H,1.0
7,C,2.0
8,D,2.0


In [19]:
comm_fixed_nodes_cas = conn.CASTable('NodeSetOutB').to_dict(orient='index')

In [20]:
comm_fixed_nodes = {v['node']:v['community_0'] for v in comm_fixed_nodes_cas.values()}

In [21]:
nx.set_node_attributes(G_comm, comm_fixed_nodes, 'community')

In [22]:
for node in G_comm.nodes:
    G_comm.nodes[node]['highlight'] = Spectral8[int(G_comm.nodes[node]['community'])]

In [23]:
title = 'Community Detection with Fixed Nodes'
hover = [('Node', '@index'), ('Community', '@community')]
attr_for_highlight = 'highlight'
nodeSize = 25

plot = render_plot(G_comm, title, hover, nodeSize, attr_for_highlight)
show(plot)