<H1><center>Neural Architecture Search using MCGRAN</center></H1>

# Multi-conditional GRAN

We model the conditional graph generation as an affine transformation based on given contraints.
We use $ \textrm{MLP}_{scale} $ and $ \textrm{MLP}_{shift} $ to geometrically transform the feature space of bernoulli mixture components.
Our intention with this transformation is to separate the various constraints(classes or categories) of graphs into distinct real space.

\begin{align}
	\alpha_{1},..., \alpha_{K} = \textrm{Softmax}(\sum_{i \in \boldsymbol{b}_{t},1 \leq j \leq i} \textrm{MLP}_{\alpha} (h_{i}^{R} - h_{j}^{R}) \otimes \textrm{MLP}_{scale}(c_{i}) + \textrm{MLP}_{shift}(c_{i}))
\end{align}

\begin{align}
	\theta_{1,i,j},..., \theta_{K,i,j} = \textrm{Sigmoid}(\textrm{MLP}_{\theta} (h_{i}^{R} - h_{j}^{R})  \otimes \textrm{MLP}_{scale}(c_{i}) + \textrm{MLP}_{shift}(c_{i}))
\end{align}

In equations, $ M $ denote the number of constraints enforced on each node $ i $, $ c_{i} \in \mathbb{R}^{M} $ represents the constraints vector associated with each node $ i $, $ \textrm{MLP}_{scale} \in  \mathbb{R}^{K \times H}$ is a RELU-based hidden layer capturing features for scaling factor, and $ \textrm{MLP}_{shift} \in  \mathbb{R}^{K \times H} $ is a RELU-based hidden layer capturing features for shift factor. $ K $ denote the number of mixture components and $ H $ denote the hidden dimension size.

# Graph-based Auto-regressive Affine Transformations

Using the structure of the graph, that is, the adjacency matrix $ A $ for each graph $G = (V,E)$. We create node level features using 1D convolutions. We use a block of three 1D convolutional hidden layers stacked together to capture features for each vertices $ v $, where $ \textrm{CNN}(A): \mathbb{R}^{|V| \times |V|} \rightarrow \mathbb{R}^{|V| \times H}$, where adjacency matrix denoted as $ A \in \mathbb{R}^{|V| \times |V|} $. Each channel of convolution captures the information specific to a node.

Then, we apply auto-regressive affine transformation to create node labels.  In graph-based auto-regressive affine transformations, for each node label prediction, we use three information.  First information comes from the node features which we captured using the CNN. Let us denote node features for a node $v\in V$ as $q_v\in\mathbb{R}^{H}$ with $H$ being the feature dimension.

Second information comes from the nodes features of the other connected nodes. This acts as neighbor constraints to predict the current node label. For a node $v\in V$ of a graph $G = (V,E)$ its set of neighboring features is given as
\[
Neigh(v) = \{ q_s ~|~ (s,v) \in E \}
\]
We could also define $NAgg(v): \mathbb{R}^{|Neigh(v)| \times H} \rightarrow \mathbb{R}^{H}$

\begin{align}
	NAgg(v) = \sum\limits_{q\in\{ q_s ~|~ (s,v) \in E \}}
\end{align}

For each node $ v $, we apply affine transformation based on node neighbors with one scaling factor $ \textrm{MLP}_{scale}(NAgg(v)) $ and one shift factor $ \textrm{MLP}_{shift}(NAgg(v)) $. 

Third information comes from the graph level constraint. We apply geometric transformation based on graph constraints with one scaling factor  $ \textrm{MLP}_{scale}(c_{v}) $ and one shift factor $ \textrm{MLP}_{shift}(c_{v}) $.

The $ c_{v} \in \mathbb{R}^{M} $ represents the constraints vector associated with each node $ v $, the $ M $ denote the number of constraints enforced on each node $ v $, the $ \textrm{MLP}_{scale} \in  \mathbb{R}^{H}$ is a RELU-based hidden layer capturing features for scaling factor, and the $ \textrm{MLP}_{shift} \in  \mathbb{R}^{ H} $ is a RELU-based hidden layer capturing features for shift factor. The $ H $ denote the hidden dimension size.

%From a social network perspective, each person (node) uses the information he or she knows, then collects information from his or her friends, and finally collects information other sources (books or internet) to make a decision.
%The same process is applied here.

We can combine all three information to predict for each node $ v $ the label as shown in equation~\ref{eq:mcgran_node_label_equation}.

\begin{align}
	v = \textrm{Softmax}(\textrm{CNN}(v) \otimes \textrm{MLP}_{scale}(NAgg(v)) \otimes \textrm{MLP}_{scale}(c_{v})   + \textrm{MLP}_{shift}(NAgg(v)) + \textrm{MLP}_{shift}(c_{v}))
\end{align}

In [1]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


In [9]:
!pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/, https://download.pytorch.org/whl/cu113
Collecting torch==1.11.0+cu113
  Downloading https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp38-cp38-linux_x86_64.whl (1637.0 MB)
[K     |████████████████▎               | 834.1 MB 1.2 MB/s eta 0:11:18tcmalloc: large alloc 1147494400 bytes == 0x39b32000 @  0x7fb54d1fe615 0x5d631c 0x51e4f1 0x51e67b 0x4f7585 0x49ca7c 0x4fdff5 0x49caa1 0x4fdff5 0x49ced5 0x4f60a9 0x55f926 0x4f60a9 0x55f926 0x4f60a9 0x55f926 0x5d7c18 0x5d9412 0x586636 0x5d813c 0x55f3fd 0x55e571 0x5d7cf1 0x49ced5 0x55e571 0x5d7cf1 0x49ec69 0x5d7c18 0x49ca7c 0x4fdff5 0x49ced5
[K     |████████████████████▋           | 1055.7 MB 1.2 MB/s eta 0:07:57tcmalloc: large alloc 1434370048 bytes == 0x7e188000 @  0x7fb54d1fe615 0x5d631c 0x51e4f1 0x51e67b 0x4f7585 0x49ca7c 0x4fdff5 0x49caa1 0x4fdff5 0x49ced5 0x4f60a9 0x55f926 0x4f60a9 0x55f926 0x4f60a9 0x55f926 0x5d7c18 0x5d9412 0x58663

In [2]:
!pip install tensorboardX

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tensorboardX
  Downloading tensorboardX-2.5.1-py2.py3-none-any.whl (125 kB)
[K     |████████████████████████████████| 125 kB 5.2 MB/s 
Installing collected packages: tensorboardX
Successfully installed tensorboardX-2.5.1


In [3]:
!curl -O https://storage.googleapis.com/nasbench/nasbench_only108.tfrecord

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  498M  100  498M    0     0   152M      0  0:00:03  0:00:03 --:--:--  152M


In [4]:
#!g++ -O2 -std=c++11 -o drive/MyDrive/Research-NAS/MCGRAN/utils/orca/orca.cpp

In [5]:
#!chmod +x drive/MyDrive/Research-NAS/MCGRAN/utils/orca/orca

## Configuration Settings

All the configurable settings for working the GRAN model are specified in the yaml files located in the config folder.  There are five sections of parameters in the configuration files:

1.   General experimental parameters
2.   Dataset parameters
3.   Model parameters
4.   Training parameters
5.   Test parameters

### General experimental parameters

Name of the experiment
> **exp_name**: *MCGRAN*

The experiment directory folder name if not already present.  This folder will contain the training and evaluation metrics.

> **exp_dir**: *exp/MCGRAN/* 

Name of the runner class name from gran_runner_*.py file
> **runner**: *GranRunner_Evaluation*

Distributed training of the model in multiple machines. Always set to false.  We did not test with true.
> **use_horovod**: *false*

GRU related settings. Always set to true. we did not test with false.
> **use_gpu**: *true*      

Cuda device id
> **device**: *cuda:0*  

Number of GPUs
> **gpus**: [0]

Random seed for reproducing the experiments
> seed: 78123456

In [None]:
!python drive/MyDrive/Research-NAS/MCGRAN/run_exp.py -c drive/MyDrive/Research-NAS/MCGRAN/config/mcgran.yaml -t

Loading dataset from file... This may take a few minutes...
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
Loaded dataset in 44 seconds
INFO  | 2022-12-08 12:21:14,435 | run_exp.py                | line 31   : Writing log file to /content/drive/MyDrive/Research-NAS/MCGRAN/exp/MCGRAN/503/log_exp_503.txt
INFO  | 2022-12-08 12:21:14,437 | run_exp.py                | line 32   : Exp instance id = 503
INFO  | 2022-12-08 12:21:14,438 | run_exp.py                | line 33   : Exp comment = None
INFO  | 2022-12-08 12:21:14,438 | run_exp.py                | line 34   : Config =
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
{'dataset': {'data_path': './',
             'dev_ratio': 0.2,
             'has_node_feat': False,
             'is_overwrite_precompute': False,
             'is_sample_subgraph': True,
             'is_save_split': Tr

# Evaluation