In [None]:
!pwd

## Setup python Environment

1) create an isolated python environment namely `gnn` via [conda](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands). 

\[Optional\] Create a configuration file for `conda`: `~/.condarc`, 
and specify the location of envrionments that will house python modules. 
This directory will grow very quickly. I suggest to use a project directory.
```json
envs_dirs:
  - /global/cfs/cdirs/atlas/xju/conda/envs
report_errors: true
```

1.1) Following commands is to install an environment named `gnn`. 
```bash
module load python
conda create -n gnn python=3.8 ipykernel
source $(which conda | sed -e s#bin/conda#bin/activate#)  gnn
python -m ipykernel install --user --name gnn --display-name a-Gnn
```

It will install a kernel file at `~/.local/share/jupyter/kernels/gnn/kernel.json`. 

1.2) create a `~/.local/share/jupyter/kernels/gnn/setup.sh` with the following contents:
```bash
#!/bin/bash
module load python
source $(which conda | sed -e s#bin/conda#bin/activate#)  gnn
python -m ipykernel_launcher $@
```
and make it executable `chmod +x ~/.local/share/jupyter/kernels/gnn/setup.sh`.

Get absolute path: `readlink -f ~/.local/share/jupyter/kernels/gnn/setup.sh`.

1.3) update the `~/.local/share/jupyter/kernels/gnn/kernel.json` as the following. 
Note that the path to `setup.sh` should be the absolute path.
```json
{
 "argv": [
  "/global/u1/x/xju/.local/share/jupyter/kernels/gnn/setup.sh",
  "-f",
  "{connection_file}"
 ],
 "display_name": "a-Gnn",
 "language": "python"
}
```

In [None]:
!which python

In [None]:
!which pip

In [None]:
!pip install tensorflow

Install the python package [root_gnn](https://github.com/xju2/root_gnn/tree/tf2) using the branch `tf2` therein. 

In [None]:
!pip install -e ..

In [None]:
filename = '/global/homes/a/ading/atlas/data/top-tagger/test.h5'

Setting up graphs for training, validation, and testing

1. Creating training graphs

```bash
create_tfrecord /global/homes/x/xju/atlas/data/top-tagger/train.h5 tfrec/train \
  --evts-per-record 100 --max-evts 1000 \
  --type TopTaggerDataset --num-workers 2
```


2. Creating validating graphs

```bash
create_tfrecord /global/homes/x/xju/atlas/data/top-tagger/val.h5 tfrec/val \
  --evts-per-record 100 --max-evts 1000 \
  --type TopTaggerDataset --num-workers 2
```


3. Creating testing graphs

```bash
create_tfrecord /global/homes/x/xju/atlas/data/top-tagger/test.h5 tfrec/test \
  --evts-per-record 100 --max-evts 1000 \
  --type TopTaggerDataset --num-workers 2
```


### Creating graphs using networkx

[networkx](https://networkx.org/documentation/stable/tutorial.html) is a Python package for the study of graphs.


In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import networkx as nx

from graph_nets import utils_np
from graph_nets import utils_tf
from graph_nets import graphs

In [None]:
g = nx.DiGraph()

# add nodes
[g.add_node(idx, features=np.array([1.*idx])) for idx in range(4)];

# add edges
edge_lists = [(0, 1), (1, 2), (2, 3), (3, 0)]
[g.add_edge(i, j, features=np.array([abs(i-j)])) for i,j in edge_lists];

In [None]:
plt.figure(figsize=(4, 4))
pos = nx.spring_layout(g)
nx.draw(g, pos, node_size=400, alpha=0.85, node_color="#1f78b4", with_labels=True)

obtain the adjacency matrix

In [None]:
adj = np.asarray(nx.to_numpy_matrix(g))
adj

In [None]:
g.edges()

In [None]:
g_tuple = utils_np.networkxs_to_graphs_tuple([g])

In [None]:
g_tuple

In [None]:
def print_graphs_tuple(g, data=True):
    for field_name in graphs.ALL_FIELDS:
        per_replica_sample = getattr(g, field_name)
        if per_replica_sample is None:
            print(field_name, "EMPTY")
        else:
            print(field_name, "is with shape", per_replica_sample.shape)
            if data and  field_name != "edges":
                print(per_replica_sample)

In [None]:
print_graphs_tuple(g_tuple)

### Create GraphsTuple using data-dict \[recommend\]

In [None]:
n_node = 4
n_node_features = 1
n_edge = 4
n_edge_features = 1
nodes = np.random.rand(n_node, n_node_features).astype(np.float32)
edges = np.random.rand(n_edge, n_edge_features).astype(np.float32)
receivers = np.array([1, 2, 3, 0])
senders = np.array([0, 1, 2, 3])
datadict = {
    "n_node": n_node,
    "n_edge": n_edge,
    "nodes": nodes,
    "edges": edges,
    "senders": senders,
    "receivers": receivers,
    "globals": np.array([0], dtype=np.float32)
}

In [None]:
g_tuple2 = utils_tf.data_dicts_to_graphs_tuple([datadict])

In [None]:
print_graphs_tuple(g_tuple2)

### Can you finish implementing the following function?

In [None]:
def fully_connected_edges(n_nodes: int):
    """For a given number of nodes, 
    return the senders and receivers for a fully-connected graph.
    """
    
    receivers = senders = n_edge = None
    
    return {"receivers": receivers, "senders": senders, "n_edge": n_edge}

### Convert an event to a fully-connected graph

In [None]:
filename = '/global/homes/a/ading/atlas/data/top-tagger/test.h5'

In [None]:
with pd.HDFStore(filename, mode='r') as store:
    df = store['table']

In [None]:
df.head()

In [None]:
df[df['is_signal_new'] == 1].head()

In [None]:
event = df.iloc[0]
event

In [None]:
import itertools
from typing import Optional

features = ['E', 'PX', 'PY', 'PZ']
scale = 0.001
solution = 'is_signal_new'

def make_graph(event, debug: Optional[bool] = False):
    n_max_nodes = 200
    n_nodes = 0
    nodes = []
    for inode in range(n_max_nodes):
        E_name = 'E_{}'.format(inode)
        if event[E_name] < 0.1:
            continue

        f_keynames = ['{}_{}'.format(x, inode) for x in features]
        n_nodes += 1
        nodes.append(event[f_keynames].values*scale)
    nodes = np.array(nodes, dtype=np.float32)
    # print(n_nodes, "nodes")
    # print("node features:", nodes.shape)

    # edges 1) fully connected, 2) objects nearby in eta/phi are connected
    # TODO: implement 2). <xju>
    all_edges = list(itertools.combinations(range(n_nodes), 2))
    senders = np.array([x[0] for x in all_edges])
    receivers = np.array([x[1] for x in all_edges])
    n_edges = len(all_edges)
    edges = np.expand_dims(np.array([0.0]*n_edges, dtype=np.float32), axis=1)
    # print(n_edges, "edges")
    # print("senders:", senders)
    # print("receivers:", receivers)

    input_datadict = {
        "n_node": n_nodes,
        "n_edge": n_edges,
        "nodes": nodes,
        "edges": edges,
        "senders": senders,
        "receivers": receivers,
        "globals": np.array([n_nodes], dtype=np.float32)
    }
    target_datadict = {
        "n_node": n_nodes,
        "n_edge": n_edges,
        "nodes": nodes,
        "edges": edges,
        "senders": senders,
        "receivers": receivers,
        "globals": np.array([event[solution]], dtype=np.float32)
    }
    input_graph = utils_tf.data_dicts_to_graphs_tuple([input_datadict])
    target_graph = utils_tf.data_dicts_to_graphs_tuple([target_datadict])
    return [(input_graph, target_graph)]

In [None]:
graphs = make_graph(event)

In [None]:
g_evt_input, g_evt_target = graphs[0]

In [None]:
print_graphs_tuple(g_evt_input, data=False)

In [None]:
17*16//2

In [None]:
g_evt_target.globals

## Training

The main script, train_classifier, can be invoked with the following bash command with default arguments:

```bash
train_classifier
```

or with the following arguments specifying I/O and hyperparameters:
```bash
train_classifier --input-dir tfrec --output-dir trained \
  --batch-size 25 --num-epochs 10 --num-iters 10 --lr 0.002
```

You can also specify other models and loss functions defined in ```root_gnn/model.py``` and ```root_gnn/losses.py```.

Let's examine what ```train_classifier``` is doing under the hood:

In [None]:
import tensorflow as tf

import os
import sys
import argparse

import re
import time
import random
import functools
import six

import numpy as np
import sklearn.metrics


from graph_nets import utils_tf
from graph_nets import utils_np
import sonnet as snt

from root_gnn import model as all_models
from root_gnn import losses
from root_gnn.src.datasets import graph
from root_gnn.utils import load_yaml

from root_gnn import trainer 

In [None]:
train_filename = "../tfrec/train/*.tfrec"


model = getattr(all_models, "GlobalClassifierNoEdgeInfo")()
loss_config = "GlobalLoss,1,1".split(',')
loss_fcn = getattr(losses, loss_config[0])(*[float(x) for x in loss_config[1:]])
config = {
    "input_dir": "../tfrec",
    "output_dir": "../trained",
    "batch_size": 50,
    "num_epochs": 5,
    "num_iters": 10,
    "shuffle_size": 1,
    "model": model,
    "loss_name": loss_config[0],
    "loss_fcn": loss_fcn,
    "lr": 0.001,
    "metric_mode": "clf",
    "early_stop": "auc",
    "max_attempts": 1
}
trnr = trainer.TrainerBase(**config)

The ```TrainerBase()``` constructor initializes a base trainer object by unpacking the ```config``` dict.

Next, the user can call functions for loading training, validation, and testing data. The requirement is that the files to be extracted from```input_dir``` must be of the proper ```.tfrec``` format created by ```create_tfrecord```.

In [None]:
#TODO: treat not as config, but as a standalone part

train_data, _ = trnr.load_training_data(filenames="../tfrec/train*.tfrec", shuffle=True)
val_data, _ = trnr.load_validating_data(filenames="../tfrec/val*.tfrec", shuffle=True)

The ```train``` function of ```TrainerBase``` performs training given the specified configurations and hyperparameters. The function can be called in two main ways:

The first way to call ```train``` is by specifying either a model, loss, or training data. The training data must be of the format returned as the first tuple value of ```load_training_data```.

In [None]:
#trnr.train(model, loss_fcn, train_data)

The second way is the default call, which assumes the model and loss are the same as the configurations, and that the training data is the same as the last call to ```load_training_data```.

In [None]:
#trnr.train()

For this next part, we will be using TensorBoard with the ```nersc_tensorboard_helper```. At this point, it is recommended to switch the notebook kernel away from "a-Gnn" to "tensorflow-v2.0.0-cpu" to access the tensorboard helper.

In [None]:
import nersc_tensorboard_helper
%load_ext tensorboard

In [None]:
%tensorboard --logdir /global/homes/a/ading/root_gnn/trained/noedge_fullevts/logs --port 0

In [None]:
nersc_tensorboard_helper.tb_address()

You can now access the link above to view the TensorBoard for your training.