# **Create Graph Datasets and Train the Model**


By running this notebook, one can create the datasets required for training and evaluation of the model, train a new model, or resume the training of an existing model.

More precisely, the usage of this code depends on some Booleans defined in the file: ```mobgraphgen/args.py```:

* If ```args.produce_graphs = True```, this code creates the augmented set of
graphs from user check-ins and saves the resulting graphs.

* If ```args.produce_min_dfscodes = True```, this code creates the minimum dfscodes from the graphs.

*  If ```args.produce_min_dfscode_tensors = True```, this code creates the minimum dfscode tensors from the minimum dfscodes.

*  If ```args.load_model = False```, a new model is initialized and trained. If ```args.load_model = True```, the training of an existing model whose path is already  defined by ```args.load_model_path``` is resumed. Note that if ```args.load_model = True```, the three booleans ```args.produce_graphs, args.produce_min_dfscodes``` and ```args.produce_min_dfscode_tensors``` are automatically set to false.


In [None]:
import torch
torch.cuda.is_available()

In [None]:
from google.colab import drive
drive.mount("/content/drive", force_remount=True)

Run the following cell if the tar files for the minimum dfscode tensors already exist. The following cell code decompresses the tar files into the VM disk. Reading the file into the VM disk is so much faster than reading subdirectory content from colab Drive.

In [None]:
#!sed -i 's/\r//' drive/MyDrive/mobgraphgen/untarscript.sh
!bash drive/MyDrive/mobgraphgen/untarscript.sh

In [None]:
%cd drive/MyDrive/mobgraphgen
!ls

In [None]:
from args import Args

args=Args()
args=args.update_args()
print("args.dropout = ",args.dropout)
print("args.num_layers = ",args.num_layers)
print("args.lr = ",args.lr)
print("args.batch_size = ",args.batch_size)
print("args.loss_type = = ",args.loss_type )
print("args.subdir_size = ",args.subdir_size)
print("args.num_workers = ",args.num_workers)
print("args.load_model = ",args.load_model)
print("args.graphs_created = ",args.graphs_created)
print("args.produce_graphs = ",args.produce_graphs)
print("args.produce_min_dfscodes = ",args.produce_min_dfscodes)
print("args.produce_min_dfscode_tensors = ",args.produce_min_dfscode_tensors)
print("args.milestones = ",args.milestones)
print("args.chunk_size = ",args.chunk_size)
print("args.second_sample_desired_size",args.second_sample_desired_size)
print("args.augmented_numbers",args.augmented_numbers)
eval_mode=False

In [None]:
from build_real_mob_graph import remove_folder, remove_file
import os
import shutil

folders_to_be_removed=["datasets/grid/graph_samples","datasets/grid/graphs","datasets/grid/min_dfscode_tensors","datasets/grid/min_dfscodes"]
files_to_be_removed=["graphs/reals/real_graphs_list.dat","datasets/grid/min_dfscodes_digest.dat",
                     "datasets/grid/min_dfscodes_to_tensors_digest.dat","datasets/grid/graphs_digest.dat","datasets/grid/map_dict.dat"]

if args.produce_graphs==True and eval_mode==False and args.graphs_created==False:
  print("inside if...")
  for foldername in folders_to_be_removed:
    remove_folder(foldername)
  for filename in files_to_be_removed:
    remove_file(filename)


In [None]:
!pip install networkx
!pip install numpy
!pip install scikit-learn
!pip install scipy
!pip install pyemd
!pip install tensorboard
!pip install tqdm
!pip install matplotlib
!pip install haversine
!pip install dgl
!pip install pyemd
!pip install dill


In [None]:
!chmod 755 build.sh
!./build.sh

In [None]:
if args.produce_graphs==True and args.graphs_created==False:
  !python build_real_mob_graph.py

In [None]:
!python3 main.py

In [None]:
from google.colab import drive

drive.flush_and_unmount()
