# Random Search of Parameters

Method which takes in a dictionary of parameter values, and produces `n` sets of values through random sampling

To get within *a*% of the true maximum, you need to run $n > \frac{log(a)}{log(1-a)}$ trials [[ref]](https://stats.stackexchange.com/questions/160479/practical-hyperparameter-optimization-random-vs-grid-search). 

To get within 10% of the true maximum we need to run $n > 21.85$ trials. 

In [1]:
import numpy as np
import itertools
import datetime


## 1. Get parameters

In [71]:
# 1) SMG only
params = {
    "lr": [0.001, 0.01, 0.1],
    "neg_adv": [True, False],
    "hidden_dim": [100, 250, 400],
    "regularization_coef": [2e-6, 2e-8],
    "gamma": [1, 5, 10, 25],
    # for RotatE we add double entity dim as there are Re and Im parts. Also weight RotatE the same as TransE by adding it twice
    "model": ["TransE_l1", "TransE_l2", "RotatE -de", "RotatE -de"],
    # Hinge loss is pairwise, and we try 3 different margins. NOTE: loss config not available in latest release
#     "loss_genre": ["'Logsigmoid'", "'Hinge' -pw -m 1.5", "'Hinge' -pw -m 4.5", "'Hinge' -pw -m 7.5"], 
    "num_negs_per_pos": [10, 25, 50],
    "batch_size": [1000, 2000, 4000, 8000],
}

# 2) SMG and V&A.
# We exclude TransE this time, as 1) showed that RotatE almost always outperforms TransE
params = {
    "lr": [0.001, 0.01, 0.1],
    "neg_adv": [True, False],
    "hidden_dim": [250, 400, 600, 800],
    "regularization_coef": [2e-6, 2e-8],
    "gamma": [1, 5, 10, 25],
    # for RotatE we add double entity dim as there are Re and Im parts.
    "model": ["RotatE -de"],
    "num_negs_per_pos": [10, 25, 50, 75],
    "batch_size": [4000, 8000, 12000], # 8000 was optimal for SMG only
}

In [60]:
def get_random_samples(params, n, replacement=False, seed=42):
    all_keys = params.keys()
    all_vals = [v for k,v in params.items()]
    combinations = list(itertools.product(*all_vals))
    
    rnd = np.random.RandomState(seed)
    res_list = [combinations[i] for i in rnd.choice(len(combinations), n, replace=replacement)]
    all_res = []
    
    for p in res_list:
        all_res.append( {k: p[i] for (i,k) in enumerate(all_keys)} )
        
    return all_res

In [61]:
n = 22
samples = get_random_samples(params, n)
samples[0:3]

[{'lr': 0.001,
  'neg_adv': False,
  'hidden_dim': 400,
  'regularization_coef': 2e-08,
  'gamma': 10,
  'model': 'RotatE -de',
  'num_negs_per_pos': 25,
  'batch_size': 4000},
 {'lr': 0.001,
  'neg_adv': True,
  'hidden_dim': 400,
  'regularization_coef': 2e-08,
  'gamma': 1,
  'model': 'RotatE -de',
  'num_negs_per_pos': 50,
  'batch_size': 4000},
 {'lr': 0.1,
  'neg_adv': False,
  'hidden_dim': 250,
  'regularization_coef': 2e-08,
  'gamma': 10,
  'model': 'TransE_l2',
  'num_negs_per_pos': 50,
  'batch_size': 4000}]

## 2. Run DGL-KE on each of the parameter sets

In [72]:
# fixed params
DATA_PATH="../data/interim/train_test_split/"
SAVE_PATH="./dglke_results"
LOGS_PATH="./dglke_logs"
DATASET="heritageconnector"
FORMAT="raw_udd_hrt"

LOG_INTERVAL=10000
BATCH_SIZE_EVAL=16
NEG_SAMPLE_SIZE_EVAL=1000
N_EPOCHS=800
N_TRIPLES=2625643 # 09.07; 3% test and 3% val


In [75]:
# delete old results and logs folders
! rm -rf {SAVE_PATH}
! rm -rf {LOGS_PATH}

In [74]:
# run experiment

!mkdir dglke_logs

"""
Explanation for (some) parameters:
- max_step: we convert from n_epochs to n_steps by doing n_epochs*(n_triples/batch_size)
- no_eval_filter: this speeds up testing by assuming that all sampled triples are negative. It'll lead to an underestimation
    in performance but is useful for hyperparameter tuning. See this issue: https://github.com/awslabs/dgl-ke/issues/84
- neg_sample_size_eval: this is the number of negative edges used in evaluation. We set it similar to BATCH_SIZE; without it
    we're likely to get a CUDA OutOfMemoryError.
"""

for idx, s in enumerate(samples):
    print(f"---TEST {idx+1}---")
    
    filename = f"{LOGS_PATH}/run_{idx+1}.txt"
    neg_adv_flag = '-adv' if s['neg_adv'] else ''

    !DGLBACKEND=pytorch dglke_train --model_name {s['model']} --data_path {DATA_PATH} --save_path {SAVE_PATH}  --dataset {DATASET}  --format {FORMAT} \
--data_files train.csv val.csv test.csv --delimiter '	' --max_step {int(N_TRIPLES/s['batch_size']*N_EPOCHS)} \
--log_interval {LOG_INTERVAL} --batch_size {s['batch_size']} --batch_size_eval {BATCH_SIZE_EVAL} --neg_sample_size {s['num_negs_per_pos']} \
--lr {s['lr']} {neg_adv_flag} --hidden_dim {s['hidden_dim']} -rc {s['regularization_coef']} -g {s['gamma']} \
--gpu 0 --test --mix_cpu_gpu --async_update --no_eval_filter --neg_sample_size_eval {NEG_SAMPLE_SIZE_EVAL} |& tee {filename}


---TEST 1---
^C
---TEST 2---
^C
---TEST 3---
^C
---TEST 4---
^C
---TEST 5---


UnboundLocalError: local variable 'child' referenced before assignment