[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ant-research/EasyTemporalPointProcess/blob/main/notebooks/easytpp_2_tfb_wb.ipynb)


# Tutorial 2: Tensorboard and Weights & Biases in EasyTPP

EasyTPP provides built-in support for both Tensorboard and Weights & Biases (W&B) to help you track and visualize your model training. These tools allow you to monitor metrics, compare experiments, and debug your models effectively.


## Example of using Tensorboard

In [4]:
# As an illustrative example, we write the YAML content to a file
yaml_content = """
pipeline_config_id: runner_config

data:
    taxi:
        data_format: json
        train_dir: easytpp/taxi  # ./data/taxi/train.json
        valid_dir: easytpp/taxi  # ./data/taxi/dev.json
        test_dir: easytpp/taxi   # ./data/taxi/test.json
        data_specs:
          num_event_types: 10
          pad_token_id: 10
          padding_side: right


NHP_train:
  base_config:
    stage: train
    backend: torch
    dataset_id: taxi
    runner_id: std_tpp
    model_id: NHP # model name
    base_dir: './checkpoints/'
  trainer_config:
    batch_size: 256
    max_epoch: 2
    shuffle: False
    optimizer: adam
    learning_rate: 1.e-3
    valid_freq: 1
    use_tfb: True
    metrics: [ 'acc', 'rmse' ]
    seed: 2019
    gpu: -1
  model_config:
    hidden_size: 32
    loss_integral_num_sample_per_step: 20
    thinning:
      num_seq: 10
      num_sample: 1
      num_exp: 500 # number of i.i.d. Exp(intensity_bound) draws at one time in thinning algorithm
      look_ahead_time: 10
      patience_counter: 5 # the maximum iteration used in adaptive thinning
      over_sample_rate: 5
      num_samples_boundary: 5
      dtime_max: 5
      num_step_gen: 1
"""

# Save the content to a file named config.yaml
with open("config.yaml", "w") as file:
    file.write(yaml_content)

Then we run the following command to train the model:

In [5]:
from easy_tpp.config_factory import Config
from easy_tpp.runner import Runner

config = Config.build_from_yaml_file('./config.yaml', experiment_id='NHP_train')

model_runner = Runner.build_from_config(config)

model_runner.run()

[31;1m2025-02-03 10:32:32,085 - config.py[pid:91053;line:34:build_from_yaml_file] - CRITICAL: Load pipeline config class RunnerConfig[0m
[31;1m2025-02-03 10:32:32,089 - runner_config.py[pid:91053;line:161:update_config] - CRITICAL: train model NHP using CPU with torch backend[0m
[38;20m2025-02-03 10:32:32,098 - runner_config.py[pid:91053;line:36:__init__] - INFO: Save the config to ./checkpoints/91053_8345177088_250203-103232/NHP_train_output.yaml[0m
[38;20m2025-02-03 10:32:32,099 - base_runner.py[pid:91053;line:176:save_log] - INFO: Save the log to ./checkpoints/91053_8345177088_250203-103232/log[0m


  from .autonotebook import tqdm as notebook_tqdm
Downloading readme: 100%|██████████| 28.0/28.0 [00:00<00:00, 119B/s]


0.2244252199397379 0.29228809611195583
min_dt: 0.000277777777777
max_dt: 5.721388888888889
[38;20m2025-02-03 10:32:38,267 - tpp_runner.py[pid:91053;line:60:_init_model] - INFO: Num of model parameters 15252[0m
[38;20m2025-02-03 10:32:45,909 - base_runner.py[pid:91053;line:98:train] - INFO: Data 'taxi' loaded...[0m
[38;20m2025-02-03 10:32:45,910 - base_runner.py[pid:91053;line:103:train] - INFO: Start NHP training...[0m
[38;20m2025-02-03 10:32:46,425 - tpp_runner.py[pid:91053;line:96:_train_model] - INFO: [ Epoch 0 (train) ]: train loglike is -1.7553733776992408, num_events is 50454[0m
[38;20m2025-02-03 10:32:47,128 - tpp_runner.py[pid:91053;line:107:_train_model] - INFO: [ Epoch 0 (valid) ]:  valid loglike is -1.6691416010202664, num_events is 7204, acc is 0.4414214325374792, rmse is 0.3327808472052436[0m
[38;20m2025-02-03 10:32:48,150 - tpp_runner.py[pid:91053;line:122:_train_model] - INFO: [ Epoch 0 (test) ]: test loglike is -1.6577474861303745, num_events is 14420, acc is

After the training is done, we can see the tensorboard files in the `./checkpoints/` directory. 

In [9]:
!ls -R

[34mcheckpoints[m[m             easytpp_1_dataset.ipynb
config.yaml             easytpp_2_tfb_wb.ipynb

./checkpoints:
[34m91053_8345177088_250203-103232[m[m

./checkpoints/91053_8345177088_250203-103232:
NHP_train_output.yaml [34mmodels[m[m                [34mtfb_valid[m[m
log                   [34mtfb_train[m[m

./checkpoints/91053_8345177088_250203-103232/models:
saved_model

./checkpoints/91053_8345177088_250203-103232/tfb_train:
events.out.tfevents.1738549958.siqiaodeMacBook-Pro.local.91053.0

./checkpoints/91053_8345177088_250203-103232/tfb_valid:
events.out.tfevents.1738549958.siqiaodeMacBook-Pro.local.91053.1


Then we can use the following script to visualize the training process:

In [None]:
! tensorboard --logdir "./checkpoints/91053_8345177088_250203-103232/tfb_train/"

TensorFlow installation not found - running with reduced feature set.
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.17.1 at http://localhost:6006/ (Press CTRL+C to quit)
