# EasyTPP - Getting Started Guide

This notebook presents the main features of the **EasyTPP** (Easy Temporal Point Processes) library with practical examples.

## 🎯 Notebook Objectives

- Understand the basic concepts of temporal point processes
- Learn to configure and train models
- Explore the different types of data and available models
- Visualize and analyze results

## 📚 Table of Contents

1. [Environment Setup](#1-configuration)
2. [Basic Concepts](#2-concepts)
3. [Data Loading and Preparation](#3-data-loading-and-preparation)
4. [Model Configuration and Training](#4-model-configuration-and-training)
5. [Evaluation and Metrics](#5-evaluation-and-metrics)
6. [Distribution analysis](#6-advanced-examples)

## 1. Environment Setup {#1-configuration}

Let's start by importing the necessary modules and setting up the environment.

In [None]:
from pathlib import Path

# Add the project root directory to PYTHONPATH
ROOT = Path().absolute().parent

CONFIGS = ROOT / "configs" / "test_runner_config.yaml"

# EasyTPP imports (use the new builders and runner manager)
from easy_tpp.configs.config_builder import RunnerConfigBuilder, DataConfigBuilder
from easy_tpp.runners import RunnerManager
from easy_tpp.configs.config_factory import config_factory, ConfigType

print("✅ EasyTPP imported successfully!")
print(f"📁 Project directory: {ROOT}")

✅ EasyTPP imported successfully!
📁 Project directory: c:\Users\enzo.cAo\Documents\Projects\finance\projet_recherche\New_LTPP


## 2. Basic Concepts {#2-concepts}

### What is a Temporal Point Process?

A **Temporal Point Process** (TPP) is a sequence of events that occur over time. Each event is characterized by:

- **Occurrence time**: When the event happens
- **Event type**: What category of event (optional)

### Application examples:

- 🏥 **Medical**: Patient arrivals at a hospital
- 💰 **Finance**: Stock market transactions
- 🌍 **Geophysics**: Earthquakes
- 📱 **Social Networks**: User posts

### Models available in EasyTPP:

- **NHP** (Neural Hawkes Process): Hawkes processes with neural networks
- **THP** (Transformer Hawkes Process): Based on Transformer architecture
- **RMTPP** (Recurrent Marked Temporal Point Process): Based on RNNs
- **AttNHP** (Attentive Neural Hawkes Process): With attention mechanism

## 3. Data Loading and Preparation {#3-donnees}

EasyTPP supports multiple data formats. Let's see how to load and prepare data.

In [None]:
from easy_tpp.data.preprocess import TPPDataModule

In [None]:
# Data configuration with proper nested structure using builders
builder = DataConfigBuilder()
# If you have a YAML, use builder.load_from_yaml(yaml_path, data_config_path)
# Here we set fields programmatically to keep the example self-contained
builder.set_field("train_dir", "NzoCs/test_dataset")
builder.set_field("valid_dir", "NzoCs/test_dataset")
builder.set_field("test_dir", "NzoCs/test_dataset")
builder.set_field("dataset_id", "test")
builder.set_field("data_format", "json")
# nested specs and loading specs can be plain dicts or config instances
builder.set_field("data_loading_specs", {"batch_size": 32, "num_workers": 1, "shuffle": True})
builder.set_field("data_specs", {"num_event_types": 2, "padding_side": "left", "truncation_side": "left"})

data_config = builder.get_config_dict()

print("📊 Data configuration created via builder:")
print(f"   Dataset: {data_config['dataset_id']}")
print(f"   Format: {data_config['data_format']}")
print(f"   Event types: {data_config['data_specs']['num_event_types']}")
print(f"   Batch size: {data_config['data_loading_specs']['batch_size']}")
print(f"   Number of workers: {data_config['data_loading_specs']['num_workers']}")
print(f"   Padding side: {data_config['data_specs']['padding_side']}")

📊 Data configuration created:
   Dataset: test
   Format: json
   Event types: 2
   Batch size: 32
   Number of workers: 1
   Padding side: left


In [None]:
# Alternative: Create DataConfig using DataConfigBuilder.from_dict or .build()
from easy_tpp.configs import DataConfig

data_config_dict = {
    "train_dir": "NzoCs/test_dataset",
    "valid_dir": "NzoCs/test_dataset",
    "test_dir": "NzoCs/test_dataset",
    "dataset_id": "test",
    "data_format": "json",
    "data_loading_specs": {
        "batch_size": 32,
        "num_workers": 1,
        "shuffle": True
    },
    "data_specs": {
        "num_event_types": 2,
        "padding_side": "left",
        "truncation_side": "left"
    }
}

# Use DataConfig.from_dict for direct class creation, or builders + build() to get Config instances
# Example using builder.build():
builder = DataConfigBuilder()
builder.from_dict(data_config_dict, "dummy")  # "dummy" path not used when passing dict
# build returns a DataConfig instance
try:
    data_config_instance = builder.build()
    print("📊 DataConfig instance created via builder.build()")
    print(f"   Dataset: {data_config_instance.dataset_id}")
except Exception:
    # Fallback to direct from_dict creation
    data_config_instance = DataConfig.from_dict(data_config_dict)
    print("📊 DataConfig instance created via DataConfig.from_dict() (fallback)")

print(f"   Event types: {data_config_instance.data_specs.num_event_types}")
print(f"   Batch size: {data_config_instance.data_loading_specs.batch_size}")

📊 Alternative DataConfig created from dictionary:
   Dataset: test
   Format: json
   Event types: 2
   Batch size: 32


In [8]:
# Create data module
datamodule = TPPDataModule(data_config)
datamodule.setup(stage='fit')  # Setup for training and validation

# Get data loaders
train_loader = datamodule.train_dataloader()
val_loader = datamodule.val_dataloader()

print("✅ Data loaders created successfully!")
print(f"   📈 Train loader: {len(train_loader)} batches")
print(f"   📊 Validation loader: {len(val_loader)} batches")

[38;20m2025-09-18 12:58:08,205 - data_loader.py[pid:16224;line:140:setup] - INFO: Setting up data for stage: fit[0m


  from .autonotebook import tqdm as notebook_tqdm


[38;20m2025-09-18 12:58:17,257 - data_loader.py[pid:16224;line:149:setup] - INFO: Train dataset created with 6 sequences[0m
[38;20m2025-09-18 12:58:20,528 - data_loader.py[pid:16224;line:158:setup] - INFO: Validation dataset created with 2 sequences[0m
✅ Data loaders created successfully!
   📈 Train loader: 1 batches
   📊 Validation loader: 1 batches


### Data Inspection

Let's use the Visualizer to analyze the data distribution.

In [9]:
from easy_tpp.data.preprocess.visualizer import Visualizer

# Create the visualizer
visualizer = Visualizer(
    data_module=datamodule,
    split="train",
    save_dir="./analysis_plots"
)

# Generate visualizations
visualizer.show_all_distributions()
visualizer.delta_times_distribution()
visualizer.event_type_distribution()

print("📈 Analysis plots generated!")
print("   Check the './analysis_plots' folder for saved graphs")

Generating visualization plots...
Inter-event time distribution plot saved to ./analysis_plots\inter_event_time_dist.png
Event type distribution plot saved to ./analysis_plots\event_type_dist.png
Sequence length distribution plot saved to ./analysis_plots\sequence_length_dist.png
All plots generated successfully!
Inter-event time distribution plot saved to ./analysis_plots\inter_event_time_dist.png
Event type distribution plot saved to ./analysis_plots\event_type_dist.png
📈 Analysis plots generated!
   Check the './analysis_plots' folder for saved graphs


## 4. Model Configuration and Training {#4-entrainement}

Now, let's configure and train a Neural Hawkes Process (NHP) model.

In [None]:
# Build runner configuration from YAML using RunnerConfigBuilder
runner_builder = RunnerConfigBuilder()
runner_builder.load_from_yaml(
    yaml_file_path=str(CONFIGS),
    training_config_path="training_configs.quick_test",
    model_config_path="model_configs.NHP",
    data_config_path="data_configs.test",
)

config_dict = runner_builder.get_config_dict()
# Use the global factory to create a RunnerConfig (note: model_id passed as extra arg)
runner_config = config_factory.create_config(ConfigType.RUNNER, config_dict, model_id="NHP")

print("⚙️ Runner configuration built via factory + builder")
print(f"   🧠 Model: NHP")
print(f"   📊 Dataset: test")

⚙️ Model configuration:
   🧠 Model: NHP
   📊 Dataset: test


In [None]:
# Create runner manager and start training
runner = RunnerManager(config=runner_config, output_dir="./training_results")

print("🚀 Starting training...")
print("   This may take a few minutes depending on your configuration.")

# Train the model
runner.run(phase="train")

print("✅ Training completed!")

[31;1m2025-09-18 12:58:49,286 - runner.py[pid:16224;line:39:__init__] - CRITICAL: Runner initialized for model: NHP on dataset: test[0m
🚀 Starting training...
   This may take a few minutes depending on your configuration.
[38;20m2025-09-18 12:58:49,289 - runner.py[pid:16224;line:129:run] - INFO: Runner executing phases: ['train'][0m
[38;20m2025-09-18 12:58:49,290 - runner.py[pid:16224;line:72:train] - INFO: === TRAINING PHASE ===[0m
[38;20m2025-09-18 12:58:49,298 - model_runner.py[pid:16224;line:116:__init__] - INFO: No valid checkpoint found. Starting from scratch.[0m
[38;20m2025-09-18 12:58:49,299 - model_runner.py[pid:16224;line:221:train] - INFO: --- Starting Training for Model : NHP on dataset : test ---[0m


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


[38;20m2025-09-18 12:58:49,343 - data_loader.py[pid:16224;line:140:setup] - INFO: Setting up data for stage: fit[0m
[38;20m2025-09-18 12:58:53,179 - data_loader.py[pid:16224;line:149:setup] - INFO: Train dataset created with 6 sequences[0m
[38;20m2025-09-18 12:58:56,321 - data_loader.py[pid:16224;line:158:setup] - INFO: Validation dataset created with 2 sequences[0m



  | Name            | Type             | Params | Mode 
-------------------------------------------------------------
0 | layer_type_emb  | Embedding        | 192    | train
1 | rnn_cell        | ContTimeLSTMCell | 57.8 K | train
2 | layer_intensity | Sequential       | 132    | train
-------------------------------------------------------------
58.1 K    Trainable params
0         Non-trainable params
58.1 K    Total params
0.232     Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

c:\Users\enzo.cAo\Documents\Projects\finance\projet_recherche\New_LTPP\.venv\Lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:433: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=15` in the `DataLoader` to improve performance.


                                                                           

c:\Users\enzo.cAo\Documents\Projects\finance\projet_recherche\New_LTPP\.venv\Lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:433: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=15` in the `DataLoader` to improve performance.
c:\Users\enzo.cAo\Documents\Projects\finance\projet_recherche\New_LTPP\.venv\Lib\site-packages\pytorch_lightning\loops\fit_loop.py:310: The number of training batches (1) is smaller than the logging interval Trainer(log_every_n_steps=5). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


Epoch 4: 100%|██████████| 1/1 [00:00<00:00,  2.61it/s, v_num=0, train_loss=2.310]

`Trainer.fit` stopped: `max_epochs=5` reached.


Epoch 4: 100%|██████████| 1/1 [00:00<00:00,  2.49it/s, v_num=0, train_loss=2.310]
✅ Training completed!


## 5. Evaluation and Metrics {#5-evaluation}

Let's now evaluate the performance of the trained model.

In [12]:
# Evaluation on test dataset
print("🧪 Evaluating model on test dataset...")

test_results = runner.run(phase="test")

print("📊 Evaluation results:")
if hasattr(runner, 'test_metrics'):
    for metric_name, value in runner.test_metrics.items():
        print(f"   {metric_name}: {value:.4f}")
else:
    print("✅ Evaluation completed - check logs for detailed metrics")

🧪 Evaluating model on test dataset...
[38;20m2025-09-18 13:00:38,842 - runner.py[pid:16224;line:129:run] - INFO: Runner executing phases: ['test'][0m
[31;1m2025-09-18 13:00:38,859 - runner.py[pid:16224;line:85:test] - CRITICAL: === TESTING PHASE ===[0m
[38;20m2025-09-18 13:00:38,921 - model_runner.py[pid:16224;line:103:__init__] - INFO: Checkpoint found: loading from ./training_results\last.ckpt[0m
[38;20m2025-09-18 13:00:38,928 - model_runner.py[pid:16224;line:114:__init__] - INFO: Loading model from checkpoint: ./training_results\last.ckpt.[0m
[38;20m2025-09-18 13:00:38,939 - model_runner.py[pid:16224;line:245:test] - INFO: --- Starting Testing for Model : NHP on dataset : test ---[0m


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


[38;20m2025-09-18 13:00:39,393 - data_loader.py[pid:16224;line:140:setup] - INFO: Setting up data for stage: test[0m


Restoring states from the checkpoint path at ./training_results\last.ckpt
Loaded model weights from the checkpoint at ./training_results\last.ckpt
c:\Users\enzo.cAo\Documents\Projects\finance\projet_recherche\New_LTPP\.venv\Lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:433: The 'test_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=15` in the `DataLoader` to improve performance.


Testing DataLoader 0: 100%|██████████| 1/1 [00:00<00:00,  7.82it/s]


[38;20m2025-09-18 13:01:26,730 - model_runner.py[pid:16224;line:269:test] - INFO: Test results saved to ./training_results\test_results.json[0m
📊 Evaluation results:
✅ Evaluation completed - check logs for detailed metrics
📊 Evaluation results:
✅ Evaluation completed - check logs for detailed metrics


### Comparison with Baselines

Let's compare our model with simple baselines.

In [None]:
from easy_tpp.evaluation.benchmarks.mean_bench import MeanInterTimeBenchmark
from easy_tpp.evaluation.benchmarks.sample_distrib_mark_bench import MarkDistributionBenchmark

# Baseline benchmark: mean prediction
mean_benchmark = MeanInterTimeBenchmark(
    data_config=data_config,
    experiment_id="mean_baseline",
    save_dir="./benchmark_results"
)

print("📊 Baseline benchmark (mean):")
mean_results = mean_benchmark.evaluate()
print(f"   Results: {mean_results}")

# Type distribution benchmark
mark_benchmark = MarkDistributionBenchmark(
    data_config=data_config,
    dataset_name="mark_baseline",
    save_dir="./benchmark_results"
)

print("\n📊 Type distribution benchmark:")
mark_results = mark_benchmark.evaluate()
print(f"   Results: {mark_results}")

[38;20m2025-09-18 13:01:34,021 - data_loader.py[pid:16224;line:140:setup] - INFO: Setting up data for stage: test[0m
📊 Baseline benchmark (mean):
[38;20m2025-09-18 13:01:37,589 - base_bench.py[pid:16224;line:147:evaluate] - INFO: Starting mean_inter_time benchmark evaluation...[0m
[38;20m2025-09-18 13:01:37,590 - mean_bench.py[pid:16224;line:51:_prepare_benchmark] - INFO: Computing mean inter-time from training data...[0m
[38;20m2025-09-18 13:01:51,480 - mean_bench.py[pid:16224;line:74:_prepare_benchmark] - INFO: Computed mean inter-time: 1.506245[0m
[38;20m2025-09-18 13:02:08,127 - base_bench.py[pid:16224;line:366:_save_results] - INFO: Results saved to: ./benchmark_results\mean_baseline\mean_inter_time_results.json[0m
[38;20m2025-09-18 13:02:08,128 - base_bench.py[pid:16224;line:375:_log_summary] - INFO: mean_inter_time benchmark completed successfully![0m
[38;20m2025-09-18 13:02:08,129 - base_bench.py[pid:16224;line:381:_log_summary] - INFO: Time RMSE: 2.772637[0m
[38

KeyboardInterrupt: 

## 6. Advanced Examples {#6-avances}

### Synthetic Data Generation

EasyTPP allows generating synthetic data to test models.

In [14]:
from easy_tpp.data.generation import HawkesSimulator

# Hawkes process configuration
params = {
    "mu": [0.1, 0.2],                    # Base intensities
    "alpha": [[0.3, 0.1], [0.2, 0.4]],  # Excitation matrix
    "beta": [[2, 1], [1.5, 3]]          # Decay matrix
}

# Create simulator
simulator = HawkesSimulator(
    mu=params["mu"],
    alpha=params["alpha"],
    beta=params["beta"],
    dim_process=2,
    start_time=0,
    end_time=100
)

print("🎲 Generating synthetic data...")

# Generate and save
simulator.generate_and_save(
    output_dir='./synthetic_data',
    num_simulations=10,
    splits={'train': 0.6, 'test': 0.2, 'dev': 0.2}
)

print("✅ Synthetic data generated in './synthetic_data'")

🎲 Generating synthetic data...
Génération de 10 simulations 2D...


Simulation de 10 processus:   0%|          | 0/10 [00:00<?, ?it/s]

Simulation de 10 processus: 100%|██████████| 10/10 [00:00<00:00, 73.65it/s]

Division des données en ensembles train/test/dev...
Sauvegarde des données...
Toutes les données ont été sauvegardées dans ./synthetic_data
✅ Synthetic data generated in './synthetic_data'





### Multiple Model Comparison

Let's compare the performance of different models on the same dataset.

In [None]:
# Multiple model comparison using RunnerConfigBuilder and RunnerManager
models_to_compare = ['NHP', 'THP', 'RMTPP']
results_comparison = {}

for model_name in models_to_compare:
    print(f"\n🧠 Training model {model_name}...")
    try:
        # Use builder to load and build runner config for each model
        rb = RunnerConfigBuilder()
        rb.load_from_yaml(
            yaml_file_path=str(CONFIGS),
            training_config_path="training_configs.quick_test",
            model_config_path=f"model_configs.{model_name}",
            data_config_path="data_configs.test",
        )
        config_dict = rb.get_config_dict()
        config = config_factory.create_config(ConfigType.RUNNER, config_dict, model_id=model_name)

        runner = RunnerManager(config=config, output_dir=f"./comparison_results/{model_name}")

        # Quick training (fewer epochs for demo)
        runner.run(phase="train")
        test_results = runner.run(phase="test")

        results_comparison[model_name] = "✅ Success"
        print(f"   ✅ {model_name} trained successfully")

    except Exception as e:
        results_comparison[model_name] = f"❌ Error: {str(e)[:50]}..."
        print(f"   ❌ Error with {model_name}: {str(e)[:50]}...")

print("\n📊 Comparison summary:")
for model, result in results_comparison.items():
    print(f"   {model}: {result}")


🧠 Training model NHP...
[31;1m2025-09-18 03:19:45,632 - runner.py[pid:7980;line:39:__init__] - CRITICAL: Runner initialized for model: NHP on dataset: test[0m
[38;20m2025-09-18 03:19:45,633 - runner.py[pid:7980;line:129:run] - INFO: Runner executing phases: ['train'][0m
[38;20m2025-09-18 03:19:45,634 - runner.py[pid:7980;line:72:train] - INFO: === TRAINING PHASE ===[0m
[38;20m2025-09-18 03:19:45,637 - model_runner.py[pid:7980;line:116:__init__] - INFO: No valid checkpoint found. Starting from scratch.[0m
[38;20m2025-09-18 03:19:45,638 - model_runner.py[pid:7980;line:221:train] - INFO: --- Starting Training for Model : NHP on dataset : test ---[0m


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


[38;20m2025-09-18 03:19:45,665 - data_loader.py[pid:7980;line:140:setup] - INFO: Setting up data for stage: fit[0m
[38;20m2025-09-18 03:19:49,379 - data_loader.py[pid:7980;line:149:setup] - INFO: Train dataset created with 6 sequences[0m
[38;20m2025-09-18 03:19:54,366 - data_loader.py[pid:7980;line:158:setup] - INFO: Validation dataset created with 2 sequences[0m



  | Name            | Type             | Params | Mode 
-------------------------------------------------------------
0 | layer_type_emb  | Embedding        | 192    | train
1 | rnn_cell        | ContTimeLSTMCell | 57.8 K | train
2 | layer_intensity | Sequential       | 132    | train
-------------------------------------------------------------
58.1 K    Trainable params
0         Non-trainable params
58.1 K    Total params
0.232     Total estimated model params size (MB)
7         Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]


Detected KeyboardInterrupt, attempting graceful shutdown ...


SystemExit: 1

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


## 6. Prediction Phase and Distribution Analysis

**Why the prediction phase is crucial:**

Temporal Point Process (TPP) models don't just serve to calculate performance metrics - their true value lies in their ability to **predict and simulate** new events. These predictions enable:

1. **Distribution comparisons** - Analyze whether the model captures temporal patterns well
2. **Realistic benchmarks** - Compare model simulations to real data  
3. **Qualitative validation** - Visualize differences between predictions and reality
4. **Practical applications** - Generate future scenarios for decision-making

### 6.1 Complete Pipeline with Predictions

In [None]:
# Complete example: train → test → predict using RunnerConfigBuilder and RunnerManager
print("🔄 Complete pipeline with predictions...")

# Build runner config
rb = RunnerConfigBuilder()
rb.load_from_yaml(
    yaml_file_path=str(CONFIGS),
    training_config_path="training_configs.quick_test",
    model_config_path="model_configs.NHP",
    data_config_path="data_configs.test",
)
config_dict = rb.get_config_dict()
config = config_factory.create_config(ConfigType.RUNNER, config_dict, model_id="NHP")

# Runner manager
runner = RunnerManager(config=config, output_dir="./prediction_analysis")

print("🔮 Generating predictions and distribution comparisons...")
runner.run(phase="predict")

print("✅ Complete pipeline finished!")
print("📊 Results available in:")
print("   - Performance metrics")
print("   - Model simulations") 
print("   - Distribution comparisons")
print("   - Analysis graphs")

🔄 Complete pipeline with predictions...
[31;1m2025-09-18 13:07:41,698 - runner.py[pid:16224;line:39:__init__] - CRITICAL: Runner initialized for model: NHP on dataset: test[0m
🔮 Generating predictions and distribution comparisons...
[38;20m2025-09-18 13:07:41,699 - runner.py[pid:16224;line:129:run] - INFO: Runner executing phases: ['predict'][0m
[31;1m2025-09-18 13:07:41,699 - runner.py[pid:16224;line:98:predict] - CRITICAL: === PREDICTION PHASE ===[0m
[38;20m2025-09-18 13:07:41,703 - model_runner.py[pid:16224;line:116:__init__] - INFO: No valid checkpoint found. Starting from scratch.[0m
[38;20m2025-09-18 13:07:41,704 - model_runner.py[pid:16224;line:275:predict] - INFO: --- Starting Prediction for Model : NHP on dataset : test ---[0m


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


[38;20m2025-09-18 13:07:41,734 - data_loader.py[pid:16224;line:140:setup] - INFO: Setting up data for stage: predict[0m


c:\Users\enzo.cAo\Documents\Projects\finance\projet_recherche\New_LTPP\.venv\Lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:433: The 'predict_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=15` in the `DataLoader` to improve performance.


Predicting DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]


[A
[A
[A
[A

Predicting DataLoader 0: 100%|██████████| 1/1 [00:04<00:00,  0.21it/s]



Formatting sequences: 100%|██████████| 2/2 [00:00<00:00, 2975.74it/s]

Data successfully saved to .\distributions_comparisons\simulations.json
[38;20m2025-09-18 13:07:59,801 - comparator.py[pid:16224;line:141:create_comparator] - INFO: Using TPPDatasetExtractor for optimized data extraction[0m
[38;20m2025-09-18 13:07:59,803 - comparator.py[pid:16224;line:68:run_comprehensive_evaluation] - INFO: Starting comprehensive temporal point process evaluation...[0m
[38;20m2025-09-18 13:07:59,804 - data_extractors.py[pid:16224;line:69:_extract_all_data] - INFO: Extracting ground truth data from TPPDataset with 2 sequences...[0m
[38;20m2025-09-18 13:07:59,810 - data_extractors.py[pid:16224;line:118:_extract_all_data] - INFO: Successfully processed 2/2 sequences, extracted 56 events[0m
[38;20m2025-09-18 13:07:59,812 - data_extractors.py[pid:16224;line:277:_extract_all_data] - INFO: Processing simulation data...[0m





[38;20m2025-09-18 13:08:00,663 - distribution_analyzer.py[pid:16224;line:119:plot_density_comparison] - INFO: Density comparison plot successfully saved to .\distributions_comparisons\comparison_inter_event_time_dist.png[0m
[38;20m2025-09-18 13:08:01,206 - plot_generators.py[pid:16224;line:165:generate_plot] - INFO: Event type distribution comparison plot saved to .\distributions_comparisons\comparison_event_type_dist.png[0m
[38;20m2025-09-18 13:08:01,750 - plot_generators.py[pid:16224;line:257:generate_plot] - INFO: Sequence length distribution comparison plot saved to .\distributions_comparisons\comparison_sequence_length_dist.png[0m
[38;20m2025-09-18 13:08:02,474 - plot_generators.py[pid:16224;line:342:generate_plot] - INFO: Cross-correlation comparison plot saved to .\distributions_comparisons\comparison_cross_correlation_moments.png[0m
[38;20m2025-09-18 13:08:02,475 - comparator.py[pid:16224;line:96:run_comprehensive_evaluation] - INFO: Comprehensive evaluation completed 




[38;20m2025-09-18 13:08:08,150 - distribution_analyzer.py[pid:16224;line:119:plot_density_comparison] - INFO: Density comparison plot successfully saved to .\distributions_comparisons\comparison_inter_event_time_dist.png[0m
[38;20m2025-09-18 13:08:08,402 - plot_generators.py[pid:16224;line:165:generate_plot] - INFO: Event type distribution comparison plot saved to .\distributions_comparisons\comparison_event_type_dist.png[0m
[38;20m2025-09-18 13:08:08,671 - plot_generators.py[pid:16224;line:257:generate_plot] - INFO: Sequence length distribution comparison plot saved to .\distributions_comparisons\comparison_sequence_length_dist.png[0m
[38;20m2025-09-18 13:08:09,348 - plot_generators.py[pid:16224;line:342:generate_plot] - INFO: Cross-correlation comparison plot saved to .\distributions_comparisons\comparison_cross_correlation_moments.png[0m
[38;20m2025-09-18 13:08:09,349 - comparator.py[pid:16224;line:96:run_comprehensive_evaluation] - INFO: Comprehensive evaluation completed

RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.

### 6.2 Simplified Alternative: Single Command

If you want the complete pipeline all at once:

In [None]:
# Ultra-simple version: everything in one command
# Use RunnerManager (runner variable) already created above
runner.run(phase="all")

print("🎉 Complete pipeline executed with phase='all'!")
print("💡 This command is equivalent to the 3 separate phases above")

**🎯 Main objective:** Verify that the model has learned the correct temporal distributions.

**📊 What the `predict` phase generates:**
- **Event simulations** based on the trained model
- **Visual comparisons** between real and simulated data
- **Statistical analyses** of temporal distributions
- **Prediction quality metrics**

**⚠️ Crucial point:** Without the prediction phase, you only have numerical metrics. With predictions, you can **see** if your model truly understands the temporal dynamics of your data.

In [21]:
import shutil
import os

folders_to_remove = [
    "analysis_plots",
    "training_results",
    "benchmark_results",
    "synthetic_data",
    "comparison_results",
    "prediction_analysis",
    "complete_pipeline",
    "lightning_logs",
    "checkpoints",
    "distributions_comparisons"
]

for folder in folders_to_remove:
    if os.path.exists(folder):
        shutil.rmtree(folder)
        print(f"Deleted: {folder}")
    else:
        print(f"Not found: {folder}")

Not found: analysis_plots
Not found: training_results
Not found: benchmark_results
Not found: synthetic_data
Not found: comparison_results
Not found: prediction_analysis
Not found: complete_pipeline
Deleted: lightning_logs
Deleted: checkpoints
Deleted: distributions_comparisons
