# FIBAD Visualization

For this demonstration we will train a model on an example dataset and then visualize the results.

In [1]:
import pooch
import fibad

# Train the model

First we download the sample dataset, configure its format and run training.

In [2]:
file_path = pooch.retrieve(
    # DOI for Example HSC dataset
    url="doi:10.5281/zenodo.14498536/hsc_demo_data.zip",
    known_hash="md5:1be05a6b49505054de441a7262a09671",
    fname="example_hsc_new.zip",
    path="../../data",
    processor=pooch.Unzip(extract_dir="."),
)

f = fibad.Fibad()
f.config["general"]["data_dir"] = "../../data/hsc_8asec_1000"
f.config["data_set"]["name"] = "HSCDataSet"
f.config["train"]["epochs"] = 10
f.train()

[2025-02-24 16:32:57,701 fibad:INFO] Runtime Config read from: /Users/mtauraso/src/fibad/src/fibad/fibad_default_config.toml
  from torch.distributed.optim import ZeroRedundancyOptimizer
[2025-02-24 16:33:00,011 fibad.data_sets.hsc_data_set:INFO] Processed 993 objects for pruning
[2025-02-24 16:33:00,012 fibad.data_sets.hsc_data_set:INFO] Checking file dimensions to determine standard cutout size...
[2025-02-24 16:33:00,014 fibad.data_sets.hsc_data_set:INFO] HSC Data set loader has 993 objects
[2025-02-24 16:33:00,019 fibad.models.model_registry:INFO] Using criterion: torch.nn.CrossEntropyLoss with default arguments.
2025-02-24 16:33:00,027 ignite.distributed.auto.auto_dataloader INFO: Use data loader kwargs for dataset '<fibad.data_sets.hsc': 
	{'sampler': <torch.utils.data.sampler.SubsetRandomSampler object at 0x14b5cd7e0>, 'batch_size': 512, 'num_workers': 0, 'pin_memory': False}
2025-02-24 16:33:00,028 ignite.distributed.auto.auto_dataloader INFO: Use data loader kwargs for dataset

 50%|#####     | 1/2 [00:00<?, ?it/s]

 50%|#####     | 1/2 [00:00<?, ?it/s]

 50%|#####     | 1/2 [00:00<?, ?it/s]

 50%|#####     | 1/2 [00:00<?, ?it/s]

 50%|#####     | 1/2 [00:00<?, ?it/s]

 50%|#####     | 1/2 [00:00<?, ?it/s]

 50%|#####     | 1/2 [00:00<?, ?it/s]

 50%|#####     | 1/2 [00:00<?, ?it/s]

 50%|#####     | 1/2 [00:00<?, ?it/s]

 50%|#####     | 1/2 [00:00<?, ?it/s]

[2025-02-24 16:33:06,484 fibad.pytorch_ignite:INFO] Total training time: 6.36[s]
[2025-02-24 16:33:06,485 fibad.pytorch_ignite:INFO] Latest checkpoint saved as: /Users/mtauraso/src/fibad/docs/pre_executed/results/20250224-163259-train-qGk8/checkpoint_epoch_10.pt
[2025-02-24 16:33:06,485 fibad.pytorch_ignite:INFO] Best metric checkpoint saved as: /Users/mtauraso/src/fibad/docs/pre_executed/results/20250224-163259-train-qGk8/checkpoint_9_loss=-494.0631.pt
2025/02/24 16:33:06 INFO mlflow.system_metrics.system_metrics_monitor: Stopping system metrics monitoring...
2025/02/24 16:33:06 INFO mlflow.system_metrics.system_metrics_monitor: Successfully terminated system metrics monitoring!
[2025-02-24 16:33:06,504 fibad.train:INFO] Finished Training


# Inference

We then run inference and umap the resulting latent space.

In [3]:
f.infer()

[2025-02-24 16:33:20,073 fibad.data_sets.hsc_data_set:INFO] Processed 993 objects for pruning
[2025-02-24 16:33:20,073 fibad.data_sets.hsc_data_set:INFO] Checking file dimensions to determine standard cutout size...
[2025-02-24 16:33:20,075 fibad.data_sets.hsc_data_set:INFO] HSC Data set loader has 993 objects
[2025-02-24 16:33:20,079 fibad.models.model_registry:INFO] Using criterion: torch.nn.CrossEntropyLoss with default arguments.
[2025-02-24 16:33:20,080 fibad.infer:INFO] data set has length 993
2025-02-24 16:33:20,080 ignite.distributed.auto.auto_dataloader INFO: Use data loader kwargs for dataset '<fibad.data_sets.hsc': 
	{'sampler': None, 'batch_size': 512, 'num_workers': 0, 'pin_memory': False}
[2025-02-24 16:33:20,212 fibad.pytorch_ignite:INFO] Evaluating model on device: mps
[2025-02-24 16:33:20,213 fibad.pytorch_ignite:INFO] Total epochs: 1
[2025-02-24 16:33:22,906 fibad.pytorch_ignite:INFO] Total evaluation time: 2.69[s]
[2025-02-24 16:33:22,907 fibad.infer:INFO] Inference 

In [4]:
f.umap()

[2025-02-24 16:33:27,427 fibad.verbs.umap:INFO] Saving UMAP results to /Users/mtauraso/src/fibad/docs/pre_executed/results/20250224-163327-umap-dNRl
[2025-02-24 16:33:27,428 fibad.data_sets.inference_dataset:INFO] Using most recent results dir /Users/mtauraso/src/fibad/docs/pre_executed/results/20250224-163320-infer-ET1e for lookup. Use the [results] inference_dir config to set a directory or pass it to this verb.
OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.


Creating Lower Dimensional Representation using UMAP:   0%|          | 0/2 [00:00<?, ?it/s]



# Visualize

Run the visualize command to see the umapped version of the latent space. The lasso, box select, and tap tools in the bokeh interface below will populate the table view once the visualization has rendered.

In [5]:
f.visualize(width=400, height=400)

[2025-02-24 16:33:40,789 fibad.data_sets.inference_dataset:INFO] Using most recent results dir /Users/mtauraso/src/fibad/docs/pre_executed/results/20250224-163327-umap-dNRl for lookup. Use the [results] inference_dir config to set a directory or pass it to this verb.


BokehModel(combine_events=True, render_bundle={'docs_json': {'811f1826-94a9-41dc-855e-ef06098672f0': {'version…