Dev peter by renierts · Pull Request #75 · PlasmaControl/FusionAIHub

renierts · 2026-05-07T15:46:21Z

No description provided.

Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients).

…, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion.

…for debugging purposes.

…the dataset class.

…s and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes.

The basic encoders are now all working. Examples are in scripts.

- Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training.

…got to remove unused modalities. This follows the standard getitem function now.

…ted!!!

Quick fix for the data standardization. Invalid values have to be ignored. Fix in the function to create H5 files. bolo data does not have to be flipped anymore as the data is now stored in the correct format.

* Nathan fm (#53) * chore: Update `pyproject.toml` to reorder authors, enhance README with environment setup instructions, and add validation notes in `validation.txt`. Refactor `dummy_model_2.py` for improved modality configuration and introduce `TextEncoder` enhancements in `text_baseline.py`. * Refactor demo scripts to utilize new `Prediction4FusionModel` and `DictMSELoss`. Update `run_demo_2.py` and `run_demo_3.py` for improved model initialization and data handling. Enhance `TokamakH5Dataset` to handle degenerate signals and improve data extraction logic. Remove unused `latent_space.py` and integrate new modality fusion models in `modality_fusion.py`. * Remove unused shot list configuration files and refactor trainer class to introduce MultimodalTrainer and UnimodalTrainer for improved training structure. * Refactor modality models and trainer classes for improved structure and functionality. Removed unused TimeSeriesEncoder and Decoder, introduced FastTimeSeriesEncoder and SpectrogramAutoEncoder. Updated UnimodalTrainer to support logging and checkpoint management. Enhanced TokamakH5Dataset for better data handling and added checkpoint loading functionality in spectrogram reconstruction script. * Add padding collate function and update training script for unimodal autoencoder - Introduced `collate_fn_pad` to handle variable-length tensors in batches. - Updated `train_unimodal_autoencoder.py` to use the new collate function. - Modified `train_unimodal.sh` to include additional signal modalities for training. - Added new autoencoder classes for fast time series and spatial profile modalities, ensuring output shape consistency with adaptive pooling. - Enhanced video autoencoder implementation for better reconstruction quality. * Remove spectrogram reconstruction script and refactor modality models - Deleted `spectrogram_reconstruction.py` as part of the restructuring. - Refactored modality models to introduce baseline versions for actuator, slow time series, fast time series, spatial profile, spectrogram, and video. - Updated model registry and signal-to-model mappings to reflect new baseline architecture. - Enhanced `TokamakH5Dataset` to support additional parameters for FFT and hop length. - Improved training script for unimodal autoencoders to utilize new baseline models and added support for variable-length tensors. * Update .gitignore to include pixi environments and add link to HSI-compression-benchmark in SpectrogramBaselineAutoEncoder docstring * Remove unused shot list files and delete deprecated scripts for training and data handling * Remove deprecated training scripts for CO2, ECE, MHR, and unimodal training * Dev peter (#48) * Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Dev peter (#50) * Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Adapted the other reconstruction scripts to match the new API. * Bugfix in the dataset class. When splitting inputs and targets, I forgot to remove unused modalities. This follows the standard getitem function now. * Prepared an option to preprocess movies. This has to be fully integrated!!! --------- Co-authored-by: Peter Steiner <61472983+renierts@users.noreply.github.com> * Dev peter (#55) * Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Adapted the other reconstruction scripts to match the new API. * Bugfix in the dataset class. When splitting inputs and targets, I forgot to remove unused modalities. This follows the standard getitem function now. * Prepared an option to preprocess movies. This has to be fully integrated!!! * Added a baseline fusion transformer for latent space prediction. Quick fix for the data standardization. Invalid values have to be ignored. Fix in the function to create H5 files. bolo data does not have to be flipped anymore as the data is now stored in the correct format. --------- Co-authored-by: Nathaniel Chen <nathanchen1101@gmail.com>

…eparation to moving to Stellar.

…5 files. Implemented calculating incremental statistics. Corrected values in the modality configuration. Removed redundant script standardize_dataset.py

TODO: Write a documentation.

…simple file transfer.

- Added information on how to use all the scripts for data fetching. Updated read_mds.sh - Added a switch for globus file transfer. This simply stores the H5 files on Omega and we can add more data later.

Moved prepare_data.py to scripts, added a batch script to do this on compute nodes. Added more point names to the data fetching scripts for Omega. Added docstring to the WelfordTensor class. Updated modalities.yaml with the new point names added.

…_preprocessing_stats more transparent. Bugfix in modalities.yaml - Channels were missing in ECE.

…tats. This is still not efficient enough and causes memory issues.

Bugfixes in the trainer. Cosmetic changes in tracking.py

- PEP-8 corrections - Support plots of time signals and videos Train-val-test split in fast_time_series_reconstruction.py

- Channels was not handled properly (if selecting slices of a signal). - Drawing: Restrict plotting to valid signals (not the padded sections after the actual signal). - Introduced masked loss for fast time series reconstruction.

…_series_baseline.py to filterscope_baseline.py). Updates in the dataset class: Clipping for log transform can go down to -.99 (sufficient because we subtract 1.0). Updates in drawing.py: We can now draw all kinds of different plots (except for profiles for now). Added functionality to draw correlation plots, which is important for finding feature distributions. Added masked loss functions to not consider out-of-range time slices for training.

…for debugging purposes.

…the dataset class.

…s and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes.

The basic encoders are now all working. Examples are in scripts.

* Nathan fm (#53) * chore: Update `pyproject.toml` to reorder authors, enhance README with environment setup instructions, and add validation notes in `validation.txt`. Refactor `dummy_model_2.py` for improved modality configuration and introduce `TextEncoder` enhancements in `text_baseline.py`. * Refactor demo scripts to utilize new `Prediction4FusionModel` and `DictMSELoss`. Update `run_demo_2.py` and `run_demo_3.py` for improved model initialization and data handling. Enhance `TokamakH5Dataset` to handle degenerate signals and improve data extraction logic. Remove unused `latent_space.py` and integrate new modality fusion models in `modality_fusion.py`. * Remove unused shot list configuration files and refactor trainer class to introduce MultimodalTrainer and UnimodalTrainer for improved training structure. * Refactor modality models and trainer classes for improved structure and functionality. Removed unused TimeSeriesEncoder and Decoder, introduced FastTimeSeriesEncoder and SpectrogramAutoEncoder. Updated UnimodalTrainer to support logging and checkpoint management. Enhanced TokamakH5Dataset for better data handling and added checkpoint loading functionality in spectrogram reconstruction script. * Add padding collate function and update training script for unimodal autoencoder - Introduced `collate_fn_pad` to handle variable-length tensors in batches. - Updated `train_unimodal_autoencoder.py` to use the new collate function. - Modified `train_unimodal.sh` to include additional signal modalities for training. - Added new autoencoder classes for fast time series and spatial profile modalities, ensuring output shape consistency with adaptive pooling. - Enhanced video autoencoder implementation for better reconstruction quality. * Remove spectrogram reconstruction script and refactor modality models - Deleted `spectrogram_reconstruction.py` as part of the restructuring. - Refactored modality models to introduce baseline versions for actuator, slow time series, fast time series, spatial profile, spectrogram, and video. - Updated model registry and signal-to-model mappings to reflect new baseline architecture. - Enhanced `TokamakH5Dataset` to support additional parameters for FFT and hop length. - Improved training script for unimodal autoencoders to utilize new baseline models and added support for variable-length tensors. * Update .gitignore to include pixi environments and add link to HSI-compression-benchmark in SpectrogramBaselineAutoEncoder docstring * Remove unused shot list files and delete deprecated scripts for training and data handling * Remove deprecated training scripts for CO2, ECE, MHR, and unimodal training * Dev peter (#48) * Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Dev peter (#50) * Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Adapted the other reconstruction scripts to match the new API. * Bugfix in the dataset class. When splitting inputs and targets, I forgot to remove unused modalities. This follows the standard getitem function now. * Prepared an option to preprocess movies. This has to be fully integrated!!! --------- Co-authored-by: Peter Steiner <61472983+renierts@users.noreply.github.com> * Dev peter (#55) * Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Adapted the other reconstruction scripts to match the new API. * Bugfix in the dataset class. When splitting inputs and targets, I forgot to remove unused modalities. This follows the standard getitem function now. * Prepared an option to preprocess movies. This has to be fully integrated!!! * Added a baseline fusion transformer for latent space prediction. Quick fix for the data standardization. Invalid values have to be ignored. Fix in the function to create H5 files. bolo data does not have to be flipped anymore as the data is now stored in the correct format. --------- Co-authored-by: Nathaniel Chen <nathanchen1101@gmail.com>

…5 files. Implemented calculating incremental statistics. Corrected values in the modality configuration. Removed redundant script standardize_dataset.py

TODO: Write a documentation.

…simple file transfer.

- Added information on how to use all the scripts for data fetching. Updated read_mds.sh - Added a switch for globus file transfer. This simply stores the H5 files on Omega and we can add more data later.

Moved prepare_data.py to scripts, added a batch script to do this on compute nodes. Added more point names to the data fetching scripts for Omega. Added docstring to the WelfordTensor class. Updated modalities.yaml with the new point names added.

…tats. This is still not efficient enough and causes memory issues.

Bugfixes in the trainer. Cosmetic changes in tracking.py

…_series_baseline.py to filterscope_baseline.py). Updates in the dataset class: Clipping for log transform can go down to -.99 (sufficient because we subtract 1.0). Updates in drawing.py: We can now draw all kinds of different plots (except for profiles for now). Added functionality to draw correlation plots, which is important for finding feature distributions. Added masked loss functions to not consider out-of-range time slices for training.

…ted for both, linear and log10 scale. Working on more accurate autoencoders for time-series and profiles.

* Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Adapted the other reconstruction scripts to match the new API. * Bugfix in the dataset class. When splitting inputs and targets, I forgot to remove unused modalities. This follows the standard getitem function now. * Prepared an option to preprocess movies. This has to be fully integrated!!! * Added a baseline fusion transformer for latent space prediction. Quick fix for the data standardization. Invalid values have to be ignored. Fix in the function to create H5 files. bolo data does not have to be flipped anymore as the data is now stored in the correct format. * Foundation model (#56) * Nathan fm (#53) * chore: Update `pyproject.toml` to reorder authors, enhance README with environment setup instructions, and add validation notes in `validation.txt`. Refactor `dummy_model_2.py` for improved modality configuration and introduce `TextEncoder` enhancements in `text_baseline.py`. * Refactor demo scripts to utilize new `Prediction4FusionModel` and `DictMSELoss`. Update `run_demo_2.py` and `run_demo_3.py` for improved model initialization and data handling. Enhance `TokamakH5Dataset` to handle degenerate signals and improve data extraction logic. Remove unused `latent_space.py` and integrate new modality fusion models in `modality_fusion.py`. * Remove unused shot list configuration files and refactor trainer class to introduce MultimodalTrainer and UnimodalTrainer for improved training structure. * Refactor modality models and trainer classes for improved structure and functionality. Removed unused TimeSeriesEncoder and Decoder, introduced FastTimeSeriesEncoder and SpectrogramAutoEncoder. Updated UnimodalTrainer to support logging and checkpoint management. Enhanced TokamakH5Dataset for better data handling and added checkpoint loading functionality in spectrogram reconstruction script. * Add padding collate function and update training script for unimodal autoencoder - Introduced `collate_fn_pad` to handle variable-length tensors in batches. - Updated `train_unimodal_autoencoder.py` to use the new collate function. - Modified `train_unimodal.sh` to include additional signal modalities for training. - Added new autoencoder classes for fast time series and spatial profile modalities, ensuring output shape consistency with adaptive pooling. - Enhanced video autoencoder implementation for better reconstruction quality. * Remove spectrogram reconstruction script and refactor modality models - Deleted `spectrogram_reconstruction.py` as part of the restructuring. - Refactored modality models to introduce baseline versions for actuator, slow time series, fast time series, spatial profile, spectrogram, and video. - Updated model registry and signal-to-model mappings to reflect new baseline architecture. - Enhanced `TokamakH5Dataset` to support additional parameters for FFT and hop length. - Improved training script for unimodal autoencoders to utilize new baseline models and added support for variable-length tensors. * Update .gitignore to include pixi environments and add link to HSI-compression-benchmark in SpectrogramBaselineAutoEncoder docstring * Remove unused shot list files and delete deprecated scripts for training and data handling * Remove deprecated training scripts for CO2, ECE, MHR, and unimodal training * Dev peter (#48) * Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Dev peter (#50) * Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Adapted the other reconstruction scripts to match the new API. * Bugfix in the dataset class. When splitting inputs and targets, I forgot to remove unused modalities. This follows the standard getitem function now. * Prepared an option to preprocess movies. This has to be fully integrated!!! --------- * Dev peter (#55) * Removed the argument "batch_size" from the trainers. Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients). * Bugfix in the dataset class. When iterating over movie configurations, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Added base script for video reconstruction. Copied from Aza's branch for debugging purposes. * Minor changes in the example scripts. More preprocessing options for the dataset class. * Fixed a bug where the dataset class failed when using multiple workers and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes. * Lots of bugfixes in the dataset, trainer, and models. The basic encoders are now all working. Examples are in scripts. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Extended checkpointing - the trainer stores now: - Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training. * Adapted the other reconstruction scripts to match the new API. * Bugfix in the dataset class. When splitting inputs and targets, I forgot to remove unused modalities. This follows the standard getitem function now. * Prepared an option to preprocess movies. This has to be fully integrated!!! * Added a baseline fusion transformer for latent space prediction. Quick fix for the data standardization. Invalid values have to be ignored. Fix in the function to create H5 files. bolo data does not have to be flipped anymore as the data is now stored in the correct format. --------- * Moved some remaining scripts to the correct subdirectories. * Still working on preparing the dataset. This is not ready to push. Preparation to moving to Stellar. * Updated the data loader. Bugfix for loading the correct slices from H5 files. Implemented calculating incremental statistics. Corrected values in the modality configuration. Removed redundant script standardize_dataset.py * Added scripts for data fetching in Omega. TODO: Write a documentation. * Added a documentation for setting up Globus CLI on Omega and start a simple file transfer. * Updated README.md: - Added information on how to use all the scripts for data fetching. Updated read_mds.sh - Added a switch for globus file transfer. This simply stores the H5 files on Omega and we can add more data later. * More PTData to fetch. * PEP-8 compatible code. Moved prepare_data.py to scripts, added a batch script to do this on compute nodes. Added more point names to the data fetching scripts for Omega. Added docstring to the WelfordTensor class. Updated modalities.yaml with the new point names added. * Generalized make_preprocessing_stats.py and made the function compute_preprocessing_stats more transparent. Bugfix in modalities.yaml - Channels were missing in ECE. * A lot of bugfixes in the dataloader and prepare_data.py * Many bugfixees in the dataset class and for computing preprocessing stats. This is still not efficient enough and causes memory issues. * Speed-ups in data_loader.py. * Speed-ups in the dataloader. Bugfixes in the trainer. Cosmetic changes in tracking.py * drawing.py: - PEP-8 corrections - Support plots of time signals and videos Train-val-test split in fast_time_series_reconstruction.py * Bugfix in processing methods of the dataloader: - Channels was not handled properly (if selecting slices of a signal). - Drawing: Restrict plotting to valid signals (not the padded sections after the actual signal). - Introduced masked loss for fast time series reconstruction. * Added a separate baseline encoder for filterscopes (renamed fast_time_series_baseline.py to filterscope_baseline.py). Updates in the dataset class: Clipping for log transform can go down to -.99 (sufficient because we subtract 1.0). Updates in drawing.py: We can now draw all kinds of different plots (except for profiles for now). Added functionality to draw correlation plots, which is important for finding feature distributions. Added masked loss functions to not consider out-of-range time slices for training. * Added a weighted loss to penalize target distributions. Corrected the R2 score calculation in the drawer. Renamed profile_reconstruction.py to mse_profile_reconstruction.py Added ts_core_density_profile_reconstruction.py * Modified the default parameters of some profile and time-series signals in data_loader.py Added more loss functions in loss.py Switched to HuberLoss in filterscopes_reconstruction.py, in mse_profile_reconstruction.py. Updated model_factory.py to completed signal encoders/decoders. Moved profile_baseline.py into modality. Added training scripts for thomson scattering profiles. * Added CER related info to the dataset class and to the model factory. * Added dummy perceiver stuff. Be careful - this is not structured nicely yet. Only work in progress. * Added more RMP point names to the data fetching script. Restarted work on the latent feature space. * Updated all scripts according to the increased set of diagnostics and actuators we are using. * Updated preprocessing_stats. Here, the statistics are now pre-calculated for both, linear and log10 scale. Working on more accurate autoencoders for time-series and profiles. --------- Co-authored-by: Nathaniel Chen <nathanchen1101@gmail.com> Co-authored-by: renierts <ps9551@princeton.edu>

…re space is more compact now. Added foundation model utilities. This is under development!!!

into dev-peter

Too much to comment all. Mainly, the old foundation model is in archive to be able to restore it at any point. The new training scripts are train_e2e*. Adapted dataset functionalities to be compatible with the new training approach.

…GPU).

…n 50fps. So, adapted it.

…r features) Resolved 12 conflicted files. Policy: * Trainer files (stage1, stage2, stage2_delta, stage2_extended, stage3): keep foundation_model's DDP plumbing (DistributedManager, DistributedSampler, _core(), dm.is_main rank guards, dm.barrier, dm.wrap, train_step_module DDP wrappers) AND dev-peter's feature additions (video/spectrogram modalities, --use_video/--use_spectro flags, freeze_*_steps, --tf_anneal_steps, teacher-forcing scheduled sampling, _worker_init for OMP threads, load_state_dict_explicit warm-start helper). * Trainer model.state_dict() switched to _core(model).state_dict() so DDP-wrapped checkpoints save the inner module. * dev-peter SLURM step counts kept where divergent (curriculum_steps, block_steps, max_steps, val_every, grad_checkpoint_every). * Model/output_heads/rollout/data_loader: video + spectrogram tokenizers, output heads, _raw_to_frame_mask helper, mask-aware tokenize/rollout, teacher-forcing rollout loop — all preserved. * tangtv MovieConfig: dev-peter's filtered config (2-channel 120x360) over foundation_model's raw 7-channel 240x720. Two known follow-ups (intentional, deferred): * .gitignore auto-merged with unanchored data/ and runs/ which match src/tokamak_foundation_model/data/. Should be /data/ and /runs/. * multi_file_dataset.py still has the per-worker profiling counters and [w-pid...] aggregate print. data_loader.py's matching timers were removed during the merge, so the counters now stay at 0.

renierts and others added 30 commits March 17, 2026 15:30

Removed the argument "batch_size" from the trainers.

28b15d3

Changed default hyperparameters in the models. Added demo for profile reconstruction. Added script for dataset standardization (has to be run once before model training to store normalization coefficients).

Bugfix in the dataset class. When iterating over movie configurations…

305c7e2

…, the wrong configuration was used to find the correct signal name. Also, removed warning for duplicated tensor conversion.

Added base script for video reconstruction. Copied from Aza's branch …

3243412

…for debugging purposes.

Added base script for video reconstruction. Copied from Aza's branch …

dfc63ee

…for debugging purposes.

Minor changes in the example scripts. More preprocessing options for …

65f48fc

…the dataset class.

Fixed a bug where the dataset class failed when using multiple worker…

746f7ba

…s and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes.

Lots of bugfixes in the dataset, trainer, and models.

f053586

The basic encoders are now all working. Examples are in scripts.

Extended checkpointing - the trainer stores now:

300a4b3

- Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training.

Extended checkpointing - the trainer stores now:

939360c

- Model - Optimizer state - Scheduler state - Current loss - Current epoch For the sake of continual training.

Adapted the other reconstruction scripts to match the new API.

d359e07

Bugfix in the dataset class. When splitting inputs and targets, I for…

9d5bee1

…got to remove unused modalities. This follows the standard getitem function now.

Prepared an option to preprocess movies. This has to be fully integra…

9e79a91

…ted!!!

Added a baseline fusion transformer for latent space prediction.

029b685

Quick fix for the data standardization. Invalid values have to be ignored. Fix in the function to create H5 files. bolo data does not have to be flipped anymore as the data is now stored in the correct format.

Moved some remaining scripts to the correct subdirectories.

7f20db2

Still working on preparing the dataset. This is not ready to push. Pr…

fc95315

…eparation to moving to Stellar.

Updated the data loader. Bugfix for loading the correct slices from H…

5437224

…5 files. Implemented calculating incremental statistics. Corrected values in the modality configuration. Removed redundant script standardize_dataset.py

Added scripts for data fetching in Omega.

354e643

TODO: Write a documentation.

Added a documentation for setting up Globus CLI on Omega and start a …

f4ff282

…simple file transfer.

Updated README.md:

39cfaea

- Added information on how to use all the scripts for data fetching. Updated read_mds.sh - Added a switch for globus file transfer. This simply stores the H5 files on Omega and we can add more data later.

More PTData to fetch.

605fc68

PEP-8 compatible code.

9f436ec

Moved prepare_data.py to scripts, added a batch script to do this on compute nodes. Added more point names to the data fetching scripts for Omega. Added docstring to the WelfordTensor class. Updated modalities.yaml with the new point names added.

Generalized make_preprocessing_stats.py and made the function compute…

80ba381

…_preprocessing_stats more transparent. Bugfix in modalities.yaml - Channels were missing in ECE.

A lot of bugfixes in the dataloader and prepare_data.py

5d2c032

Many bugfixees in the dataset class and for computing preprocessing s…

ffa2c29

…tats. This is still not efficient enough and causes memory issues.

Speed-ups in data_loader.py.

33db368

Speed-ups in the dataloader.

345a3d5

Bugfixes in the trainer. Cosmetic changes in tracking.py

drawing.py:

06a9065

- PEP-8 corrections - Support plots of time signals and videos Train-val-test split in fast_time_series_reconstruction.py

Bugfix in processing methods of the dataloader:

857f75a

- Channels was not handled properly (if selecting slices of a signal). - Drawing: Restrict plotting to valid signals (not the padded sections after the actual signal). - Introduced masked loss for fast time series reconstruction.

renierts and others added 29 commits April 13, 2026 13:47

Added base script for video reconstruction. Copied from Aza's branch …

5dc6c7c

…for debugging purposes.

Minor changes in the example scripts. More preprocessing options for …

b0c1ce7

…the dataset class.

Fixed a bug where the dataset class failed when using multiple worker…

36fd17f

…s and opening an H5 file prior to distributing the dataset across all workers. Significant updates in the Fast time series baseline and actuator reconstruction classes.

Lots of bugfixes in the dataset, trainer, and models.

e84fae4

The basic encoders are now all working. Examples are in scripts.

Adapted the other reconstruction scripts to match the new API.

897697c

Moved some remaining scripts to the correct subdirectories.

7e0c537

Updated the data loader. Bugfix for loading the correct slices from H…

d18375a

…5 files. Implemented calculating incremental statistics. Corrected values in the modality configuration. Removed redundant script standardize_dataset.py

Added scripts for data fetching in Omega.

1fb3a69

TODO: Write a documentation.

Added a documentation for setting up Globus CLI on Omega and start a …

fe43bb2

…simple file transfer.

Updated README.md:

09691fc

- Added information on how to use all the scripts for data fetching. Updated read_mds.sh - Added a switch for globus file transfer. This simply stores the H5 files on Omega and we can add more data later.

More PTData to fetch.

a46d97b

PEP-8 compatible code.

bb50ad2

Moved prepare_data.py to scripts, added a batch script to do this on compute nodes. Added more point names to the data fetching scripts for Omega. Added docstring to the WelfordTensor class. Updated modalities.yaml with the new point names added.

A lot of bugfixes in the dataloader and prepare_data.py

9cdca1a

Many bugfixees in the dataset class and for computing preprocessing s…

7a1a9a4

…tats. This is still not efficient enough and causes memory issues.

Speed-ups in data_loader.py.

0ef276d

Speed-ups in the dataloader.

946b5f7

Bugfixes in the trainer. Cosmetic changes in tracking.py

Updated preprocessing_stats. Here, the statistics are now pre-calcula…

cc77bec

…ted for both, linear and log10 scale. Working on more accurate autoencoders for time-series and profiles.

TS profiles are now slow time series instead of profiles.

cf4b51e

Had to update all the profiles and slow time-series. The latent featu…

6cf8981

…re space is more compact now. Added foundation model utilities. This is under development!!!

Merge branch 'dev-peter' of https://github.com/PlasmaControl/FusionAIHub

4f68b7c

into dev-peter

Much better GPU utilization of the e2d pipeline now (98% on a single …

739084a

…GPU).

Prepared for video data. 100fps works better with the 50ms chunks tha…

4ec7075

…n 50fps. So, adapted it.

Stage 2 is ready for video support.

da616d5

Prepared for real multi-model foundation model. TS+Video+Spectrograms.

f9d6fcc

renierts merged commit 5a53521 into foundation_model May 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev peter#75

Dev peter#75
renierts merged 73 commits into
foundation_modelfrom
dev-peter

renierts commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

renierts commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant