Initial FAITH commit by nathanchenseanwalter · Pull Request #29 · PlasmaControl/FusionAIHub

nathanchenseanwalter · 2025-08-01T19:30:22Z

No description provided.

- Added .gitignore - Added a directory for documentation. - Added a directory for unit tests.

With a script of fetching data, transferring data, and object (work-in-progress) that read and unify the dataset.

Initial publication

Work Time series full pipeline take suffix, output aligned dict with window.

Minor improvement on the finding the closest index.

- Added spectrogram utilities. - Further cosmetic changes in Max's code

# Conflicts: # examples/Data_fetching/fetch_GAdata.py # examples/Data_fetching/fetch_toksearch.py

Added functionalities to resample a time-series and an empty module for time-series interpolation.

Co-authored-by: Alvin Garcia <alvin-garcia@users.noreply.github.com>

…IHub into dev-nathan

…rom "fusion_ai_hub" to "fusionaihub". Removed several unused modules and files related to data loading, processing, and visualization. Added new dependencies for enhanced functionality, including ipykernel, ipywidgets, scikit-learn, torch, tables, and pyyaml.

…r notebooks to reset execution counts, streamline data loading, and enhance visualization. Refactor dataset preparation scripts to improve logging, configuration handling, and remove deprecated modules.

…nvironment and installing necessary packages.

…uctions. Remove deprecated logging configuration file and update dataset preparation README for improved clarity and modular structure.

…ormats from .pkl to .joblib and .csv for dataset indexing, enhancing clarity on the output structure.

Added instructions on how to use

- Increased job execution time in `prepare_data.sh` from 10 hours to 15 hours. - Changed the data processing configuration in `accessing_data.ipynb` to use the `magnetics` dataset instead of `signals`. - Enabled debug mode in `spectrogram.yaml` for improved troubleshooting. - Updated STFT transformation parameters in `processing_v0.py` to utilize configuration settings. - Enhanced type hints in dataset classes for better clarity and type safety in `base.py`, `file_based.py`, and other related files.

- Added missing newlines at the end of files in `__main__.py` and `align.py`. - Improved formatting and consistency in `base.py`, `file_based.py`, and other dataset classes by adjusting indentation and line breaks for better readability.

- Added `black[jupyter]` to development dependencies in `pyproject.toml` for improved code formatting. - Introduced new ignore rule for `black` in `pyproject.toml` to handle specific annotation warnings. - Updated function signatures in various files to include type hints for better clarity and type safety. - Improved formatting in several Python files for consistency and readability. - Added notes in `README.md` regarding package reinstallation after changes. - Removed obsolete Jupyter notebooks from the `hackathon` directory to clean up the project.

Big Beautiful Bill: 1300 linting error fixes + Ruff + Black reformatting + fixes to dataloader (can now read joblib dataset)

This commit also contains some updates to the pinned dependencies in the `uv.lock` file. It might be a good idea to re-visit the dependencies once we have made more progress on the project.

Dev nathan

Implement draft spectrogram-based model `specfmv0`

review-notebook-app · 2025-08-01T19:30:28Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Copilot

Pull Request Overview

This is an initial commit introducing the FAITH (Fusion Autoencoder for Interpretable Token-based Hierarchical representations) framework for training block-based autoencoders. The PR includes a complete implementation of modular autoencoder architectures with support for masked autoencoders (MAE), hyperparameter tuning via Ray Tune, PyTorch Lightning integration, and flexible file-based datasets.

Key changes:

Modular autoencoder architecture with configurable encoder/decoder blocks
Masked Autoencoder (MAE) implementation with various masking strategies
Ray Tune integration for hyperparameter optimization with multiple search algorithms
PyTorch Lightning training framework with automatic mixed precision and scheduling
Flexible dataset loading supporting joblib, HDF5, and NumPy file formats

Reviewed Changes

Copilot reviewed 101 out of 129 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`tests/train/test_train_blocks_base.py`	Example usage and testing of base autoencoder blocks
`tests/train/test_autoencoder.py`	Comprehensive testing of BlockBasedAutoencoder functionality
`test_config.yaml`	YAML configuration example for autoencoder models
`src/faith/train/tuning/search_spaces.py`	Predefined hyperparameter search spaces for Ray Tune
`src/faith/train/tuning/ray_tuner.py`	Ray Tune integration for hyperparameter optimization
`src/faith/train/tuning/__init__.py`	Tuning module initialization with graceful Ray import handling
`src/faith/train/training/lightning_trainer.py`	PyTorch Lightning wrapper for autoencoder training
`src/faith/train/training/__init__.py`	Training module exports
`src/faith/train/models/utils.py`	Utility functions for model analysis and memory estimation
`src/faith/train/models/mae.py`	Masked Autoencoder implementation with flexible masking strategies
`src/faith/train/models/configs.py`	Configuration management with preset and custom model configurations
`src/faith/train/models/autoencoder.py`	Core BlockBasedAutoencoder implementation
`src/faith/train/models/__init__.py`	Model module exports and public API
`src/faith/train/data/loaders/factory.py`	DataLoader factory with worker initialization for lazy datasets
`src/faith/train/data/datasets/file_based.py`	File-based dataset implementations supporting multiple formats

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

renierts and others added 30 commits April 18, 2024 08:59

- Prepared a very basic package structure.

f2f662e

- Added .gitignore - Added a directory for documentation. - Added a directory for unit tests.

Initial publication

d04ed07

With a script of fetching data, transferring data, and object (work-in-progress) that read and unify the dataset.

Merge pull request #1 from PlasmaControl/dev-max

c91c90a

Initial publication

Cosmetic changes of Max's code.

2124c02

Added normalization and flat top finder

12f6c8b

Time series full pipeline

1ba5ff6

Work Time series full pipeline take suffix, output aligned dict with window.

Closet index improvement

9e67a55

Minor improvement on the finding the closest index.

Fixed padding and binary search bug

60109ee

Change default setting, indent error, and plotting error

da9597a

- Added time domain filters

a22af0c

- Added spectrogram utilities. - Further cosmetic changes in Max's code

Merge remote-tracking branch 'origin/dev-peter' into dev-peter

c931d61

# Conflicts: # examples/Data_fetching/fetch_GAdata.py # examples/Data_fetching/fetch_toksearch.py

Added docstrings for STFT and spectrogram.

21f6df9

Added functionalities to resample a time-series and an empty module for time-series interpolation.

Added linear interpolation to fill missing values in a time-series.

1107a66

Update data_prep_obj.py

8ccf8c3

Added functions

e0b3eca

Updates to syntax

1a784e7

file.py combines some base class functions

1f87285

load and scaling updates

6574656

init updates

4d82e93

small edits

d8afde8

Co-authored-by: Josh Josephy-Zack <Jayzee77@users.noreply.github.com>

552254b

Co-authored-by: Alvin Garcia <alvin-garcia@users.noreply.github.com>

stuff

4c3d872

gpu compatability

f1d8aa0

most recent with debugging

30f014c

Merge branch 'dev-nathan' of https://github.com/PlasmaControl/FusionA…

d1357d3

…IHub into dev-nathan

Update .gitignore to exclude data and logs directories. Modify Jupyte…

6ccc075

…r notebooks to reset execution counts, streamline data loading, and enhance visualization. Refactor dataset preparation scripts to improve logging, configuration handling, and remove deprecated modules.

Update README.md to include instructions for activating the virtual e…

2f2d046

…nvironment and installing necessary packages.

Enhance README.md with project purpose, team members, and setup instr…

9c8a50c

…uctions. Remove deprecated logging configuration file and update dataset preparation README for improved clarity and modular structure.

Update dataset preparation README to reflect changes in output file f…

5aaff62

…ormats from .pkl to .joblib and .csv for dataset indexing, enhancing clarity on the output structure.

nathanchenseanwalter and others added 19 commits July 19, 2025 10:08

Update README.md

0e90fd7

Update README.md

7da20ca

Update README.md

61ce14f

Update README.md

46cb7b8

Update README.md

695c0d7

Merge pull request #22 from PlasmaControl/dev-nathan

a43dd49

Added instructions on how to use

Refactor code for consistency and clarity

fb9f9e6

- Added missing newlines at the end of files in `__main__.py` and `align.py`. - Improved formatting and consistency in `base.py`, `file_based.py`, and other dataset classes by adjusting indentation and line breaks for better readability.

Removed ruff.toml

f1b8597

Merge pull request #24 from PlasmaControl/dev-nathan

ad73b4e

Big Beautiful Bill: 1300 linting error fixes + Ruff + Black reformatting + fixes to dataloader (can now read joblib dataset)

Started refactoring the ResidualBlock.

c89e044

Auto-format pyproject.toml

447a7fb

This commit also contains some updates to the pinned dependencies in the `uv.lock` file. It might be a good idea to re-visit the dependencies once we have made more progress on the project.

Refactor code structure for improved readability and maintainability

5c089eb

Remove black dependency and related configuration from pyproject.toml

88acb2e

Implement code changes to enhance functionality and improve performance

8496d19

Merge pull request #28 from PlasmaControl/dev-nathan

454494d

Dev nathan

Merge branch 'foundation25' into specfmv0

4b53e52

Merge pull request #27 from PlasmaControl/specfmv0

ed09d98

Implement draft spectrogram-based model `specfmv0`

nathanchenseanwalter requested a review from Copilot August 1, 2025 19:30

Merge branch 'main' into foundation25

4daaa30

Copilot AI reviewed Aug 1, 2025

View reviewed changes

nathanchenseanwalter and others added 7 commits August 1, 2025 15:32

Update src/faith/train/tuning/ray_tuner.py

63b12f3

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/faith/train/tuning/ray_tuner.py

1d10f84

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/faith/train/data/datasets/file_based.py

52dede7

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update tests/train/test_autoencoder.py

9c46ba5

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/faith/train/models/autoencoder.py

b7ae0a0

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/faith/train/data/datasets/file_based.py

89e4134

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/faith/train/models/mae.py

fdafbf7

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial FAITH commit#29

Initial FAITH commit#29
nathanchenseanwalter wants to merge 120 commits into
mainfrom
foundation25

nathanchenseanwalter commented Aug 1, 2025

Uh oh!

review-notebook-app Bot commented Aug 1, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

nathanchenseanwalter commented Aug 1, 2025

Uh oh!

review-notebook-app Bot commented Aug 1, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants