A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation

This is the official github repo for the paper: A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation (Accepted at TMLR)

INSTALLATION

# Dependencies Install 
conda env create --file environment.yml
conda activate mi_ml_gen

# package install (in development mode)
conda develop ./mi_ml_gen

MLVGMS REFERENCES AND PRE-TRAINED MODELS

BigBiGAN

original paper: Large Scale Adversarial Representation Learning
used pretrained model (pytorch): https://github.com/lukemelas/pytorch-pretrained-gans

StyleGAN-2

paper: Analyzing and Improving the Image Quality of StyleGAN
github (official-tensorflow): https://github.com/NVlabs/stylegan2
pretrained model (official-pytorch): https://github.com/NVlabs/stylegan2-ada-pytorch

DATASETS

For training the "real-data" encoders, we use datasets in the ffcv format.

We load the precomputed files for ImageNet-1K and LSUN Cars at:
https://huggingface.co/SerezD/mi_ml_gen/tree/main/datasets

Datasets for downstream tasks can be generated with the script:
mi_ml_gen/data/create_image_beton_file.py

GENERATE MULTIPLE VIEWS

If your interest is to use the repository just for view generation, simply run the script:

python mi_ml_gen/src/scripts/view_generation.py --configuration ./conf.yaml --save_folder ./tmp/

Examples of valid configurations are:
mi_ml_gen/configurations/view_generation/bigbigan.yaml
mi_ml_gen/configurations/view_generation/stylegan.yaml.

For example, they allow to generate things like:

TRAIN $T_\mathbf{z}(\mathbf{z})$ and Monte Carlo Simulation

To train walkers, deciding which latents (chunks) to perturb, check the script at: mi_ml_gen/src/noise_maker/cop_gen_training/train_navigator.py

For example, to train the walker on all latents except the first of bigbigan, run:

python train_navigator.py --generator bigbigan --g_path ./runs/BigBiGAN_x1.pth --chunks 1_6

Note: learning rates for walkers and InfoNCE loss may vary depending on selected chunks, generator, and batch size. A rule of thumb is to keep them low (1e-5 order of magnitude), allowing smooth learning of the walkers.

The pre-trained models that we used in the experiments are available at: https://huggingface.co/SerezD/mi_ml_gen/tree/main/runs/walkers/

To generate Table 1 in the paper (Monte Carlo simulation), run the script at mi_ml_gen/src/noise_maker/delta_estimation/monte_carlo_simulation.py

TRAIN ENCODERS (SimCLR, SimSiam, Byol)

For the SimCLR results, we follow the experimental procedure of previous work, available at: https://github.com/LiYinqi/COP-Gen/tree/master

Therefore, this repo containst only the code for training SimSiam and Byol models, generating data with continuous sampling.
The configuration .yaml file for each model, containing training hyperparameters, can be found in the mi_ml_gen/configurations/encoders path.

To train a new encoder from scratch, run the script: mi_ml_gen/src/multiview_encoders/train_encoder.py.
For example, to train SimSiam encoder on Imagenet-1K dataset (real data), run:

python mi_ml_gen/src/multiview_encoders/train_encoder.py --seed 0 --encoder simsiam --conf simsiam_bigbigan/encoder_imagenet_baseline_real --data_path ./datasets/imagenet/ffcv/ --logging

Note: please check the script file for a description of each argument.

Pre-trained encoder models that we used in the experiments are available at: https://huggingface.co/SerezD/mi_ml_gen/tree/main/runs/encoders/

TRAIN AND TEST LINEAR CLASSIFIERS

After the encoder has been trained, it is possible to train a small mlp network for classification on several downstream datasets.

The configuration .yaml files, containing training hyperparameters, can be found in the mi_ml_gen/configurations/classifiers path.

To train the classifiers, run the script at mi_ml_gen/src/evaluations/classification/train_classifier.py. For example, to train a classifier on a Byol pre-trained encoder and on the StanfordCars downstream task:

python mi_ml_gen/src/evaluations/classification/train_classifier.py --encoder_path './runs/encoder_lsun_baseline_real/last.ckpt' --data_path './datasets/StanfordCars/ffcv' --conf 'classifier_lsun' --dataset 'StanfordCars' --run_name 'tmp' --seed 0 --logging

Note: please check the script file for a description of each argument.

Pre-trained Linear Classifiers are available at: https://huggingface.co/SerezD/mi_ml_gen/tree/main/runs/linear_classifiers/

As a last step, evaluations on the test set can be run with the script: mi_ml_gen/src/evaluations/classification/eval_classifier.py

For example:

python mi_ml_gen/src/evaluations/classification/eval_classifier.py --lin_cls_path './runs/LinCls-StanfordCars-encoder_lsun_chunks_learned_classifier/last.ckpt' --data_path './datasets/StanfordCars/ffcv/' --dataset 'StanfordCars' --batch_size 16 --out_log_file tmp

CONTINUOUS SAMPLING

To compare performance speed and reproduce results of Figure 5, run:

python mi_ml_gen/src/test_online_learning/train_offline.py --loader [ffcv, torch] --dataset_path ./datasets/imagenet/

where ./datasets/imagenet/ contains train/*.png files and ffcv/train.beton.

Then run:

python mi_ml_gen/src/test_online_learning/train_online.py --generator_path ./runs/BigBiGAN_x1.pth

CITATION

@article{
    serez2025a,
    title={A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation},
    author={Dario Serez and Marco Cristani and Alessio Del Bue and Vittorio Murino and Pietro Morerio},
    journal={Transactions on Machine Learning Research},
    issn={2835-8856},
    year={2025},
    url={https://openreview.net/forum?id=uaj8ZL2PtK},
    note={}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
images		images
mi_ml_gen		mi_ml_gen
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE_SimSiam		LICENSE_SimSiam
LICENSE_StyleGAN		LICENSE_StyleGAN
environment.yml		environment.yml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation

INSTALLATION

MLVGMS REFERENCES AND PRE-TRAINED MODELS

BigBiGAN

StyleGAN-2

DATASETS

GENERATE MULTIPLE VIEWS

TRAIN $T_\mathbf{z}(\mathbf{z})$ and Monte Carlo Simulation

TRAIN ENCODERS (SimCLR, SimSiam, Byol)

TRAIN AND TEST LINEAR CLASSIFIERS

CONTINUOUS SAMPLING

CITATION

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation

INSTALLATION

MLVGMS REFERENCES AND PRE-TRAINED MODELS

BigBiGAN

StyleGAN-2

DATASETS

GENERATE MULTIPLE VIEWS

TRAIN $T_\mathbf{z}(\mathbf{z})$ and Monte Carlo Simulation

TRAIN ENCODERS (SimCLR, SimSiam, Byol)

TRAIN AND TEST LINEAR CLASSIFIERS

CONTINUOUS SAMPLING

CITATION

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages