A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation
This is the official github repo for the paper: A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation (Accepted at TMLR)
# Dependencies Install
conda env create --file environment.yml
conda activate mi_ml_gen
# package install (in development mode)
conda develop ./mi_ml_gen
original paper: Large Scale Adversarial Representation Learning
used pretrained model (pytorch): https://github.com/lukemelas/pytorch-pretrained-gans
paper: Analyzing and Improving the Image Quality of StyleGAN
github (official-tensorflow): https://github.com/NVlabs/stylegan2
pretrained model (official-pytorch): https://github.com/NVlabs/stylegan2-ada-pytorch
For training the "real-data" encoders, we use datasets in the ffcv format.
We load the precomputed files for ImageNet-1K and LSUN Cars at:
https://huggingface.co/SerezD/mi_ml_gen/tree/main/datasets
Datasets for downstream tasks can be generated with the script:
mi_ml_gen/data/create_image_beton_file.py
If your interest is to use the repository just for view generation, simply run the script:
python mi_ml_gen/src/scripts/view_generation.py --configuration ./conf.yaml --save_folder ./tmp/
Examples of valid configurations are:
mi_ml_gen/configurations/view_generation/bigbigan.yaml
mi_ml_gen/configurations/view_generation/stylegan.yaml.
For example, they allow to generate things like:
To train walkers, deciding which latents (chunks) to perturb, check the script at: mi_ml_gen/src/noise_maker/cop_gen_training/train_navigator.py
For example, to train the walker on all latents except the first of bigbigan, run:
python train_navigator.py --generator bigbigan --g_path ./runs/BigBiGAN_x1.pth --chunks 1_6
Note: learning rates for walkers and InfoNCE loss may vary depending on selected chunks, generator, and batch size. A rule of thumb is to keep them low (1e-5 order of magnitude), allowing smooth learning of the walkers.
The pre-trained models that we used in the experiments are available at: https://huggingface.co/SerezD/mi_ml_gen/tree/main/runs/walkers/
To generate Table 1 in the paper (Monte Carlo simulation), run the script at mi_ml_gen/src/noise_maker/delta_estimation/monte_carlo_simulation.py
For the SimCLR results, we follow the experimental procedure of previous work, available at: https://github.com/LiYinqi/COP-Gen/tree/master
Therefore, this repo containst only the code for training SimSiam and Byol models, generating data with continuous sampling.
The configuration .yaml file for each model, containing training hyperparameters, can be found in the mi_ml_gen/configurations/encoders path.
To train a new encoder from scratch, run the script: mi_ml_gen/src/multiview_encoders/train_encoder.py.
For example, to train SimSiam encoder on Imagenet-1K dataset (real data), run:
python mi_ml_gen/src/multiview_encoders/train_encoder.py --seed 0 --encoder simsiam --conf simsiam_bigbigan/encoder_imagenet_baseline_real --data_path ./datasets/imagenet/ffcv/ --logging
Note: please check the script file for a description of each argument.
Pre-trained encoder models that we used in the experiments are available at: https://huggingface.co/SerezD/mi_ml_gen/tree/main/runs/encoders/
After the encoder has been trained, it is possible to train a small mlp network for classification on several downstream datasets.
The configuration .yaml files, containing training hyperparameters, can be found in the mi_ml_gen/configurations/classifiers path.
To train the classifiers, run the script at mi_ml_gen/src/evaluations/classification/train_classifier.py. For example, to train a classifier on a Byol pre-trained encoder and on the StanfordCars downstream task:
python mi_ml_gen/src/evaluations/classification/train_classifier.py --encoder_path './runs/encoder_lsun_baseline_real/last.ckpt' --data_path './datasets/StanfordCars/ffcv' --conf 'classifier_lsun' --dataset 'StanfordCars' --run_name 'tmp' --seed 0 --logging
Note: please check the script file for a description of each argument.
Pre-trained Linear Classifiers are available at: https://huggingface.co/SerezD/mi_ml_gen/tree/main/runs/linear_classifiers/
As a last step, evaluations on the test set can be run with the script: mi_ml_gen/src/evaluations/classification/eval_classifier.py
For example:
python mi_ml_gen/src/evaluations/classification/eval_classifier.py --lin_cls_path './runs/LinCls-StanfordCars-encoder_lsun_chunks_learned_classifier/last.ckpt' --data_path './datasets/StanfordCars/ffcv/' --dataset 'StanfordCars' --batch_size 16 --out_log_file tmp
To compare performance speed and reproduce results of Figure 5, run:
python mi_ml_gen/src/test_online_learning/train_offline.py --loader [ffcv, torch] --dataset_path ./datasets/imagenet/
where ./datasets/imagenet/ contains train/*.png files and ffcv/train.beton.
Then run:
python mi_ml_gen/src/test_online_learning/train_online.py --generator_path ./runs/BigBiGAN_x1.pth
@article{
serez2025a,
title={A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation},
author={Dario Serez and Marco Cristani and Alessio Del Bue and Vittorio Murino and Pietro Morerio},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2025},
url={https://openreview.net/forum?id=uaj8ZL2PtK},
note={}
}







