Skip to content

Demographics-conditioned and decorrelated variational autoencoder (DemoVAE)

License

Notifications You must be signed in to change notification settings

aorliche/demo-vae

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Demographic-Conditioned and Decorrelated Variational Autoencoder (DemoVAE)

Variational autoencoder generating synthetic subject FC and other fixed-length modalities of fMRI for distribution sampling, removal of confounds, and demographic-change or site harmonization.

Installable pip package

Run

pip install demovae

to install demovae along with basic dependencies (numpy, scikit-learn, and torch).

Check the file Pip3TestSample.ipynb for a basic idea of how to run the code.

Site harmonization may be achieved by fitting the model then transforming original data with changed "demographics" (i.e., site codes).

Check this file in the pip subdirectory to see all of the configuration parameters you can set, e.g.:

    @staticmethod
    def get_default_params():
        return dict(latent_dim=60,      # Latent dimension
                use_cuda=True,          # GPU acceleration
                nepochs=3000,           # Training epochs
                pperiod=100,            # Epochs between printing updates 
                bsize=1000,             # Batch size
                loss_C_mult=1,          # Covariance loss (KL div)
                loss_mu_mult=1,         # Mean loss (KL div)
                loss_rec_mult=100,      # Reconstruction loss
                loss_decor_mult=10,     # Latent-demographic decorrelation loss
                loss_pred_mult=0.001,   # Classifier/regressor guidance loss
                alpha=100,              # Regularization for continuous guidance models
                LR_C=100,               # Regularization for categorical guidance models
                lr=1e-4,                # Learning rate
                weight_decay=0,         # L2 regularization for VAE model
                )

API

The DemoVAE class uses the scikit-learn API. You have access to the following methods:

DemoVAE
fit
transform
fit_transform
get_latents
save
load

Check the pip/src/demovae directory for how to use them.

Features

We condition a variational autoencoder on demographic data, and at the same time train it to decorrelate latent features from these demographics

This is important because many datasets are skewed with respect to demographics.

Prediction based on fMRI may be predicting the phenotype of interest OR predicting the demographic (age, sex, race, etc.) and inferring the phenotype based on the inherent demographic bias in the dataset.

In some cases, decorrelating latents from demographic information destroys predictive ability.

We find that most clinical and computerized battery fields in the PNC and BSNIP datasets are biased by demographics, and once latent features are decorrelated by DemoVAE, are no longer significantly correlated with the fMRI data.

An exception to this finding are Antipsychotic medication use in the BSNIP dataset as well as PANSS scores for severity of schizophrenia symptoms, which are still significantly correlated after demographic decorrelation.

We find the DemoVAE creates fMRI functional connectivity (FC) samples that are high in quality and represent the full distribution of subject fMRI.


Additionally, it correctly captures the group differences of the demographic features it is trained on.

Prediction using models trained on synthetic DemoVAE data is almost as good as prediction using models trained on real data.

Preprint manuscript available at: arXiv.
Submitted to IEEE journal.

Personal website: https://aorliche.github.io/
Lab website: https://www2.tulane.edu/~wyp/
Email me

About

Demographics-conditioned and decorrelated variational autoencoder (DemoVAE)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages