Yasheng Sun, Bohan Li, Mingchen Zhuge, Deng-Ping Fan, Salman Khan, Fahad Shahbaz Khan, Hideki Koike
We aim to develop a straightforward framework that uses other modalities, such as natural language, to translate the original “dreamland”. We present DreamConnect, employing a dual-stream diffusion framework to manipulate visually stimulated brain signals. By integrating an asynchronous diffusion strategy, our framework establishes an effective interface with human “dreams”, progressively refining their final imagery synthesis.
- [2024/08]: Paper is on Arxiv.
- [2024/12]: Paper is accepted by Visual Intelligence.
a. Create a conda virtual environment and activate it. It requires python >= 3.7 as base environment.
conda create -n sssp python=3.7 -y
conda activate ssspb. Install PyTorch and torchvision following the official instructions.
conda install pytorch==1.10.0 torchvision==0.8.2 -c pytorch -c conda-forgec. Install other dependencies. We simply freeze our environments. Other environments might also works. Here we provide requirements.txt file for reference.
pip install -r requirements.txtNote that the transformers==1.19.2 is strictly required.
-
Agree to the Natural Scenes Dataset's Terms and Conditions and fill out the NSD Data Access form.
-
Our Customized Dataset. The editing instructions are located
in third_party/StableDiffusionReconstruction/codes/utils/miscdirectory. The obtained images after instruction can be downloaded from nsd_coco_output.tar.
- Download Pretrained model and put it to
logs/train_res_inject_idback_train_res_inject_idback_2024-04-22/checkpoints/ckpt_epoch_50/mp_rank_00_model_states.ptaccordingly.
Open the configuration file located at configs/test/test_res_value_inject_idback_css15.yaml and update the checkpoint path to match the paths of the downloaded models.
- Download the pre-trained language-based instruction model provided by InstructDiffusion.
For ease of use, we provide the following pre-aligned features:
- fMRI-aligned VAE features (fmri_vae.zip)
- Aligned image features (img_clip.zip)
- Aligned text features (text_clip.zip)
If you are interested in training a alignment model by yourself, please follow the overall procedure fMRI-reconstruction-NSD.
We provide the our trained alignment model for img_clip and text_clip. Download them and place to the directory of train_logs/latent_diffusion_image_fp32_resume/ and train_logs/latent_diffusion_text_fp32_resume2/ accordingly.
Then, you can run below commands to obtain the above provided img_clip and text_clip files.
bash experiments/diffusion_test.sh image
bash experiments/diffusion_test.sh text
Once the paths are updated, you can test the model by running the following command:
bash experiments/test_language_control.shMany thanks to these excellent open source projects:
- [InstructPix2Pix] (https://github.com/timothybrooks/instruct-pix2pix)
- [fMRI-reconstruction-NSD] (https://github.com/MedARC-AI/fMRI-reconstruction-NSD)
- [Versatile-Diffusion] (https://github.com/SHI-Labs/Versatile-Diffusion)
- [Stable-Diffusion] (https://github.com/CompVis/stable-diffusion)
- [InstructDiffusion] (https://github.com/cientgu/InstructDiffusion)
If you find our paper and code useful for your research, please consider citing:
@misc{sun2024connectingdreamsvisualbrainstorming,
title={Connecting Dreams with Visual Brainstorming Instruction},
author={Yasheng Sun and Bohan Li and Mingchen Zhuge and Deng-Ping Fan and Salman Khan and Fahad Shahbaz Khan and Hideki Koike},
year={2025},
journal={Viusal Intelligence}
}
