🧠 Visual Decoding from EEG

Author

Atharv Kumar (atharvkumar43@gmail.com)

Faculty Guides:
Prof. Arnav Bhaskar

📝 Overview

This project explores decoding EEG (Electroencephalogram) signals to reconstruct the visual stimuli experienced by the brain — using text and image generation models. It leverages deep learning and multi-modal alignment to generate high-fidelity image reconstructions from brain signals.

This work opens new pathways in brain-computer interfaces, neuroscience, and thought-driven AI systems.

🎯 Objectives

EEG-based Textual Encoding: Extract meaningful embeddings from EEG data.
Image Reconstruction: Use captions (via BLIP-2) and images (via Stable Diffusion) to reconstruct what the subject saw.
Direct Thought-to-Image: Create an end-to-end pipeline from EEG → Text → Image.

🧠 Dataset

EEG Signals: 16,740 EEG samples (17 channels, 100 timepoints each).
Images: Each of the 16,740 images shown to 10 subjects.
Labels: For supervised and aligned training.

🔧 Methodology

🔹 Step 1: EEG Embedding (VAE)

VAE trained on DEAP dataset to extract EEG embeddings.
Ensures compact, meaningful signal representation.

🔹 Step 2: Caption Generation (BLIP-2)

BLIP-2 generates captions from the original image.

🧾 "A small armadillo walking on the dirt"

🔹 Step 3: Cross-Modal Alignment (CLIP / Masked CLIP)

Align EEG and text embeddings via CLIP.
Trained to bring both into a common latent space.

🔹 Step 4: Text Generation (GPT-2)

GPT-2 decodes EEG → Text via autoregressive generation.

🧠 ➡️ GPT-2 ➡️ "A baby armadillo in its enclosure at the zoo"

🔹 Step 5: Depth Estimation (GCNN/GAT)

Graph CNN captures spatial relations in EEG for image depth features.

🔹 Step 6: Image Reconstruction (Stable Diffusion)

Prompt + Depth Map → Stable Diffusion (v2.1 base) to synthesize visual output.

🧩 Model Architecture

📊 Results

✅ Caption Alignment Results

EEG Caption (GPT-2)	BLIP Caption	ROUGE Score
"a man holding an accordion..."	"a person playing an accordion..."	0.44
"a floral air mattress..."	"an air mattress with a floral pattern..."	0.52

✅ Image Reconstruction Results

EEG Signal	Original Image	Caption	Generated Text	Reconstructed Image	SSIM
		"a small armadillo walking in the dirt"	"a baby armadillo enclosure at the zoo"		11.02%
		"a group of people riding on a boat"	"a group of people in an airboat"		14.32%

🔬 Quantitative Analysis

CLIP Loss: Dropped from 3.48 to 0.12 (30 epochs).
Cosine Similarity Matrix: Strong diagonals (high EEG-text alignment).
ROUGE Scores: ROUGE-1 between 0.44–0.52.
SSIM: Image similarity remains low (~10–15%), but semantically accurate.

📚 References

🙏 Acknowledgements

Special thanks to our guides Prof. Arnav Bhaskar for their constant support and insights.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
B21038_VisualDecodingUsingEEG_MTP_Report.pdf		B21038_VisualDecodingUsingEEG_MTP_Report.pdf
README.md		README.md
Screenshot 2025-06-10 150550.png		Screenshot 2025-06-10 150550.png
Screenshot 2025-06-10 151516.png		Screenshot 2025-06-10 151516.png
Screenshot 2025-06-10 151526.png		Screenshot 2025-06-10 151526.png
Screenshot 2025-06-10 151536.png		Screenshot 2025-06-10 151536.png
Screenshot 2025-06-10 151752.png		Screenshot 2025-06-10 151752.png
Screenshot 2025-06-10 151758.png		Screenshot 2025-06-10 151758.png
Screenshot 2025-06-10 151809.png		Screenshot 2025-06-10 151809.png
caption-generation-module.ipynb		caption-generation-module.ipynb
dataset.py		dataset.py
eda_eeg1.py		eda_eeg1.py
eeg-analysis-embedding.ipynb		eeg-analysis-embedding.ipynb
eeg-to-text-final.ipynb		eeg-to-text-final.ipynb
eeg_embeddings (1).npy		eeg_embeddings (1).npy
eeg_embeddings.npy		eeg_embeddings.npy
eeg_to_text_mapping_model.pth		eeg_to_text_mapping_model.pth
final-reconstruction.ipynb		final-reconstruction.ipynb
generated_captions.npy		generated_captions.npy
generated_captions_embeddings.npy		generated_captions_embeddings.npy
generated_captions_vae_embeddings.npy		generated_captions_vae_embeddings.npy
generated_texts_from_eeg.npy		generated_texts_from_eeg.npy
image-to-depth (1).ipynb		image-to-depth (1).ipynb
imagnet-eeg-visual-decoding (4).ipynb		imagnet-eeg-visual-decoding (4).ipynb
modified_eeg_embeddings.npy		modified_eeg_embeddings.npy
modified_eeg_vae_embeddings.npy		modified_eeg_vae_embeddings.npy
modified_text_embeddings.npy		modified_text_embeddings.npy
modified_text_vae_embeddings.npy		modified_text_vae_embeddings.npy
text-eeg-close-embeddings.ipynb		text-eeg-close-embeddings.ipynb
text-embeddings.ipynb		text-embeddings.ipynb
vae_decoder.h5		vae_decoder.h5
vae_encoder.h5		vae_encoder.h5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Visual Decoding from EEG

📝 Overview

🎯 Objectives

🧠 Dataset

🔧 Methodology

🔹 Step 1: EEG Embedding (VAE)

🔹 Step 2: Caption Generation (BLIP-2)

🔹 Step 3: Cross-Modal Alignment (CLIP / Masked CLIP)

🔹 Step 4: Text Generation (GPT-2)

🔹 Step 5: Depth Estimation (GCNN/GAT)

🔹 Step 6: Image Reconstruction (Stable Diffusion)

🧩 Model Architecture

📊 Results

✅ Caption Alignment Results

✅ Image Reconstruction Results

🔬 Quantitative Analysis

📚 References

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Machforo/Visual-Decoding-using-EEG

Folders and files

Latest commit

History

Repository files navigation

🧠 Visual Decoding from EEG

📝 Overview

🎯 Objectives

🧠 Dataset

🔧 Methodology

🔹 Step 1: EEG Embedding (VAE)

🔹 Step 2: Caption Generation (BLIP-2)

🔹 Step 3: Cross-Modal Alignment (CLIP / Masked CLIP)

🔹 Step 4: Text Generation (GPT-2)

🔹 Step 5: Depth Estimation (GCNN/GAT)

🔹 Step 6: Image Reconstruction (Stable Diffusion)

🧩 Model Architecture

📊 Results

✅ Caption Alignment Results

✅ Image Reconstruction Results

🔬 Quantitative Analysis

📚 References

🙏 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages