Just a script to embed all of the textual and image data in CLIP and save the embeddings in the drive folder
Requires the image data to be unzipped, and all .ini files do be deleted.

### Setup

In [None]:
%pip install git+https://github.com/openai/CLIP.git

Defaulting to user installation because normal site-packages is not writeable
Collecting git+https://github.com/openai/CLIP.git
  Cloning https://github.com/openai/CLIP.git to c:\users\talgo\appdata\local\temp\pip-req-build-1e2s8d6x
  Resolved https://github.com/openai/CLIP.git to commit dcba3cb2e2827b402d2701e7e1c7d9fed8a20ef1
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Note: you may need to restart the kernel to use updated packages.


  Running command git clone --filter=blob:none --quiet https://github.com/openai/CLIP.git 'C:\Users\talgo\AppData\Local\Temp\pip-req-build-1e2s8d6x'

[notice] A new release of pip is available: 25.0.1 -> 25.2
[notice] To update, run: C:\Users\talgo\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [None]:
%pip install python-dotenv

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0.1 -> 25.2
[notice] To update, run: C:\Users\talgo\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [17]:
# ----- stdlib & deps ------------------------------------------------------
import os, sys, numpy as np, torch
from pathlib import Path
from PIL import Image
from tqdm import tqdm
import clip
from dotenv import load_dotenv
# -------------------------------------------------------------------------

# ----- get project dir from env -----
load_dotenv()  # this loads variables from .env into os.environ

BRAIN_DECODER_DIR = os.environ.get("BRAIN_DECODER_DIR")
if not BRAIN_DECODER_DIR:
    raise EnvironmentError(
        "BRAIN_DECODER_DIR is not set in your .env file.\n"
        "Please add it, e.g.:\nBRAIN_DECODER_DIR=G:/My Drive/brain-decoder-files"
    )

# ----- paths --------------------------------------------------------------
PROJ_DIR = Path.cwd()
DATA_DIR = PROJ_DIR / "data"
DRIVE_DATA_FOLDER = Path(BRAIN_DECODER_DIR).expanduser().resolve()
if not DRIVE_DATA_FOLDER.exists():
    raise FileNotFoundError(f"Project directory not found: {DRIVE_DATA_FOLDER}")

text_npz   = DRIVE_DATA_FOLDER / "clip_text_embeddings.npz"
image_npz  = DRIVE_DATA_FOLDER / "clip_image_embeddings.npz"   # new: raw vectors
images_dir = DATA_DIR / "experiment-images"
assert images_dir.is_dir(), f"{images_dir} not found!"

# ----- load CLIP model ----------------------------------------------------
device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-L/14@336px", device=device)  # returns 768-d vectors
model.eval()

# ----- concept labels -----------------------------------------------------
concepts = np.genfromtxt(DRIVE_DATA_FOLDER / "concepts.txt", dtype=str)


### Embed textual data

In [None]:
with torch.no_grad():
    toks = clip.tokenize([f"A photo of {c}" for c in concepts]).to(device)
    txt  = model.encode_text(toks)
    txt  = txt / txt.norm(dim=-1, keepdim=True)

np.savez_compressed(text_npz, data=txt.cpu().numpy().astype(np.float32))
print("Text vectors saved to", text_npz)

### Embed image data

In [None]:
img_paths = [(con, p) for con in concepts for p in (images_dir / con).glob("*.jpg")]

BATCH = 64
all_vecs, all_cons, all_files = [], [], []

for i in tqdm(range(0, len(img_paths), BATCH), desc="encoding images"):
    batch_paths = img_paths[i : i + BATCH]

    imgs = [preprocess(Image.open(p).convert("RGB")) for _, p in batch_paths]
    imgs = torch.stack(imgs).to(device)

    with torch.no_grad():
        vecs = model.encode_image(imgs).float()
    vecs = vecs / vecs.norm(dim=-1, keepdim=True)

    all_vecs.append(vecs.cpu())
    all_cons.extend([con for con, _ in batch_paths])
    all_files.extend([p.name for _, p in batch_paths])

embeddings = torch.cat(all_vecs).numpy().astype(np.float32)

np.savez_compressed(
    image_npz,
    embeddings=embeddings,
    concepts=np.array(all_cons),
    filenames=np.array(all_files),
)
print(f"Image vectors saved to ", image_npz)


encoding images: 100%|██████████| 17/17 [36:01<00:00, 127.12s/it]

Image vectors saved to  G:\.shortcut-targets-by-id\1CwmFOsYFnq6t33KAzpvw0gaOTQXbcozs\brain-decoder-files\clip_image_embeddings.npz



