# SEARLE: Step 1 - Image Concepts Association on Kaggle

This notebook runs Step 1 of the SEARLE pipeline.

## Prerequisities
1. Upload the `SEARLE` folder as a Dataset to Kaggle.
2. Attach the dataset to this notebook.


In [None]:
# 1. Install Dependencies
!pip install git+https://github.com/openai/CLIP.git
!pip install transformers pandas tqdm comet-ml

In [None]:
# 2. Setup Workspace
import shutil
import os
from pathlib import Path

# Define paths
INPUT_PATH = Path('/kaggle/input') 
WORKING_PATH = Path('/kaggle/working/SEARLE')

# Find SEARLE directory in input
searle_input = None
for p in INPUT_PATH.rglob('SEARLE'):
    if p.is_dir() and (p / 'src').exists():
        searle_input = p
        break

if searle_input:
    print(f"Found SEARLE at {searle_input}")
    # Copy to working directory to allow writing
    if not WORKING_PATH.exists():
        shutil.copytree(searle_input, WORKING_PATH)
        print("Copied SEARLE to working directory.")
    else:
        print("SEARLE already exists in working directory.")
else:
    print("Could not find SEARLE directory in inputs! Please check dataset structure.")

In [None]:
# 3. Run Image Concepts Association
import os

os.chdir(WORKING_PATH)
print(f"Current working directory: {os.getcwd()}")

# Run the script
# NOTE: Make sure the dataset path matches your Kaggle input path for FashionIQ
!python src/image_concepts_association.py --clip-model-name "ViT-B/32" --dataset fashioniq --dataset-path "/kaggle/input/fashioniq/FashionIQ" --split val

In [None]:
# 4. Zip Results for Download
!zip -r similar_concepts.zip data/similar_concepts

print("Done! You can now download similar_concepts.zip from the Output section.")