# Concept Space Traversal

Concept space traversal refers to navigating and exploring the high-dimensional vector space in which data points (such as words, images, or other entities) are embedded based on their semantic relationships or similarities. You can use this for exploring and manipulating the semantic relationships captured by embeddings, enabling applications like concept generation, similarity search, and analogy discovery. 

It provides a way to navigate the abstract space of concepts and ideas represented in high-dimensional vector spaces.

Key points about concept space traversal:

1. **Embedding space:** Data points are represented as vectors in a high-dimensional space, where similar concepts are positioned close to each other based on their semantic relationships. This space is often referred to as the embedding space or concept space.

2. **Similarity measures:** The proximity or similarity between data points in the embedding space is typically measured using cosine similarity or Euclidean distance. Points that are closer together are considered more semantically similar.

3. **Traversal methods:** Concept space traversal involves moving from one point to another within the embedding space. This can be done through various methods, such as:

   - Linear interpolation: Creating intermediate points between two concepts by taking weighted averages of their vector representations. 

   - Vector arithmetic: Performing operations like addition or subtraction on concept vectors to find analogies or explore relationships.

   - Nearest neighbour search: Finding the closest points to a given concept vector to discover related concepts.


In [None]:
import os

import fiftyone as fo
import fiftyone.brain as fob
import fiftyone.zoo as foz

For this notebook, you'll need to manually download the Colorswap dataset. You can find the dataset [here](https://github.com/Top34051/colorswap?tab=readme-ov-file).

Once the dataset is downloaded, you can get it into FiftyOne format.

First, you will need to "unpack" the json file that comes with the dataset:

In [None]:
import json 

images_path = "./colorswap"

# Load train.json
with open('./colorswap/train.json', 'r') as f:
    train_data = json.load(f)

# Load test.json
with open('/Users/harpreetsahota/workspace/datasets/colorswap/test.json', 'r') as f:
    test_data = json.load(f)

# Combine the two datasets
packed_annotations = train_data + test_data

unpacked_annotations = []

for item in packed_annotations:
    unpacked_annotations.append({
        "image_path": os.path.join(images_path, item["image_1"]),
        "caption": item["caption_1"],
        "image_source": item["image_source"]
    })
    unpacked_annotations.append({
        "image_path": os.path.join(images_path, item["image_2"]),
        "caption": item["caption_2"],
        "image_source": item["image_source"]
    })

Now that we have our custom labels, we can write helper functions that will define the schema for the FiftyOne dataset object:

In [None]:
import fiftyone as fo
import fiftyone.core.fields as fof
import os

def create_colorswap_dataset(name) -> fo.Dataset:
	"""
	Creates schema for a FiftyOne dataset.
	"""
	dataset = fo.Dataset(name=name, persistent=True, overwrite=True)

	dataset.add_sample_field(
		'prompt', 
		fof.StringField,
		description='Prompt that generated image'
		)

	dataset.add_sample_field(
		'image_source', 
		fof.StringField,
		description='Model that generated image'
		)
	
	return dataset


def create_fo_sample(image: dict) -> fo.Sample:
    """
    Creates a FiftyOne Sample from a given image entry with metadata and custom fields.

    Args:
        image (dict): A dictionary containing image data including the path and other properties.

    Returns:
        fo.Sample: The FiftyOne Sample object with the image and its metadata.
    """
    
    filepath = image.get('image_path')
    
    if not filepath:
        return None

    prompt = image.get('caption')
    image_source = image.get('image_source')

    sample = fo.Sample(
        filepath=filepath,
        prompt=prompt,
        image_source=image_source,
    )

    return sample

def add_samples_to_fiftyone_dataset(
	dataset: fo.Dataset,
	samples: list
	):
	"""
	Creates a FiftyOne dataset from a list of samples.

	Args:
		samples (list): _description_
		dataset_name (str): _description_
	"""
	dataset.add_samples(samples, dynamic=True)
	dataset.add_dynamic_sample_fields()

Now let's load it into FiftyOne format, like so:

In [None]:
dataset = create_colorswap_dataset("colorswap")

samples = [create_fo_sample(image) for image in unpacked_annotations]

add_samples_to_fiftyone_dataset(dataset, samples)

In [None]:
import fiftyone as fo

color_swap = fo.load_dataset("colorswap")

Note that this dataset is also available in FiftyOne format on the HuggingFace Hub

In [None]:
import fiftyone.utils.huggingface as fouh

color_swap = fouh.load_from_hub(
    "Voxel51/ColorSwap",
    name="colorswap_full",)


The Concept Traversal Plugin for FiftyOne allows users to navigate the space of concepts in their dataset using both text and images. 

Key points:

- You select a starting image from their dataset, then iteratively add text concepts with relative strengths to move around the multimodal embedding space.

- Behind the scenes, it generates embedding vectors for the text prompts, combines them with the starting image vector, and performs a similarity search on the dataset.

- Creating the plugin required generating a multimodal similarity index (e.g. using a CLIP model) on the dataset first.

To use the plugin, a similarity index that supports prompts (i.e., embeds both text and images) must be present on the dataset. 

This can be created using the `fiftyone.brain` module, specifically the `compute_similarity` function, which takes the dataset, a `brain_key`, the `model_name` (e.g., `clip-vit-base32-torch`), and the `metric` (e.g., `cosine`) as arguments.

The plugin can be installed by running the command `fiftyone plugins download https://github.com/jacobmarks/concept-interpolation`.

The plugin provides two main operators:

1. `open_interpolation_panel`: Opens the interpolation panel when clicked, but is only activated when the dataset has a similarity index.

2. `interpolator`: Runs the actual interpolation between the two text prompts.

In summary, this FiftyOne plugin enables users to explore the latent space between two text concepts by interpolating between their embeddings and visualizing the results, providing an interactive way to understand the relationships between different text prompts.


In [9]:
import os 

sim_index = fob.compute_similarity(
    color_swap,
    brain_key="concept_embeddings",
    embeddings="clip_embeddings",
    model="clip-vit-base32-torch",
    metric="cosine",
    )

Computing embeddings...
 100% |███████████████████| 38/38 [2.5s elapsed, 0s remaining, 15.7 samples/s]      


In [10]:
fob.compute_visualization(
    color_swap,
    embeddings="clip_embeddings",
    method="umap",
    brain_key = "umap_2d_clip",
    num_dims=2,
    num_workers = os.cpu_count(),
    progress=True, 
)

Generating visualization...
UMAP( verbose=True)
Thu Aug 22 13:29:20 2024 Construct fuzzy simplicial set
Thu Aug 22 13:29:20 2024 Finding Nearest Neighbors
Thu Aug 22 13:29:21 2024 Finished Nearest Neighbor Search
Thu Aug 22 13:29:22 2024 Construct embedding


Epochs completed:   0%|            0/500 [00:00]

	completed  0  /  500 epochs
	completed  50  /  500 epochs
	completed  100  /  500 epochs
	completed  150  /  500 epochs
	completed  200  /  500 epochs
	completed  250  /  500 epochs
	completed  300  /  500 epochs
	completed  350  /  500 epochs
	completed  400  /  500 epochs
	completed  450  /  500 epochs
Thu Aug 22 13:29:23 2024 Finished embedding


<fiftyone.brain.visualization.VisualizationResults at 0x3e7b5b520>

In [None]:
session = fo.launch_app(color_swap)