# Visual Layer SDK Explorer

This notebook shows you how to use Visual Layer SDK to upload, explore, and filter datasets. 

**Prerequisite**

You need to have valid Visual Layer account in order to be able to interact with Visual Layer service. 
Once you have an account, you will need to get VISUAL_LAYER_API_KEY and VISUAL_LAYER_API_SECRET and put it in environment variables.

In [1]:
# uncomment this to install necessary packages
#!pip install matplotlib
#!pip install visual-layer-sdk -i https://test.pypi.org/simple/


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Looking in indexes: https://test.pypi.org/simple/

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
pip install -i https://test.pypi.org/simple/ visual-layer-sdk==0.1.24

Looking in indexes: https://test.pypi.org/simple/

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [3]:
import os
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from IPython.display import Image, display
from visual_layer_sdk import VisualLayerClient
from visual_layer_sdk.dataset import IssueType, SearchOperator, SemanticRelevance
from dotenv import load_dotenv
load_dotenv()
VL_API_KEY = os.getenv('VISUAL_LAYER_API_KEY')
VL_API_SECRET = os.getenv('VISUAL_LAYER_API_SECRET')

if VL_API_KEY is None or VL_API_SECRET is None:
    raise EnvironmentError("Please configure VISUAL_LAYER_API_KEY and VISUAL_LAYER_API_SECRET environment variable.")

# Initialize the client and do a quick health check
client = VisualLayerClient(api_key=VL_API_KEY, api_secret=VL_API_SECRET)
print(f"client status: {client.healthcheck()}")

client status: {'status': 'Green', 'service-health-message': 'none', 'version': '2.42.190'}


In [4]:
# utility function to display images in a given dataset with input image IDs
def display_images(ds, image_uris):
    for index, image_uri in enumerate(image_uris):
        image_url = ds.get_image_info(image_uri)['image_uri']
        display(Image(url=image_url))

# Basic Dataset Exploration APIs

In [5]:
# Get all datasets
datasets = client.get_all_datasets()
print(f"Found {len(datasets)} datasets")

Found 40 datasets


In [6]:
# Get a specific dataset by dataset ID
ds_id = datasets.iloc[0]['id']
ds = client.get_dataset_object(ds_id)
details = ds.get_details()
print(f"Dataset details: {details}")

Dataset details: {'id': 'b66848f6-56d1-11f0-9bb7-7e21cd95c256', 'created_by': 'd92cf865-4110-404d-8250-97711aaaf3ae', 'source_dataset_id': None, 'owned_by': 'VL', 'display_name': 'Dataset 11', 'description': '', 'preview_uri': '', 'source_type': 'VL', 'source_uri': '', 'created_at': '2025-07-01T23:18:35.728673', 'updated_at': None, 'filename': None, 'sample': None, 'status': 'INITIALIZING'}


In [7]:
# Get a specific dataset by dataset ID, the 'Beans Enriched' dataset is a public dataset you can explore
LEAF_DATASET_ID = 'bc41491e-78ae-11ef-ba4b-8a774758b536' 
leaf_ds = client.get_dataset_object(LEAF_DATASET_ID)

# export all images in the dataset to a dataframe
leaf_df = leaf_ds.export_to_dataframe()

# display first 2 images
display_images(leaf_ds, leaf_df.iloc[:2]['media_id'])

# 5 Search Methods in Visual Layer SDK

The SDK provides 5 main search methods:
1. **Label Search** - Find images by object/classification labels
2. **Caption Search** - Semantic search using AI-generated captions
3. **Issue Search** - Filter by quality issues (blur, brightness, etc.)
4. **Visual Similarity Search** - Find visually similar images
5. **Semantic Search** - Natural language search with relevance control

## 1. Label Search

Search for images containing specific labels or classifications.

In [8]:
# Search datasets by labels
# Find the leaf with spot and bean rust
filtered_df = leaf_ds.search_by_labels(labels=['angular_leaf_spot', 'bean_rust'])

print(f"Found {len(filtered_df)} images with angular_leaf_spot or bean_rust")

# display first 2 images
if not filtered_df.empty:
    display_images(leaf_ds, filtered_df.iloc[:2]['media_id'])

Found 868 images with angular_leaf_spot or bean_rust


In [9]:
# Search for a single label
healthy_leaf_df = leaf_ds.search_by_labels(labels='healthy')
print(f"Found {len(healthy_leaf_df)} images with healthy leaves")

if not healthy_leaf_df.empty:
    display_images(leaf_ds, healthy_leaf_df.iloc[:2]['media_id'])

Found 427 images with healthy leaves


## 2. Caption Search

Search using natural language descriptions generated by AI.

In [10]:
# Search datasets by captions
# find the data with a person holding leaf
filtered_df = leaf_ds.search_by_captions(captions=['a person holding leaf'])

print(f"Found {len(filtered_df)} images matching 'a person holding leaf'")

if not filtered_df.empty:
    display_images(leaf_ds, filtered_df.iloc[:2]['media_id'])

Found 463 images matching 'a person holding leaf'


In [11]:
# Search with a single caption string
disease_df = leaf_ds.search_by_captions(captions='leaf')
print(f"Found {len(disease_df)} images matching 'leaf'")

if not disease_df.empty:
    display_images(leaf_ds, disease_df.iloc[:2]['media_id'])

Found 1047 images matching 'leaf'


## 3. Issue Search

Filter images by quality issues like blur, brightness, etc.

In [12]:
# Search datasets by issue types
PET_DATASET_ID = '3972b3fc-1809-11ef-bb76-064432e0d220'
pet_dataset = client.get_dataset_object(PET_DATASET_ID)

# Find dataset that are blur
filtered_df = pet_dataset.search_by_issues(issue_type=IssueType.BLUR)
print(f"Found {len(filtered_df)} blurry images")

if not filtered_df.empty:
    display_images(pet_dataset, filtered_df.iloc[:2]['media_id'])

Found 90 blurry images


In [13]:
# Search for multiple issue types
quality_issues_df = pet_dataset.search_by_issues(
    issue_type=[IssueType.OUTLIERS]
)
print(f"Found {len(quality_issues_df)} images with quality issues")

if not quality_issues_df.empty:
    display_images(pet_dataset, quality_issues_df.iloc[1:3]['media_id'])

Found 8 images with quality issues


## 4. Visual Similarity Search

Find images visually similar to a reference image.

In [20]:
# Similarity search, upload a file from your local folder 
local_image = os.path.join(os.getcwd(), 'dog.jpeg')
                      
similar_images_df = pet_dataset.search_by_visual_similarity(image_path=local_image, threshold=0.4)
print(f"Found {len(similar_images_df)} visually similar images")

if not similar_images_df.empty:
    display_images(pet_dataset, similar_images_df.iloc[:2]['media_id'])

Found 476 visually similar images


In [15]:
# Search with multiple reference images
# similar_images_df = pet_dataset.search_by_visual_similarity(
#     image_path=[local_image, 'path/to/another/image.jpg']
# )
# print(f"Found {len(similar_images_df)} images similar to reference images")

# if not similar_images_df.empty:
#     display_images(pet_dataset, similar_images_df.iloc[:2]['media_id'])

## 5. Semantic Search

Natural language search with relevance control.

In [16]:
# Semantic search with default relevance
SEARCH_DATASET_ID = '83e13af6-5b67-11f0-ae55-825b18749830'
search_dataset = client.get_dataset_object(SEARCH_DATASET_ID)



semantic_df = search_dataset.search_by_semantic(text="vehicle")
print(f"Found {len(semantic_df)} images semantically matching 'vehicle'")

if not semantic_df.empty:
    display_images(pet_dataset, semantic_df.iloc[:2]['media_id'])

Found 608 images semantically matching 'vehicle'


In [17]:
# Semantic search with different relevance levels
# High relevance (more strict matching)
high_relevance_df = pet_dataset.search_by_semantic(
    text="dog playing", 
    relevance=SemanticRelevance.HIGH_RELEVANCE
)
print(f"Found {len(high_relevance_df)} images with high relevance to 'dog playing'")

# Low relevance (more lenient matching)
low_relevance_df = pet_dataset.search_by_semantic(
    text="dog playing", 
    relevance=SemanticRelevance.LOW_RELEVANCE
)
print(f"Found {len(low_relevance_df)} images with low relevance to 'dog playing'")

if not high_relevance_df.empty:
    print("\nHigh relevance results:")
    display_images(pet_dataset, high_relevance_df.iloc[:2]['media_id'])

if not low_relevance_df.empty:
    print("\nLow relevance results:")
    display_images(pet_dataset, low_relevance_df.iloc[:2]['media_id'])

⚠️  Semantic search is not enabled for this dataset.
Found 0 images with high relevance to 'dog playing'
⚠️  Semantic search is not enabled for this dataset.
Found 0 images with low relevance to 'dog playing'


## Search Operators

Different search methods support various operators for more precise filtering:

In [18]:
# Label search with different operators
# IS_ONE_OF (default) - find images with any of the specified labels
any_label_df = leaf_ds.search_by_labels(
    labels=['angular_leaf_spot', 'bean_rust'], 
    search_operator=SearchOperator.IS_ONE_OF
)
print(f"Found {len(any_label_df)} images with any of the specified labels")

# IS - find images with all specified labels (when multiple labels)
# IS_NOT - find images that don't match the labels
# IS_NOT_ONE_OF - find images that don't have any of the specified labels

# Caption search with different operators
# IS (default) - find images matching the combined caption text
combined_caption_df = leaf_ds.search_by_captions(
    captions=['person', 'holding', 'leaf'], 
    search_operator=SearchOperator.IS
)
print(f"Found {len(combined_caption_df)} images matching combined caption text")

Found 868 images with any of the specified labels
Found 463 images matching combined caption text
