Skip to content

Feature: Query Search Images in a Dataset #388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Ashp116
Copy link

@Ashp116 Ashp116 commented Jun 15, 2025

Description

This PR adds query() and query_all() methods to enable query search across images in a dataset using Roboflow's /search/v1 endpoint. These methods support flexible search queries (e.g., by filename, tags, or project name) and allow users to:

  • Retrieve specific metadata fields (e.g., tags, filename, dimensions).
  • Control pagination via pageSize and continuationToken.
  • Stream results in batches (query_all) or retrieve a single page (query).

This enhancement improves dataset navigation and filtering by allowing more expressive, programmatic image searches within a project.

It addresses the issue discussed in #360 by providing a querying interface for image datasets.

List any dependencies that are required for this change:

This change does not introduce any new dependencies; it uses only existing libraries already required by the project.

Type of change

  • [ ✅ ] Bug fix (non-breaking change which fixes an issue)
  • [ ✅ ] New feature (non-breaking change which adds functionality)
  • [ ➖ ] This change requires a documentation update: Not really sure

How has this change been tested, please provide a testcase or example of how you tested the change?

The change was tested with the following example script, which performs a semantic search for images by filename in a dataset using the new query_all and query methods. It loads necessary queries all matching images in pages, collects the results, and prints the total count along with sample entries:

import os
from dotenv import load_dotenv
import roboflow

load_dotenv(".env")

API_KEY = os.getenv("API_KEY")
WORKSPACE = os.getenv("WORKSPACE")
PROJECT_ID = os.getenv("PROJECT_ID")

rf = roboflow.Roboflow(api_key=API_KEY)
workspace = rf.workspace(WORKSPACE)
project = workspace.project(PROJECT_ID)

filename = "4B-1K"

# Test query() - single page of results
single_page_results = project.query(query_str=f'filename:"{filename}"', page_size=10)
print(f"Single page results count: {len(single_page_results)}")
for image in single_page_results[:5]:
    print(image)

print("\n---\n")

# Test query_all() - all pages, streamed
all_results = []
for page in project.query_all(query_str=f'filename:"{filename}"', page_size=10):
    all_results.extend(page)

print(f"Total results from query_all: {len(all_results)}")
for image in all_results[:5]:
    print(image)

Any specific deployment considerations

N/A

Docs

  • [ ❌ ] Docs updated? What were the changes: (query and query_all functions were added)

@Ashp116 Ashp116 changed the title Query Search Images in a Dataset Feature: Query Search Images in a Dataset Jun 16, 2025
@CLAassistant
Copy link

CLAassistant commented Jun 20, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ Ashp116
❌ pre-commit-ci[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants