## Search Api

This API provides the possibility to search for featuregroup, featureview, trainingdataset and feature.

## Prerequisite

Before you run this notebook you need to define a tag schema.

![create tag](../images/create_tag.png)

This tag schema will be used to attach tags to a featuregroup. To learn more about tag schemas goto [Define a tag schema](https://docs.hopsworks.ai/latest/user_guides/fs/tags/tags/#step-1-define-a-tag-schema).

## Scope

* Create featuregroup, featureview and trainingdataset
* Attach tag and keywords
* Search for featuregroup, featureview and trainingdataset
* Search for featuregroup, featureview and trainingdataset by tag
* Search for featuregroup, featureview and trainingdataset by keyword
* Get entity from search result

In [1]:
import hopsworks

In [None]:
# Connect to your cluster, to be used running inside Jupyter or jobs inside the cluster.
project = hopsworks.login()
# Uncomment when connecting to the cluster from an external environment.
#project = hopsworks.login(host="hopsworks.ai.local", api_key_file='api_key')

2025-07-02 09:59:50,424 INFO: Closing external client and cleaning up certificates.
Connection closed.
2025-07-02 09:59:50,443 INFO: Initializing external client
2025-07-02 09:59:50,443 INFO: Base URL: https://hopsworks.ai.local:443
2025-07-02 09:59:51,509 INFO: Python Engine initialized.

Logged in to project, explore it here https://hopsworks.ai.local:443/p/119


### Create featuregroup, featureview and trainingdataset

In [5]:
import pandas as pd

# Data loading
original_iris_df = pd.read_csv(
    "https://repo.hops.works/master/hopsworks-tutorials/data/iris.csv"
).reset_index()

fs = project.get_feature_store()
# Feature Group
iris_fg = fs.get_or_create_feature_group(
    name="iris_model",
    version=1,
    primary_key=["index"],
    description="Iris flower dataset",
)

iris_fg.insert(original_iris_df, write_options={"wait_for_job": True})

iris_fv = fs.get_or_create_feature_view(
    name="iris_model",
    version=1,
    description="Read from Iris flower dataset",
    labels=["variety"],
    query=iris_fg.select_all(),
)
td_version, _ = iris_fv.create_training_data(data_format="parquet")

iris_td = fs.get_training_dataset("iris_model_1", td_version)


Feature Group created successfully, explore it at 
https://hopsworks.ai.local:443/p/119/fs/67/fg/13


Uploading Dataframe: 100.00% |██████████| Rows 150/150 | Elapsed Time: 00:00 | Remaining Time: 00:00


Launching job: iris_model_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://hopsworks.ai.local:443/p/119/jobs/named/iris_model_1_offline_fg_materialization/executions
2025-07-02 10:00:08,877 INFO: Waiting for execution to finish. Current state: SUBMITTED. Final status: UNDEFINED
2025-07-02 10:00:12,003 INFO: Waiting for execution to finish. Current state: RUNNING. Final status: UNDEFINED
2025-07-02 10:01:45,471 INFO: Waiting for execution to finish. Current state: AGGREGATING_LOGS. Final status: SUCCEEDED
2025-07-02 10:01:45,640 INFO: Waiting for log aggregation to finish.
2025-07-02 10:01:54,105 INFO: Execution finished successfully.
Feature view created successfully, explore it at 
https://hopsworks.ai.local:443/p/119/fs/67/fv/iris_model/version/1
Finished: Materializing data to Hopsworks, using Hopsworks Feature Query Service (1.92s) 




#### Attach a tag

In [7]:
tag_name = "iris_tag"
iris_fg.add_tag(tag_name, {"name": "iris_tag_fg", "id": 1})
iris_fv.add_tag(tag_name, {"name": "iris_tag_fv", "id": 2})
iris_td.add_tag(tag_name, {"name": "iris_tag_td", "id": 3})

#### Attach keywords

In [8]:
import json
import time
from hopsworks import client


def add_keywords(projectId: int, featurestoreId: int, keywords: list[str], params=[]):
    _client = client.get_instance()

    path_params = [
        "project",
        projectId,
        "featurestores",
        featurestoreId,
    ]
    path_params.extend(params)
    path_params.append("keywords")
    headers = {"content-type": "application/json"}
    data = {"keywords": keywords}

    _client._send_request("POST", path_params, data=json.dumps(data), headers=headers)


# Add keywords to the feature group
add_keywords(
    projectId=project.id,
    featurestoreId=fs.id,
    keywords=["iris_fg", "flower_fg"],
    params=["featuregroups", str(iris_fg.id)],
)
# Add keywords to the feature view
add_keywords(
    projectId=project.id,
    featurestoreId=fs.id,
    keywords=["iris_fv", "flower_fv"],
    params=["featureview", iris_fv.name, "version", str(iris_fv.version)],
)
# Add keywords to the training dataset
add_keywords(
    projectId=project.id,
    featurestoreId=fs.id,
    keywords=["iris_td", "flower_td"],
    params=[
        "featureview",
        iris_fv.name,
        "version",
        str(iris_fv.version),
        "trainingdatasets",
        "version",
        str(iris_td.version),
    ],
)

time.sleep(20)  # give it time to index

### Search for featuregroup, featureview and trainingdataset

In [9]:
search_api = project.get_search_api()

In [10]:
featurestore_result = search_api.featurestore_search("iris_model")
featurestore_result.to_dict()

{'featuregroups': [{'name': 'iris_model',
   'version': 1,
   'description': 'Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T07:59:56.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': '<em>iris_model</em>',
    'description': None,
    'features': [],
    'tags': [],
    'other_xattrs': None},
   'creator': {'username': 'meb10000',
    'firstname': 'Admin',
    'lastname': 'Admin',
    'email': 'admin@hopsworks.ai'}}],
 'feature_views': [{'name': 'iris_model',
   'version': 1,
   'description': 'Read from Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T08:01:55.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': '<em>iris_model</em>',
    'description': None,
    'features': [],
    'tags': [],
    'other_xattrs': {'xattr.featurestore.fv_features.name':

### Search for featuregroup, featureview and trainingdataset by tag

In [11]:
# search by any tag name, key or value
featurestore_result = search_api.featurestore_search("iris_tag", "tag")
featurestore_result.to_dict()

{'featuregroups': [{'name': 'iris_model',
   'version': 1,
   'description': 'Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T07:59:56.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': None,
    'description': None,
    'features': [],
    'tags': [{'key': '<em>iris_tag</em>', 'value': None},
     {'key': None, 'value': '{"name": "<em>iris_tag_fg</em>", "id": 1}'}],
    'other_xattrs': None},
   'creator': {'username': 'meb10000',
    'firstname': 'Admin',
    'lastname': 'Admin',
    'email': 'admin@hopsworks.ai'}}],
 'feature_views': [{'name': 'iris_model',
   'version': 1,
   'description': 'Read from Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T08:01:55.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': None,
    'description': None,
    'feature

In [12]:
# search by any tag name
featurestore_result = search_api.featurestore_search("iris_tag", "tag_name")
featurestore_result.to_dict()

{'featuregroups': [{'name': 'iris_model',
   'version': 1,
   'description': 'Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T07:59:56.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': None,
    'description': None,
    'features': [],
    'tags': [{'key': '<em>iris_tag</em>', 'value': None},
     {'key': None, 'value': '{"name": "<em>iris_tag_fg</em>", "id": 1}'}],
    'other_xattrs': None},
   'creator': {'username': 'meb10000',
    'firstname': 'Admin',
    'lastname': 'Admin',
    'email': 'admin@hopsworks.ai'}}],
 'feature_views': [{'name': 'iris_model',
   'version': 1,
   'description': 'Read from Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T08:01:55.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': None,
    'description': None,
    'feature

In [14]:
# search by any tag key
featurestore_result = search_api.featurestore_search("name", "tag_key")
featurestore_result.to_dict()

{'featuregroups': [{'name': 'iris_model',
   'version': 1,
   'description': 'Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T07:59:56.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': None,
    'description': None,
    'features': [],
    'tags': [{'key': None,
      'value': '{"<em>name</em>": "iris_tag_fg", "id": 1}'}],
    'other_xattrs': None},
   'creator': {'username': 'meb10000',
    'firstname': 'Admin',
    'lastname': 'Admin',
    'email': 'admin@hopsworks.ai'}}],
 'feature_views': [{'name': 'iris_model',
   'version': 1,
   'description': 'Read from Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T08:01:55.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': None,
    'description': None,
    'features': [],
    'tags': [{'key': None,
      'va

In [15]:
# search by any tag value
featurestore_result = search_api.featurestore_search("iris_tag_fg", "tag_value")
featurestore_result.to_dict()

{'featuregroups': [{'name': 'iris_model',
   'version': 1,
   'description': 'Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T07:59:56.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': None,
    'description': None,
    'features': [],
    'tags': [{'key': None,
      'value': '{"name": "<em>iris_tag_fg</em>", "id": 1}'}],
    'other_xattrs': None},
   'creator': {'username': 'meb10000',
    'firstname': 'Admin',
    'lastname': 'Admin',
    'email': 'admin@hopsworks.ai'}}],
 'feature_views': [],
 'trainingdatasets': [],
 'features': [],
 'featuregroups_from': 0,
 'featuregroups_total': 1,
 'feature_views_from': 0,
 'feature_views_total': 0,
 'trainingdatasets_from': 0,
 'trainingdatasets_total': 0,
 'features_from': 0,
 'features_total': 0}

### Search for featuregroup, featureview and trainingdataset by keyword

In [17]:
featurestore_result = search_api.featurestore_search("iris", "keyword")
featurestore_result.to_dict()

{'featuregroups': [{'name': 'iris_model',
   'version': 1,
   'description': 'Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T07:59:56.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'project1'},
   'highlights': {'name': '<em>iris_model</em>',
    'description': '<em>Iris</em> flower dataset',
    'features': [],
    'tags': [{'key': '<em>iris_tag</em>', 'value': None},
     {'key': None, 'value': '{"name": "<em>iris_tag_fg</em>", "id": 1}'}],
    'other_xattrs': {'xattr.keywords': '<em>iris_fg</em>'}},
   'creator': {'username': 'meb10000',
    'firstname': 'Admin',
    'lastname': 'Admin',
    'email': 'admin@hopsworks.ai'}}],
 'feature_views': [{'name': 'iris_model',
   'version': 1,
   'description': 'Read from Iris flower dataset',
   'featurestore_id': 67,
   'created': '2025-07-02T08:01:55.000Z',
   'parent_project_id': 119,
   'parent_project_name': 'project1',
   'access_projects': {'119': 'proj

### Get entity from search result

In [18]:
featuregroup = featurestore_result.featuregroups[0].get_feature_group()
featuregroup.to_dict()

{'id': 13,
 'name': 'iris_model',
 'version': 1,
 'description': 'Iris flower dataset',
 'onlineEnabled': False,
 'timeTravelFormat': 'HUDI',
 'features': [Feature('index', 'bigint', None, True, False, False, None, None, 13),
  Feature('sepal_length', 'double', None, False, False, False, None, None, 13),
  Feature('sepal_width', 'double', None, False, False, False, None, None, 13),
  Feature('petal_length', 'double', None, False, False, False, None, None, 13),
  Feature('petal_width', 'double', None, False, False, False, None, None, 13),
  Feature('variety', 'string', None, False, False, False, None, None, 13)],
 'featurestoreId': 67,
 'type': 'streamFeatureGroupDTO',
 'statisticsConfig': StatisticsConfig(True, False, False, False, []),
 'eventTime': None,
 'expectationSuite': None,
 'parents': None,
 'topicName': None,
 'notificationTopicName': None,
 'deprecated': False,
 'transformationFunctions': [],
 'dataSource': {'query': None,
  'database': None,
  'group': None,
  'table': Non

In [19]:
featuregroup.show(5)

Finished: Reading data from Hopsworks, using Hopsworks Feature Query Service (0.34s) 


Unnamed: 0,index,sepal_length,sepal_width,petal_length,petal_width,variety
0,104,6.5,3.0,5.8,2.2,Virginica
1,26,5.0,3.4,1.6,0.4,Setosa
2,1,4.9,3.0,1.4,0.2,Setosa
3,131,7.9,3.8,6.4,2.0,Virginica
4,21,5.1,3.7,1.5,0.4,Setosa
