# Filtered searches

This tutorial walks through how to do filtered searches via the API. This is useful if you want to find specific parts of the data, especially if the dataset is large. Here, I will be testing on a small dataset for the purposes of this tutorial. This dataset is from deleting entity sets tutorial.

## Configurations

In [6]:
import openlattice
import olpy

In [12]:
baseurl = 'https://api.openlattice.com'

configuration = openlattice.Configuration()
configuration.host = baseurl
configuration.access_token = olpy.get_jwt(base_url = configuration.host)

edmAPI = openlattice.EdmApi(openlattice.ApiClient(configuration))
entitySetsAPI = openlattice.EntitySetsApi(openlattice.ApiClient(configuration))
permissionsAPI = openlattice.PermissionsApi(openlattice.ApiClient(configuration))
dataAPI = openlattice.DataApi(openlattice.ApiClient(configuration))
integrationAPI = openlattice.DataIntegrationsApi(openlattice.ApiClient(configuration))
orgAPI = openlattice.OrganizationsApi(openlattice.ApiClient(configuration))
searchAPI = openlattice.SearchApi(openlattice.ApiClient(configuration))

In [10]:
entitysetid = '0f898abd-edfe-4212-8642-74db49efa474'
data = dataAPI.load_entity_set_data(entitysetid)
data

[{'nc.SubjectIdentification': ['Child1'],
  'openlattice.@id': ['24220000-0000-0000-8000-00000001f3aa']},
 {'nc.SubjectIdentification': ['Woman2'],
  'openlattice.@id': ['260a0000-0000-0000-8000-00000001e2a1']},
 {'nc.SubjectIdentification': ['Child2'],
  'openlattice.@id': ['26140000-0000-0000-8000-00000001f19f']},
 {'nc.SubjectIdentification': ['Woman1'],
  'openlattice.@id': ['25fc0000-0000-0000-8000-00000001e3e2']}]

This dataset is small enough to pull everything all at once, but for the purposes of this exercise, let's filter the dataset on `Woman1`  _without_ first loading everything. To do so, you will use search constraints, as shown in the code below.

Note: There is a way to do fuzzy searches, but unfortunately I haven't been able to figure it out yet. This search is mostly useful for searching a specific category of a property.

In [32]:
propertytypeid = edmAPI.get_property_type_id(namespace = "nc", name = "SubjectIdentification")
condition = "Woman1"

search_str = f"entity.{propertytypeid}: '{condition}'"

constraint1 = openlattice.Constraint(
    type="simple",
    search_term=search_str,
    fuzzy=False
)

constraints = [constraint1]

constraintgroup = openlattice.ConstraintGroup(
    min = len(constraints),
    constraints = constraints
)


constraints = openlattice.SearchConstraints(
    entity_set_ids = [entitysetid],
    start = 0,
    max_hits = 10000,
    constraints = [constraintgroup]
)

results = searchAPI.search_entity_set_data(constraints)

results.hits

[{'nc.SubjectIdentification': ['Woman1'],
  'openlattice.@id': ['25fc0000-0000-0000-8000-00000001e3e2'],
  'openlattice.@lastWrite': ['2021-01-18T17:03:58.737172Z']}]

You can also search between certain dates. For example, one useful metadata is to look for the last write time. (Note, the following code will return the whole dataset, since I wrote them all at once).

In [33]:
import dateutil

starttime = dateutil.parser.parse("2021-01-15T16:00:00-08:00")
endtime = dateutil.parser.parse("2021-01-19T16:00:00-08:00")

constraint2 = openlattice.Constraint(
    type = 'writeDateTimeFilter',
    start = starttime.isoformat(),
    end = endtime.isoformat())

constraints = [constraint2]

constraintgroup = openlattice.ConstraintGroup(
    min = len(constraints),
    constraints = constraints
)


constraints = openlattice.SearchConstraints(
    entity_set_ids = [entitysetid],
    start = 0,
    max_hits = 10000,
    constraints = [constraintgroup]
)

results = searchAPI.search_entity_set_data(constraints)

results.hits

[{'nc.SubjectIdentification': ['Woman1'],
  'openlattice.@id': ['25fc0000-0000-0000-8000-00000001e3e2'],
  'openlattice.@lastWrite': ['2021-01-18T17:03:58.737172Z']},
 {'nc.SubjectIdentification': ['Child1'],
  'openlattice.@id': ['24220000-0000-0000-8000-00000001f3aa'],
  'openlattice.@lastWrite': ['2021-01-18T17:03:58.738098Z']},
 {'nc.SubjectIdentification': ['Woman2'],
  'openlattice.@id': ['260a0000-0000-0000-8000-00000001e2a1'],
  'openlattice.@lastWrite': ['2021-01-18T17:03:58.732095Z']},
 {'nc.SubjectIdentification': ['Child2'],
  'openlattice.@id': ['26140000-0000-0000-8000-00000001f19f'],
  'openlattice.@lastWrite': ['2021-01-18T17:03:58.733039Z']}]

You can search both at once like this:

In [34]:
constraints = [constraint1, constraint2]

constraintgroup = openlattice.ConstraintGroup(
    min = len(constraints),
    constraints = constraints
)


constraints = openlattice.SearchConstraints(
    entity_set_ids = [entitysetid],
    start = 0,
    max_hits = 10000,
    constraints = [constraintgroup]
)

results = searchAPI.search_entity_set_data(constraints)

results.hits

[{'nc.SubjectIdentification': ['Woman1'],
  'openlattice.@id': ['25fc0000-0000-0000-8000-00000001e3e2'],
  'openlattice.@lastWrite': ['2021-01-18T17:03:58.737172Z']}]

Note - our API can only handle 10k hits at a time. To search through more, you can try working with a while loop. Our [Chronicle job](https://github.com/openlattice/chronicle-processor/blob/master/utils/chronicle_processor.py#L10) searches via a while loop over dates. Alternatively, if it's too large, [assembled views](https://help.openlattice.com/article/95-integrated-data-viewing-and-joining-tables) may be a better method.