On this page
The AppSearch
client can either be configured directly:
# Use the AppSearch client directly:
from elastic_enterprise_search import AppSearch
app_search = AppSearch(
"http://localhost:3002",
http_auth="private-..."
)
# Now call API methods
app_search.search(...)
…or can be used via a configured EnterpriseSearch.app_search
instance:
from elastic_enterprise_search import EnterpriseSearch
ent_search = EnterpriseSearch("http://localhost:3002")
# Configure authentication of the AppSearch instance
ent_search.app_search.http_auth = "private-..."
# Now call API methods
ent_search.app_search.search(...)
Engines index documents and perform search functions. To use App Search you must first create an Engine.
Let’s create an Engine named national-parks
and uses English
as a language:
# Request:
app_search.create_engine(
engine_name="national-parks",
language="en",
)
# Response:
{
"name": "national-parks",
"type": "default",
"language": "en"
}
Once we’ve created an Engine we can look at it:
# Request:
app_search.get_engine(engine_name="national-parks")
# Response:
{
"document_count": 0,
"language": "en",
"name": "national-parks",
"type": "default"
}
We can see all our Engines in the App Search instance:
# Request:
app_search.list_engines()
# Response:
{
"meta": {
"page": {
"current": 1,
"size": 25,
"total_pages": 1,
"total_results": 1
}
},
"results": [
{
"document_count": 0,
"language": "en",
"name": "national-parks",
"type": "default"
}
]
}
Once you’ve created an Engine you can start adding documents
with the index_documents()
method:
# Request:
app_search.index_documents(
engine_name="national-parks",
documents=[{
"id": "park_rocky-mountain",
"title": "Rocky Mountain",
"nps_link": "https://www.nps.gov/romo/index.htm",
"states": [
"Colorado"
],
"visitors": 4517585,
"world_heritage_site": False,
"location": "40.4,-105.58",
"acres": 265795.2,
"date_established": "1915-01-26T06:00:00Z"
}, {
"id": "park_saguaro",
"title": "Saguaro",
"nps_link": "https://www.nps.gov/sagu/index.htm",
"states": [
"Arizona"
],
"visitors": 820426,
"world_heritage_site": False,
"location": "32.25,-110.5",
"acres": 91715.72,
"date_established": "1994-10-14T05:00:00Z"
}]
)
# Response:
[
{
"errors": [],
"id": "park_rocky-mountain"
},
{
"errors": [],
"id": "park_saguaro"
}
]
Both of our new documents indexed without errors.
Now we can look at our indexed documents in the engine:
# Request:
app_search.list_documents(engine_name="national-parks")
# Response:
{
"meta": {
"page": {
"current": 1,
"size": 100,
"total_pages": 1,
"total_results": 2
}
},
"results": [
{
"acres": "91715.72",
"date_established": "1994-10-14T05:00:00Z",
"id": "park_saguaro",
"location": "32.25,-110.5",
"nps_link": "https://www.nps.gov/sagu/index.htm",
"states": [
"Arizona"
],
"title": "Saguaro",
"visitors": "820426",
"world_heritage_site": "false"
},
{
"acres": "265795.2",
"date_established": "1915-01-26T06:00:00Z",
"id": "park_rocky-mountain",
"location": "40.4,-105.58",
"nps_link": "https://www.nps.gov/romo/index.htm",
"states": [
"Colorado"
],
"title": "Rocky Mountain",
"visitors": "4517585",
"world_heritage_site": "false"
}
]
}
You can also retrieve a set of documents by their id
with
the get_documents()
method:
# Request:
app_search.get_documents(
engine_name="national-parks",
document_ids=["park_rocky-mountain"]
)
# Response:
[
{
"acres": "265795.2",
"date_established": "1915-01-26T06:00:00Z",
"id": "park_rocky-mountain",
"location": "40.4,-105.58",
"nps_link": "https://www.nps.gov/romo/index.htm",
"states": [
"Colorado"
],
"title": "Rocky Mountain",
"visitors": "4517585",
"world_heritage_site": "false"
}
]
You can update documents with the put_documents()
method:
# Request:
resp = app_search.put_documents(
engine_name="national-parks",
documents=[{
"id": "park_rocky-mountain",
"visitors": 10000000
}]
)
# Response:
[
{
"errors": [],
"id": "park_rocky-mountain"
}
]
Now that we’ve indexed some data we should take a look at the way the data is being indexed by our Engine.
First take a look at the existing Schema inferred from our data:
# Request:
resp = app_search.get_schema(
engine_name="national-parks"
)
# Response:
{
"acres": "text",
"date_established": "text",
"location": "text",
"nps_link": "text",
"states": "text",
"title": "text",
"visitors": "text",
"world_heritage_site": "text"
}
Looks like the date_established
field wasn’t indexed
as a date
as desired. Update the type of the date_established
field:
# Request:
resp = app_search.put_schema(
engine_name="national-parks",
schema={
"date_established": "date"
}
)
# Response:
{
"acres": "number",
"date_established": "date", # Type has been updated!
"location": "geolocation",
"nps_link": "text",
"square_km": "number",
"states": "text",
"title": "text",
"visitors": "number",
"world_heritage_site": "text"
}
Once documents are ingested and the Schema is set properly
you can use the search()
method to search through an Engine
for matching documents.
The Search API has many options, read the Search API documentation for a list of all options.
# Request:
resp = app_search.search(
engine_name="national-parks",
body={
"query": "rock"
}
)
# Response:
{
"meta": {
"alerts": [],
"engine": {
"name": "national-parks-demo",
"type": "default"
},
"page": {
"current": 1,
"size": 10,
"total_pages": 2,
"total_results": 15
},
"request_id": "6266df8b-8b19-4ff0-b1ca-3877d867eb7d",
"warnings": []
},
"results": [
{
"_meta": {
"engine": "national-parks-demo",
"id": "park_rocky-mountain",
"score": 6776379.0
},
"acres": {
"raw": 265795.2
},
"date_established": {
"raw": "1915-01-26T06:00:00+00:00"
},
"id": {
"raw": "park_rocky-mountain"
},
"location": {
"raw": "40.4,-105.58"
},
"nps_link": {
"raw": "https://www.nps.gov/romo/index.htm"
},
"square_km": {
"raw": 1075.6
},
"states": {
"raw": [
"Colorado"
]
},
"title": {
"raw": "Rocky Mountain"
},
"visitors": {
"raw": 4517585.0
},
"world_heritage_site": {
"raw": "false"
}
}
]
}
Multiple searches can be executed at the same time with the multi_search()
method:
# Request:
resp = app_search.multi_search(
engine_name="national-parks",
body={
"queries": [
{"query": "rock"},
{"query": "lake"}
]
}
)
# Response:
[
{
"meta": {
"alerts": [],
"engine": {
"name": "national-parks-demo",
"type": "default"
},
"page": {
"current": 1,
"size": 1,
"total_pages": 15,
"total_results": 15
},
"warnings": []
},
"results": [
{
"_meta": {
"engine": "national-parks",
"id": "park_rocky-mountain",
"score": 6776379.0
},
"acres": {
"raw": 265795.2
},
"date_established": {
"raw": "1915-01-26T06:00:00+00:00"
},
"id": {
"raw": "park_rocky-mountain"
},
"location": {
"raw": "40.4,-105.58"
},
"nps_link": {
"raw": "https://www.nps.gov/romo/index.htm"
},
"square_km": {
"raw": 1075.6
},
"states": {
"raw": [
"Colorado"
]
},
"title": {
"raw": "Rocky Mountain"
},
"visitors": {
"raw": 4517585.0
},
"world_heritage_site": {
"raw": "false"
}
}
]
},
...
]
Curations hide or promote result content for pre-defined search queries.
# Request:
resp = app_search.create_curation(
engine_name="national-parks",
queries=["rocks", "rock", "hills"],
promoted_doc_ids=["park_rocky-mountains"],
hidden_doc_ids=["park_saguaro"]
)
# Response:
{
"id": "cur-6011f5b57cef06e6c883814a"
}
# Request:
resp = app_search.get_curation(
engine_name="national-parks",
curation_id="cur-6011f5b57cef06e6c883814a"
)
{
"hidden": [
"park_saguaro"
],
"id": "cur-6011f5b57cef06e6c883814a",
"promoted": [
"park_rocky-mountains"
],
"queries": [
"rocks",
"rock",
"hills"
]
}
# Request:
app_search.put_curation(
engine_name='my-engine',
curation_id='cur-6011f5b57cef06e6c883814a',
queries=["foo", "bar"],
promoted=["doc-1", "doc-2"],
hidden=["doc-3"]
)
# Response:
{
"id": "cur-6011f5b57cef06e6c883814a"
}
Meta Engines is an Engine that has no documents of its own, instead it combines multiple other Engines so that they can be searched together as if they were a single Engine.
The Engines that comprise a Meta Engine are referred to as "Source Engines".
Creating a Meta Engine uses the create_engine()
method
and set the type
parameter to "meta"
.
# Request:
app_search.create_engine(
engine_name="meta-engine",
type="meta",
source_engines=["national-parks"]
)
# Response:
{
"document_count": 1,
"name": "meta-engine",
"source_engines": [
"national-parks"
],
"type": "meta"
}
# Request:
app_search.search(
engine_name="meta-engine",
body={
"query": "rock"
}
)
# Response:
{
"meta": {
"alerts": [],
"engine": {
"name": "meta-engine",
"type": "meta"
},
"page": {
"current": 1,
"size": 10,
"total_pages": 1,
"total_results": 1
},
"request_id": "aef3d3d3-331c-4dab-8e77-f42e4f46789c",
"warnings": []
},
"results": [
{
"_meta": {
"engine": "national-parks",
"id": "park_black-canyon-of-the-gunnison",
"score": 2.43862
},
"id": {
"raw": "national-parks|park_black-canyon-of-the-gunnison"
},
"nps_link": {
"raw": "https://www.nps.gov/blca/index.htm"
},
"square_km": {
"raw": 124.4
},
"states": {
"raw": [
"Colorado"
]
},
"title": {
"raw": "Black Canyon of the Gunnison"
},
"world_heritage_site": {
"raw": "false"
}
}
]
}
Notice how the id
of the result we receive (national-parks|park_black-canyon-of-the-gunnison
)
includes a prefix of the Source Engine that the result is from to distinguish them from
results with the same id
but different Source Engine within a search result.
If we have an existing Meta Engine named meta-engine
we can add additional Source Engines to it with the
add_meta_engine_source()
method. Here we add the
state-parks
Engine:
# Request:
app_search.add_meta_engine_source(
engine_name="meta-engine",
source_engines=["state-parks"]
)
# Response:
{
"document_count": 1,
"name": "meta-engine",
"source_engines": [
"national-parks",
"state-parks"
],
"type": "meta"
}
If we change our mind about state-parks
being a Source Engine for
meta-engine
we can use the delete_meta_source_engines()
method:
# Request:
app_search.delete_meta_engine_source(
engine_name="meta-engine",
source_engines=["state-parks"]
)
# Response:
{
"document_count": 1,
"name": "meta-engine",
"source_engines": [
"national-parks"
],
"type": "meta"
}
# Create a domain
resp = app_search.create_crawler_domain(
engine_name="crawler-engine",
body={
"name": "https://example.com"
}
)
domain_id = resp["id"]
# Get a domain
app_search.get_crawler_domain(
engine_name="crawler-engine",
domain_id=domain_id
)
# Update a domain
app_search.put_crawler_domain(
engine_name="crawler-engine",
domain_id=domain_id,
body={
...
}
)
# Delete a domain
app_search.delete_crawler_domain(
engine_name="crawler-engine",
domain_id=domain_id
)
# Validate a domain
app_search.get_crawler_domain_validation_result(
body={
"url": "https://example.com",
"checks": [
"dns",
"robots_txt",
"tcp",
"url",
"url_content",
"url_request"
]
}
)
# Extract content from a URL
app_search.get_crawler_url_extraction_result(
engine_name="crawler-engine",
body={
"url": "https://example.com"
}
)
# Trace a URL
app_search.get_crawler_url_tracing_result(
engine_name="crawler-engine",
body={
"url": "https://example.com"
}
)
# Get the active crawl
app_search.get_crawler_active_crawl_request(
engine_name="crawler-engine",
)
# Start a crawl
app_search.create_crawler_crawl_request(
engine_name="crawler-engine"
)
# Cancel the active crawl
app_search.delete_crawler_active_crawl_request(
engine_name="crawler-engine"
)
# Create an entry point
resp = app_search.create_crawler_entry_point(
engine_name="crawler-engine",
body={
"value": "/blog"
}
)
entry_point_id = resp["id"]
# Delete an entry point
app_search.delete_crawler_entry_point(
engine_name="crawler-engine",
entry_point_id=entry_point_id
)
# Create a crawl rule
resp = app_search.create_crawler_crawl_rule(
engine_name="crawler-engine",
domain_id=domain_id,
body={
"policy": "deny",
"rule": "ends",
"pattern": "/dont-crawl"
}
)
crawl_rule_id = resp["id"]
# Delete a crawl rule
app_search.delete_crawler_crawl_rule(
engine_name="crawler-engine",
domain_id=domain_id,
crawl_rule_id=crawl_rule_id
)
# Create a sitemap
resp = app_search.create_crawler_sitemap(
engine_name="crawler-engine",
domain_id=domain_id,
url="https://example.com/sitemap.xml"
)
sitemap_id = resp["id"]
# Delete a sitemap
app_search.delete_crawler_sitemap(
engine_name="crawler-engine",
domain_id=domain_id,
sitemap_id=sitemap_id
)
# Get adaptive relevenace settings for an Engine
app_search.get_adaptive_relevance_settings(
engine_name="adaptive-engine"
)
{
"curation": {
"enabled": True,
"mode": "manual",
"timeframe": 7,
"max_size": 3,
"min_clicks": 20,
"schedule_frequency": 1,
"schedule_unit": "day"
}
}
# Enable automatic adaptive relevance
app_search.put_adaptive_relevance_settings(
engine_name="adaptive-engine",
curation={
"mode": "automatic"
}
)
# List all adaptive relevance suggestions for an engine
app_search.list_adaptive_relevance_suggestions(
engine_name="adaptive-engine"
)
{
"meta": {
"page": {
"current": 1,
"total_pages": 1,
"total_results": 2,
"size": 25
}
},
"results": [
{
"query": "forest",
"type": "curation",
"status": "pending",
"updated_at": "2021-09-02T07:22:23Z",
"created_at": "2021-09-02T07:22:23Z",
"promoted": [
"park_everglades",
"park_american-samoa",
"park_arches"
],
"operation": "create"
},
{
"query": "park",
"type": "curation",
"status": "pending",
"updated_at": "2021-10-22T07:34:12Z",
"created_at": "2021-10-22T07:34:54Z",
"promoted": [
"park_yellowstone"
],
"operation": "create",
"override_manual_curation": true
}
]
}
# Get adaptive relevance suggestions for a query
app_search.get_adaptive_relevance_suggestions(
engine_name="adaptive-engine",
query="forest",
)
{
"meta": {
"page": {
"current": 1,
"total_pages": 1,
"total_results": 1,
"size": 25
}
},
"results": [
{
"query": "forest",
"type": "curation",
"status": "pending",
"updated_at": "2021-09-02T07:22:23Z",
"created_at": "2021-09-02T07:22:23Z",
"promoted": [
"park_everglades",
"park_american-samoa",
"park_arches"
],
"operation": "create"
}
]
}
# Update status of adaptive relevance suggestions
app_search.put_adaptive_relevance_suggestions(
engine_name="adaptive-engine",
suggestions=[
{"query": "forest", "type": "curation", "status": "applied"},
{"query": "mountain", "type": "curation", "status": "rejected"}
]
)