# ElasticSearch REST API

## What is REST?
Roy Fielding first presented REST in 2000 in his famous dissertation as a lightweight alternative to SOAP-based web services.

A Web API (or Web Service) conforming to the REST architectural style is a REST API.

REST APIs are very popular these days and they have mostly replaced SOAP-based web services in the web.

Unlike SOAP-based web services, there is no "official" standard for RESTful web APIs. 

While SOAP used XML for messages, REST generally uses JSON.

JSON (JavaScript Object Notation) is an open standard file format derived from the JavaScript language (the ubiquous web programming language originally developed for Netscape in 1994). It's syntax is basically equivalent to that of a Python dict.

This is an example of a JSON object:

```json
{
    "name": "Ye",
    "surname": "Wenjie",
    "children": ["Yang", "Dong"],
    "age": 85,
    "height": 175.5,
    "married": true
}
```

In RESTful APIs we have resources and we can manipulate those resources using CRUD operations.

**Resources** are REST main abstration. Basically a resource is a name for a class of objects, we can think of it as a collection of objects of the same type that we can manipulate.

Resources are represented as URIs (ie. URLs).

**CRUD** operations:
- Create
- Read
- Update
- Delete

REST is **stateless**: no session information is retained by the server.

Semantics: REST implements operations using the HTTP methods:
- GET: read
- POST: create
- PUT: update
- DELETE: delete

GET, PUT and DELETE methods are idempotent. POST not.

The response and error codes are also based on the standard HTTP response status codes:
- 200 OK
- 201 Created
- 400 Bad Request
- 401 Unauthorized
- 404 Not Found

Using HTTP for communication allows REST APIs to be accessed all over the internet.


References:
- [What is REST](https://restfulapi.net/)
- [Roy Fielding's dissertation](https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm)
- [HTTP response status codes](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status)



## Elasticsearch REST API
Opensearch is based on Elasticsearch 7.10.2.

- [REST API Reference ElasticSearch 7.10](https://www.elastic.co/guide/en/elasticsearch/reference/7.10/rest-apis.html)

```
curl --insecure --user admin:admin 'https://10.38.28.237:9200/twitter-cesga/_search?pretty'
http --verify no --auth admin:admin 'https://opensearch-2:9200/_cat/health?v&pretty'
```

### Inserting data
```
PUT /:index/_doc/:id
{
    "genre": ["IMAX", "Sci-Fi"],
    "title": "The Arrival",
    "year": 2013
}
```

Let's insert a document in a non-existent index called `testing` and we will assign it the `id` 1:

In [28]:
%%bash

# First we will delete the testing index if it exists
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X DELETE \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing"

# And now we insert the data using automatic index creationg and dynamic mapping
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X POST -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_doc/1" -d '
{
    "genre": ["IMAX", "Sci-Fi"],
    "title": "The Arrival",
    "year": 2013
}'

{"acknowledged":true}{"_index":"testing","_type":"_doc","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}

### A note on mappings
Defining the schema of a document is referred as **mapping**: the mappings will determine how a document and its fields are stored and indexed in ElasticSearch.

In cases like the above that we insert a document in a new index that has not been created previously and then it has not predefined mappings, then elasticsearch will automatically created the mappings for us using a feature called: **dynamic mapping**.

ElasticSearch will try its best to infer the type of the fiels based on its value.

If we want instead of dynamic mapping we can use **explicit mapping** so we define in advance how our data will be represented.

### Index creation:
```
PUT /:index
```
- Shards & Replicas
```json
    "settings" : {
        "index" : {
            "number_of_shards" : "3",
            "number_of_replicas" : "2"
        }
    }
```
- Mappings: Let know elasticsearch that we want the "date" field as type "date" (by default it will interpret it as "long")
```
PUT /:index
{
    "mappings": {
        "properties": {
            "year": {"type": "date"}
        }
    }
}
```

Let's see the settings and mappings that were automatically created in our `testing` index:

In [29]:
%%bash

curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing?pretty"

{
  "testing" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "genre" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "title" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "year" : {
          "type" : "long"
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1665594745824",
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "uuid" : "n4NXZ95yRrW2W_zT1e6G9Q",
        "version" : {
          "created" : "135248227"
        },
        "provided_name" : "testing"
      }
    }
  }
}


We see that the `year` field has been assigned the `long` type, but we know that it is actually a `date` so we can fine-tune it if we use explicit mappings instead of the dynamic mappings.

Let's now delete the `testing` index and recreate it using the appropriate settings and mappings:

In [30]:
%%bash

curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X DELETE \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing"

{"acknowledged":true}

And now we can create it with the desired settings (3 primary shards, 2 replicas) and mappings (year as date):

In [31]:
%%bash

curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X PUT -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing" -d '
{
    "settings" : {
        "index" : {
            "number_of_shards" : "3",
            "number_of_replicas" : "2"
        }
    },
    "mappings": {
        "properties": {
            "year": {"type": "date"}
        }
    }
}'

{"acknowledged":true,"shards_acknowledged":true,"index":"testing"}

In [34]:
%%bash

curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing?pretty"

{
  "testing" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "year" : {
          "type" : "date"
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1665594765615",
        "number_of_shards" : "3",
        "number_of_replicas" : "2",
        "uuid" : "FItyDRNBTvehHHGTclgifA",
        "version" : {
          "created" : "135248227"
        },
        "provided_name" : "testing"
      }
    }
  }
}


As we can see only the mapping for the `year` field is there. The other mappings will be generated automatically by ElasticSearch when documents are added.

So, let's insert again our document and we will see how the other mappings are created:

In [35]:
%%bash

curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X POST -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_doc/1" -d '
{
    "genre": ["IMAX", "Sci-Fi"],
    "title": "The Arrival",
    "year": 2013
}'

{"_index":"testing","_type":"_doc","_id":"1","_version":1,"result":"created","_shards":{"total":3,"successful":3,"failed":0},"_seq_no":0,"_primary_term":1}

And now we can see all the mappings generated for the other fields:

In [36]:
%%bash

curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing?pretty"

{
  "testing" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "genre" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "title" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "year" : {
          "type" : "date"
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1665594765615",
        "number_of_shards" : "3",
        "number_of_replicas" : "2",
        "uuid" : "FItyDRNBTvehHHGTclgifA",
        "version" : {
          "created" : "135248227"
        },
        "provided_name" : "testing"
      }
    }
  }
}


## Mapping: Field data types
Common data types that we can use when defining mappings:
- **text**: the traditional field type for full-text content such as the body of an email or the description of a product. Used for **full text search**, replacing the old **analyzed**.
- **keyword**: it is used for structured content such as IDs, email addresses, hostnames, status codes, zip codes, or tags. Used for **verbatim** search, replacing the old **not_analyzed**.
- **date**: JSON doesn’t have a date data type, so dates in Elasticsearch can either be:
  - strings containing formatted dates, e.g. "2015-01-01" or "2015/01/01 12:10:30"
  - a number representing milliseconds-since-the-epoch
  - a number representing seconds-since-the-epoch
- Numeric field types: **long**, integer, **double**, float, ...
- binary: binary value encoded as a Base64 string
- boolean: true and false values
- object: a JSON object
- ip: IPv4 and IPv6 addresses
- flattened:  An entire JSON object as a single field value. (not available in OpenSearch)
- Range types: long_range, double_range, date_range, ip_range
- geo_point: Latitude and longitude points
- geo_shape: Complex shapes, such as polygons
- Arrays
- Multi-fiels
- ...

and much more, you can check all the field data types available in:

- [Field data types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html)

## Controlling full-text search
When we want exact-match we have to use the **keyword** type instead of **text**. This prevents that ElasticSearch analyzes the text in this field (previously this was done with the **no_analyzed** option) and it will store it as it is in verbatim format. This way the queries will have to exactly match the text.

In the **text** fields ElasticSearch will pass the text through the configured **analyzers** and **tokenizers** before storing it. Then we will be able to do full-text search in the text of this field.

## Preventing mapping explosion
There is as situation that we should avoid that is called **mapping explosion**, this happens when the number of fields in an index grows to a large amount causing out of memory errors and other problems.

This can happen for example when using dynamic mapping and every new document introduces new files, this causes elasticsearch to create a new mapping for each new field.

For example imagine a document like the following generated by filebeat  from the logs of the server (more on that later):
```json
{
  "@timestamp" : "2022-10-11T23:03:18.494Z",
  "input" : {
    "type" : "log"
  },
  "agent" : {
    "version" : "7.12.1",
    "hostname" : "c27-32",
    "ephemeral_id" : "0cd90f91-0c99-4346-a747-0516d3e2613f",
    "id" : "dfdee6c4-68f4-41b5-94a1-b0fe87c72209",
    "name" : "c27-32",
    "type" : "filebeat"
  },
  "message" : """2022-10-12 01:03:18.283 23 INFO neutron.wsgi [req-114a9dd4-be1b-49d3-b85a-4c2de4138e0d 1e16a103f9f44d9591cd4b4046050d36 77619eace356487892f47ff36c3e45c4 - default default] 10.108.27.24,10.108.27.35 "GET /v2.0/networks/4381458d-1edc-4ad5-bb46-b54272123f58?fields=segments HTTP/1.1" status: 200  len: 188 time: 0.0732589""",
  "tags" : [
    "cloud",
    "openstack-prod"
  ]
}
```

Now imagine that a new document arrives including additional fields, for example extending the information about the agent running in the server, for example indicating the `uptime` of the agent and the `memory` used:
```json
...
  "agent" : {
    "version" : "7.12.1",
    "hostname" : "c27-32",
    "ephemeral_id" : "0cd90f91-0c99-4346-a747-0516d3e2613f",
    "id" : "dfdee6c4-68f4-41b5-94a1-b0fe87c72209",
    "name" : "c27-32",
    "type" : "filebeat",
    "uptime": 3600,
    "memory": 100
  },
...
```

This way the number of fields will keep growing.

To prevent this we can define the `agent` field as a **flattened data type**:
```json
  "properties": {
    "agent": {
      "type": "flattened"
    }
  }
```

NOTE: The flattened data type is not available in opensearch or the oss elasticsearch, you need to get a version with at least basic license.

Additionaly, as a safe meassure, we can also configure different mapping limit settings like `index.mapping.nested_fields.limit`:

References:
- [Flattened field type](https://www.elastic.co/guide/en/elasticsearch/reference/current/flattened.html)
- [Mapping limit settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-settings-limit.html)


## Exercise
- Lab: [Preventing mapping explosion](mapping_explosion.ipynb)

### Bulk import
```json
PUT /_bulk
{"create": {"_index": "testing", "_id": "1"}}
{"id": "1", "genre": ["IMAX", "Sci-Fi"], "title": "The Arrival", "year": 2013}
{"create": {"_index": "testing", "_id": "2"}}
{"id": "2", "genre": ["IMAX", "Sci-Fi"], "title": "The Expanse", "year": 2010}
...
```

We create a json file with the two lines for the different documents we want to insert and then we run:
```bash
curl --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X PUT -H "Content-Type: application/json" \
    --data-binary @movies-bulk.json \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/_bulk"
```

### Exercise
Let's create a index named `movies` with the movielens dataset:

- Lab: [Bulk import: Movielens dataset](exercises/bulk_import_movielens.ipynb)

### Updating Data
When we update a document we must take into account that every document has a **_version** field (we can see it in the `_doc` query or adding `version=true` to the `_search` query).

As in the case of the RDDs in Spark, **documents are inmutable**. This means that when we update a document what actually happens is that a new document is created with an incremented `_version` number and the old one is marked for deletion.

In [38]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_doc/1?pretty"

{
  "_index" : "testing",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "genre" : [
      "IMAX",
      "Sci-Fi"
    ],
    "title" : "The Arrival",
    "year" : 2013
  }
}


There are two types of updates:

**Full update**: we provide all fields in the body of the request (just the same as when we do a insert)
```
PUT /:index/_doc/:id
```

For example, we have discovered that `The Arrival` release year was 2015 and not 2013:

In [50]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X PUT -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_doc/1" -d '
{
    "genre": ["IMAX", "Sci-Fi"],
    "title": "The Arrival",
    "year": 2015
}'

{"_index":"testing","_type":"_doc","_id":"1","_version":5,"result":"updated","_shards":{"total":3,"successful":3,"failed":0},"_seq_no":4,"_primary_term":1}

In [51]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_doc/1?pretty"

{
  "_index" : "testing",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 5,
  "_seq_no" : 4,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "genre" : [
      "IMAX",
      "Sci-Fi"
    ],
    "title" : "The Arrival",
    "year" : 2015
  }
}


**Partial update**: in this case only the updated fields are given in the body of the request
```
POST /:index/_doc/:id/_update
{
    "doc": {"title": "The Expanse Updated"}
}
```

Ups, we were wrong, The Arrival release year was 2016. We can fix it with a partial update:

In [52]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X POST -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_doc/1/_update" -d '
{
    "doc": {"year": 2016}
}'

{"_index":"testing","_type":"_doc","_id":"1","_version":6,"result":"updated","_shards":{"total":3,"successful":3,"failed":0},"_seq_no":5,"_primary_term":1}

In [53]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_doc/1?pretty"

{
  "_index" : "testing",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 6,
  "_seq_no" : 5,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "genre" : [
      "IMAX",
      "Sci-Fi"
    ],
    "title" : "The Arrival",
    "year" : 2016
  }
}


We can see that each time we do an update the `_version` field is incremented.

### List all documents in index
```
GET /:index/_search
```

In [54]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_search?pretty"

{
  "took" : 421,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "testing",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "genre" : [
            "IMAX",
            "Sci-Fi"
          ],
          "title" : "The Arrival",
          "year" : 2016
        }
      }
    ]
  }
}


### List documents in index with the given word
```
GET /:index/_search?q=Arrival
```

In [55]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_search?q=Arrival&pretty"

{
  "took" : 631,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.08701137,
    "hits" : [
      {
        "_index" : "testing",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.08701137,
        "_source" : {
          "genre" : [
            "IMAX",
            "Sci-Fi"
          ],
          "title" : "The Arrival",
          "year" : 2016
        }
      }
    ]
  }
}


### Delete a document
```
DELETE /:index/_doc/:id
```

In [56]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X DELETE \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_doc/1"

{"_index":"testing","_type":"_doc","_id":"1","_version":7,"result":"deleted","_shards":{"total":3,"successful":3,"failed":0},"_seq_no":6,"_primary_term":1}

In [57]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET -H "Content-Type: application/json" \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/_search?pretty"

{
  "took" : 803,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}


### Show index mappings
```
GET /:index
```

In [58]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing/?pretty"

{
  "testing" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "genre" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "title" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "year" : {
          "type" : "date"
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1665594765615",
        "number_of_shards" : "3",
        "number_of_replicas" : "2",
        "uuid" : "FItyDRNBTvehHHGTclgifA",
        "version" : {
          "created" : "135248227"
        },
        "provided_name" : "testing"
      }
    }
  }
}


### Compact and aligned text (CAT) APIs
- List available indices
    ```
    GET /_cat/indices?v
    ```

In [59]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/_cat/indices?v&pretty"

health status index                        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   twitter-putin                efIgy7-FTcqvbglatsM24Q   3   2    1991395            0      2.4gb        840.2mb
green  open   security-auditlog-2022.08.20 1PFGTppvTVSYKfJY5o4f6A   1   1          8            0    177.9kb        147.2kb
green  open   security-auditlog-2022.08.22 I-jrwQh5Txqs61zaoMsAZw   1   1         23            0     66.1kb           33kb
green  open   security-auditlog-2022.08.21 QL2M8iGPR8CaTLSW8tmheg   1   1         15            0    238.5kb        119.2kb
green  open   .opendistro_security         1A3g-op6R3yqr8GyfLxHAw   1   2          9            0    179.5kb         60.9kb
green  open   security-auditlog-2022.08.24 05hR8rw6QLm0RI1YXl7r7A   1   1         13            0    125.8kb         62.9kb
green  open   security-auditlog-2022.08.23 FOjGL_vLRs6gjzrmpCbkpA   1   1         15            0    238.7kb        119.3kb
green  o

- Delete an index:
```
DELETE /<index-name>
```

In [60]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X DELETE \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/testing"

{"acknowledged":true}

- List the nodes in the cluster
    ```
    GET /_cat/nodes
    ```

In [62]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/_cat/nodes?pretty"

10.38.28.8   60 99 0 0.51 0.27 0.20 dimr * opensearch-3
10.38.28.237 10 74 0 0.02 0.02 0.00 dimr - opensearch-1
10.38.27.170 41 99 0 0.19 0.07 0.04 dimr - opensearch-2


### Cluster APIs
- Cluster health
    ```
    GET /_cat/health?v
    ```

In [64]:
%%bash
curl --silent --insecure -u ${OPENSEARCH_USER}:${OPENSEARCH_PASSWD} \
    -X GET \
    "https://${OPENSEARCH_HOST}:${OPENSEARCH_PORT}/_cat/health?v"

epoch      timestamp cluster              status node.total node.data discovered_master shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1665596187 17:36:27  opensearch-cesga-dev green           3         3              true    247 120    0    0        0             0                  -                100.0%


### Advanced: Full custom analyzer declaration
Just in case that you want to fine tune the analyzer used for text fields you can do it by configuring it in the index:

```json
PUT /testing
{
  "settings": {
    "analysis": {
      "filter": {
        "english_stop": {
          "type":       "stop",
          "stopwords":  "_english_"
        },
        "english_stemmer": {
          "type":       "stemmer",
          "language":   "english"
        }
      },
      "analyzer": {
        "my_custom_english_analyzer": {
          "char_filter":  ["html_strip"],
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "english_stop",
            "english_stemmer"
          ]
        }
      }
    }
  }
}
```

## References
- [Elasticsearch Cheat Sheet for developers](https://elasticsearch-cheatsheet.jolicode.com/)