# Course 4, Module 3: Introduction to Elasticsearch and its REST API

This notebook provides a foundational introduction to Elasticsearch. We will explore what it is, its core data organization concepts, and how to perform basic Create, Read, Update, and Delete (CRUD) operations using its native **REST API**.

All commands and outputs in this notebook are pre-rendered for reference, so no live Elasticsearch instance is required.

--- 
## 1. What is Elasticsearch?

**Elasticsearch** is a distributed, RESTful search and analytics engine built on top of the Apache Lucene library. While it can be used as a NoSQL, document-oriented database, its primary strength is its powerful full-text search capability. It's designed for speed and scalability, allowing you to search through massive amounts of data in near real-time.

**Analogy:** If PostgreSQL is a hyper-organized relational database like a bank vault, Elasticsearch is like a super-powered, searchable library index for all your unstructured and semi-structured data. It's not where you store your primary, transactional records, but it's the best tool for finding things quickly.

--- 
## 2. Core Concepts (SQL vs. Elasticsearch)

To understand Elasticsearch, it's helpful to map its concepts to the relational world you already know. This table is the Rosetta Stone for understanding its structure.

| Relational Database (SQL) | Elasticsearch (NoSQL) |
|:---|:---|
| **Database** | **Cluster** (A collection of one or more servers/nodes) |
| **Table** | **Index** (A collection of documents) |
| **Row** | **Document** (A self-contained JSON object) |
| **Column** | **Field** (A key inside the JSON document) |
| **Schema** | **Mapping** (Rules defining the fields and their data types) |

--- 
## 3. Interacting with the REST API using `curl`

All interactions with Elasticsearch happen over a standard HTTP REST API. The following examples show the `curl` commands and the expected JSON responses.

### Health Check

First, let's check if our Elasticsearch cluster is running and healthy.

**Command:**
```bash
curl -X GET "localhost:9200/_cat/health?v"
```

**Expected Output:**
```
epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1664558219 12:06:59  elasticsearch green           1         1      0   0    0    0        0             0                  -                100.0%
```

### Create an Index

An index is like a database table. Let's create one called `books` using an HTTP `PUT` request.

**Command:**
```bash
curl -X PUT "localhost:9200/books"
```

**Expected Output:**
```json
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "books"
}
```

### Index a Document (Create/Update)

Now let's add a document (like a row) to our `books` index. We'll give it an ID of `1`. We use a `POST` request and provide the JSON data using the `-d` flag.

**Command:**
```bash
curl -X POST "localhost:9200/books/_doc/1" -H 'Content-Type: application/json' -d'{
  "title": "The Great Gatsby",
  "author": "F. Scott Fitzgerald",
  "year": 1925,
  "tags": ["classic", "novel"]
}'
```

**Expected Output:**
```json
{
  "_index": "books",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}
```

### Retrieve a Document

To get the document back, we use an HTTP `GET` request with the document's ID. Using `?pretty` formats the JSON output.

**Command:**
```bash
curl -X GET "localhost:9200/books/_doc/1?pretty"
```

**Expected Output:**
```json
{
  "_index" : "books",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "title" : "The Great Gatsby",
    "author" : "F. Scott Fitzgerald",
    "year" : 1925,
    "tags" : [ "classic", "novel" ]
  }
}
```

### Delete a Document

To remove the document, we use a `DELETE` request.

**Command:**
```bash
curl -X DELETE "localhost:9200/books/_doc/1"
```

**Expected Output:**
```json
{
  "_index": "books",
  "_id": "1",
  "_version": 2,
  "result": "deleted",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 1,
  "_primary_term": 1
}
```

### Delete an Index

Finally, to remove the entire index and all its documents, we send a `DELETE` request to the index name.

**Command:**
```bash
curl -X DELETE "localhost:9200/books"
```

**Expected Output:**
```json
{
  "acknowledged": true
}
```

--- 
## Conclusion

This notebook covered the absolute basics of Elasticsearch. We learned that:
- It's a document-oriented search engine with concepts that map to the relational world.
- All interactions happen over a standard **HTTP REST API**.
- We can perform basic CRUD operations (Create/Index, Get, Delete) using simple tools like `curl`.

In the next notebook, we will dive into Elasticsearch's main feature: **powerful search**.