Skip to content

Interview Questions

ignacio-alorre edited this page Nov 3, 2019 · 2 revisions

How does Elasticsearch work? Indexing Documents to the Repository. During an indexing operation, Elasticsearch converts raw data such as log files or message files into internal documents and stores them in a basic data structure similar to a JSON object. Simply do an HTTP POST that transmits your document as a simple JSON object.

What is Kibana? Kibana is an open source data visualization plugin for Elasticsearch. It provides visualization capabilities on top of the content indexed on an Elasticsearch cluster. Users can create bar, line and scatter plots, or pie charts and maps on top of large volumes of data.

Why is NRT applied to Elasticsearch? In Elasticsearch NRT stands for Near Real Time Search platform. Elasticsearch is a near real-time search platform. What this means is there is a slight latency (normally one second) from the time you index a document until the time it becomes searchable.

What is a Cluster in Elasticsearch? A cluster is a collection of one or more nodes (servers) that together holds your entire data and provides federated indexing and search capabilities across all nodes. A cluster is identified by a unique name which by default is “elasticsearch”. This name is important because a node can only be part of a cluster if the node is set up to join the cluster by its name.

What is Node in Elasticsearch? A node is a single server that is part of your cluster, stores your data, and participates in the cluster’s indexing and search capabilities. Just like a cluster, a node is identified by a name which by default is a random Universally Unique IDentifier (UUID) that is assigned to the node at startup.

What is Index in Elasticsearch?

  • An index is a collection of documents that have somewhat similar characteristics. For example, you can have an index for customer data, another index for a product catalog, and yet another index for order data.

  • An index is identified by a name (that must be all lowercase) and this name is used to refer to the index when performing search, update, and delete operations against the documents in it.

What is Document in Elasticsearch? A document is a basic unit of information that can be indexed. For example, you can have a document for a single customer, another document for a single product, and yet another for a single order. This document is expressed in JSON (JavaScript Object Notation) which is a ubiquitous internet data interchange format.

**** What are Shards in Elasticsearch and Explain the concept?

A) An index can potentially store a large amount of data that can exceed the hardware limits of a single node. For example, a single index of a billion documents taking up 1TB of disk space may not fit on the disk of a single node or may be too slow to serve search requests from a single node alone.

To solve this problem, Elasticsearch provides the ability to subdivide your index into multiple pieces called shards. When you create an index, you can simply define the number of shards that you want. Each shard is in itself a fully-functional and independent “index” that can be hosted on any node in the cluster.

****

Clone this wiki locally