Skip to content

Commit

Permalink
DOCS-208 - Connect Elastic Quickstart not working (#166)
Browse files Browse the repository at this point in the history
* DOCS-208 - Connect Elastic Quickstart not working

[WIP] DOCS-208 - Connect Elastic Quickstart not working

WIP

* WIP

* Feedback
  • Loading branch information
joel-hamill committed Feb 1, 2018
1 parent 703b141 commit f02ec4d
Show file tree
Hide file tree
Showing 2 changed files with 127 additions and 120 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ target
docs/_build
.idea
*.iml
*.DS_Store
246 changes: 126 additions & 120 deletions docs/elasticsearch_connector.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _elasticsearch-overview:

Elasticsearch Connector
========================
Elasticsearch Connector
=======================
The Elasticsearch connector allows moving data from Kafka to Elasticsearch. It writes data from
a topic in Kafka to an `index <https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html#_index>`_
in Elasticsearch and all data for a topic have the same
Expand All @@ -26,44 +26,30 @@ connector provides a feature to infer mapping from the schemas of Kafka messages

.. _elasticsearch-quickstart:

Quick start
Quick Start
-----------
This quick start uses the Elasticsearch connector to export data produced by the Avro console
producer to Elasticsearch.

First, start all the necessary services using Confluent CLI:
**Prerequisites:**

.. tip::
- :ref:`Confluent Platform <installation>` is installed and services are running by using the Confluent CLI. This quick start assumes that you are using the Confluent CLI, but standalone installations are also supported. By default ZooKeeper, Kafka, Schema Registry, Kafka Connect REST API, and Kafka Connect are started with the ``confluent start`` command. For more information, see :ref:`installation_archive`.
- Elasticsearch 5.x is installed and running.

If not already in your PATH, add Confluent's ``bin`` directory by running: ``export PATH=<path-to-confluent>/bin:$PATH``
.. important:: Elasticsearch 6.x is not supported at this time due to a known issue.

.. sourcecode:: bash

$ confluent start

Every service will start in order, printing a message with its status:

.. sourcecode:: bash

Starting zookeeper
zookeeper is [UP]
Starting kafka
kafka is [UP]
Starting schema-registry
schema-registry is [UP]
Starting kafka-rest
kafka-rest is [UP]
Starting connect
connect is [UP]
----------------------------
Add a Record to the Consumer
----------------------------

Next, start the Avro console producer to import a few records to Kafka:
Start the Avro console producer to import a few records to Kafka:

.. sourcecode:: bash

$ ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test-elasticsearch-sink \
--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'
<path-to-confluent>/bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test-elasticsearch-sink \
--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'

Then in the console producer, type in:
Then in the console producer, enter:

.. sourcecode:: bash

Expand All @@ -73,101 +59,121 @@ Then in the console producer, type in:

The three records entered are published to the Kafka topic ``test-elasticsearch`` in Avro format.

Before starting the connector, please make sure that the configurations in
``etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties`` are properly set to your
configurations of Elasticsearch, e.g. ``connection.url`` points to the correct http address.
Then start the Elasticsearch connector by loading its configuration with the following command:
--------------------------------
Load the Elasticsearch Connector
--------------------------------

.. sourcecode:: bash

$ ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties \
etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties

.. sourcecode:: bash

$ confluent load elasticsearch-sink
{
"name": "elasticsearch-sink",
"config": {
"connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
"tasks.max": "1",
"topics": "test-elasticsearch-sink",
"key.ignore": "true",
"connection.url": "http://localhost:9200",
"type.name": "kafka-connect",
"name": "elasticsearch-sink"
},
"tasks": []
}

To check that the connector started successfully view the Connect worker's log by running:
Load the predefined Elasticsearch connector.

.. sourcecode:: bash

$ confluent log connect

Towards the end of the log you should see that the connector starts, logs a few messages, and then exports
data from Kafka to Elasticsearch.
Once the connector finishes ingesting data to Elasticsearch, check that the data is available in Elasticsearch:

.. sourcecode:: bash

$ curl -XGET 'http://localhost:9200/test-elasticsearch-sink/_search?pretty'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "test-elasticsearch-sink",
"_type" : "kafka-connect",
"_id" : "test-elasticsearch-sink+0+0",
"_score" : 1.0,
"_source" : {
"f1" : "value1"
}
}]
}
}

Finally, stop the Connect worker as well as all the rest of the Confluent services by running:

.. sourcecode:: bash

$ confluent stop
Stopping connect
connect is [DOWN]
Stopping kafka-rest
kafka-rest is [DOWN]
Stopping schema-registry
schema-registry is [DOWN]
Stopping kafka
kafka is [DOWN]
Stopping zookeeper
zookeeper is [DOWN]

or stop all the services and additionally wipe out any data generated during this quick start by running:

.. sourcecode:: bash
.. tip:: Before starting the connector, you can verify that the configurations in ``etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties`` are properly set (e.g. ``connection.url`` points to the correct HTTP address).

#. Optional: View the available predefined connectors with this command:

.. sourcecode:: bash

confluent list connectors

Your output should resemble:

.. sourcecode:: bash

Bundled Predefined Connectors (edit configuration under etc/):
elasticsearch-sink
file-source
file-sink
jdbc-source
jdbc-sink
hdfs-sink
s3-sink

#. Load the the ``elasticsearch-sink`` connector:

.. sourcecode:: bash

confluent load elasticsearch-sink

Your output should resemble:

.. sourcecode:: bash

{
"name": "elasticsearch-sink",
"config": {
"connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
"tasks.max": "1",
"topics": "test-elasticsearch-sink",
"key.ignore": "true",
"connection.url": "http://localhost:9200",
"type.name": "kafka-connect",
"name": "elasticsearch-sink"
},
"tasks": [],
"type": null
}

.. tip:: For non-CLI users, you can load the Elasticsearch connector by running Connect in standalone mode with this command:

.. sourcecode:: bash

$ ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties \
etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties


#. After the connector finishes ingesting data to Elasticsearch, check that the data is available in Elasticsearch:

.. sourcecode:: bash

$ curl -XGET 'http://localhost:9200/test-elasticsearch-sink/_search?pretty'


Your output should resemble:

.. sourcecode:: bash

{
"took" : 39,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test-elasticsearch-sink",
"_type" : "kafka-connect",
"_id" : "test-elasticsearch-sink+0+0",
"_score" : 1.0,
"_source" : {
"f1" : "value1"
}
},
{
"_index" : "test-elasticsearch-sink",
"_type" : "kafka-connect",
"_id" : "test-elasticsearch-sink+0+2",
"_score" : 1.0,
"_source" : {
"f1" : "value3"
}
},
{
"_index" : "test-elasticsearch-sink",
"_type" : "kafka-connect",
"_id" : "test-elasticsearch-sink+0+1",
"_score" : 1.0,
"_source" : {
"f1" : "value2"
}
}
]
}
}

$ confluent destroy
Stopping connect
connect is [DOWN]
Stopping kafka-rest
kafka-rest is [DOWN]
Stopping schema-registry
schema-registry is [DOWN]
Stopping kafka
kafka is [DOWN]
Stopping zookeeper
zookeeper is [DOWN]
Deleting: /tmp/confluent.w1CpYsaI

Features
--------
Expand Down

0 comments on commit f02ec4d

Please sign in to comment.