-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event structure example #3
Comments
Hi @luckyswede, Currently the plugin supports reading string messages from kafka (and the DefaultEncoder is used as serializer.class), and puts into ES as a value property. It is planned to add support for JSON Kafka messages as well, but if you need it asap, I would welcome you to add the support and send a pull request. Regarding not seeing docs in ES, the cause might be that you did not specify bulk.size property while creating the index, and by default the value 100 will be used. If that's the case, you will need to consume 100 messages minimum, and only then the docs will be inserted into ES, because the plugin uses Bulk API to insert messages in a bulk. Please let me know if that resolves your problem, or if I can help you with anything else. |
Ok I see. / Jonas |
I am not sure which type configuration parameter are you referring to, the one in the top level when creating the index, or the one under index ? curl -XPUT 'localhost:9200/_river/<river-name>/_meta' -d '
{
"type" : "kafka",
"kafka" : {
"zookeeper.connect" : <zookeeper.connect>,
"zookeeper.connection.timeout.ms" : <zookeeper.connection.timeout.ms>,
"topic" : <topic-name>
},
"index" : {
"index" : <index-name>,
"type" : <mapping-type-name>,
"bulk.size" : <bulk.size>,
"concurrent.requests" : <concurrent.requests>
}
}' The top level one is a static value "kafka", can not be changed by the client. |
I'm having the same problem. What's strange is it worked for a little while, then I deleted the river and re-created it and can't get it to work. I see the data in the ElasticSearch Logs:
But the Search only shows the River stuff no data:
Here's the result:
(Not showing the kibana entry) Here's the PUT to create the River:
|
On a related note, you mentioned:
How do I tell elastic search to interpret the elasticsearch index (in my case Thanks |
I can't yet tell for sure, but it seems like the never returns and so the
I.E. this block never exits:
Could there be conditions where the line
never goes false? |
Hi @rberger, Regarding your first comment "I'm having the same problem. What's strange is it worked for a little while, then I deleted the river and re-created it and can't get it to work". I just tested the same behaviour on my local machine, when I delete the river, and create it again, without restarting the elasticsearch server, I could see the new messages being inserted from kafka into elasticsearch (I was using plain string messages). Actually I am using the following command to retrieve all the data inserted into ES: GET /kafka-index/_search?pretty=1 Could you maybe try the above mentioned command, and see if that shows any result for you? Regards, |
I'm pretty sure that for some reason, maybe due to the way the data from my kafka topic is coming in, the elasticsearch bulk add is never getting called. I see the logging from the line I am wondering if I have a stream of data coming from kafka that always makes My Java fu is weak and I'm just learning Elasticsearch and Kafka, but I will try to do some experiments. I will also try another kafka topic where I update the topic manually so I can see if its related to how much continuous data is in the kafka topic. Any help is appreciated. If you have the interest, please drop me an email at rberger@mistsys.com and I could give you access to our kafka consumer to see if its a usage pattern problem or if I'm just doing something wrong. In any case appreciate you creating this tool and your help! If I do the same command as
I get:
|
Hi @rberger, After examining your use case again (when you have an infinite flow of data, without stopping), I am actually thinking that should be a bug in my code, inside the while loop. Because you have always data to read, consumerIterator.hasNext() keeps reading all the time, and never gets out of the loop, so it can add the messages to the ES bulkprocessor. It will only go out of the loop, when there is no data to read, and consumer times out. I will fix this issue tomorrow, and will update you to test it again. Locally perform the following steps:
This should work, because when you stop typing, the consumer will time out after 15 ms, and the while loop will be terminated, so the messages will be added into ES bulk processor. Sorry that this caused trouble for your application, I will fix it asap and let you know. Cheers, |
Ok, I think I proved that the issue is that the I had the river running. In the elasticsearch log I'm seeing continuous output of
But never any log messages that show that Then I actually shut down the kafka brokers assuming that would make Besides the error messages saying it couldn't connect to the kafka broker it for the first time said:
And then a flurry of:
And now when I say:
I get:
So need to do something that causes |
I presume that if we would just make
have another parameter that breaks every bulk.size messages. That would cause it to break out of the loop, and let the I’ll probably take a stab at trying to fix it. Its good for me to learn more about this :-) If you have any hints as to where you think the fix should go and what it would be let me know and I’ll give it a try and send a pull request. |
Hi @rberger, I actually already fixed the problem, will deploy the latest version in couple of minutes :) Cheers, |
Cool! Thanks! |
Hi @rberger, The new fix is already available in the latest release version 1.1.1. See here - https://github.com/mariamhakobyan/elasticsearch-river-kafka/releases/tag/v1.1.1. Cheers, |
Yay! It works! Now I can get back to figuring out how to have the message from Kafka be interpreted as JSON and not Text. I will look into what changes are needed to make that happen. I think I learnt enough from this exercise to maybe address that issue. Thanks again! |
Great to hear that! In current version, if you receive JSON messages from Kafka, that messages will be inserted into ES as one value (not extracted per each json property). This is the default behaviour. Could we consider this issue as resolved? Regards, |
Yes please consider this issue resolved. Thanks Again! |
Hi,
Do you have an example of what structure the kafka events should have in order to work with this river?
I'm receiving json from kafka and I have configured a mapping in ES to describe the messages and configured the river to use this type. I see the messages coming in from kafka in the ES logs, but no docs are put into ES.
Any clues?
Thanks / Jonas
The text was updated successfully, but these errors were encountered: