Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upMultiLangDaemon Throws NullPointerException Going From One Shard To Two When Multiple Daemons Are Running #29
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Thanks for reporting. I'll try to reproduce this. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevincdeng
Jun 29, 2015
Contributor
Hi eyesoftime,
I'm not able to reproduce this. Can you provide the steps which you took to produce the problem?
What language are you using to process the records? Are you using one of the official multilang KCLs?
Thanks
|
Hi eyesoftime, I'm not able to reproduce this. Can you provide the steps which you took to produce the problem? What language are you using to process the records? Are you using one of the official multilang KCLs? Thanks |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
eyesoftime
Jun 29, 2015
I was loading records into the stream, each one about 130KB of arbitrary data with indexes applied for tracking. Initially I started out with one shard. After about 2500 records I split the shard into two, after another 2500 records I merged the new shards. So, there's 4 shard altogether. There were no consumers running at the time (but that doesn't really change the outcome).
So then I started one daemon with Python sample application (with additional logging added, again for tracking). While it was consuming records from the first shard, I started the second daemon which then was idling as the first shard hadn't been consumed yet and second and third are the child shards of the first. Now when the consumer of the first shard reached the end, the first daemon died with the NPE. It happened repeatedly, while running daemons on the same EC2 instance, or parallel to one in my local machine. The same thing happened when the test was done 2-1-2 with shardes, ie merging and splitting. In that case it also died when going from one shard to two.
Hope it helps you.
eyesoftime
commented
Jun 29, 2015
|
I was loading records into the stream, each one about 130KB of arbitrary data with indexes applied for tracking. Initially I started out with one shard. After about 2500 records I split the shard into two, after another 2500 records I merged the new shards. So, there's 4 shard altogether. There were no consumers running at the time (but that doesn't really change the outcome). So then I started one daemon with Python sample application (with additional logging added, again for tracking). While it was consuming records from the first shard, I started the second daemon which then was idling as the first shard hadn't been consumed yet and second and third are the child shards of the first. Now when the consumer of the first shard reached the end, the first daemon died with the NPE. It happened repeatedly, while running daemons on the same EC2 instance, or parallel to one in my local machine. The same thing happened when the test was done 2-1-2 with shardes, ie merging and splitting. In that case it also died when going from one shard to two. Hope it helps you. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevincdeng
Jun 29, 2015
Contributor
Thanks for the information. I did not do the shard merge in my own test, so that might be the problem. I will do so to see if that reproduces the problem.
|
Thanks for the information. I did not do the shard merge in my own test, so that might be the problem. I will do so to see if that reproduces the problem. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevincdeng
Jun 29, 2015
Contributor
I have reproduced the problem. The problem isn't with MultiLangRecordProcessor per se, but rather with the Worker implementation.
A Worker will sometimes call shutdown on an IRecordProcessor even if initialize has not been called on the same instance. Since MultiLangRecordProcessor uses its initialize method to construct certain fields, and its shutdown method assumes that those fields have been initialized, an NPE occurs.
Once again thank you for reporting the problem. It will be fixed in a future release.
|
I have reproduced the problem. The problem isn't with A Once again thank you for reporting the problem. It will be fixed in a future release. |
kevincdeng
added
the
fix in progress
label
Jul 1, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pboocock
Jul 28, 2015
Using the python wrapper for this package, and I seem to be running into the same issue. Is there a workaround you can recommend to guarantee that initialize always gets called?
pboocock
commented
Jul 28, 2015
|
Using the python wrapper for this package, and I seem to be running into the same issue. Is there a workaround you can recommend to guarantee that |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevincdeng
Jul 28, 2015
Contributor
If your code doesn't need the shard id, you might be able to place it in the constructor of the class instead. Do you absolutely need initialize to be called? If it's just to ensure proper functioning of the shutdown method, adding a flag to check whether initialization has happened might be sufficient.
|
If your code doesn't need the shard id, you might be able to place it in the constructor of the class instead. Do you absolutely need initialize to be called? If it's just to ensure proper functioning of the shutdown method, adding a flag to check whether initialization has happened might be sufficient. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kevinrivers
commented
Nov 16, 2015
|
Was this fixed in a recent version? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soulcutter
Jan 28, 2016
This remains unfixed. The MultiLangRecordProcessor has not been changed since Oct 2014 https://github.com/awslabs/amazon-kinesis-client/blob/73ac2c0e25a25776cbc88f2c685223fb049e6757/src/main/java/com/amazonaws/services/kinesis/multilang/MultiLangRecordProcessor.java
I was able to reproduce this issue on 1.6.1 (the current latest version)
soulcutter
commented
Jan 28, 2016
|
This remains unfixed. The MultiLangRecordProcessor has not been changed since Oct 2014 https://github.com/awslabs/amazon-kinesis-client/blob/73ac2c0e25a25776cbc88f2c685223fb049e6757/src/main/java/com/amazonaws/services/kinesis/multilang/MultiLangRecordProcessor.java I was able to reproduce this issue on 1.6.1 (the current latest version) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
findchris
commented
Mar 3, 2016
|
@kevincdeng ETA here? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pboocock
Mar 21, 2016
@kevincdeng @findchris FWIW I've found success by specifying the failoverTimeMillis property in the .properties file to a high number (e.g. 100s)
pboocock
commented
Mar 21, 2016
|
@kevincdeng @findchris FWIW I've found success by specifying the failoverTimeMillis property in the .properties file to a high number (e.g. 100s) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Kahn
Apr 4, 2016
This apparently shipped in https://github.com/awslabs/amazon-kinesis-client#release-162-march-23-2016 @manango can you close or merge this PR please? Its confusing to leave open.
Kahn
commented
Apr 4, 2016
|
This apparently shipped in https://github.com/awslabs/amazon-kinesis-client#release-162-march-23-2016 @manango can you close or merge this PR please? Its confusing to leave open. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
The issue has been resolved in 1.6.2 release. Closing the issue. |
manango
closed this
Apr 5, 2016
manango
removed
the
fix in progress
label
Apr 5, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
prashantalhat
Dec 19, 2016
I am facing similar issue in https://github.com/awslabs/amazon-kinesis-client-net. Can someone please help?
prashantalhat
commented
Dec 19, 2016
|
I am facing similar issue in https://github.com/awslabs/amazon-kinesis-client-net. Can someone please help? |
eyesoftime commentedJun 29, 2015
If there's a stream with one shard that is split into two shards and two daemons are run starting from the trim horizon of the shards, the daemon that does processing of the parent shard dies with NullPointerException when it reaches the end of the shard. The second daemon will take over the processing of child shards but one of the daemons has exited. Happens with 1.2.0 as well as 1.4.0.