Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Service Error caused subsequent kinesis requests to fail #30

Open
akshaykailaje opened this issue Oct 23, 2015 · 3 comments
Open

Comments

@akshaykailaje
Copy link

Hi

We use the Kinesis Producer Library (v0.10.1) to push events to our Kinesis stream. On 10/20/2015 at 22:32 Pacific Time, we saw a steep rise in errors from the Kinesis Producer Library. The errors seem to have been triggered by an "Internal Service Error" on the Kinesis side. We didn't see any service outages on the AWS service health dashboard.
The errors resolved on its own when we restarted our tomcat server.

We also saw a spike in PutRecord latency at the same time. It seems like it was caused due to the errors.
putrecord_latency_spike

Can you please give us insight into why these errors would occur?

Initial error:

Error while logging Kinesis Record - attempts=5, attemptDetails={"errorMessage":"Internal service failure.","duration":5571,"errorCode":"InternalFailure","successful":false,"delay":10013},{"errorMessage":"Internal service failure.","duration":2213,"errorCode":"InternalFailure","successful":false,"delay":4425},{"errorMessage":"Internal service failure.","duration":41,"errorCode":"InternalFailure","successful":false,"delay":5002},{"errorMessage":"Expired while waiting in HttpClient queue","duration":55568920,"errorCode":"Exception","successful":false,"delay":-55566188},{"errorMessage":"Record has reached expiration","duration":0,"errorCode":"Expired","successful":false,"delay":0}

Subsequent Errors:
Error while logging Kinesis Record - attempts=6, attemptDetails={"errorMessage":"Internal service failure.","duration":5571,"errorCode":"InternalFailure","successful":false,"delay":9602},{"errorMessage":"Internal service failure.","duration":2213,"errorCode":"InternalFailure","successful":false,"delay":4425},{"errorMessage":"Internal service failure.","duration":41,"errorCode":"InternalFailure","successful":false,"delay":5002},{"errorMessage":"Expired while waiting in HttpClient queue","duration":55568920,"errorCode":"Exception","successful":false,"delay":-55566188},{"errorMessage":"Expired while waiting in HttpClient queue","duration":55569332,"errorCode":"Exception","successful":false,"delay":-55568920},{"errorMessage":"Record has reached expiration","duration":0,"errorCode":"Expired","successful":false,"delay":0}

@perryn
Copy link

perryn commented Jul 14, 2016

Hi @akshaykailaje,

We are seeing similar behaviour - did you ever get to the bottom of this?

Cheers
Perryn

@pfifer
Copy link
Contributor

pfifer commented Feb 15, 2017

Internal service errors indicate something has gone wrong internal to the Kinesis service. It's always a possibility that you will see them, normally they should recover automatically. Looking at the errors you provided it looks like there may have been issues that was causing requests to take longer than expected. The long request time may have caused the internal service failures, and the subsequent record expiration.

Are you seeing consistent levels of Internal Service Failures?

@rakhu
Copy link

rakhu commented Sep 1, 2018

Getting this frequent failures. 0.12.9 jar Windows 2012 R2 Standard VMWare Intel 2GHZ (4 processors) 6 GB RAM.
During this time CPU is going 100% and bringing down the windows server. So as a workaround onFailure i writen to destroy (flushSync and destroy didnt help ). But in this case i am loosing all the outstanding records. Can you help me to know if anyway to solve this ? Atlease i need a way to get the message and reprocess it.

Note: All Kinesis default config used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants