New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After recording for about 5 minutes, the Aeron Client check timeout exception #373
Comments
Stat is below:
|
Hi, |
@nitsanw , I use the master trunk version of 2017.6.8. I write a recording control, it read the recording stream and channel list, then call ArchiverConductor's startRecording. |
@nitsanw , I have about 8 streams for recording, the publishers is over several machines. |
@jordanxlj ArchiverConductor is internal API so no direct interaction with it is intended. You should launch a ArchivingMediaDriver/LightweightArchivingMediaDriver and interact via the control stream. |
Can you update your version of Aeron too? you are a month behind |
I think about the updating, but I also will locate the bug at the mean time. I will collector more debug information of the media driver. |
Is the reason that I had recorded too may data? |
I think you can save yourself some time by updating(to avoid any bugs we may have fixed), and switch to intended usage(to avoid bugs resulting from unplanned interaction) |
192.168.0.11 debug log, I start recording on the host. |
192.168.0.10 debug log. |
192.168.0.12 debug log. |
I use 225.0.1.3 multicast address to send data information, use 224.110.110.111 to send topology information. |
I send various messages, length from 1.3mb to 200Bytes. The frequency of the different messages is different. |
My code is shared below. |
@mjpt777 , Could you help see this problem? Thanks. |
@jordanxlj Have you updated to the latest version of Aeron from master? |
No, I haven't update to the latest version of master, I plan to update after the new release version come out. I thought the problem had no relationship with the version. When offer failed, I would tried 5 times to retry. I found when the retring is serious, the problem would occured more possibly. |
I will change to use no reliability way to see if the problem will occured. |
In 192.168.0.11 debug log, at line 52088, check timeout, and close the archiver. I see the CMD_IN_KEEPALIVE_CLIENT is always recorded, how to find the responding Client of the Recorder? |
This could happen due to a large GC pause whereby your client was not able to perform the duty cycle for over 5 seconds. It could also happen due to resource starvation in the client or the archive conductor thread blocked waiting on some action such as IO. You could increase the Given that you are using development code in a non-standard way and not keeping up with the latest build we cannot offer support without a support contract. |
After recording for about 5 minutes, the Aeron Client check timeout exception. The exception is below:
io.aeron.exceptions.ConductorServiceTimeoutException: Timeout between service calls over 5000000000ns
at io.aeron.ClientConductor.onCheckTimeouts(ClientConductor.java:508)
at io.aeron.ClientConductor.doWork(ClientConductor.java:431)
at io.aeron.ClientConductor.doWork(ClientConductor.java:143)
at org.agrona.concurrent.AgentInvoker.invoke(AgentInvoker.java:88)
at io.aeron.archiver.ArchiveConductor.doWork(ArchiveConductor.java:116)
at org.agrona.concurrent.AgentRunner.run(AgentRunner.java:140)
at java.lang.Thread.run(Thread.java:745)
The text was updated successfully, but these errors were encountered: