Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serial-Transport] NettyChannelFactory.closeEventLoopForChannel never returns #646

Closed
ottlukas opened this issue Nov 9, 2022 · 15 comments
Closed
Labels
bug integration-kafka-connect https://plc4x.apache.org/users/integrations/apache-kafka.html java Pull requests that update Java code MAJOR Modbus https://plc4x.apache.org/users/protocols/modbus.html

Comments

@ottlukas
Copy link
Contributor

ottlukas commented Nov 9, 2022

I'm polling a modbus device over a RS485 serial connection:


... connectionString":"modbus-rtu:serial:///dev/ttyAMA2?BAUD_RATE=19200&STOP_BITS=1&PARITY=EVEN&unit-identifier=1"

...
"sources. ... .pollReturnInterval": 5000, 
"jobs. ... .interval": 2000,

It works for a few minutes and then:


[2022-10-28 17:20:50,239] WARN [conveyor12source|task-0] Request finished with exception. Reporting
Connection as Broken (org.apache.plc4x.java.utils.connectionpool2.CachedPlcConnection:203)
java.util.concurrent.CancellationException
 
      at java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396)
     
  at org.apache.plc4x.java.utils.connectionpool2.CachedPlcConnection.lambda$wrapReadWithTimeout$1(CachedPlcConnection.java:105)
 
      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
       
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
 
      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 
      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 
      at java.base/java.lang.Thread.run(Thread.java:829)
[2022-10-28 17:20:50,603] INFO [conveyor12source|task-0]
Channel is closed, closing worker Group also (org.apache.plc4x.java.spi.connection.NettyChannelFactory:150)
[2022-10-28
17:20:50,614] INFO [conveyor12source|task-0] Worker Group was closed successfully! (org.apache.plc4x.java.spi.connection.NettyChannelFactory:155)
[2022-10-28
17:20:50,625] WARN [conveyor12source|task-0] Selector.select() returned prematurely 512 times in a row;
rebuilding Selector org.apache.plc4x.java.transport.serial.SerialPollingSelector@1a6dc4ab. (io.netty.channel.nio.NioEventLoop:628)
[2022-10-28
17:20:50,626] INFO [conveyor12source|task-0] Migrated 1 channel(s) to the new Selector. (io.netty.channel.nio.NioEventLoop:499)
[2022-10-28
17:20:50,637] WARN [conveyor12source|task-0] Selector.select() returned prematurely 512 times in a row;
rebuilding Selector org.apache.plc4x.java.transport.serial.SerialPollingSelector@2107b3db. (io.netty.channel.nio.NioEventLoop:628)

The Migrated 1 channel(s) and Selector.select() log entires continues endless.

I've found that the call in NettyChannelFactory to:


@Override    
public void closeEventLoopForChannel(Channel channel) {
        if (eventLoops.containsKey(channel))
{
            logger.info("Channel is closed, closing worker Group also");
            EventLoopGroup
eventExecutors = eventLoops.get(channel);
            eventLoops.remove(channel);
            eventExecutors.shutdownGracefully().awaitUninterruptibly();
 
          logger.info("Worker Group was closed successfully!");
        } else {
            logger.warn("Trying
to remove EventLoop for Channel {} but have none stored", channel);
        }
    } 

eventExecutors.shutdownGracefully() never returns even though the above log indicates it did. I've changed the call to:


eventExecutors.shutdownGracefully().addListener(e->{ LoggerFactory.getLogger(NettyChannelFactory.class).info("closeEventLoopForChannel:
Worker group finally shut down"); } ); 

in an attempt to eliminate any possible deadlocking.

Any suggestions would be much appreciated.

 

Imported from Jira PLC4X-358. Original Jira may contain additional context.
Reported by: Kielsholm.

@ottlukas ottlukas added bug driver-modbus integration-kafka-connect https://plc4x.apache.org/users/integrations/apache-kafka.html MAJOR labels Nov 9, 2022
@heyoulin
Copy link

try my fork: https://github.com/spnettec/plc4x

@chrisdutz
Copy link
Contributor

Anything in that fork, that you would like to contribute back to the project?

@chrisdutz chrisdutz changed the title NettyChannelFactory.closeEventLoopForChannel never returns [Serial-Transport] NettyChannelFactory.closeEventLoopForChannel never returns Feb 10, 2024
@BioComputerSience
Copy link
Contributor

BioComputerSience commented Jun 20, 2024

We have the same issue for Modbus TCP.
Is there something new for a fix of the bug ?

We added a timeout: workerGroup.shutdownGracefully().awaitUninterruptibly(2000);

@ottlukas
Copy link
Contributor Author

Which version of PLC4J are you running on? and if possible can you provide a PCAP ?

@BioComputerSience
Copy link
Contributor

We use PLC4J 0.11.0. Can't provide a PCAP.

@splatch splatch added java Pull requests that update Java code Modbus https://plc4x.apache.org/users/protocols/modbus.html and removed driver-modbus labels Jul 5, 2024
@markus-franke
Copy link
Contributor

@chrisdutz Does something speak against adding a timeout for the graceful shutdown of the worker groups as suggested by @BioComputerSience ?

@chrisdutz
Copy link
Contributor

No it doesn't ... feel free to whip up a PR.

@BioComputerSience
Copy link
Contributor

The PR is open. When will it be released?

@chrisdutz
Copy link
Contributor

Can't find a PR ... would you mind posting a link to this?

@chrisdutz
Copy link
Contributor

But regarding timelines ... I no longer have much time on PLC4X (More on that here https://github.com/chrisdutz/blog/blob/main/plc4x/throwing-the-towel.adoc) ... so I guess there won't be a big release any time soon, the way I currently see it unless someone from the community does it.

@BioComputerSience
Copy link
Contributor

#1701

@chrisdutz
Copy link
Contributor

Well that is already merged ... so it's in ... you just need to build it yourself.

@markus-franke
Copy link
Contributor

@chrisdutz As far as I can see the PR was just closed instead of merged. I don't see the relevant changes on "develop" branch.

@chrisdutz
Copy link
Contributor

You are absolutely right ... I wanted to merge it, not sure why I clicked "Close" ... thanks for pointing that out ... and possibly explains why I claimed to have merged it ;-)

@markus-franke
Copy link
Contributor

@chrisdutz No problem. Thanks for the merge ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug integration-kafka-connect https://plc4x.apache.org/users/integrations/apache-kafka.html java Pull requests that update Java code MAJOR Modbus https://plc4x.apache.org/users/protocols/modbus.html
Projects
None yet
Development

No branches or pull requests

6 participants