Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AFUNIXSocketChannel selector not working with jetty #108

Closed
usertree opened this issue Jun 24, 2022 · 26 comments
Closed

AFUNIXSocketChannel selector not working with jetty #108

usertree opened this issue Jun 24, 2022 · 26 comments
Assignees
Labels
bug The issue describes a bug in the code verify The issue is considered fixed/done, and reassigned to the originator to verify.
Milestone

Comments

@usertree
Copy link

Are AFUNIXSocketChannel and AbstractSelector available? Can you provide an example?

@kohlschuetter
Copy link
Member

kohlschuetter commented Jun 24, 2022

Yes, they're available. Make sure to use junixsocket 2.5.0 or newer since a few bugs have been fixed there.

Here's some code to get started:

AFUNIXSelectorProvider provider = AFUNIXSelectorProvider.provider();
AFUNIXSocketChannel sc = provider.openSocketChannel();
sc.connect(AFUNIXSocketAddress.of(new File("/tmp/test.sock")));
// (work with SocketChannel API)

AFUNIXSocket socket = sc.socket(); // always succeeds
// (work with Socket API)
    
AFUNIXSocket sock = ...;
sock.getChannel(); // always succeeds as well

AbstractSelector selector = provider.openSelector();
// work with Selector API

Also see the ThroughputTest unit tests.

@kohlschuetter kohlschuetter added question The issue is a question rather than a bug report verify The issue is considered fixed/done, and reassigned to the originator to verify. labels Jun 24, 2022
@usertree
Copy link
Author

When JunixSocket is used through the Jetty extension, AFSelector.select() appears to be invalid and the corresponding key cannot be obtained.

@kohlschuetter
Copy link
Member

Can you provide some test code that shows the bug? What's the error message / stack trace?

@usertree
Copy link
Author

package com.xinba.jetty;

import org.eclipse.jetty.client.HttpClient;
import org.eclipse.jetty.client.HttpDestination;
import org.eclipse.jetty.client.http.HttpClientTransportOverHTTP;
import org.eclipse.jetty.io.EndPoint;
import org.eclipse.jetty.io.ManagedSelector;
import org.eclipse.jetty.io.SelectorManager;
import org.newsclub.net.unix.AFUNIXSelectorProvider;
import org.newsclub.net.unix.AFUNIXSocketAddress;
import org.newsclub.net.unix.AFUNIXSocketChannel;

import java.io.IOException;
import java.net.ConnectException;
import java.net.InetAddress;
import java.net.InetSocketAddress;
import java.net.SocketException;
import java.nio.channels.SelectableChannel;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.charset.Charset;
import java.util.Map;

/**
 * uds transport by junixsocket
 *
 * @since 2022-06-06
 */
public class SOPTransportUnixSocket extends HttpClientTransportOverHTTP {

    private String socketFile;

    private static final AFUNIXSelectorProvider PROVIDER = AFUNIXSelectorProvider.getInstance();;

    /**
     * construct
     *
     * @param unixSocketPath unix socket path
     */
    public SOPTransportUnixSocket(String unixSocketPath) {
        System.out.println("SOPTransportUnixSocket 1111");
        this.socketFile = unixSocketPath;
    }

    @Override
    protected SelectorManager newSelectorManager(HttpClient client) {
        return new SOPUnixSelectorManager(client, getSelectors());
    }

    @Override
    public void connect(InetSocketAddress address, Map<String, Object> context) {
        AFUNIXSocketChannel channel = null;
        try {
            InetAddress inet = address.getAddress();
            if (!inet.isLoopbackAddress() && !inet.isLinkLocalAddress() && !inet.isSiteLocalAddress()) {
                System.out.println("unix socket cannot connect to {}" + address.getHostString());
                throw new ConnectException("unix socket cannot connect to " + address.getHostString());
            }
            AFUNIXSocketAddress afunixSocketAddress = AFUNIXSocketAddress.of(socketFile.getBytes(Charset.defaultCharset()));
            System.out.println("accept opening");
            channel = PROVIDER.openSocketChannel();
            // channel = AFUNIXSocketChannel.open();
            boolean connected = channel.connect(afunixSocketAddress);
            // channel = AFUNIXSelectorProvider.getInstance().openSocketChannel(afunixSocketAddress);
            HttpDestination destination = (HttpDestination)context.get(HTTP_DESTINATION_CONTEXT_KEY);
            HttpClient client = destination.getHttpClient();
            System.out.println("accept starting");
            configure(client, channel);
            channel.configureBlocking(false);
            getSelectorManager().accept(channel, context);
            System.out.println("accept finished");
        } catch (Throwable throwable) {
            System.out.println("connect failed, " + throwable);
            if (throwable.getClass() == SocketException.class) {
                throwable = new SocketException("Could not connect to " + address).initCause(throwable);
            }
            if (channel != null) {
                try {
                    channel.close();
                } catch (IOException ioException) {
                    System.out.println("channel close catch ioException.");
                } finally {
                    connectFailed(context, throwable);
                }
            }
        }
    }

    class SOPUnixSelectorManager extends ClientSelectorManager {

        protected SOPUnixSelectorManager(HttpClient client, int selectors) {
            super(client, selectors);
        }

        @Override
        protected Selector newSelector() throws IOException {
            return PROVIDER.openSelector();
        }

        @Override
        protected EndPoint newEndPoint(SelectableChannel channel, ManagedSelector selector, SelectionKey key) {
            SOPUnixEndPoint unixEndPoint = new SOPUnixEndPoint((AFUNIXSocketChannel) channel, selector, key, getScheduler());
            unixEndPoint.setIdleTimeout(getHttpClient().getIdleTimeout());
            return unixEndPoint;
        }
    }
}

package com.xinba.jetty;

import org.eclipse.jetty.io.ChannelEndPoint;
import org.eclipse.jetty.io.ManagedSelector;
import org.eclipse.jetty.util.thread.Scheduler;
import org.newsclub.net.unix.AFUNIXSocketChannel;

import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.channels.SelectionKey;

/**
 * unix endPoint
 *
 * @since 2022-06-06
 */
class SOPUnixEndPoint extends ChannelEndPoint {

    /**
     * construct
     *
     * @param channel channel
     * @param selector selector
     * @param key key
     * @param scheduler scheduler
     */
    SOPUnixEndPoint(AFUNIXSocketChannel channel, ManagedSelector selector, SelectionKey key, Scheduler scheduler) {
        super(channel, selector, key, scheduler);
    }

    @Override
    public AFUNIXSocketChannel getChannel() {
        return (AFUNIXSocketChannel) super.getChannel();
    }

    @Override
    public InetSocketAddress getLocalAddress() {
        return null;
    }

    @Override
    public InetSocketAddress getRemoteAddress() {
        return null;
    }

    @Override
    protected void doShutdownOutput() {
        try {
            getChannel().shutdownOutput();
            super.doShutdownOutput();
        } catch (IOException exception) {
            System.out.println("unix channel shutdown failed, " + exception);
        }
    }
}


@usertree
Copy link
Author

Debug finds that the code is blocked and no response is returned.
image

@kohlschuetter kohlschuetter changed the title Are AFUNIXSocketChannel and AbstractSelector available? Can you provide an example? AFUNIXSocketChannel selector not working with jetty Jun 24, 2022
@kohlschuetter kohlschuetter added bug The issue describes a bug in the code and removed question The issue is a question rather than a bug report verify The issue is considered fixed/done, and reassigned to the originator to verify. labels Jun 24, 2022
kohlschuetter added a commit that referenced this issue Jun 24, 2022
Previously, we weren't properly updating the interestOps data structure
in PollFd when a SelectionKey's interestOps were changed after the fact.

This prevented Jetty to work with junixsocket.

#108
@usertree
Copy link
Author

image
Jetty updates the read mode internally, but this interface is not adapted.

@kohlschuetter kohlschuetter added this to the 2.5.1 milestone Jun 24, 2022
@kohlschuetter
Copy link
Member

Please verify with the latest junixsocket 2.5.1-SNAPSHOT.

You may need to add the following statements to your project's Maven POM:

    <repositories>
        <repository>
            <id>snapshots-repo</id>
            <url>https://oss.sonatype.org/content/repositories/snapshots</url>
            <releases>
                <enabled>false</enabled>
            </releases>
            <snapshots>
                <enabled>true</enabled>
            </snapshots>
        </repository>
    </repositories>

@kohlschuetter kohlschuetter added the verify The issue is considered fixed/done, and reassigned to the originator to verify. label Jun 24, 2022
@usertree
Copy link
Author

OK,I'll try. thank you very much

kohlschuetter added a commit that referenced this issue Jun 24, 2022
This artifact provides junixsocket-compatible jetty connectors.

The server connector should be compatible with jetty >= 9.4.12.
The client connector should be compatible with jetty >= 10.0.8.

Related to #108
@kohlschuetter
Copy link
Member

Please also see the most recent addition to junixsocket (also available in the snapshots repository): junixsocket-jetty.

This artifact provides both a server connector (jetty >= 9.4.12) and a client connector (jetty >= 10.0.8). Please give it a try and let me know if that works for you.

@usertree
Copy link
Author

Due to historical reasons, we cannot upgrade JDK11 and Jetty10. Can we provide solutions based on JDK8 and Jetty9?

@usertree
Copy link
Author

During the verification, it is found that if the number of seletors exceeds 1, requests cannot be sent to the server. Is it possible that JunixSocket does not support concurrent requests?
image

@kohlschuetter
Copy link
Member

A great way to find more bugs, thank you!

Please try again with the latest snapshot (>= junixsocket 2.5.1-20220625.155644) or anything with commit 6d324a6 or later.

@joakime
Copy link

joakime commented Jun 26, 2022

Debug finds that the code is blocked and no response is returned.
image

Of special note, Jetty considers a blocking select() which returns 0 to be a broken NIO implementation, and is the cause of many 100% CPU utilization side effects.
Jetty will access the selector from any number of threads, often concurrently, so that select() call should never return 0.
This is basic NIO requirements, and many custom NIO implementations get this wrong.
The screenshot hints at the return 0 bug and some awkward locking.
If this is the case, you have a way to go before this will work reliably on Jetty.

@kohlschuetter
Copy link
Member

@joakime Did you try with the latest build I mentioned? Can you reproduce your issue?

I threw a bunch of parallel requests at it, and it seems to work for me, also with multiple selectors. Are you seeing a concrete issue with junixsocket's handling of Selector#select, or is this a hypothetical concern?

Regarding select() not allowed to return 0: According to the Java API documentation, it is perfectly OK to return 0:

"The number of keys, possibly zero, whose ready-operation sets were updated",

So I think that is not the NIO requirement you mention. Is there additional documentation around this point?

By the way, there was an indeed a bug in the client connector side of things (missing Selector implementation override). I've now added (a slightly modified) org.eclipse.jetty.unixdomain.server.UnixDomainTest to junixsocket-jetty's test suite, and all these jetty tests seem to pass now.

I wonder what else is missing?

Please test with the latest snapshot (20220626.130729-5 or newer), or commit d366394 or newer from git.

@gregw
Copy link

gregw commented Jun 26, 2022

@joakime I'm not so worried by that locking... if/when we do call select() from multiple threads, we expect the implementation to internally mutually exclude and one to eventually to respond and the next to then take over and block waiting for further events. So that locking is fine. Also, I think with our latest code, where the selector is always called from our producer, we should no longer do concurrent calls to select anyway. But each select is highly likely to be done by a different thread as we prefer to allow any thread woken up in a select call to proceed to handle those selections rather than hand off to another thread (likely on different CPU with cold cache etc.).

However, we definitely do expect a blocking call to select to not immediately return with 0, else we end up in a busy loop. Previously we made efforts to detect such loops as it was decided it was better to die and allow a restart rather than to continue to hammer on a non functional busy loop selector. But our currect code no longer detects such a case, rather it only has the option to try a selectNow instead before looping. So if select() does return 0 without pause, you will be in a busy loop.

@kohlschuetter
Copy link
Member

@gregw AFSelector#select should not return immediately (unless an exception is thrown, e.g., when the socket is closed).

@usertree Now it would be a great time to verify that everything works for you with the latest code. Since it includes some bug fixes, I'd like to release 2.5.1 soon. If I don't hear from you until Thursday (June 30), I will assume everything works now (but I might be slightly sad that you didn't reply). Cheers!

@usertree
Copy link
Author

Which version can I use, currently using 2.5.1 - SNAPSHOT still has seletor concurrency issues.

@kohlschuetter
Copy link
Member

kohlschuetter commented Jun 29, 2022

@usertree
When using snapshot dependencies, there is a chance that you're not on the most recent one.

Please verify that you test with 2.5.1-20220628.145818-6.

<dependencies>
  <dependency>
      <groupId>com.kohlschutter.junixsocket</groupId>
      <artifactId>junixsocket-common</artifactId>
      <version>2.5.1-20220628.145818-6</version>
  </dependency>
  <dependency>
      <groupId>com.kohlschutter.junixsocket</groupId>
      <artifactId>junixsocket-jetty</artifactId>
      <version>2.5.1-20220628.145818-6</version>
  </dependency>
</dependencies>
<repositories>
  <repository>
      <id>snapshots-repo</id>
      <url>https://oss.sonatype.org/content/repositories/snapshots</url>
      <releases>
          <enabled>false</enabled>
      </releases>
      <snapshots>
          <enabled>true</enabled>
      </snapshots>
  </repository>
</repositories>

@usertree
Copy link
Author

Sorry I can't rely on 2.5.1-20220628.145818-6 from https://oss.sonatype.org/content/repositories/snapshots/. In addition, the list of snapshots cannot be viewed.

@kohlschuetter
Copy link
Member

I'm not sure I understand what "can't rely" means in this context. I assume you're having problems accessing that snapshot?
You can browse the artifacts here: https://oss.sonatype.org/content/repositories/snapshots/com/kohlschutter/junixsocket/

If you run your tests on your own machine, try deleting your m2 repository cache for junixsocket, e.g.,, delete the contents in ~/.m2/repository/com/kohlschutter/junixsocket

If you're still having issues to get the right snapshot, I'd recommend trying to build junixsocket from source.

I'm pretty confident that we've captured the select() bugs. To rule out bugs in your implementation vs junixsocket's jetty code, try using AFSocketServerConnector from junixsocket-jetty. Instructions are here: https://github.com/kohlschutter/junixsocket/blob/master/junixsocket-jetty/README.md

@kohlschuetter
Copy link
Member

Please verify with 2.5.1; just released. Thanks!

@usertree
Copy link
Author

usertree commented Jul 4, 2022

No problem is found in the local verification with 2.5.1. The verification will be performed in the environment later.

@kohlschuetter
Copy link
Member

Closing resolved ticket due to inactivity. Please re-open if you find any problems. Thanks for reporting!

@kevink-sq
Copy link
Contributor

kevink-sq commented Jul 11, 2023

👋 requesting to reopen this issue because we ran into response problems when using the jetty org.newsclub.net.unix.jetty.AFSocketServerConnector in version 2.6.2 in production. We were looking for alternatives to address the jnr-unixsocket load issue and tried junixsocket.

symptoms:

  • when the jetty service using the AFSocketServerConnector returned a response greater than 32KB, the same first 32KB (exactly every 0x8000 bytes) was repeated. For example, if the response was 100KB, then the same first 32KB was repeated until 100KB.
  • server did not log a warning or error and the request looked like a 200 response on the server side. Only when client attempted to parse the response was there an error (especially if it was supposed to be structured data).
  • OS socket SO_SNDBUF was verified to be 212MB.
  • Swapping out the connector to Jetty 10's UnixDomainServerConnector fixed the issue (note swapping the connector was the only diff in the server, no other changes/updates were present)

@kohlschuetter
Copy link
Member

@kevink-sq Thanks for the update! This looks like a different bug. Can you please file a new issue with some code to replicate this? Cheers!

@kohlschuetter
Copy link
Member

@kevink-sq Following up with #135

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue describes a bug in the code verify The issue is considered fixed/done, and reassigned to the originator to verify.
Projects
None yet
Development

No branches or pull requests

5 participants