Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't communicate with kernel SCTP stack program. #603

Closed
asterwyx opened this issue Jul 25, 2021 · 10 comments
Closed

Can't communicate with kernel SCTP stack program. #603

asterwyx opened this issue Jul 25, 2021 · 10 comments

Comments

@asterwyx
Copy link

I've written a simple sctp client using interface provided by linux, part of my code is like below:

#include <arpa/inet.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <stdarg.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <stdbool.h>
#include <string.h>
#include <netinet/sctp.h>

static const uint16_t       CONN_PORT     = 7780;
static const int                BUF_LEN           = 4096;
static const char             CONN_ADDR[]  = "127.0.0.1";

int main(int argc, char *argv[])
{
    int sock_fd;
    int error;
    char msg_buf[BUF_LEN];
    struct sctp_sndrcvinfo info;
    struct sctp_event_subscribe sub;
    memset(&sub, 0, sizeof(struct sctp_event_subscribe));

    sock_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
    
    sub.sctp_data_io_event = 1;
    sub.sctp_association_event = 1;
    error = setsockopt(sock_fd, SOL_SCTP, SCTP_EVENTS, (char *)&sub, sizeof(sub));
    if (0 != error)
    {
        fprintf(stderr, "SCTP_EVENTS: error %d\n", error);
    }
    struct sockaddr_in addr;
    addr.sin_family = AF_INET;
    addr.sin_addr.s_addr = inet_addr(CONN_ADDR);
    addr.sin_port = htons(CONN_PORT);
    error = connect(sock_fd, (struct sockaddr *)&addr, sizeof(addr));
    if (0 != error)
    {
        fprintf(stderr, "Can't connect to %s, port %d, errno: %d, %s\n", inet_ntoa(addr.sin_addr), ntohs(addr.sin_port), errno, strerror(errno));
        close(sock_fd);
        exit(EXIT_FAILURE);
    }
......
}

It confused me when I run echo_server provided by usrsctp and this simple sctp client. The connection failed, and it told me:

Can't connect to 127.0.0.1, port 7780, errno: 111, Connection refused

Then I tried to capture the packets to see what happened. I started Wireshark to capture packets on loopback NIC, then I got these records:

7406 1526.417963540    127.0.0.1 -> 127.0.0.1    SCTP 122 INIT
7407 1526.417984216    127.0.0.1 -> 127.0.0.1    SCTP 50 ABORT

Why did this happen? I've also written a simple server using Linux socket. And I've verified that my server and client could communicate with each other normally. They can complete INIT->INIT_ACK->COOKIE_ECHO->COOKIE_ACK process.

@tuexen
Copy link
Member

tuexen commented Jul 25, 2021

You can not run the kernel stack and the userland stack at the same time using SCTP/IP. The packet containing the ABORT chunk comes from the kernel stack handling the packet containing the INIT chunk.

Either use two different hosts (and no kernel stack on the host running the program with the userland stack) or use UDP encapsulation, but I think the Linux kernel does not support this yet.

@asterwyx
Copy link
Author

You can not run the kernel stack and the userland stack at the same time using SCTP/IP. The packet containing the ABORT chunk comes from the kernel stack handling the packet containing the INIT chunk.

Either use two different hosts (and no kernel stack on the host running the program with the userland stack) or use UDP encapsulation, but I think the Linux kernel does not support this yet.

Why does the ABORT packet coming from the kernel stack handle the packet containing the INIT chunk? I tried to run usrsctp programs on two different hosts and the following is what I do:

  1. Run echo_server on a host whose IP address is 192.168.131.1.
./echo_server
  1. Run client on another host whose IP address is 192.168.131.2.
./client 192.168.131.1 7780

In the meantime, I ran tshark both on NIC whose IP address is 192.168.131.1 and on NIC whose IP address is 192.168.131.2 to monitor network traffic. I got below records:

306 14018.662563313 192.168.131.2 -> 192.168.131.1 SCTP 186 INIT
307 14018.662722705 192.168.131.1 -> 192.168.131.2 SCTP 60 ABORT
308 14018.662999216 192.168.32.1 -> 192.168.131.2 SCTP 682 INIT_ACK 

It confused me that 192.168.32.1 is just a virtual NIC and there was no route between it and 192.168.131.2. Why did echo_server change an address to send the INIT_ACK packet and why did it even send an ABORT packet from INIT packet's destination address to its source address?
PS: I have changed echo_server's listening port to 7780.

@tuexen
Copy link
Member

tuexen commented Jul 29, 2021

You can not run the kernel stack and the userland stack at the same time using SCTP/IP. The packet containing the ABORT chunk comes from the kernel stack handling the packet containing the INIT chunk.
Either use two different hosts (and no kernel stack on the host running the program with the userland stack) or use UDP encapsulation, but I think the Linux kernel does not support this yet.

Why does the ABORT packet coming from the kernel stack handle the packet containing the INIT chunk? I tried to run usrsctp programs on two different hosts and the following is what I do:

When you have a kernel stack, packets get delivered to that stack. Since the kernel stack does not have states for associations handled in userland, it considers them as out of the blue and replies with a packet containing an ABORT chunk.

  1. Run echo_server on a host whose IP address is 192.168.131.1.
./echo_server
  1. Run client on another host whose IP address is 192.168.131.2.
./client 192.168.131.1 7780

In the meantime, I ran tshark both on NIC whose IP address is 192.168.131.1 and on NIC whose IP address is 192.168.131.2 to monitor network traffic. I got below records:

306 14018.662563313 192.168.131.2 -> 192.168.131.1 SCTP 186 INIT
307 14018.662722705 192.168.131.1 -> 192.168.131.2 SCTP 60 ABORT
308 14018.662999216 192.168.32.1 -> 192.168.131.2 SCTP 682 INIT_ACK 

It confused me that 192.168.32.1 is just a virtual NIC and there was no route between it and 192.168.131.2. Why did echo_server change an address to send the INIT_ACK packet and why did it even send an ABORT packet from INIT packet's destination address to its source address?

Two issues:

  1. Either at he host owning 192.168.131.1 there must a second SCTP stack active, which sends the packet with the ABORT chunk, or there is a middlebox involved between 192.168.131.2 and 192.168.131.1 which sends the packet with the ABORT chunk.

  2. The server chooses the first address it thinks it can use. It is a limitation of the userland code. It doesn't know the kernels routing table...

PS: I have changed echo_server's listening port to 7780.

@asterwyx
Copy link
Author

You can not run the kernel stack and the userland stack at the same time using SCTP/IP. The packet containing the ABORT chunk comes from the kernel stack handling the packet containing the INIT chunk.
Either use two different hosts (and no kernel stack on the host running the program with the userland stack) or use UDP encapsulation, but I think the Linux kernel does not support this yet.

Why does the ABORT packet coming from the kernel stack handle the packet containing the INIT chunk? I tried to run usrsctp programs on two different hosts and the following is what I do:

When you have a kernel stack, packets get delivered to that stack. Since the kernel stack does not have states for associations handled in userland, it considers them as out of the blue and replies with a packet containing an ABORT chunk.
What does "have a kernel stack" mean? I installed lksctp-tools and lksctp-tools-devel on my server. Does this mean that I have installed a kernel stack? But I didn't either run any binary after installation or insert any module to the kernel. I guess the Linux kernel has support for SCTP default? In other words, how can I remove the kernel SCTP stack? Simply uninstall lksctp-tools and lksctp-tools-devel?

  1. Run echo_server on a host whose IP address is 192.168.131.1.
./echo_server
  1. Run client on another host whose IP address is 192.168.131.2.
./client 192.168.131.1 7780

In the meantime, I ran tshark both on NIC whose IP address is 192.168.131.1 and on NIC whose IP address is 192.168.131.2 to monitor network traffic. I got below records:

306 14018.662563313 192.168.131.2 -> 192.168.131.1 SCTP 186 INIT
307 14018.662722705 192.168.131.1 -> 192.168.131.2 SCTP 60 ABORT
308 14018.662999216 192.168.32.1 -> 192.168.131.2 SCTP 682 INIT_ACK 

It confused me that 192.168.32.1 is just a virtual NIC and there was no route between it and 192.168.131.2. Why did echo_server change an address to send the INIT_ACK packet and why did it even send an ABORT packet from INIT packet's destination address to its source address?

Two issues:

  1. Either at the host owning 192.168.131.1, there must be a second SCTP stack active, which sends the packet with the ABORT chunk, or there is a middlebox involved between 192.168.131.2 and 192.168.131.1 which sends the packet with the ABORT chunk.
  2. The server chooses the first address it thinks it can use. It is a limitation of the userland code. It doesn't know the kernels routing table...

PS: I have changed echo_server's listening port to 7780.
Two issues:

  1. As you say, I guess that the Linux kernel SCTP stack received the INIT packet and replied with an ABORT packet. Then the problem is how can I prove this.
  2. What does "it think it can use" mean? Why does the server choose the source address of the INIT packet as the destination address of the INIT_ACK packet?
    Thanks!

@tuexen
Copy link
Member

tuexen commented Jul 29, 2021

If you are using Linux, lsmod lists the kernel modules, which are loaded. Does a module with the name sctp show up?

@asterwyx
Copy link
Author

asterwyx commented Jul 29, 2021

Yes, I've just run this command:

lsmod | grep sctp

I got below output:

sctp                  279238  2
libcrc32c              12644  4 xfs,sctp,nf_nat,nf_conntrack

So disabling kernel SCTP stack means removing the sctp module?
I tried it just now but was reminded that

rmmod: ERROR: Module sctp is in use

I've checked that my kernel client and server weren't running at the time.

@tuexen
Copy link
Member

tuexen commented Jul 29, 2021

Yes, I've just run this command:

lsmod | grep sctp

I got below output:

sctp                  279238  2
libcrc32c              12644  4 xfs,sctp,nf_nat,nf_conntrack

So disabling kernel SCTP stack means removing the sctp module?
I tried it just now but was reminded that

rmmod: ERROR: Module sctp is in use

I've checked that my kernel client and server weren't running at the time.

I've not much experience with Linux, but I think you can't unload the sctp once it is loaded. At least this was true in the past...

@asterwyx
Copy link
Author

Yes, I've just run this command:

lsmod | grep sctp

I got below output:

sctp                  279238  2
libcrc32c              12644  4 xfs,sctp,nf_nat,nf_conntrack

So disabling kernel SCTP stack means removing the sctp module?
I tried it just now but was reminded that

rmmod: ERROR: Module sctp is in use

I've checked that my kernel client and server weren't running at the time.

I've not much experience with Linux, but I think you can't unload the sctp once it is loaded. At least this was true in the past...

Thanks, I've looked up some references and found that to remove the sctp module we need to add the -f option. But this didn't work for my server yet, so I rebooted it.

@asterwyx
Copy link
Author

asterwyx commented Aug 2, 2021

Why don't we just use the destination address of the INIT packet to fill the source address of the INIT_ACK packet? I've looked through the code and found that the INIT_ACK packet's source address is filled using the INIT packet's destination address only in the loopback scope. But in other cases, usrsctp will re-choose a source address for the INIT_ACK packet. I don't know why this is better.
My problem still exists, I can't run echo_server and client separately on two servers. They can't even start up an association. I modified the code by myself. usrsctplib\netinet\sctp_output.c, line 6673 to 6677:

if (stc.loopback_scope) {
	over_addr = (union sctp_sockstore *)dst;
} else {
	over_addr = NULL;
}

I've changed to directly assign dst to over_addr like below:

over_addr = (union sctp_sockstore *)dst;

I remade it all and tested it again. Now COOKIE_ECHO packet can be sent and COOKIE ACK can be sent too. But the problem was that there was no data sent between the two. I saw consecutive COOKIE_ECHO packets wat sent. It seems that the client didn't recognize the COOKIE_ACK packet and kept sending COOKIE_ECHO until max try. What might the root cause be? I guess my modification is incomplete. Does this have something to do with the state cookie? Thanks!

@tuexen
Copy link
Member

tuexen commented Aug 2, 2021

The FreeBSD kernel stack uses the IP layer to determine the source address (based on the routing table). The userland stack has not this functionality.

Regarding the COOKIE-ACK: Can you enable the debug output and get an idea why it is not accepted. Which IP addresses are used for the handshake?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants