Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Auto-IP and Multicast/socket connection #53587

Closed
bonetor opened this issue Jan 8, 2023 · 15 comments · Fixed by #54608
Closed

Issue with Auto-IP and Multicast/socket connection #53587

bonetor opened this issue Jan 8, 2023 · 15 comments · Fixed by #54608
Assignees
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@bonetor
Copy link

bonetor commented Jan 8, 2023

Hello together.

I am using the STM32H747-DISCO board with Zephyr v3.1 .

I configured a socket to receive Multicasts send via UDP by a client software running on a PC and I send back UDP packages to the client software as an answer to the received packages.

The socket is configured in the following way:

int32_t init()
{
    int32_t ret;
    struct sockaddr_in addr4;

    addr4.sin_family = AF_INET;
    addr4.sin_port = htons(MULTICAST_LISTEN_PORT);
    addr4.sin_addr.s_addr = INADDR_ANY;

    ret = configSocket((struct sockaddr *)&addr4, sizeof(addr4));
    ...
}

static int32_t configSocket(struct sockaddr *bindAddr, socklen_t bindAddrLen)
{
    struct zsock_addrinfo hints;
    struct zsock_addrinfo* pRetAddrInfo = NULL;
    const struct in_addr* multiAddr = NULL;
    struct net_if_mcast_addr* mcAddr = NULL;
	
    /* Create socket */
    udpData.sock = zsock_socket(bindAddr->sa_family, SOCK_DGRAM, IPPROTO_UDP);
    ...

    /* Bind socket */
    ret = zsock_bind(udpData.sock, bindAddr, bindAddrLen);

    /* Resolve network address */
    memset(&hints, 0, sizeof(struct zsock_addrinfo));
    hints.ai_family = AF_INET;
    hints.ai_socktype = SOCK_DGRAM;
    hints.ai_flags = AI_PASSIVE;
    hints.ai_protocol = IPPROTO_UDP;

    ret = zsock_getaddrinfo(MULTICAST_LISTEN_ADDRESS_STR, MULTICAST_LISTEN_PORT_STR,
                             &hints, &pRetAddrInfo);
    ...

    /* Get IPv4 multicast address */
    multiAddr = &net_sin(pRetAddrInfo->ai_addr)->sin_addr;
    ....

    /* Get network interface */
    struct net_if* ptrNetIf = net_if_get_first_by_type(&NET_L2_GET_NAME(ETHERNET));
    ....

    /* Add IPv4 multicast address to network interface */
    mcAddr = net_if_ipv4_maddr_add(ptrNetIf, multiAddr);
    ....
    net_if_ipv4_maddr_join(mcAddr);
    ...
}

The socket is polled within a thread to receive the UDP packages and send back answers to the received packages via:

void thread()
{
    received = zsock_recvfrom(udpData.sock, udpData.recvBuffer,
			sizeof(udpData.recvBuffer), 0,
			&clientAddr, &clientAddrLen);
    ...
    ret = zsock_sendto(udpData.sock, udpData.sendBuffer, sizeof(struct KistlerDeviceCfg), 0,
		(struct sockaddr *)&serverAddr, serverAddrLen);
    ....
}

If a static IP is configured, zsock_recvfrom receives the Multicast and zsock_sendto sends an answer to the client software.

If the IP settings are changed during running my application from static to DHCP and a DHCP is available zsock_recvfrom receives the Multicast and zsock_sendto sends an answer to the client software.

If the IP settings are changed during running the application from static to DHCP and a DHCP is not available, Auto-IP is used and Zephyr sets an IP automatically. In that case and zsock_recvfrom receives the Multicast but zsock_sendto does not send an answer to the client software and returns an error, -EINVAL.

When debugging the issue I found out that in the function net_if_ipv4_select_src_addr in line 3215 of zephyr/subsys/net/ip/net_if.c the variable src is not set and stays NULL.

In case of Auto-IP the function call to net_ipv4_is_ll_addr(dst) in the if statement in line 3223 returns true as the Auto-IP address is a local-link address. Therefore the else statement in line 3254 is executed. Within that statement src is not set. That's why the if statement in line 3258 is executed. Within this statement net_if_ipv4_get_global_addr returns NULL and net_ipv4_unspecified_address returns NULL. That's why at the end zsock_sendto fails.

I do not understand why src is not set to a valid value in case of switching from static IP to Auto-IP. Why is src not set and zsock_sendto fails at the end?

In case of static IP the function call to net_ipv4_is_ll_addr(dst) in the if statement in line 3223 returns false and the if statement is executed. In that case line 3227 with the function call net_if_ipv4_get_best_match is executed and src is set to the static IP address.

@rlubos
Copy link
Contributor

rlubos commented Jan 10, 2023

I do not understand why src is not set to a valid value in case of switching from static IP to Auto-IP

It's not clear to me either. The logic behind address selection in case dst address is a LL address is pretty simple - use LL address as a src. The only reasonable explanation to me is that there is no LL address on an interface - are you 100% sure this isn't the case? Can you check for example with shell net iface command if the LL address is configured on the interface before trying to send a packet? Note, that IPv4 autoconf address is configured after a total delay of ~10 seconds.

I can see however another potential issue with IPv4 autoconf - it does not seem to set the subnet mask on the interface when the address is configured (/16 or autoconf address). So if the statically configured address has used a larger subnet mask, it can cause problems for example with ARP, which will not recognize the LL peer address as a part of the local subnet, and will try to request GW address instead. This could lead to errnoneous behaviour or the following error in logs in case GW address is not set:

net_arp: Gateway not set for iface 0x24004bd8

Have you seen those?

The subnet issue should likely be fixed in the IPv4 autoconf module itself, i. e. when autoconf address is set, the subnet mask on the interface should also be updated, @jukkar do you agree? I see the limitation though, that it's not possible to have IPv4 addresses configured with different sub masks on a single interface, not sure if that's a big problem though.

@rlubos
Copy link
Contributor

rlubos commented Jan 10, 2023

@bonetor Please give #53666 a try to see if it helps.

@laurenmurphyx64 laurenmurphyx64 added bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: low Low impact/importance bug labels Jan 17, 2023
@carlescufi
Copy link
Member

@bonetor can you please see if #53666 fixes this issue?

@bonetor
Copy link
Author

bonetor commented Jan 18, 2023

@rlubos : I did see net_arp: Gateway not set for iface ..... After I applied #53666 I did not see it again.

Interface 0x24002558 (Ethernet) [1]
===================================
Link addr : 00:03:05:00:00:01
MTU       : 1500
Flags     : NO_AUTO_START,IPv4
Ethernet capabilities supported:
        10 Mbits
        100 Mbits
IPv4 unicast addresses (max 2):
        169.254.21.55 autoconf preferred infinite
IPv4 multicast addresses (max 1):
        239.253.253.239
IPv4 gateway : 0.0.0.0
IPv4 netmask : 255.255.0.0
DHCPv4 lease time : 0
DHCPv4 renew time : 0
DHCPv4 server     : 0.0.0.0
DHCPv4 requested  : 0.0.0.0
DHCPv4 state      : selecting
DHCPv4 attempts   : 3

@rlubos , @carlescufi : #53666 does not fix the issue. src is still not set to a valid value in net_if_ipv4_select_src_addr.

@rlubos
Copy link
Contributor

rlubos commented Jan 18, 2023

@bonetor Just to clarify, what is the destination address in this scenario?

@bonetor
Copy link
Author

bonetor commented Jan 18, 2023

The serverAddr in zsock_sendto is set to the Multicast address

const struct in_addr xMcast = { .s4_addr = {239, 253, 253, 239}};
struct sockaddr_in serverAddr = { .sin_family = AF_INET,
                                .sin_addr.s_addr = htonl(xMcast.s_addr) ,
                                .sin_port = htons(MULTICAST_LISTEN_PORT)};

The client software running on a PC sending and receiving Multicasts via UDP is in a network with IP starting with 10.x.x.x.

@rlubos
Copy link
Contributor

rlubos commented Jan 19, 2023

Ok, then I've got confused by your previous statement, which suggested that the destination address is also IPv4 link-local:

In case of Auto-IP the function call to net_ipv4_is_ll_addr(dst) in the if statement in line 3223 returns true

But it seems that you want to send a packet to a non-link-local destination address, having only a link-local one, and this is simply not supported by Zephyr. In such case, the link-local source addresses are explicitly ignored:
https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/net/ip/net_if.c#L3129

I think this approach is justifiable in simple implementations, given that LL addresses are not routable and should only be considered for link-local communication. RFC3927 however does mention a possibility of sending a packet to a host that does not have a LL address:

A host with an IPv4 Link-Local address may send to a destination
which does not have an IPv4 Link-Local address.

@jukkar What's your opinion, shall we introduce such a possibility? I think it should be possible to introduce a final fall-back, in case there's really no other address, to try to choose LL address here (only if autoconf is enabled):
https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/net/ip/net_if.c#L3268

It'd still be needed however to register a valid gateway address (peer address in such case), otherwise ARP module will get confused, as from its perspective the IP packet is within different subnet, so it'll try to send it via gateway.

@rlubos
Copy link
Contributor

rlubos commented Feb 1, 2023

@jukkar Any opinion on the above?

@erwango erwango removed the platform: STM32 ST Micro STM32 label Feb 1, 2023
@jukkar
Copy link
Member

jukkar commented Feb 1, 2023

A host with an IPv4 Link-Local address may send to a destination
which does not have an IPv4 Link-Local address.

Typically auto link local addresses are meant to point-to-point connection to a host with similar auto ip address, and have limited functionality (no routing etc). I would say that what we have implemented now is ok and fine, but I have no big objections if someone implements it and proposes patch, the review will then tell whether it should be accepted or not.

@bonetor
Copy link
Author

bonetor commented Feb 2, 2023

@rlubos In regard to your comment

Ok, then I've got confused by your previous statement, which suggested that the destination address is also IPv4 link-local:

The statement

In case of Auto-IP the function call to net_ipv4_is_ll_addr(dst) in the if statement in line 3223 returns true as the Auto-IP address is a local-link address.

actually says, that the destination address is a link-local address. But in the mentioned case we want to send the package from the device with a link-local address to the destination address which is the Multicast IP address.

const struct in_addr xMcast = { .s4_addr = {239, 253, 253, 239}};  // destination address
struct sockaddr_in serverAddr = { .sin_family = AF_INET,
                                .sin_addr.s_addr = htonl(xMcast.s_addr) ,
                                .sin_port = htons(MULTICAST_LISTEN_PORT)};

net_ipv4_is_ll_addr returns true, if the address is a link-local address, beginning with 169.254.

static inline bool net_ipv4_is_ll_addr(const struct in_addr *addr)
{
    return (ntohl(UNALIGNED_GET(&addr->s_addr)) & 0xA9FE0000) == 0xA9FE0000;
} 

But why does the function returns true if zsock_sendto with serverAddr.sin_addr.s_addr = htonl(239.253.253.239) is called?

@rlubos
Copy link
Contributor

rlubos commented Feb 2, 2023

But why does the function returns true if zsock_sendto with serverAddr.sin_addr.s_addr = htonl(239.253.253.239) is called?

I've now noticed that net_ipv4_is_ll_addr() is wrong, it should use a proper subnet mask (0xFFFF0000) when filtering the first two bytes of the address - otherwise, it can lead to unexpected results, for instance with multicast address, if only specific bits match.

Anyway, fixing this doesn't fix the issue, as LL source would still be ignored for multicast destinations. Please give this branch a try, where I've implemented a fallback to LL address in case no better match is found:
https://github.com/rlubos/zephyr/commits/net/use-ll-addr-as-last-resort

I've tested it briefly with the autoconf sample, and I was able to sucessfully send packets to mutlicast address or non-LL peers on the same local network. If you can confirm that this helps in your case as well, I can send a PR out of it for further review.

@bonetor
Copy link
Author

bonetor commented Feb 2, 2023

I did a first test and with the changes on your branch it works also on my side. I will so some further tests with my setup and provide you more feedback.

@rlubos
Copy link
Contributor

rlubos commented Feb 8, 2023

@bonetor Did you have a chance to do some more tests?

@bonetor
Copy link
Author

bonetor commented Feb 8, 2023

Sorry for the late feedback. The described behavior is solved with the implementations on https://github.com/rlubos/zephyr/commits/net/use-ll-addr-as-last-resort.

This issue can be closed.

@rlubos
Copy link
Contributor

rlubos commented Feb 8, 2023

We still need to open a PR :) I'll send one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants