Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Problem]: MAC address portion of raop service name changes to all zeros, shairport is then unselectable as output destination #1657

Closed
1 task done
SoarVermont opened this issue Mar 31, 2023 · 15 comments

Comments

@SoarVermont
Copy link

What happened?

After a while random shairport nodes become unavailable on the LAN as audio output destinations. It appears that the unavailable nodes have all zeros for the MAC address portion of the _raop._tcp. service name. If the shairport-sync service is restarted the MAC address in the name is repopulated correctly and the node becomes available again. The LAN is a mesh network with multiple Eero base stations, so I'm guessing that the nodes are probably hopping between them when the MAC address changes to all zeros.

I'm not certain that this is the root cause for why certain nodes become unavailable, but whether or not they have the full MAC address in the service name seems to map to whether they're selectable as an audio output destination or not. You can otherwise communicate with them normally.

The _airplay._tcp. service seems to be unaffected by this problem - it looks good whether the node is available or not. Both services contain the correct IP addresses in their text records.

Relevant log output

No response

Configuration Information.

$ shairport-sync --displayConfig

Display Config Start.

From "uname -a":
Linux Globe1 5.10.162-ti-r58 #1bullseye SMP PREEMPT Sat Feb 25 19:31:31 UTC 2023 armv7l GNU/Linux

From /etc/os-release:
Debian GNU/Linux 11 (bullseye)

From /sys/firmware/devicetree/base/model:
TI AM335x BeagleBone Green Wireless

Shairport Sync Version String:
4.1-rc0-286-ga1c9387c-AirPlay2-OpenSSL-Avahi-ALSA-stdout-soxr-metadata-dbus-sysconfdir:/etc

Command Line:
shairport-sync --displayConfig

Configuration File:
/etc/shairport-sync.conf

Configuration File Settings:
general :
{
name = "Speaker 1";
};
alsa :
{
output_device = "plughw:0,0,0";
output_rate = 44100;
output_format = "S16_LE";
};

Display Config End.
Goodbye!

How did you install Shairport Sync?

Built from source

Check previous issues

  • Confirm
@SoarVermont
Copy link
Author

I think the shairport service is getting restarted, and then the return value of the call to get_device_id in shairport.c isn't checked for success. I replaced the call with the following snippet and so far I haven't seen the problem recur. I'll continue monitoring.

  // get a device id -- the first non-local MAC address
  while (get_device_id((uint8_t *)&config.hw_addr, 6) < 0) {
      sleep(1);
  }

@mikebrady mikebrady added bug and removed new issue labels Apr 2, 2023
@mikebrady
Copy link
Owner

mikebrady commented Apr 2, 2023

Fantastic work! The value returned by get-device_id -- either -1 or 0 -- is coming from a call to getifaddrs.

I never imagined that it would change dynamically! So I'll make some changes so that Shairport Sync will wait at startup -- in a loop like you have used above -- to get the information necessary to generate a device ID and then store it for later use.

What do you think?

@SoarVermont
Copy link
Author

Thanks! I had this happen again though last night, even with the loop until get_device_id returns a success code. I'm now thinking there might be an edge case where getifaddrs is returning all zeroes for the address. I'll investigate this further today. I had multiple power failures last night, and after the reboots the MAC address portions of the service names were all zeroes again. I do think that we need a loop in the code until a valid address is returned.

@mikebrady
Copy link
Owner

Thanks, yeah -- more information on this would be very useful. It seems to me that we only need it to be successful once, at the start, and so a loop there would be okay. But I think it would be necessary to limit the looping to something sensible so that an error message could be generated if the problem persisted for a long time.

@SoarVermont
Copy link
Author

Here's what I found: It looks like immediately after a reboot getifaddrs returns only records with sa_family=AF_INET or AF_INET6. After a few moments subsequent calls also return a record with sa_family=AF_PACKET. It seems like the simple fix will be to change common.c:get_device_id to return a failure code if it doesn't find a match and then alter the code in shairport.c:main to repeat the call to get_device_id for a reasonable number of tries before failing.

@SoarVermont
Copy link
Author

LMK if you'd like for me to submit a pull request.

@mikebrady
Copy link
Owner

Thanks for that. Can I just clarify that, as soon as getifaddrs is successful, there is a valid MAC address coming back (i.e. irrespective of the sa_family value)? If that was the case, would it not follow that all that was necessary was to loop until getifaddrs was okay (or timeout of course) and then grab the MAC address?

@SoarVermont
Copy link
Author

In the case I described, getifaddrs returned a zero status and there were valid AF_INET and AF_INET6 records but no AF_PACKET records. In that case it is insufficient to just check the return status of getifaddrs. My understanding is that you need an AF_PACKET or AF_LINK record to retrieve the MAC address, but perhaps I'm mistaken. I thought that the AF_INET and AF_INET6 records contain the IP addresses but not the MAC address.

@SoarVermont
Copy link
Author

This is how I changed the code:

int get_device_id(uint8_t *id, int int_length) {
  int response = -1;
  struct ifaddrs *ifaddr = NULL;
  struct ifaddrs *ifa = NULL;
  int i = 0;
  uint8_t *t = id;
  for (i = 0; i < int_length; i++) {
    *t++ = 0;
  }

  if (getifaddrs(&ifaddr) == 0) {
    t = id;
    int found = 0;

    for (ifa = ifaddr; ((ifa != NULL) && (found == 0)); ifa = ifa->ifa_next) {
#ifdef AF_PACKET
      if ((ifa->ifa_addr) && (ifa->ifa_addr->sa_family == AF_PACKET)) {
        struct sockaddr_ll *s = (struct sockaddr_ll *)ifa->ifa_addr;
        if ((strcmp(ifa->ifa_name, "lo") != 0)) {
          found = 1;
          response = 0;
          for (i = 0; ((i < s->sll_halen) && (i < int_length)); i++) {
            *t++ = s->sll_addr[i];
          }
        }
      }
#else
#ifdef AF_LINK
      struct sockaddr_dl *sdl = (struct sockaddr_dl *)ifa->ifa_addr;
      if ((sdl) && (sdl->sdl_family == AF_LINK)) {
        if (sdl->sdl_type == IFT_ETHER) {
          found = 1;
          response = 0;
          uint8_t *s = (uint8_t *)LLADDR(sdl);
          for (i = 0; ((i < sdl->sdl_alen) && (i < int_length)); i++) {
            *t++ = *s++;
          }
        }
      }
#endif
#endif
    }
    freeifaddrs(ifaddr);
  }
  return response;
}

@SoarVermont
Copy link
Author

I checked again and need to update the description of the behavior slightly. On the first call there was also an AF_PACKET record, but it was associated with the loopback and was ignored (ifa_name = "lo"). The wlan0 records appeared in subsequent calls. I think my altered code is still appropriate.

@mikebrady
Copy link
Owner

Thanks for this really interesting information! Let me digest it for a day or two, please.

@mikebrady
Copy link
Owner

mikebrady commented Apr 3, 2023

Thanks for the code above, which is perfect. I agree about the need to see an AF_PACKET or AF_LINK. I was going to put your code inside a ten second wait loop, as follows:

int get_device_id(uint8_t *id, int int_length) {

  uint64_t wait_time = 10000000000L; // wait up to this (ns) long to get a MAC address

  int response = -1;
  struct ifaddrs *ifaddr = NULL;
  struct ifaddrs *ifa = NULL;

  int i = 0;
  uint8_t *t = id;
  for (i = 0; i < int_length; i++) {
    *t++ = 0;
  }

  uint64_t wait_until = get_absolute_time_in_ns();
  wait_until = wait_until + wait_time;

  int64_t time_to_wait;
  do {
    if (getifaddrs(&ifaddr) == 0) {
      t = id;
      int found = 0;

      for (ifa = ifaddr; ((ifa != NULL) && (found == 0)); ifa = ifa->ifa_next) {
#ifdef AF_PACKET
        if ((ifa->ifa_addr) && (ifa->ifa_addr->sa_family == AF_PACKET)) {
          struct sockaddr_ll *s = (struct sockaddr_ll *)ifa->ifa_addr;
          if ((strcmp(ifa->ifa_name, "lo") != 0)) {
            found = 1;
            response = 0;
            for (i = 0; ((i < s->sll_halen) && (i < int_length)); i++) {
              *t++ = s->sll_addr[i];
            }
          }
        }
#else
#ifdef AF_LINK
        struct sockaddr_dl *sdl = (struct sockaddr_dl *)ifa->ifa_addr;
        if ((sdl) && (sdl->sdl_family == AF_LINK)) {
          if (sdl->sdl_type == IFT_ETHER) {
            found = 1;
            response = 0;
            uint8_t *s = (uint8_t *)LLADDR(sdl);
            for (i = 0; ((i < sdl->sdl_alen) && (i < int_length)); i++) {
              *t++ = *s++;
            }
          }
        }
#endif
#endif
      }
      freeifaddrs(ifaddr);
    }
    time_to_wait = wait_until - get_absolute_time_in_ns();
  } while ((response != 0) && (time_to_wait > 0));
  if (response != 0)
    warn("Can't create a device ID -- no valid MAC address can be found.");
  return response;
}

What would you think? (It works for me, but that's just a normal situation...)

@SoarVermont
Copy link
Author

I just tested that code on my beaglebone after a reboot. It called getifaddrs nearly 13K times before it was successful. Perhaps put a usleep call between attempts? I was originally thinking a one second delay made sense given that this only happens on startup, but I'm sure a 100 millisecond delay is also reasonable. Otherwise I think we're hitting the hardware too hard.

@mikebrady
Copy link
Owner

mikebrady commented Apr 3, 2023

Yikes, you are completely right! I’ll go with 100 ms.

int get_device_id(uint8_t *id, int int_length) {

  uint64_t wait_time = 10000000000L; // wait up to this (ns) long to get a MAC address

  int response = -1;
  struct ifaddrs *ifaddr = NULL;
  struct ifaddrs *ifa = NULL;

  int i = 0;
  uint8_t *t = id;
  for (i = 0; i < int_length; i++) {
    *t++ = 0;
  }

  uint64_t wait_until = get_absolute_time_in_ns();
  wait_until = wait_until + wait_time;

  int64_t time_to_wait;
  do {
    if (getifaddrs(&ifaddr) == 0) {
      t = id;
      int found = 0;

      for (ifa = ifaddr; ((ifa != NULL) && (found == 0)); ifa = ifa->ifa_next) {
#ifdef AF_PACKET
        if ((ifa->ifa_addr) && (ifa->ifa_addr->sa_family == AF_PACKET)) {
          struct sockaddr_ll *s = (struct sockaddr_ll *)ifa->ifa_addr;
          if ((strcmp(ifa->ifa_name, "lo") != 0)) {
            found = 1;
            response = 0;
            for (i = 0; ((i < s->sll_halen) && (i < int_length)); i++) {
              *t++ = s->sll_addr[i];
            }
          }
        }
#else
#ifdef AF_LINK
        struct sockaddr_dl *sdl = (struct sockaddr_dl *)ifa->ifa_addr;
        if ((sdl) && (sdl->sdl_family == AF_LINK)) {
          if (sdl->sdl_type == IFT_ETHER) {
            found = 1;
            response = 0;
            uint8_t *s = (uint8_t *)LLADDR(sdl);
            for (i = 0; ((i < sdl->sdl_alen) && (i < int_length)); i++) {
              *t++ = *s++;
            }
          }
        }
#endif
#endif
      }
      freeifaddrs(ifaddr);
    }
    time_to_wait = wait_until - get_absolute_time_in_ns();
    if (response != 0)
      usleep(100000);
  } while ((response != 0) && (time_to_wait > 0));
  if (response != 0)
    warn("Can't create a device ID -- no valid MAC address can be found.");
  return response;
}

@github-actions
Copy link

This issue has been inactive for 45 days so will be closed 7 days from now. To prevent this, please remove the "stale" label or post a comment.

@github-actions github-actions bot added the Stale label Jul 30, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants