Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase number of LWIP timers for MDNS (fixes #7333) #7589

Merged
merged 1 commit into from
Feb 16, 2023

Conversation

gneverov
Copy link

The proximate cause of this issue here, where this memory allocation fails.

timeout = (struct sys_timeo *)memp_malloc(MEMP_SYS_TIMEOUT);

This raises two questions:

  1. Why did the memory allocation fail?
  2. Why wasn't the failure more fatal (e.g., crash instead of continue as usual)?

Why did the memory allocation fail?

LWIP uses statically sized arrays from which it allocates each type of struct. In this case we exhausted the static array for the timeout struct. The size of the array is determined by MEMP_NUM_SYS_TIMEOUT which has some preprocessor magic to calculate the maximum number of allocations needed, so we should never run out. However this macro only knows about stuff in lwip/core; it doesn't know about lwip/apps.

From lwip/apps, we use mdns. In the code for mdns, there is this well hidden comment:

You need to increase MEMP_NUM_SYS_TIMEOUT by one if you use MDNS!

So we need to manually increase the value of MEMP_NUM_SYS_TIMEOUT in lwip/core from 7 to 8 and unfortunately we can't use the magic macro calculation anymore.

But it doesn't end there! A different mdns file, allocates a further 3 timers per IP version. And unlike the first file, there doesn't appear to be a comment to increase MEMP_NUM_SYS_TIMEOUT to compensate.

So that is how the new value of MEMP_NUM_SYS_TIMEOUT is calculated and it is sufficient to fix the issue.

Why wasn't the failure more fatal?

When the memory allocation fails, the code calls LWIP_ASSERT, which in turn calls LWIP_PLATFORM_ASSERT, which is defined here:

#define LWIP_PLATFORM_ASSERT(x) do { if(!(x)) while(1); } while(0)

The problem is that the parameter, x, to LWIP_PLATFORM_ASSERT is not the assert predicate, but the assert message. The message string constant is statically non-null, so this whole macro simplifies into a no-op instead of an infinite loop. This means the assert after the memory allocation failure does nothing and the program continues normally instead of halting, which would be appropriate for this non-recoverable error.

The fix for this is of course trivial; it's not necessary to fix #7333, but it is useful to catch similar bugs earlier. Currently Adafruit does not maintain its own fork of pico-sdk, so I will submit the PR to Raspberry Pi. Unless someone thinks it is worthwhile to create the fork.

Copy link
Collaborator

@DavePutz DavePutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and confirmed that this fixes issue #7333

@dhalbert
Copy link
Collaborator

@gneverov Could you change the base to 8.0.x and I will get this in the next 8.0.x release? Thanks!

@gneverov gneverov changed the base branch from main to 8.0.x February 15, 2023 23:30
@gneverov gneverov changed the base branch from 8.0.x to main February 16, 2023 00:43
@gneverov gneverov force-pushed the issue_7333 branch 2 times, most recently from b6ca7c7 to fd1c3ca Compare February 16, 2023 00:57
@gneverov gneverov changed the base branch from main to 8.0.x February 16, 2023 00:58
Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@gneverov
Copy link
Author

BTW, it turns out the LWIP_PLATFORM_ASSERT secondary issue is already fixed in the upstream pico-sdk. Perhaps you want to update the circuitpython submodule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

raspberrypi 8.0.0 socket exceptions when web workflow is enabled
4 participants