New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iPXE nic_drv - memory allocation failed in alloc_memblock #593
Comments
From your log output, I see that dde_ipxe selects the 82579lm driver for your platform. On qemu, where we regularly test nic_drv, the message states Generally, for solving such problems, I would grep for the error message ("memory allocation failed in alloc_memblock") in the |
I have the same message when I have tried to flood l4linux. After this message l4linux in qemu hangs. |
Increasing the quota for nic_drv does not help. |
Maybe @chelmuth has a good idea of how to go about debugging this issue? (he is the original author of dde_ipxe) |
Please give the following patch a try - maybe the allocator is not able to fulfill alignment requirements? diff --git a/dde_ipxe/src/lib/dde_ipxe/dde_support.cc b/dde_ipxe/src/lib/dde_ipxe/dde_support.cc
index 33ccecc..a315e0f 100644
--- a/dde_ipxe/src/lib/dde_ipxe/dde_support.cc
+++ b/dde_ipxe/src/lib/dde_ipxe/dde_support.cc
@@ -60,7 +60,8 @@ extern "C" void *alloc_memblock(size_t size, size_t align)
{
void *ptr;
if (allocator()->alloc_aligned(size, &ptr, log2(align)).is_error()) {
- PERR("memory allocation failed in alloc_memblock");
+ PERR("memory allocation failed in alloc_memblock (size=%zd, align=%zx)",
+ size, align);
return 0;
};
return ptr; |
As iPXE header files are not C++ compatible, the implementation missed proper include directives. For example, alloc_memblock() had a wrong signature, which was not detected. Now, C wrapper functions are implemented using a local API to the C++ backend. Related to #593.
@chelmuth last patch doesn't solve the issue. If I try flood network, I have an endless loop of messages again:
|
Driver works ok with lwip stack independent of L4Linux. Tested on same real machine. |
I now think this has something to do with LWIP calling LWIP_PLATFORM_DIAG internally. I don't think this gets connected to printf or equivalent. |
@iloskutov Thanks for posting the output. The align values are clearly a problem. In |
Am I missing something or isn't this done in the following line?
|
Uh, you are right. Sorry for the wrong track.. :-/ By looking again in the file, I see that the used backing store is always 1M. Would it make sense to make it depend in the available ram quota instead? With the current implementation, my initial proposal to @dwaddington to increase the quota of the nic_drv has no effect on the amount of usable backing store. |
Grmpf, so I missed that drawback of the implementation. I would go for dynamically growing backing store until the quota limit is reached with my current knowledge about DDE iPXE: It uses the DDE kit allocator besides |
That sounds like a good way to go. :-) |
@iloskutov and @dwaddington could you please give the branch above a try? |
@chelmuth I'm going to work in few days and try it. |
@chelmuth I have tested on the Genode master branch with your patch. My run script is https://gist.github.com/4546722
When I have finished flood on l4linux I see that memory continues to grow in your new allocator and l4linux hangs. I have tested it on qemu and real hardware. Log https://gist.github.com/4546771 |
That's really strange. Your log shows that the allocator grows above 320K. Does it stop growing eventually? |
No, it doesn't stop. I didn't wait when it has finished. |
So, I have to give your run script and stress test a try, which will not happen before Monday unfortunately. |
Thank you. |
I did several test during the last 3 hours. Here are my results:
I also tried to incorporate upstream fixes from the iPXE repository, which did not help. Unfortunately, I currently have no idea how to tackle this. One possible next step could be to build an iPXE ROM for Qemu and enter the command line on startup to interrupt the normal boot. Then, the original implementation could be stress-tested like our port. |
@alex-ab could you please have a look at this my rebased branch above? I tested it only on OKL4 where it fails. |
@chelmuth please see commit message above |
With the commit c34bbe2caa86edc5ca61f9f0d92eb35a47f96892 I can't trigger the
messages anymore. Of course, the root cause is not solved. A bad nic_session client can still cause the driver to fail if it just don't consume any packets ... Does the commit work for you @dwaddington, @iloskutov ? |
@alex-ab Thanks for investigating. I am wondering, what will a bad NIC client be able to do besides cutting off its own network connection? For NIC bridge this would be fatal but is this really an issue for the driver? |
What's about a nic driver acting in promiscuous mode and serving more than one nic_session client? I assumed the nic_session interface doesn't restrict the number of clients. |
Indeed, the number of clients should not be restricted by the interface (I haven't yet understood how it would restrict the number of clients to a single one). The NIC bridge is a particular example for a NIC service with multiple clients. With my statement I referred to the NIC driver, which supports only one client anyway. With "root cause", are you pointing at the NIC driver implementation or the NIC session interface? |
I'm refering to the nic driver implementation (dde_ipxe), where the number of nic_session clients is not restricted. |
dde_ipxe uses |
Ok, I see, thanks for clarification. |
@chelmuth: With the three last commits the 'memory allocation failed in alloc_memblock' messages don't appear anymore. Do we want to mark this issue as fixed or should I create a new issue, since you have still the rewrite of the memory allocation of dde_ipxe in the pipe ? |
IMO all relevant parts are in your version and, no, I've no work in progress regarding this issue. |
As iPXE header files are not C++ compatible, the implementation missed proper include directives. For example, alloc_memblock() had a wrong signature, which was not detected. Now, C wrapper functions are implemented using a local API to the C++ backend. Related to genodelabs#593.
I am trying to use the iPXE NIC driver on a Dell Optiplex 990 workstation with Intel 82579LM Gigabit NIC card. I am using the l4linux test. 32-bit Fiasco.OC
It seems to start but then goes into an endless cycle of reporting "memory allocation failed in alloc_memblock" (see below).
Any thoughts on how to go about debugging this?
[init -> nic_drv] --- init iPXE NIC
[init -> nic_drv] scan_pci(): Found: 00:19.0 8086:1502 (rev 04) IRQ 05
[init -> nic_drv] probe_pci_device(): using driver 82579lm
[init -> nic_drv] adjust_pci_device(): PCI device 00:19.0 latency timer is unreasonabl.
[init -> nic_drv] ioremap(): bus_addr = e1a00000 len = 20000
[init -> nic_drv] snprintf not implemented
[init -> nic_drv] number of devices: 1
[init -> nic_drv] --- init rx_callbacks
[init -> nic_drv] --- get MAC address
[init -> nic_drv] 18:03:73:28:fffffffa:62
Quota exceeded! amount=4096, size=4096, consumed=4096
[init -> l4linux] upgrade quota donation for SIGNAL session
[init -> nic_drv] memory allocation failed in alloc_memblock
[init -> nic_drv] memory allocation failed in alloc_memblock
[init -> nic_drv] memory allocation failed in alloc_memblock
[init -> nic_drv] memory allocation failed in alloc_memblock
[init -> nic_drv] memory allocation failed in alloc_memblock
[init -> nic_drv] memory allocation failed in alloc_memblock
[init -> nic_drv] memory allocation failed in alloc_memblock
[init -> nic_drv] memory allocation failed in alloc_memblock
The text was updated successfully, but these errors were encountered: