-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we detect how much stack space and heap memory is used by a sketch? #54
Comments
Hi, @KurtE I think it would be better to deal with this issue here than elsewhere. It would be nice to be able to measure the stack and heap usage of sketches. In an IDE such as TrueStudio, the memory area is measured and displayed in the GUI. I do not have experience with this, so I think we should try to think together. |
Hi @OpusK @routiful @kijongGil , For what it is worth, I have hacked up my own AX 12_test_OpenCM sketch to try it out. Here is my current sketch: Note: I have also checked in the changes into my github project: https://github.com/KurtE/Open_CM_CR_Arduino_Sketches
There is an Init function and a print memory usage function that I am trying out, which tries to init memory to 0xff and then scans looking for first place not 0xff: Run with current develop code (After last nights merge)...
Now if you run it with my updated Dynamixel SDK code, which I was talking about on different thread. Then run similar run, where I just enter my command: H
Let me know what you think... Still probably needs some more sophistication, but maybe at least gives some hints... |
@KurtE , void printMemoryUsage()
{
uint8_t *current_heap_ptr = (uint8_t*)_sbrk(0);
Serial.printf("Heap ptr: %x Usage: %d\n", (uint32_t)current_heap_ptr,
(uint32_t)current_heap_ptr - g_start_heap_pointer);
// stack info
uint8_t *sp_minus = stack_ptr - 10; // leave a little slop
uint8_t *p = current_heap_ptr;
// try to find out how far the stack has been used
while ((p < sp_minus) && (*p == 0xff)) p++;
Serial.printf("Stack Max: %x, usage: %d\n", p, g_end_stack_pointer - (uint32_t)p);
Serial.printf("Estimated unused memory: %d\n", (uint32_t)(p - current_heap_ptr));
} The function you created seems to be very useful.
I think this result is very meaningful. There seems to be quite different memory management methods in environments with a lot of resources and very few resources. However, if the API is not changed, it will not affect existing users, so it will be possible to optimize the internal code (even if it is divided into define processing) |
Hi @OpusK, Thanks, Example if I compile the example b_Blink_LED it says Global variables use 9700 bytes of dynamic memory (almost half). Question is can we cut that down more? Example Serial1, Serial2, Serial3 all have static buffers for TX, RX of 128 bytes so maybe 768 bytes there... I know on the Teensy, the code was setup such that each SerialX object was in it's own source file, such that only if the user used something of that Serial object, would it be included by the linker. But I know there are tricks that need to be done to make it work, as the SerialEvent code would bring it in.... ... As a comparison, if I compile blink on a Teensy LC global variables take up 2048 out of 8k (M0 processor) or if I comple for the T3.2 it takes up 3436 out of 64K (M3 processor). As for my test case in previous post. Yes, I totally understand this is more for @kijongGil and part of my other issue, that I put up against the library. I did not change any of the APIs in this comparison. What I did do was to again change the underlying code, such that if I did things like read or write 1, 2, 4 bytes, I did not malloc the TX buffer, but instead built it on the stack (Passed in pointer to buffer to use). I know from the sizes that for 1, 2 bytes no need for space for stuffing, for 4 bytes, I added an extra byte to buffer in case of stuffing. Currently I have separate code that does the stuffing here, while it is directly outputting the data to the buffer. Now that I have a more optimal stuffing function, may simply want to call it. However in those cases, I would like it to not realloc buffer, but know that there was enough space allocated in the first place. Also wondering if it would be too slimy to have the addStuffing call not call realloc if the memory did not come from heap... That is if it is in the range: &end and _sbrk(0) Again more for other issue. But again would like to get rid of realloc altogether as it is playing with fire. Example: suppose you call malloc(32) bytes and you get txBuffer pointer, and then later call realloc(38), and some other things were allocated in the space right after where txBuffer was. realloc will allocate memory in different location in heap and copy your 32 bytes into it, and then return the new pointer to you AND release the old memory buffer. But then you return from this call, and the caller continues to use the OLD location and tries to write that data out, and then it frees the old memory, which typically will cause the heap to be corrupted. But again more for the other issue. |
Not sure who in this list, but thought I would do a quick test to get an idea of where all of the space in the data section is... So I again compiled the example blink. This time I used the default Arduino one, only edited to change which pin number. This one does not output text to Serial. The compiler still said:
So I found the objdump program, I then edited the file, extracted the areas above and below data section, used grep to find lines with < then edited again removed a few bad lines, changed space to , and imported into excel, where I then used function to convert hex values to decimal, and then subtract the line below from the line, to get an idea of which items are taking space...
Then to show the data here, I exported again to CSV file, changed the ,'s into | and ...
|
Follow on to above - I start to look for some low hanging fruit, especially larger entries, like: VirtAddVarTab (1024 bytes)
So it simply is init as 0,1,2... 0x1ff Two solutions. Could change it like:
Which I tried (#ifdef the for loop I showed above). Compiles fine and sure enough data usage drops by 1024 bytes, but code size increases by the static data. Option 2:
Could make this conditional. where as an option you could still define the variable and Then change the 9 places in this file, like:
to
This would remove the data and probably not grow the code... Maybe even shrink it... Looks like USB is using at least 3.5K or memory. Each Serial1/2/3 uses at least 256 byte RX buffer 128 byte TX buffer, 48 byte object... Of course there are more radical changes, where hopefully you can setup the code, such that if a sketch does not use any of some main feature, than the code and data objects associated with it are not brought into the binary. Probably more than I want to chew ;) |
Changing the buffer size or building only the necessary Serials (including SerialEvent processing) can affect existing examples. First, regarding the buffer size, I think we can reduce the size if there is no problem with DXL communication. Second, as far as I can see, Arduino's official boards have created all of their Serial classes globally, so we seem to have done so too. If we need this, it will be enough to add developer options.
In this regard, I agree. |
As you mentioned, having Serial1, 2, 3 optionally is a little more work and can be a bit tricky. The issue is how to do it without impacting user programs. With a build, where all of the compiled objects are put into a library (archive). During the link only those objects which are referenced are included in the binary. That is if anything within that compiled unit is included... So since Serial1, Serial2 and Serial3 as well as their buffers is included in variant.cpp and obviously there are things referenced in this file, all of these objects are defined. So for example on Teensy builds, Each of the SerialX objects are defined in their own source file. Note: On teensy, actually there is underlying C code for each of these object so there is Serial1.c Serial2.c... and later there are wrapper classes, each of these wrappers are again in their own source file. But that is besides the point. So for example if nothing calls anything within Serial2, than in theory the Serial2 object and currently it's TX buffer... The RX buffer is a 2 dimensional array as part of lower object... But this could change as well. However there is the complication of the function:
Which calls all of these... And brings in all of the objects... So there are some hacks that you could do like:
Actually in this case would probably always init the [0] to Serial object, or not make that change on that line. For the others than you might change the functions like Serial1.begin(), to then update this object list to point to it's object... Again can be tricky. Especially if you also want to try to get all of the internal objects DRV... |
@KurtE , Thank you for your good suggestion. However, it is a good idea to reduce the default buffer size of the Serial class (about 64 bytes) For EEPROM, it would be better to change it to const. What do you think? |
This issue has been closed as there weren't recent activities. Please feel free to reopen this thread if there's any opinion to throw. Thanks. |
@OpusK - makes sense to close this one. This can be an ongoing type of thing, that every so often someone should take a look at the sizes of things that are part of the system and see if there are any low hanging fruit. Examples of Serial buffers and the like. With only 3 serial objects, maybe not as critical, but for example some Teensy boards have up to 8 serial devices... An in their case for the T4, I did do some stuff similar to mentioned above. The SerialX objects each go into their own source file, along with their own buffers and their own default SerialEventX function. The main loop code that did stuff like if (Serial2.available()) SerialEvent2(); |
This issue reopened. Currently, I have other things to deal with, so after I finish them first, I will look back at this issue. |
Sort of old and who knows what applies any more so, going to go ahead and close it out |
Not sure if this is the best place to ask? This is not totally specific to OpenCM, this equally will apply to OpenCR, XEL, Ros To Arduino... But the actual code may differ.
Again not sure if this is better to ask here or RobotSource or Robotis Forum?
But wondering if there are any suggestions on how to detect how much memory is being used by a sketch. For example I am pretty sure that when I try to blindly merge in the current Dyanmixel SDK, my test apps were failing to properly run as the stack and heap corrupted each other.
Would be great if we could somehow find out how much non-static memory a program is using, which would would help to measure reductions in usage.
I don't know for example if the new operation as well as malloc(), maybe call _sbrk(). If so might be able to enable test for collision... (Would need to maybe change the commented out write statement...
But again this does not show us how close we are getting...
Was wondering if it would make sense to do something like, maybe have the startup code, do something like write a standard byte or the like to all memory over "end" to some known quantity.
Again assuming that _sbrk is correct and we can hack it slightly, we can hopefully find out how far the heap has grown.
We could then scan from the heap_end back to at worst case back to current stack pointer, but basically keep walking up in memory until we hit something that is NOT our prefill values and assume the stack grew at least to there... Obviously the stack could have grown beyond that if it had used those magic values, but hopefully close enough...
In my test app that failed earlier. I added some init code:
Which Output:
Which if I subtract the two I see they differ by 7048 bytes. Which might explain why increasing the size of buffers from 2K to 4K and a write allocated a 4K buffer and the stuffing function also used 4K on stack where these two collided.
Does this make sense? Has anyone already setup something like this?
The text was updated successfully, but these errors were encountered: