-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix M100 Free Memory Checker #6302
Conversation
M100 had numerious changes and quit working. Part of the problem is the overloading of the SERIAL_PROTOCOL functions. Also, some of the address arithmatic was changed to use char *ptr and passing ptr into the SERIAL_PROTOCOL functions caused them to try to print a string instead of a number. M100 is working again. Let's keep it that way! M100 has been expanded to now have a function int free_memory_is_corrupted() that can be called from other code to see if the free space is still contiguous. It may make sense to add a flag to control its verbose nature but right now, the extra chit chat is very helpful to know int free_memory_is_corrupted() is doing the right thing and what it found at various points when it was called. A 'Show & Tell' is coming up with int free_memory_is_corrupted().
In consideration of the 32-bit Marlin, going forward we'll need to use |
Yeah... But let's leave this alone for the next couple of weeks... I need this stable and working. I'll have something very interesting to show you later today. The |
It's an issue of C++ overloading generally, and something we need to consider in a several places. For example, when we want to print an 8-bit value as an integer we have to cast it with |
Since we don't use any dynamic allocation —only globals that never move and stack-based variables that should remain relatively small— what kinds of issues are you catching with |
Interesting. All it does is define a |
Its the constexpr float stuff in ubl_G29.cpp.... the stack frame is getting corrupted. I know this sounds like a 1st year computer science student claiming "Compiler Bug!!!!" But.... I think I have the evidence. I'm trying to get things packaged up so you can duplicate what I'm seeing.... BUT VERY LITERALLY.... I'm seeing a big chunk of the stack frame get corrupted. |
In what instance is the stack getting corrupted? |
Patience!!!! I'll write it up and show you dumps of the stack after each command. But I'm trying to get things simplified way down so there is no argument about what is happening... But up at the top of ubl_G29.cpp there was this code: constexpr float ubl_3_point_1_X = UBL_PROBE_PT_1_X,
ubl_3_point_1_Y = UBL_PROBE_PT_1_Y,
ubl_3_point_2_X = UBL_PROBE_PT_2_X,
ubl_3_point_2_Y = UBL_PROBE_PT_2_Y,
ubl_3_point_3_X = UBL_PROBE_PT_3_X,
ubl_3_point_3_Y = UBL_PROBE_PT_3_Y; and with this commented out code 'active' (even if not invoked) the stack gets hammered SERIAL_PROTOCOLLNPAIR("Checkpoint 2:", free_memory_is_corrupted() );
if (code_seen('T')) {
/*
const float lx1 = LOGICAL_X_POSITION(ubl_3_point_1_X),
lx2 = LOGICAL_X_POSITION(ubl_3_point_2_X),
lx3 = LOGICAL_X_POSITION(ubl_3_point_3_X),
ly1 = LOGICAL_Y_POSITION(ubl_3_point_1_Y),
ly2 = LOGICAL_Y_POSITION(ubl_3_point_2_Y),
ly3 = LOGICAL_Y_POSITION(ubl_3_point_3_Y);
float z1 = probe_pt(lx1, ly1, false, g29_verbose_level),
z2 = probe_pt(lx2, ly2, false, g29_verbose_level),
z3 = probe_pt(lx3, ly3, true , g29_verbose_level);
// We need to adjust z1, z2, z3 by the Mesh Height at these points. Just because they are non-zero doesn't mean
// the Mesh is tilted! (We need to compensate each probe point by what the Mesh says that location's height is)
z1 -= ubl.get_z_correction(lx1, ly1);
z2 -= ubl.get_z_correction(lx2, ly2);
z3 -= ubl.get_z_correction(lx3, ly3);
*/
float z1 = probe_pt( LOGICAL_X_POSITION(ubl_3_point_1_X), LOGICAL_Y_POSITION(ubl_3_point_1_Y), false, g29_verbose_level),
z2 = probe_pt( LOGICAL_X_POSITION(ubl_3_point_2_X), LOGICAL_Y_POSITION(ubl_3_point_2_Y), false, g29_verbose_level),
z3 = probe_pt( LOGICAL_X_POSITION(ubl_3_point_3_X), LOGICAL_Y_POSITION(ubl_3_point_3_Y), true , g29_verbose_level);
// We need to adjust z1, z2, z3 by the Mesh Height at these points. Just because they are non-zero doesn't mean
// the Mesh is tilted! (We need to compensate each probe point by what the Mesh says that location's height is)
z1 -= ubl.get_z_correction(LOGICAL_X_POSITION(ubl_3_point_1_X), LOGICAL_Y_POSITION(ubl_3_point_1_Y));
z2 -= ubl.get_z_correction(LOGICAL_X_POSITION(ubl_3_point_2_X), LOGICAL_Y_POSITION(ubl_3_point_2_Y));
z3 -= ubl.get_z_correction(LOGICAL_X_POSITION(ubl_3_point_3_X), LOGICAL_Y_POSITION(ubl_3_point_3_Y));
do_blocking_move_to_xy((X_MAX_POS - (X_MIN_POS)) / 2.0, (Y_MAX_POS - (Y_MIN_POS)) / 2.0);
tilt_mesh_based_on_3pts(z1, z2, z3);
}
SERIAL_PROTOCOLLNPAIR("Checkpoint 3:", free_memory_is_corrupted() ); |
So it was crashing? If so, that does tend to look like a compiler problem. |
Some times it crashes... Some times things are just fine. It just depends on where the block of corruption happens. I can do things to move the corruption around. And it sure looks and feels like a compiler bug. But I'm very hesitant to say it is just yet. But, declaring things constexpr should not have any impact on things in the middle of the stack!!!!! Like I say.... I'll get it simplified way down, and then let you play with it. But M100 is critical to being able to see what is happening... So.... no changes to M100 until we understand this other problem!!! |
Is |
Well... Technically no. But practically speaking... Yes. If you want to see the stack frame from the previous run, you would not want to do a M100 I. The RAM memory inside the AVR does not get reset or cleared from the previous run. So... It is possible to do an 'autopsy' and look at previous stack information if you want. But practically speaking... you want to do a M100 I first thing. |
Perhaps the handling of This way you can do |
It could... I just did it that way to make turning off features closer to the top of the file. It probably would be best to have 'I' at the top. BUT... Let's wait a few days to do that. I'm might have a simplified case ready to check in for you to look at soon. |
Ok. I will hold off. I have an update of |
Unfortunately... My simplification isn't causing the problem, and while I'm getting corruption, it is not in the middle of the stack frame. It is much more close to the 'frame pointer' now. But still... You will be able to see the problem. Do you have a graphical LCD Panel? If so... That would be best because that squeezes the stack space the worst. But here is what you need to duplicate the problem. (I'm waiting until Travis says its OK to merge)
I added a G299 command to simplify things. I was hoping that would cause the problem because that would be just a tiny bit of code causing the issue and easier to examine. But that failed. Removing (or shrinking) the G299 function very possibly will move the corruption to the middle of the stack frame for you. But if that fails.... I can do cut & paste dumps of what I see at each step of the process. |
Sounds mysterious. I'm champing at the bit to solve this one. |
I'm wondering... I wonder if this is related to the LCD Panel and its indirect function calls ? If you want, I can merge so you can take a look. But I don't want to scream "Compiler Bug!" without knowing for a fact that is what is doing this. I'm definitely seeing corruption... And I can cut and paste the log of everything for you to look at. But I don't have rock solid proof yet. Arrrrgh! Travis doesn't like something. It will be tomorrow before I can make progress on this. But I'll try to simplify things and get it to a very trivial example that causes the problem to show up. |
See if you can isolate the exact line where the problem manifests and add logging. One way the stack can blow up if you call |
Not at the moment. It's still in Oregon waiting to be shipped here.
Due to U8Glib? |
Well, that PR needs a huge amount of cleanup, and should be squashed to a single commit before being merged anyway. If you need anything tested, just ping me and I will do so. |
M100 had numerious changes and quit working. Part of the problem is
the overloading of the SERIAL_PROTOCOL functions. Also, some of the
address arithmatic was changed to use char *ptr and passing ptr into the
SERIAL_PROTOCOL functions caused them to try to print a string instead
of a number. M100 is working again. Let's keep it that way!
M100 has been expanded to now have a function int
free_memory_is_corrupted() that can be called from other code to see if
the free space is still contiguous. It may make sense to add a flag to
control its verbose nature but right now, the extra chit chat is very
helpful to know int free_memory_is_corrupted() is doing the right thing
and what it found at various points when it was called. A 'Show &
Tell' is coming up with int free_memory_is_corrupted().