Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Junction Deviation (enabled) causes curve stuttering #17920

Closed
hrabbot opened this issue May 8, 2020 · 138 comments
Closed

[BUG] Junction Deviation (enabled) causes curve stuttering #17920

hrabbot opened this issue May 8, 2020 · 138 comments

Comments

@hrabbot
Copy link

hrabbot commented May 8, 2020

Bug Description

Curved extrusions stutter (multiple full stops in the middle of a continuous curve) with Junction Deviation enabled. Enabling/Disabling S Curve and Linear Advance (indepentently and in combination) does not remedy the problem. Adjustments to acceleration and junction deviation value do not remedy the problem.

Disabling Junction Deviation remedies the problem, even with S Curve and Linear Advance both set.

My Configurations

Required: Please include a ZIP file containing your Configuration.h and Configuration_adv.h files.
Working_configs.zip
Stutter_configs.zip

Expected behavior: Smooth curve motion, as expected based on baseline experience.

Actual behavior: Stutter/halting during curve motion.

Additional Information

Configs and example video attached.

@qwewer0
Copy link
Contributor

qwewer0 commented May 8, 2020

Could you isolate a half/full circle of the gcode that stutters, and upload it?

@hrabbot
Copy link
Author

hrabbot commented May 8, 2020

Could you isolate a half/full circle of the gcode that stutters, and upload it?

Here's a snippet:

G1 X221.745 Y15.794 E9.0641
G1 X221.904 Y15.798 E0.0099
G1 X222.890 Y15.846 E0.0615
G1 X223.207 Y15.877 E0.0198
G1 X224.262 Y16.034 E0.0664
G1 X224.574 Y16.096 E0.0198
G1 X225.580 Y16.348 E0.0647
G1 X225.876 Y16.437 E0.0193
G1 X226.904 Y16.802 E0.0679
G1 X227.207 Y16.927 E0.0204
G1 X228.176 Y17.385 E0.0668
G1 X228.457 Y17.536 E0.0199
G1 X229.373 Y18.086 E0.0666
G1 X229.641 Y18.265 E0.0201
G1 X230.505 Y18.908 E0.0671
G1 X230.747 Y19.107 E0.0195
G1 X231.519 Y19.807 E0.0650
G1 X231.743 Y20.030 E0.0197
G1 X232.454 Y20.814 E0.0660
G1 X232.655 Y21.059 E0.0197
G1 X233.296 Y21.921 E0.0669
G1 X233.476 Y22.190 E0.0202
G1 X234.032 Y23.118 E0.0674
G1 X234.181 Y23.398 E0.0198
G1 X234.633 Y24.355 E0.0659
G1 X234.754 Y24.647 E0.0197
G1 X235.112 Y25.646 E0.0661
G1 X235.205 Y25.954 E0.0201
G1 X235.467 Y27.006 E0.0675
G1 X235.528 Y27.314 E0.0196
G1 X235.682 Y28.352 E0.0654
G1 X235.713 Y28.669 E0.0198
G1 X235.762 Y29.654 E0.0614
G1 X235.766 Y29.813 E0.0099

@qwewer0
Copy link
Contributor

qwewer0 commented May 8, 2020

How fast did you print?

@qwewer0
Copy link
Contributor

qwewer0 commented May 8, 2020

Can't really feel stutter in the given gcode snippet, or Can feel it a little stutter all the time.

Try it out: JD_single_arc_test_2.zip

@XDA-Bam
Copy link
Contributor

XDA-Bam commented May 8, 2020

Gcode: Issue_17920.zip

Speed limit profile (not accounting for nominal_speed_sqr) should look roughly like this:
Filter_none_issue-17920_rev4
(JD 0.013, accel 500, as per OPs config)

The problem is, that we're just on the verge of 1 mm length on those segments. So we jump between regular >1 mm v_max_junction_sqr (white background)) and the <1 mm case limit_sqr (grey background).

EDITED: Realized, that the segment length is the problem, not the junction angle (that's around 177.5° for most).

@hrabbot
Copy link
Author

hrabbot commented May 8, 2020

See zipped video example. Also note blobby curves already on printbed.
stutter.zip

@daleckystepan
Copy link
Contributor

daleckystepan commented May 8, 2020

So we should come up with solution where transition between speeds will be smooth. What about get rid of 1mm block and >135° condition and do the math always? Maybe we can do it with another/faster approximation and without division at the end of calculation (I have this approximation on my github). But I'm just guessing - I don't understand the code and movement enough.

@XDA-Bam
Copy link
Contributor

XDA-Bam commented May 8, 2020

Here's the block or segment length for the GCode in discussion:
Issue_17920_block_length

@XDA-Bam
Copy link
Contributor

XDA-Bam commented May 9, 2020

So we should come up with solution where transition between speeds will be smooth. What about get rid of 1mm block and >135° condition and do the math always? ...

Dropping those if-checks sadly doesn't solve the stutter in this case. limit_sqr is bumpy itself, as you can see from the speed limit plot.

This is a general problem of the discretization and the fact that we can't look ahead. The two JD code paths (>135° && < 1 mm vs. <135° || > 1 mm) have different problems due to this, but in essence, the code is "right" in stuttering like it does. The GCode is sliced in a way, which dictates these speed changes. From the point of view of the planner, it is logically impossible to tell if this one small element it is looking at right now with that slightly larger change in angle than the one before is the beginning of a sharper curve, or just a slicer artifact inside of a smooth, continous circle. No amount of transition between the two code paths can solve this problem (in general, it might work for specific cases).

There's only one way to safely prevent this: When planning a block, look ahead a couple of blocks to see where we are going. Then, fit an analytical function to our path (backwards and ahead) and calculate the second derivative for the block currently being planned. That would require extensive changes to the planner code and quite a bit of computational resources when running. Also, I wouldn't know how to implement this in C++ 😄

@thisiskeithb
Copy link
Member

Should we revert back to classic jerk before the next release?

@Lord-Quake
Copy link
Contributor

Lord-Quake commented May 10, 2020

I'm not sure if it is necessary to default back to classic jerk since for the most part JD is seems to work.
Going from 5.0.5.3 and then looking at each commit I was able to determine that the problem was introduced with commit 2020.05.04 and I believe this could be the culprit: c55475e

Below my results with the jump from good to bad:
Junction_Deviation_Curve_Stuttering

Edit: Corrected the commit dates.

@thinkyhead
Copy link
Member

Should we revert back to classic jerk before the next release?

It's preferable for many cases, so it might be the best choice for certain example configurations. However, this kind of suggestion implies that we are no closer to a solution.

Since jerk is relatively inexpensive, perhaps the best thing would be a hybrid approach.

Although, I am still curious to see if the "angular change within X total distance" accumulator approach could also work, insofar as it can be tracked in an efficient way.

I feel I could get pretty far with this sort of algorithm experimentation if I had nothing else to do. I mean, what we ultimately want to do is produce the ideal and perfect, but horribly expensive version first. Then try to chop that up into something simpler.

But maybe Classic Jerk is superior in every way? I can't think of too many reasons to fault it.

@thinkyhead
Copy link
Member

I believe this could be the culprit

@Lord-Quake — You may be right. Does it look better if you disable JD_USE_LOOKUP_TABLE (inside of planner.h)? If so then the LUT is missing something that the old approximation has "more right."

@Lord-Quake
Copy link
Contributor

@thinkyhead Using commit 2020.05.04 I disabled

//#define JD_USE_LOOKUP_TABLE

Result is no more stuttering. The print turned out fine.

You've now got something to work with :-)

@Lord-Quake
Copy link
Contributor

Just for the sake of completeness I tried the latest commit 86c1125 and disabled
//#define JD_USE_LOOKUP_TABLE
with the same result. The stuttering has stopped, print is fine.

I did find a new issue (cosmetics only though)

@XDA-Bam
Copy link
Contributor

XDA-Bam commented May 11, 2020

Since #17938 has already been merged and might influence this, it would be nice if someone affected could check this again (with the lookup table enabled) using most recent bugfix. It may be improved, I don't think it will be fixed.

@Lord-Quake
Copy link
Contributor

Well testing with my Ender 3 Pro is over for now.....
Uploaded the firmware as requested.
While starting the print I touched the X-Stepper motor and got a big electric shock.
The X-axis no longer moves. Looks like the stepper driver or some component related to the X-axis is defective. I will have to get a new main board :-(

@Lord-Quake
Copy link
Contributor

@XDA-Bam While my Ender 3 Pro now dead atm, with which I do all my tests I decided to update my Anet A8. Since I can't compare the degree I can say the stuttering is still evident.
Maybe someone else can also report?

@lzamel
Copy link

lzamel commented May 11, 2020

Using the latest commit (1475fd3) with #define JD_USE_LOOKUP_TABLE and I'm no longer observing any stuttering. If there's any it is minimal. Before the whole machine was shaking on arcs.

@thinkyhead
Copy link
Member

By disabling JD_USE_LOOKUP_TABLE the planner is using the older JD code. The newer JD code is supposed to be more accurate and use less processing, but it also seems to produce the side effects you are seeing. We are continuing to experiment, and will do a comparison of the numbers that come out of the current code versus the older code.

@cwizard
Copy link

cwizard commented May 14, 2020

@XDA-Bam While my Ender 3 Pro now dead atm, with which I do all my tests I decided to update my Anet A8. Since I can't compare the degree I can say the stuttering is still evident.
Maybe someone else can also report?

Bro if you are in the USA I have a spare stock motherboard I can send to you to get you back up. I used bugfix for a day and a half before I realized JD was ruining my prints.

@Lord-Quake
Copy link
Contributor

@cwizard Thanks for the offer! I'm in Germany and will try to get a replacement.

@davevo22
Copy link

Hi, where I can comment on the function JD_USE_LOOKUP_TABLE? I have CONFIGURATION_H_VERSION 020005 and I have the same problem with motion stutter.
Thank you for answer

@Lord-Quake
Copy link
Contributor

@davevo22
\Marlin\src\module\planner.h

@davevo22
Copy link

I searched there but did not find it. Row number?

@Lord-Quake
Copy link
Contributor

Around line 43

@Lord-Quake
Copy link
Contributor

Lord-Quake commented Jun 4, 2020

Here are the results using Cura_SteamEngine 15.01 (Repetier-Host) in standard mode with the same settings as before.:
Stutter_Test_3
In these examples bulges are also evident (hard to see in the picture) at the same places with/out LUT albeit fewer.
I've also enclosed the g-code file I used for the above prints: Cube_40mm_V3_Standard_1Wall.zip

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Jun 4, 2020

The Cura sliced files looks quite a bit different concerning segment lengths.

Histogram detail:
Cube_40mm_V3_Standard_1Wall_Cura_histo-detail

Curvature / Equiv. radius:
Cube_40mm_V3_Standard_1Wall_Cura_radius

Note that the max Y value is a bit higher than in the earlier plots. This GCode has much fewer segments below 0.5 mm. Segment length distribution isn't super consistent, but differences between consecutive segments are less pronounced than in the GCode sliced by ideaMaker. This is also evident from the sharper/thinner distribution in the histogram. A typical layer looks like this:
Cube_40mm_V3_Standard_1Wall_Cura_Layer-8

I'm still searching for a metric which is somewhat correlated with those bulges. I expected the width of the distribution of equivalent radii to be a candidate, but it doesn't look like it is 🤔

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Jun 4, 2020

@Lord-Quake Could you run the standard print sliced by ideaMaker again, but this time, comment out NOMORE(vmax_junction_sqr, limit_sqr); in planner.cpp? If you do this, it should be irrelevant if you use the LUT or not. But let's use it - just to be close to the defaults.

@Lord-Quake
Copy link
Contributor

Both instances? Lines 2505 and 2389

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Jun 4, 2020

There should be only a single instance of this exact line in the latest bugfix-2.0.x. This one:

NOMORE(vmax_junction_sqr, limit_sqr);

@Lord-Quake
Copy link
Contributor

Yes, you correct.... will do the test now.

@Lord-Quake
Copy link
Contributor

Well, this took a lot longer than I had wanted because my nozzle crashed in to the bed due to a little piece of filament stuck to the nozzle while leveling.....
After readjusting the printer and setting a new Z height I was able to print this.
Check it out :-)
Stutter_Test_4
Just beautiful :-)
Take a look at the round curvature, how clean it now is. Compare this to the other prints where bugles are visible. The wall is also completely free of any bulges or line distortions.

In my opinion this is how Marlin should look like :-)

I will be doing more tests, not that this a fluke but I expect the other tests will show the same results.

@qwewer0
Copy link
Contributor

qwewer0 commented Jun 4, 2020

Even if it's a stupid question, but why do we need these limits: vmax_junction_sqr and limit_sqr?
What would happen if we don't have those limits?

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Jun 4, 2020

Just beautiful :-)
Take a look at the round curvature, how clean it now is. Compare this to the other prints where bugles are visible. The wall is also completely free of any bulges or line distortions.

Yeah, I agree. Also, that 100% confirms what I was suspecting:

  • The problem is not the number of tiny segments itself
  • A lack of CPU cycles is not the main problem (not totally sure the LUT code wasn't optimized out by the compiler, though)
  • The stutter & bulges are induced by limit_sqr setting excessively low junction speeds

As I said before, I also think that those excessively low junction speeds come from the fact, that the estimated radius is too low - most likely because some segments are aligned incorrectly by the slicer.

Thank you very much for testing this again! 🍻

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Jun 4, 2020

Even if it's a stupid question, but why do we need these limits: vmax_junction_sqr and limit_sqr?
What would happen if we don't have those limits?

The normal JD code determines junction speeds only based on junction angles. That means, if you slice an identical curve with a different number of elements, the one with more elements would print faster (because each junction angle is smaller, when there's more elements).
The additional code computing limit_sqr uses both, segment length and junction angle, to determine junction speeds. It basically assumes, that the current segment being planned is part of a bigger circle constructed of copies of this segment and then calculates that circles' radius. If everything were sliced perfectly and we had infinite steps per mm resolution, this approach would work fine. But it isn't and we don't 😄

@Lord-Quake
Copy link
Contributor

I did some more tests. Firstly I designed a new test object for quicker prints. I call it "Boomerang" :-)

Here the file if anyone wants to test: Boomerang.zip

One test was with 1 wall (0.4mm) and the other with 2 walls (2x0.4mm).
Stutter_5
Pretty, aren't they.
Observe what happens when the fan gradually speeds up.....

Parts in full view:
Stutter_6

Looking good right? Well, yes.....

However, I think I've found a "new" issue that probably is completely unrelated to our issue.
Look at the boomerangs.... I've been observing the issue ever since I started these tests.
Can you see it? :-)

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Jun 4, 2020

However, I think I've found a "new" issue that probably is completely unrelated to our issue.
Look at the boomerangs.... I've been observing the issue ever since I started these tests.
Can you see it? :-)

That line from 10 o'clock to 16 o'clock on the brim? 😀

@Lord-Quake
Copy link
Contributor

Lord-Quake commented Jun 4, 2020

(buzzer) Incorrect ;-)

@Lord-Quake
Copy link
Contributor

Closeup (for the blind) Haha....

Stutter_7

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Jun 4, 2020

Blob on topmost layer?

@Lord-Quake
Copy link
Contributor

(buzzer) We have a winner 👍

Seriously though.... On every print the nozzle halts for a split second on the last layer just before the print finishes. This is new....

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Jun 4, 2020

Weird. Maybe check your buffer size or turn off SLOWDOWN: If any change here influences the blob position, that narrows down the sources of potential bugs.

Back to the main topic: How do we proceed with our limit_sqr-problem? Simply removing the code will uncover the quirks of "pure" JD again. Other ideas?

@AnHardt
Copy link
Member

AnHardt commented Jun 4, 2020

If that blob is about plannerbuffersize segments before the part is ready - try a M400 after the last moving g-code of the part. Could be EEPROM activity, closing files, updating printcounter, ... , things done when finishing a print. M400 will delay that until the part is finally printed - not only finally planed.

@Lord-Quake
Copy link
Contributor

Lord-Quake commented Jun 5, 2020

@AnHardt M400 didn't help. This issue is new and I'll look into it deeper with a separate open issue entry if needed.

@XDA-Bam
Copy link
Contributor

XDA-Bam commented Jun 10, 2020

From my point of view, we have definitely identified that limit_sqr is problematic. As there is no solution yet, I wanted to take the time to quickly visualize how I understand the problem is created (by the slicer, that is). Ultimately, the goal would be to brainstrom possible solutions.

Let's look at some random curve, which we slice into segments of different length. Our hypothetical slicer works well, but it's got some quirks. Now let's look at three of those sliced segments and their junction angles:

Slicing variant A

Sliced_Curve_A

On slicing variant A you can see, that our hypothetical slicer decided to insert a small segment 2. And it distributed the junction angles Theta in a way that the angle 1->2 is close to 180° and the angle 2->3 is significantly below 180°. You could say that most of the curvature is attributed to the junction 2->3.

This is incorrect - the curvature should be distributed evenly amongst both junctions. But it is also not problematic for Marlin. The junction speed for segment 2 is computed using a low block->millimeters but divided by a very small junction_theta = (180° - Theta) << 1 and will be quite high.

const float limit_sqr = (block->millimeters * junction_acceleration) / junction_theta;

For segment 3, the junction speed will be a little bit lower than it should be for a perfectly sliced curve, but because the segment is much longer than 2, the junction speed will still be high enough to not be limiting in most real world scenarios. It is technically too low, but it will most of the time still come out at something like 60 mm/s, which happens to be enough to cause no problems during printing.

Slicing variant B

Sliced_Curve_B

On slicing variant B you can see that our slicer again inserted a small segment, but this time, the junction angles are distributed in reverse: The angle on 1->2 is quite a bit lower than 180°, while the angle for the junction 2->3 is very close to 180°.

This is just as incorrect as slicing variant A. But this time, it also creates a big problem in Marlin: The junction speed for segment 2, which is very short and is divided by an angle junction_theta = (180° - Theta) >> 1, will be significantly lower than the ones before and after. It may be 40 mm/s if we are lucky, but often enough, it can be as low as 20 mm/s and less.

I am very, very certain that this is our main problem with limit_sqr. In short: If the slicer happens to accidentally combine a tiny segment with a junction angle to the preceeding segment which is larger than the "correct" one derived from local curvature, limit_sqr is simply crushed. Thereby, the printer is forced to slow down abruptly, causing stutter and surface imperfections.

Maybe it's the slicer's fault, maybe this comes from the polygon boundaries in the STL, I'm not sure. But I currently see no clean solution. What are your ideas?

@thinkyhead
Copy link
Member

thinkyhead commented Jun 10, 2020

In each scenario you're looking at the eventual curvature / change in angle starting from point 1 and going out to point 3. In a longer scenario you could go out to point N. The total change in angle would seem to be the relevant issue.

What is interesting is that whenever you change angle in a set of 2 or more moves, the line directly from point 1 to your destination will "always" be shorter than the total of all the segment lengths. (That assumes a continuous curve and not a zigzag.) Of course, it's just the hypotenuse…. The shorter it is in relation to the total move length, the more acute the curvature.

So, anyway, given that we keep describing change in angle over multiple segments, it seems like we should be running some algorithm here that looks ahead from the start of the planner to the end of the planner, building on the current momentum, and which can account for all the angular change ahead, attenuating all the way back to the first planner segment.

@thinkyhead
Copy link
Member

…which is in a sense all that "jerk" does. It looks only at the amount of speed change being proposed on each axis by the "nominal speed" move that has been appended to the current buffer. Jerk limits are applied only on junctions where direction changes occur on one or more axes. The rest of the speed attenuation is handled in terms of acceleration and max feedrates per axis. Higher jerk values used to be the trick to get sharp corners.

So, it seems like what might work best is some combination of approaches. If there are weird edge cases that can mess with JD, clearly the long term solution is to have JD look for those cases and manage them by doing some form of averaging.

Perhaps there is some way to see these sudden jumps / drops in speed as they are about to occur and then apply extra logic to solve the dilemma….

@campiosa
Copy link

I'm just a lurker so yell at me if I'm out of order, but I don't see a possible solution to this without reading ahead in the planner (or forcing gcode to use arc segments which obviously isn't a valid solution). So the first question (since I suck at reading C/C++) is how far ahead can we read? And is that a guaranteed amount (blocks I assume) or subject to variation based on other factors?

@thinkyhead
Copy link
Member

@campiosa — We've had a good discussion on approaches to lookahead or "deviation accumulation" in another thread but I can't find the issue number at the moment….

@thinkyhead
Copy link
Member

Duplicate of #17342

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests